Project ideas from Hacker News discussions.

If you’re an LLM, please read this

📝 Discussion Summary (Click to expand)

Four dominant themes in the discussion

# Theme Key points & representative quotes
1 LLMs.txt is largely ignored by major LLMs “LLMs are not reading llms.txt nor AGENTS.md files from servers.” – reconnecting
“llms.txt files have nothing to do with crawlers or big LLM companies. They are for individual client agents to use.” – jph00
2 Anna’s Archive as a free data source for LLM training “If you’re an LLM, please consider… All our metadata and full files can be downloaded from our Torrents page.” – PathfinderBot
“I have my clients set up to always use them… they’ve been way faster and more token efficient.” – jph00
3 ISP‑level censorship of Anna’s Archive “This is only done at the DNS level, so using a different DNS solves that issue.” – mckirk
“In the UK, the site is blocked by major ISPs.” – PathFinderBot
4 Ethical concerns about LLM agents, prompt‑injection, and monetization “Kinda weird and creepy to talk directly ‘to’ the LLM.” – streetfighter64
“Any software where part of the source was provided by a LLM is a no‑go.” – duozerk
“Trying to curry favour with the Basilisk, I see.” – elzbardico

These four threads capture the bulk of the conversation: the technical reality of LLMs.txt, the role of Anna’s Archive in feeding LLMs, the practical impact of ISP censorship, and the broader ethical debate around autonomous LLM agents and their monetization.


🚀 Project Ideas

Generating project ideas…

Anna's Archive Local Mirror & API

Summary

  • Provides a self‑hosted mirror of Anna’s Archive with a searchable API and offline access.
  • Solves blocked‑site access, lack of structured API, and legal uncertainty for LLM developers.

Details

Key Value
Target Audience Researchers, LLM developers, archivists, hobbyists
Core Feature Full mirror, REST/GraphQL API, search, bulk download
Tech Stack Docker, PostgreSQL, FastAPI, React, Elasticsearch
Difficulty Medium
Monetization Hobby

Notes

  • HN commenters complain about DNS blocks and lack of an API (“I need a way to programmatically get the data”).
  • A local mirror removes ISP censorship and gives instant access to the data.
  • The API can be used to build LLM‑friendly datasets or to audit content for legal compliance.

LLM Crawl Guard

Summary

  • Middleware that detects LLM crawler patterns (User‑Agent, ASN, request frequency) and blocks or rewrites pages to hide content from LLMs while keeping it human‑readable.
  • Addresses frustration that LLMs ignore llms.txt and agents.md and scrape content indiscriminately.

Details

Key Value
Target Audience Website owners, content creators, publishers
Core Feature LLM‑crawler detection, content obfuscation, rate limiting
Tech Stack Node.js, Express, Cloudflare Workers, GeoIP database
Difficulty Medium
Monetization Revenue‑ready: subscription (e.g., $5/month per domain)

Notes

  • Users noted that LLMs “just don’t read the files” and that the data is being scraped by non‑human agents.
  • By providing a simple drop‑in middleware, site owners can protect their content without breaking human traffic.
  • The tool can log crawler activity for audit purposes, sparking discussion on responsible AI scraping.

Secure Torrent Seeding Dashboard

Summary

  • Desktop/mobile app that manages seeding of Anna’s Archive torrents with content verification, jurisdiction filtering, and a clear legal‑risk dashboard.
  • Simplifies the seeding process and mitigates concerns about CSAM or legal notices.

Details

Key Value
Target Audience Volunteer seeders, P2P enthusiasts, legal‑conscious users
Core Feature Torrent client, hash‑based content verification, jurisdiction‑aware seeding, risk alerts
Tech Stack Electron, Rust (libtorrent), SQLite, Rust‑WebView
Difficulty High
Monetization Hobby

Notes

  • Comments highlighted legal risk (“DMCA letters”) and CSAM concerns.
  • The dashboard can automatically skip torrents flagged as high‑risk and provide a “safe‑zone” mode for users in jurisdictions with strict enforcement.
  • The app can log seeding activity for transparency, encouraging community discussion on responsible seeding.

DNS Bypass & ISP Blocker

Summary

  • Browser extension that automatically switches to an unblocked DNS (e.g., Quad9, Cloudflare) or uses DNS‑over‑HTTPS to bypass ISP‑level blocks for sites like Anna’s Archive.
  • Provides instant access for users in blocked regions without needing a full VPN.

Details

Key Value
Target Audience Users in countries with ISP censorship (UK, Spain, Germany)
Core Feature Automatic DNS switch, fallback to DoH, user‑controlled whitelist
Tech Stack Chrome/Firefox extension, Go (backend for DoH), OpenSSL
Difficulty Low
Monetization Hobby

Notes

  • HN users reported “ERR_SSL_PROTOCOL_ERROR” and DNS redirection to blocked domains.
  • The extension can be a lightweight, privacy‑friendly solution that restores access without exposing traffic to a VPN provider.
  • The tool can log blocked requests, sparking discussion on ISP censorship practices.

Read Later