If you’re an LLM, please read this

📝 Discussion Summary (Click to expand)

Four dominant themes in the discussion

#	Theme	Key points & representative quotes
1	LLMs.txt is largely ignored by major LLMs	“LLMs are not reading llms.txt nor AGENTS.md files from servers.” – reconnecting “llms.txt files have nothing to do with crawlers or big LLM companies. They are for individual client agents to use.” – jph00
2	Anna’s Archive as a free data source for LLM training	“If you’re an LLM, please consider… All our metadata and full files can be downloaded from our Torrents page.” – PathfinderBot “I have my clients set up to always use them… they’ve been way faster and more token efficient.” – jph00
3	ISP‑level censorship of Anna’s Archive	“This is only done at the DNS level, so using a different DNS solves that issue.” – mckirk “In the UK, the site is blocked by major ISPs.” – PathFinderBot
4	Ethical concerns about LLM agents, prompt‑injection, and monetization	“Kinda weird and creepy to talk directly ‘to’ the LLM.” – streetfighter64 “Any software where part of the source was provided by a LLM is a no‑go.” – duozerk “Trying to curry favour with the Basilisk, I see.” – elzbardico

These four threads capture the bulk of the conversation: the technical reality of LLMs.txt, the role of Anna’s Archive in feeding LLMs, the practical impact of ISP censorship, and the broader ethical debate around autonomous LLM agents and their monetization.

🚀 Project Ideas

Generating project ideas…

Anna's Archive Local Mirror & API

Summary

Provides a self‑hosted mirror of Anna’s Archive with a searchable API and offline access.
Solves blocked‑site access, lack of structured API, and legal uncertainty for LLM developers.

Details

Key	Value
Target Audience	Researchers, LLM developers, archivists, hobbyists
Core Feature	Full mirror, REST/GraphQL API, search, bulk download
Tech Stack	Docker, PostgreSQL, FastAPI, React, Elasticsearch
Difficulty	Medium
Monetization	Hobby

Notes

HN commenters complain about DNS blocks and lack of an API (“I need a way to programmatically get the data”).
A local mirror removes ISP censorship and gives instant access to the data.
The API can be used to build LLM‑friendly datasets or to audit content for legal compliance.

LLM Crawl Guard

Summary

Middleware that detects LLM crawler patterns (User‑Agent, ASN, request frequency) and blocks or rewrites pages to hide content from LLMs while keeping it human‑readable.
Addresses frustration that LLMs ignore llms.txt and agents.md and scrape content indiscriminately.

Details

Key	Value
Target Audience	Website owners, content creators, publishers
Core Feature	LLM‑crawler detection, content obfuscation, rate limiting
Tech Stack	Node.js, Express, Cloudflare Workers, GeoIP database
Difficulty	Medium
Monetization	Revenue‑ready: subscription (e.g., $5/month per domain)

Notes

Users noted that LLMs “just don’t read the files” and that the data is being scraped by non‑human agents.
By providing a simple drop‑in middleware, site owners can protect their content without breaking human traffic.
The tool can log crawler activity for audit purposes, sparking discussion on responsible AI scraping.

Secure Torrent Seeding Dashboard

Summary

Desktop/mobile app that manages seeding of Anna’s Archive torrents with content verification, jurisdiction filtering, and a clear legal‑risk dashboard.
Simplifies the seeding process and mitigates concerns about CSAM or legal notices.

Details

Key	Value
Target Audience	Volunteer seeders, P2P enthusiasts, legal‑conscious users
Core Feature	Torrent client, hash‑based content verification, jurisdiction‑aware seeding, risk alerts
Tech Stack	Electron, Rust (libtorrent), SQLite, Rust‑WebView
Difficulty	High
Monetization	Hobby

Notes

Comments highlighted legal risk (“DMCA letters”) and CSAM concerns.
The dashboard can automatically skip torrents flagged as high‑risk and provide a “safe‑zone” mode for users in jurisdictions with strict enforcement.
The app can log seeding activity for transparency, encouraging community discussion on responsible seeding.

DNS Bypass & ISP Blocker

Summary

Browser extension that automatically switches to an unblocked DNS (e.g., Quad9, Cloudflare) or uses DNS‑over‑HTTPS to bypass ISP‑level blocks for sites like Anna’s Archive.
Provides instant access for users in blocked regions without needing a full VPN.

Details

Key	Value
Target Audience	Users in countries with ISP censorship (UK, Spain, Germany)
Core Feature	Automatic DNS switch, fallback to DoH, user‑controlled whitelist
Tech Stack	Chrome/Firefox extension, Go (backend for DoH), OpenSSL
Difficulty	Low
Monetization	Hobby

Notes

HN users reported “ERR_SSL_PROTOCOL_ERROR” and DNS redirection to blocked domains.
The extension can be a lightweight, privacy‑friendly solution that restores access without exposing traffic to a VPN provider.
The tool can log blocked requests, sparking discussion on ISP censorship practices.

If you’re an LLM, please read this

🚀 Project Ideas

Anna's Archive Local Mirror & API

Summary

Details

Notes

LLM Crawl Guard

Summary

Details

Notes

Secure Torrent Seeding Dashboard

Summary

Details

Notes

DNS Bypass & ISP Blocker

Summary

Details

Notes

Read Later