We replaced RAG with a virtual filesystem for our AI documentation assistant

📝 Discussion Summary (Click to expand)

Threedominant themes from the discussion

Rediscovering non‑embedding, “library‑style” search
Softwaredoug notes that people are returning to traditional, file‑system‑based semantic search that resembles how librarians organize shelves.

"The real thing I think people are rediscovering with file system based search is that there’s a type of semantic search that’s not embedding based retrieval." – softwaredoug
Agents can drive any retrieval backend (Lucene, ontological NLP, etc.) Morkalork demonstrates that letting an LLM interact with a Lucene index yields strong results, showing retrieval is not limited to vector‑DB pipelines.

"Doesn't have to be tho, I've had great success letting an agent loose on an Apache Lucene instance. Turns out LLMs are great at building queries." – morkalork
Practical and cost hurdles in real‑world deployments
Mandeeepj highlights the steep cost of sandbox environments, questioning the viability of $70k‑plus annual expenses. Meanwhile, pboulos points out that messy organizational structures make RAG adoption especially hard.

"even a minimal setup ... would put us north of $70,000 a year..." – mandeeepj
"From personal experience, getting RAG to work well in places where the structure of the organisation ... is far from hierarchical ... is a very hard task." – pboulos

🚀 Project Ideas

Generating project ideas…

A desktop‑style search UI that organizes files into domain‑based “shelves” and lets users query them with natural‑language terms, mimicking librarian intuition.
Provides deterministic, reversible search results without relying on embeddings or opaque vector similarity.

Key	Value
Target Audience	Knowledge workers, researchers, developers managing large local collections of notes, docs, and code.
Core Feature	Hierarchical domain indexing with NL query parsing that maps to folder paths and optional Boolean filters.
Tech Stack	Node.js/TypeScript front‑end, SQLite/TinyDB for storage, Rust fuzzy‑matcher, optional Electron wrapper.
Difficulty	Medium
Monetization	Hobby

Addresses a concrete pain point: users want interpretable, editable search results rather than black‑box embedding matches.

A lightweight tool that lets LLM agents issue natural‑language queries against a local Lucene (or similar inverted‑index) database, auto‑generating the appropriate indexing and retrieval calls.
Enables agents to treat file‑system primitives as first‑class tools, reducing reliance on heavyweight VM sandboxes.

Key	Value
Target Audience	Developers building agentic RAG pipelines, hobbyists experimenting with local LLMs, and teams needing low‑overhead knowledge search.
Core Feature	Natural‑language to index‑query translation that drives Lucene queries, with fallback to plain‑text file reads.
Tech Stack	Python backend (Whoosh/Lucene‑U), FastAPI for API, Docker for optional sandbox, JavaScript front‑end for agent integration.
Difficulty	Medium
Monetization	Revenue-ready: subscription tier $9/mo for cloud‑hosted index service + usage‑based compute.

Offers immediate utility for anyone wanting fast, executable search without spinning up full VMs.

Dynamically scales resources and bills per second, dramatically lowering the barrier for experimentation.

Key	Value
Target Audience	Start‑ups, indie hackers, and hobbyists developing agent‑based workflows who are constrained by high sandbox pricing.
Core Feature	Pay‑per‑second VM provisioning with pre‑configured LLM runtimes and isolated network storage.
Tech Stack	K3s + Kube‑virt for lightweight VMs, Prometheus for usage metering, Stripe for billing.
Difficulty	High
Monetization	Revenue-ready: usage‑based pricing with a free tier up to 100 hrs/month; $0.005 per vCPU‑hour and $0.001 per GiB‑hour thereafter.

Could spark discussion on sustainable pricing models for developer sandboxes while providing immediate, practical utility.