Project ideas from Hacker News discussions.

How I write software with LLMs

📝 Discussion Summary (Click to expand)

1. Splitting the agent into specialized “sub‑agents” (architect, developer, reviewer, etc.)
Many users argue that breaking a single LLM into role‑specific agents helps manage context, enforce permissions, and reduce hallucinations.

“If you want to separate capabilities, definitely.” – chriswarbo
“The orchestrator runs the whole thing… architect → developer → reviewer.” – marcus_holmes

2. Cost‑vs‑quality trade‑off and token budgeting
A recurring point is that multi‑agent pipelines can be expensive, but they can also save tokens by delegating simple tasks to cheaper models.

“One tier, one model is cheaper, but the quality comes with the reviewers.” – stavros
“You spend your token & context budget in full in 3 phases.” – hakanderyal

3. Human understanding and the “vibe‑coding” critique
Several comments caution that relying on LLMs without reading or reviewing the code turns developers into “no‑code” users and erodes architectural insight.

“If you fail to even read the code produced, then I might as well treat it like a no‑code system.” – ashwinsundar
“I can understand the high‑levels of how no‑code works, but as soon as it breaks, it might as well be a black box.” – ashwinsundar

4. Workflow & tooling integration (CLI vs IDE, markdown artifacts, harnesses)
Users discuss how to embed agents into existing toolchains, the value of markdown‑based plan files, and the pros/cons of terminal‑based vs. IDE‑based agents.

“I’m using a hierarchy of artifacts: requirements doc → design docs → code+tests.” – aix1
“All artifacts are version controlled.” – aix1
“I just want to talk to a model all day, but that’s not the same as writing code.” – lbreakjai

These four themes capture the main strands of opinion in the discussion.


🚀 Project Ideas

[Orchestrated Agent Studio]

Summary

  • Solves the fragmentation of LLM-driven code pipelines by providing a unified orchestrator that automatically creates architect, developer, and reviewer roles.
  • Core value: dramatically reduces context‑window usage and token waste while enforcing systematic review loops.

Details

Key Value
Target Audience Small dev teams, solo hackers building side projects
Core Feature Multi‑agent orchestration with auto‑saved design docs and role‑based permissions
Tech Stack React front‑end, Node.js backend, PostgreSQL, OpenAI GPT‑4o, Claude 3, Docker
Difficulty High
Monetization Revenue-ready: Tiered subscription ($12/mo basic, $45/mo pro)

Notes

  • Directly answers HN’s call for a “super‑powers” framework that separates concerns without manual prompt gymnastics.
  • Opens discussion on cost‑effective model routing (e.g., Sonnet for planner, Opus for reviewer).

[Blueprint Builder]

Summary

  • Addresses the pain of vague requirements by turning user intent into structured design artifacts (specs, diagrams, test plans).
  • Core value: guarantees clear, shareable blueprints that keep context small and enable reliable hand‑offs.

Details

Key Value
Target Audience Product managers, solo founders, freelancers
Core Feature Generates markdown spec files, PlantUML architecture diagrams, and test matrices from natural‑language prompts
Tech Stack Vue.js, Python backend, LangChain, DALL‑E 3 for diagram generation, GPT‑4 Turbo
Difficulty Medium
Monetization Hobby (free open‑source, optional paid support)

Notes

  • Mirrors Stavros’ “plan file” approach; HN loves concrete artifact‑driven workflows.
  • Potential integration with Notion or GitHub Issues for seamless tracking.

[Sub‑Agent Rental Hub]

Summary

  • Tackles the scarcity of specialized sub‑agents (DB, infra, security) by letting users rent pre‑trained tiny models for specific roles.
  • Core value: enables anyone to compose a custom agent fleet without paying for large‑model calls on every step.

Details

Key Value
Target Audience Developers who need cheap, focused tasks (e.g., database query writer, CI pipeline builder)
Core Feature Marketplace of skill‑packaged Docker containers exposing a single‑purpose endpoint for the orchestrator
Tech Stack FastAPI, Docker Compose, Hugging Face models (e.g., CodeLlama‑7B‑DB, TinyLlama‑Infra), Stripe for payments
Difficulty Medium
Monetization Revenue-ready: Pay‑per‑call ($0.001 per inference) + optional monthly quota

Notes

  • Echoes the discussion about using different models for planner vs developer; HN will debate open‑source vs proprietary trade‑offs.
  • Sparks conversation on token‑budget markets and fair model pricing. ## [AutoCode Reviewer]

Summary- Solves the “review bottleneck” after LLM code generation by automatically running static analysis, security scans, and contextual unit tests.

  • Core value: guarantees higher‑quality output before developers see it, cutting debugging time dramatically.

Details

Key Value
Target Audience Teams adopting vibe‑coding, indie hackers, open‑source maintainers
Core Feature Integrated reviewer agent that critiques generated files, suggests fixes, and enforces style guides
Tech Stack Rust backend, Semgrep, SonarQube APIs, GPT‑4‑Vision for visual bug detection, GitHub Actions
Difficulty High
Monetization Revenue-ready: Enterprise license $30/user/mo (self‑hosted) + free community tier

Notes

  • Directly addresses HN’s concern about needing multiple reviewers and “different hats”. - Likely to generate debate on false positives vs code‑ownership.

[SpecCrafter]

Summary

  • Provides a structured requirement‑articulation workflow that extracts business goals, produces clear acceptance criteria, and auto‑generates implementation tickets.
  • Core value: turns vague “add email support” prompts into concrete, testable specs, reducing scope creep.

Details

Key Value
Target Audience Product owners, solo SaaS founders, remote teams
Core Feature Chat‑driven spec generator that outputs markdown requirement docs, priority tags, and linked GitHub Issues
Tech Stack Next.js front‑end, Go microservice, GPT‑4‑Turbo for parsing, Markdown pipelines
Difficulty Low
Monetization Hobby (free, with optional paid hosted API)

Notes

  • Mirrors the “plan file” concept from Stavros and the orchestrator discussion; HN loves concrete docs that replace ambiguous prompts.
  • Opportunity for integration with project‑management tools like Linear.

[Self‑Healing Code Loop]

Summary

  • Addresses flakiness of LLM‑generated code by continuously running generated tests, feeding failures back, and auto‑repairing bugs without human intervention. - Core value: turns a one‑shot generation into a reliable, self‑correcting pipeline.

Details

Key Value
Target Audience Developers building internal tools, rapid‑prototype hackers, SaaS founders
Core Feature Loop that executes unit/integration tests on output, triggers re‑prompt with failure context, and merges fixes automatically
Tech Stack Python orchestration, pytest, LangChain, OpenAPI validator, GitHub PR automation
Difficulty High
Monetization Revenue-ready: Usage‑based $0.005 per loop iteration + optional enterprise SLA

Notes

  • Resonates with HN’s frustration about LLM “hallucinations” and the need for guardrails.
  • Sparks debate on the limits of self‑repair vs human oversight.

Read Later