Project ideas from Hacker News discussions.

Many SWE-bench-Passing PRs would not be merged

📝 Discussion Summary (Click to expand)

Error generating summary: 'choices'


🚀 Project Ideas

RepoTailor: CustomizedLLM PR Evaluation Engine

Summary

  • A plug‑in service that evaluates pull requests generated by LLMs against a project’s own coding standards, test coverage, and architectural constraints.
  • Provides a “fit score” that predicts merge likelihood, reducing manual review overload.

Details

Key Value
Target Audience Engineering teams using AI code assistants at scale
Core Feature Automated PR scoring that combines test outcomes, style compliance, and architectural deviation metrics
Tech Stack FastAPI backend, PostgreSQL, Docker, React front‑end, Rust inference engine
Difficulty Medium
Monetization Revenue-ready: tiered pricing per active repository

Notes- HN users repeatedly cite “tests pass ≠ merge ready” and maintainers rejecting AI PRs over style; this directly addresses that pain.

  • Opens a market for repo‑specific evaluation tools, a gap highlighted by bisonbear’s call for custom metrics.

StylePrompt Hub: Reusable Architectural Taste Prompts

Summary

  • A marketplace of vetted “taste” prompts that encode preferred code structure, naming conventions, and refactor habits for LLMs.
  • Users can adopt, share, or customize prompts to steer AI output toward maintainable code.

Details

Key Value
Target Audience AI‑augmented developers and teams seeking consistent code quality
Core Feature Curated prompt library with metadata tags (e.g., “low entropy”, “high DRY”) and versioning
Tech Stack Next.js UI, GraphQL API, Node.js workers, Markdown prompt storage
Difficulty Low
Monetization Revenue-ready: freemium with premium prompt packs

Notes

  • Discussions about “prompt engineering for taste” and the difficulty of conveying architectural intent; this product makes that reusable.
  • Aligns with requests for better “steering” of LLMs and reduces time spent crafting ad‑hoc prompts.

EntropyLens: Codebase Complexity & Maintainability Dashboard

Summary

  • A SaaS that measures code entropy, cyclomatic complexity, and abstraction depth to surface hidden technical debt in AI‑generated code.
  • Generates visual heatmaps and actionable refactor suggestions to improve maintainability.

Details

Key Value
Target Audience DevOps engineers, senior engineers, and maintainers of large codebases
Core Feature Static analysis engine calculating entropy, cross‑entropy, and entropy‑adjusted diff size; UI dashboard with trend alerts
Tech Stack Python backend, ElasticSearch, D3.js visualizations, Docker Compose
Difficulty High
Monetization Revenue-ready: usage‑based pricing per million lines analyzed

Notes

  • Multiple comments stress measuring “entropy” and signals beyond test passes (e.g., cyclomatic complexity, diff size) to judge maintainability.
  • Directly addresses the desire for structural metrics highlighted by users like code_biologist and jlandersen.

PatternGuard: Auto‑Generated Linter Rules from AI Refactors

Summary

  • A service that learns from successful AI refactors and automatically creates lint rules that enforce desired code patterns and prevent regressions.
  • Integrates with CI to block PRs that violate learned best‑practice syntax.

Details

Key Value
Target Audience Teams using AI code assistants who face “spaghetti” outputs and need enforceable style guardrails
Core Feature Lint rule generator that analyzes diffs from accepted AI PRs and produces custom ESLint/ruff/Python‑lint rules
Tech Stack Node.js rule parser, TypeScript AST transformer, GitHub Action integration
Difficulty Medium
Monetization Revenue-ready: monthly subscription per repository

Notes

  • Community repeatedly mentions inability to enforce “taste” automatically; this product turns ad‑hoc fixes into reusable lint rules.
  • Responds to remarks about “AI making weird choices” and the need for “pattern enforcement” to keep codebases coherent.

Read Later