Project ideas from Hacker News discussions.

Many SWE-bench-Passing PRs would not be merged

Original Article

Hacker News Discussion

📝 Discussion Summary (Click to expand)

Error generating summary: 'choices'

🚀 Project Ideas

RepoTailor: CustomizedLLM PR Evaluation Engine

Summary

A plug‑in service that evaluates pull requests generated by LLMs against a project’s own coding standards, test coverage, and architectural constraints.
Provides a “fit score” that predicts merge likelihood, reducing manual review overload.

Details

Key	Value
Target Audience	Engineering teams using AI code assistants at scale
Core Feature	Automated PR scoring that combines test outcomes, style compliance, and architectural deviation metrics
Tech Stack	FastAPI backend, PostgreSQL, Docker, React front‑end, Rust inference engine
Difficulty	Medium
Monetization	Revenue-ready: tiered pricing per active repository

Notes- HN users repeatedly cite “tests pass ≠ merge ready” and maintainers rejecting AI PRs over style; this directly addresses that pain.

Opens a market for repo‑specific evaluation tools, a gap highlighted by bisonbear’s call for custom metrics.

StylePrompt Hub: Reusable Architectural Taste Prompts

Summary

A marketplace of vetted “taste” prompts that encode preferred code structure, naming conventions, and refactor habits for LLMs.
Users can adopt, share, or customize prompts to steer AI output toward maintainable code.

Details

Key	Value
Target Audience	AI‑augmented developers and teams seeking consistent code quality
Core Feature	Curated prompt library with metadata tags (e.g., “low entropy”, “high DRY”) and versioning
Tech Stack	Next.js UI, GraphQL API, Node.js workers, Markdown prompt storage
Difficulty	Low
Monetization	Revenue-ready: freemium with premium prompt packs

Notes

Discussions about “prompt engineering for taste” and the difficulty of conveying architectural intent; this product makes that reusable.
Aligns with requests for better “steering” of LLMs and reduces time spent crafting ad‑hoc prompts.

EntropyLens: Codebase Complexity & Maintainability Dashboard

Summary

A SaaS that measures code entropy, cyclomatic complexity, and abstraction depth to surface hidden technical debt in AI‑generated code.
Generates visual heatmaps and actionable refactor suggestions to improve maintainability.

Details

Key	Value
Target Audience	DevOps engineers, senior engineers, and maintainers of large codebases
Core Feature	Static analysis engine calculating entropy, cross‑entropy, and entropy‑adjusted diff size; UI dashboard with trend alerts
Tech Stack	Python backend, ElasticSearch, D3.js visualizations, Docker Compose
Difficulty	High
Monetization	Revenue-ready: usage‑based pricing per million lines analyzed

Notes

Multiple comments stress measuring “entropy” and signals beyond test passes (e.g., cyclomatic complexity, diff size) to judge maintainability.
Directly addresses the desire for structural metrics highlighted by users like code_biologist and jlandersen.

PatternGuard: Auto‑Generated Linter Rules from AI Refactors

Summary

A service that learns from successful AI refactors and automatically creates lint rules that enforce desired code patterns and prevent regressions.
Integrates with CI to block PRs that violate learned best‑practice syntax.

Details

Key	Value
Target Audience	Teams using AI code assistants who face “spaghetti” outputs and need enforceable style guardrails
Core Feature	Lint rule generator that analyzes diffs from accepted AI PRs and produces custom ESLint/ruff/Python‑lint rules
Tech Stack	Node.js rule parser, TypeScript AST transformer, GitHub Action integration
Difficulty	Medium
Monetization	Revenue-ready: monthly subscription per repository

Notes

Community repeatedly mentions inability to enforce “taste” automatically; this product turns ad‑hoc fixes into reusable lint rules.
Responds to remarks about “AI making weird choices” and the need for “pattern enforcement” to keep codebases coherent.