Project ideas from Hacker News discussions.

All the bugs they found

📝 Discussion Summary (Click to expand)

Top 3 Themes from the Discussion

# Theme Supporting Quote
1 Skepticism toward Anthropic’s “Mythos” security stance “The whole point of Mythos/Glasswing is our best models are scary good at security research, so much so that we won’t let them help you find vulnerabilities unless you are a trusted partner.” — simonw
“Considering Anthropic had a sandbox‑bypass vulnerability in CC for a year, silently patched it, and still hasn't made a disclosure statement, no one on Earth should trust them or believe a word they say.” — nullbio
“Trying to work around Anthropic blocking security‑related prompts does get pretty tiring though.” — vachanmn123
2 LLMs as tools for security research – promise and limits “Nice writeup. A practical example of a project, what was found, how it was found, the quality of the findings, reproducible.” — tptacek
“The bad thing about software is that there’s infinite ways to solve the same problem… I’ve not had much success with them writing code that simply has no bugs.” — onlyrealcuzzo
“If you can get better at identifying complexity, LLMs can get much better at suggesting viable fixes.” — amelius
3 Community dynamics & incentives – authenticity vs. career‑building “People want promotions, money and a job in general, and they will do stupid stuff to keep their jobs and increase their pay.” — hootz
“It is not at all, in the slightest, weird heuristic to deploy in the Agentic Era. It’s a heuristic after all. There is no proof one way or the other.” — tptacek
“Is there no room in your model of the world for someone to figure out something interesting using AI tools and then write about it just because they like sharing interesting information?” — simonw

Summary:
The conversation clusters around (1) distrust of Anthropic’s restrictive security policy, (2) a realistic appraisal of how effective LLMs are at uncovering bugs, and (3) the broader HN culture where career incentives and authenticity are frequently questioned.


🚀 Project Ideas

Security PromptBroker

Summary

  • Mediates user requests to LLMs that are blocked by default security filters, enabling safe security‑research prompts.
  • Provides a simple API and UI for users to obtain “research mode” access without navigating provider bureaucracy.

Details

Key Value
Target Audience Security researchers, penetration testers, red‑team engineers
Core Feature Automatic prompt negotiation with LLM providers, fallback to alternative open‑source models when denied
Tech Stack FastAPI backend, Redis queue, Docker, OpenAPI spec
Difficulty Medium
Monetization Revenue-ready: SaaS subscription $15/mo per user

Notes

  • Directly addresses vachanmn123’s frustration with Anthropic blocking security prompts.
  • Offers a practical workaround that could be discussed on HN as a “useful hack”.

VeriChain Composable LLM Security Pipeline

Summary

  • A framework that chains LLM steps (code generation → test creation → vulnerability analysis → auto‑fix) into a reproducible loop.
  • Integrates with CI/CD to continuously improve code safety.

Details

Key Value
Target Audience Dev teams, AI‑augmented developers, security engineers
Core Feature End‑to‑end pipeline with artifact storage, verification checks, and result tagging
Tech Stack Python, LangChain, PostgreSQL, GitHub Actions, React dashboard
Difficulty High
Monetization Revenue-ready: Enterprise tier $50/mo per repo

Notes

  • Mirrors simonw’s suggestion of composing LLMs for security testing; would resonate with HN’s interest in practical AI tooling.
  • Enables systematic bug discovery and verification, fitting the “practical example” desire.

BugScore Code Complexity Analyzer

Summary

  • Analyzes source code for hidden complexity and bug‑prone patterns, assigning a severity score. - Generates targeted test cases to expose issues that LLMs often miss.

Details

Key Value
Target Audience Software engineers, code reviewers, open‑source maintainers
Core Feature Static analysis + LLM‑driven test synthesis, produces maintainability recommendations
Tech Stack Rust (analysis engine), Python (LLM wrapper), SQLite (scores), VS Code extension
Difficulty Medium
Monetization Hobby

Notes

  • Tackles keybored’s concern about “working vs. good enough” code and amelius’s call for concrete metrics.
  • Provides actionable insights that could spark discussion on HN about improving code quality with AI assistance.

Read Later