All the bugs they found

📝 Discussion Summary (Click to expand)

Top 3 Themes from the Discussion

#	Theme	Supporting Quote
1	Skepticism toward Anthropic’s “Mythos” security stance	“The whole point of Mythos/Glasswing is our best models are scary good at security research, so much so that we won’t let them help you find vulnerabilities unless you are a trusted partner.” — simonw
		“Considering Anthropic had a sandbox‑bypass vulnerability in CC for a year, silently patched it, and still hasn't made a disclosure statement, no one on Earth should trust them or believe a word they say.” — nullbio
		“Trying to work around Anthropic blocking security‑related prompts does get pretty tiring though.” — vachanmn123
2	LLMs as tools for security research – promise and limits	“Nice writeup. A practical example of a project, what was found, how it was found, the quality of the findings, reproducible.” — tptacek
		“The bad thing about software is that there’s infinite ways to solve the same problem… I’ve not had much success with them writing code that simply has no bugs.” — onlyrealcuzzo
		“If you can get better at identifying complexity, LLMs can get much better at suggesting viable fixes.” — amelius
3	Community dynamics & incentives – authenticity vs. career‑building	“People want promotions, money and a job in general, and they will do stupid stuff to keep their jobs and increase their pay.” — hootz
		“It is not at all, in the slightest, weird heuristic to deploy in the Agentic Era. It’s a heuristic after all. There is no proof one way or the other.” — tptacek
		“Is there no room in your model of the world for someone to figure out something interesting using AI tools and then write about it just because they like sharing interesting information?” — simonw

Summary:
The conversation clusters around (1) distrust of Anthropic’s restrictive security policy, (2) a realistic appraisal of how effective LLMs are at uncovering bugs, and (3) the broader HN culture where career incentives and authenticity are frequently questioned.

🚀 Project Ideas

Security PromptBroker

Summary

Mediates user requests to LLMs that are blocked by default security filters, enabling safe security‑research prompts.
Provides a simple API and UI for users to obtain “research mode” access without navigating provider bureaucracy.

Details

Key	Value
Target Audience	Security researchers, penetration testers, red‑team engineers
Core Feature	Automatic prompt negotiation with LLM providers, fallback to alternative open‑source models when denied
Tech Stack	FastAPI backend, Redis queue, Docker, OpenAPI spec
Difficulty	Medium
Monetization	Revenue-ready: SaaS subscription $15/mo per user

Notes

Directly addresses vachanmn123’s frustration with Anthropic blocking security prompts.
Offers a practical workaround that could be discussed on HN as a “useful hack”.

VeriChain Composable LLM Security Pipeline

Summary

A framework that chains LLM steps (code generation → test creation → vulnerability analysis → auto‑fix) into a reproducible loop.
Integrates with CI/CD to continuously improve code safety.

Details

Key	Value
Target Audience	Dev teams, AI‑augmented developers, security engineers
Core Feature	End‑to‑end pipeline with artifact storage, verification checks, and result tagging
Tech Stack	Python, LangChain, PostgreSQL, GitHub Actions, React dashboard
Difficulty	High
Monetization	Revenue-ready: Enterprise tier $50/mo per repo

Notes

Mirrors simonw’s suggestion of composing LLMs for security testing; would resonate with HN’s interest in practical AI tooling.
Enables systematic bug discovery and verification, fitting the “practical example” desire.

BugScore Code Complexity Analyzer

Summary

Analyzes source code for hidden complexity and bug‑prone patterns, assigning a severity score. - Generates targeted test cases to expose issues that LLMs often miss.

Details

Key	Value
Target Audience	Software engineers, code reviewers, open‑source maintainers
Core Feature	Static analysis + LLM‑driven test synthesis, produces maintainability recommendations
Tech Stack	Rust (analysis engine), Python (LLM wrapper), SQLite (scores), VS Code extension
Difficulty	Medium
Monetization	Hobby

Notes

Tackles keybored’s concern about “working vs. good enough” code and amelius’s call for concrete metrics.
Provides actionable insights that could spark discussion on HN about improving code quality with AI assistance.

All the bugs they found

Top 3 Themes from the Discussion

🚀 Project Ideas

Security PromptBroker

Summary

Details

Notes

VeriChain Composable LLM Security Pipeline

Summary

Details

Notes

BugScore Code Complexity Analyzer

Summary

Details

Notes

Read Later