Project ideas from Hacker News discussions.

Anthropic tries to hide Claude's AI actions. Devs hate it

📝 Discussion Summary (Click to expand)

1. Transparency vs. “clean‑UI” design
Developers complain that Anthropic is hiding the very information they need to trust the tool.

“Hiding filenames turns the workflow into a black box.” – KurSix
“They changed it from showing just number of files read to showing the actual paths/filenames.” – lkbm

2. Autonomy is a double‑edged sword
Agents that run unattended are praised for speed but feared for silently corrupting code.

“If Claude starts digging into node_modules or opening some stale config from 2019, I need to know immediately so I can smash Ctrl+C.” – KurSix
“The problem is the tool turning against the user.” – snvzz

3. Productivity gains vs. quality & maintainability
Many users see agents as a way to automate tedious tasks, yet the output often requires heavy review or breaks existing code.

“They use agents to automate boring chores, but the output quality is low.” – krastanov
“They can do better with multiple agents but still need review.” – adastra22

4. Monetization and feature erosion
Token‑cost, slow performance, and feature removal are cited as evidence that Anthropic is prioritizing revenue over developer experience.

“They are burning tokens, making it slow.” – nullbio
“They hide features to make UI cleaner.” – frigg

These four threads capture the core tensions in the discussion: how much visibility to give an autonomous agent, whether that autonomy is useful or dangerous, whether the productivity promises hold up in real codebases, and how business decisions are reshaping the tool.


🚀 Project Ideas

Agent Watchtower

Summary

  • Real‑time, terminal‑based monitor for LLM agents (Claude Code, OpenAI, Gemini, etc.) that streams file accesses, tool calls, and plan steps.
  • Gives developers instant visibility to abort or redirect an agent before it corrupts a codebase.
  • Core value: turns a black‑box AI into a controllable, observable collaborator.

Details

Key Value
Target Audience Individual developers, small teams using AI‑powered coding tools.
Core Feature Live event stream (file read/write, tool invocation, plan generation) with pause/abort controls and configurable verbosity.
Tech Stack Rust/Go for performance, TUI library (tui-rs / termbox), WebSocket bridge to agent APIs, optional FUSE integration for file‑system hooks.
Difficulty Medium
Monetization Hobby

Notes

  • HN users lament “Claude hides filenames” and “I can’t stop it once it starts digging into node_modules.”
  • A live stream lets you press Ctrl+C or Esc to halt the agent exactly when it starts reading the wrong directory.
  • The tool can be dropped into existing workflows (e.g., watchtower | claude-code) and is open‑source, encouraging community extensions.

PermissionGuard

Summary

  • A lightweight library that wraps any LLM agent, enforcing fine‑grained file‑system permissions and logging every action.
  • Provides an audit trail that can be exported to GitHub or a SIEM for compliance.
  • Core value: gives teams the confidence that autonomous agents cannot accidentally touch critical files.

Details

Key Value
Target Audience Enterprises, open‑source maintainers, CI/CD pipelines.
Core Feature Declarative permission rules (glob patterns, read/write flags), runtime enforcement, immutable log of actions.
Tech Stack Python library, optional Rust FUSE shim, JSON‑L formatted logs, integration with GitHub Actions.
Difficulty Medium
Monetization Revenue‑ready: subscription for enterprise support + open‑source core.

Notes

  • Commenters like “Claude starts reading the entire utils folder” highlight the need for a guard.
  • PermissionGuard can be configured once (guard.yaml) and applied to any agent, ensuring that even if the model mis‑interprets a prompt, it cannot write to config/production.yaml.
  • The audit log can be fed into a CI job that automatically flags unauthorized file accesses.

TokenCap

Summary

  • A cost‑control service that sets per‑task token budgets, monitors usage in real time, and aborts the agent when the budget is exceeded.
  • Provides instant cost estimates and a dashboard for historical usage.
  • Core value: prevents “blank‑cheque” runs that burn tokens and money.

Details

Key Value
Target Audience Developers, product managers, teams paying for LLM APIs.
Core Feature Token budget enforcement, real‑time usage meter, auto‑abort, cost‑projection widget.
Tech Stack Node.js microservice, OpenAI/Anthropic API wrappers, WebSocket for live updates, Grafana dashboard.
Difficulty Medium
Monetization Revenue‑ready: tiered pricing per token‑cap plan.

Notes

  • HN users complain about “Claude keeps running for 30 minutes on a simple prompt.”
  • TokenCap lets you set a 5‑minute, 200‑token budget and guarantees the agent stops when it hits the limit, saving money and time.
  • The dashboard can be embedded in a GitHub Action to enforce budgets on PRs automatically.

PlanGate

Summary

  • A web‑based approval system that visualizes an LLM’s proposed plan, diffs, and execution steps before any code is written.
  • Allows reviewers to approve, modify, or reject the plan, and then triggers the agent only after explicit consent.
  • Core value: restores human oversight and reduces “vibe‑coding” risk.

Details

Key Value
Target Audience Teams using AI agents for code generation, CI/CD pipelines.
Core Feature Plan rendering (JSON → tree view), diff viewer, comment threads, approval workflow, webhook to trigger agent.
Tech Stack React + TypeScript, Node.js backend, PostgreSQL for audit logs, GitHub API integration.
Difficulty Medium
Monetization Hobby (open‑source) with optional paid GitHub App add‑on.

Notes

  • Users like “I want to see the plan before it writes files” and “Claude’s plan is too noisy.”
  • PlanGate shows the exact files and changes the agent intends to make, letting you spot a wrong assumption before it runs.
  • The approval webhook can be hooked into GitHub Actions, ensuring that only reviewed code ever lands in the repo.

Read Later