1M context is now generally available for Opus 4.6 and Sonnet 4.6

📝 Discussion Summary (Click to expand)

1. Long‑context windows are a double‑edged sword
“The 1 M window has been holding up pretty well until about 700k+. Sometimes it would continue to do okay past that, sometimes it started getting a bit dumb around there.” – a_e_k
“I’ve been using the 1 M context at work through our enterprise plan … it seems to have been holding up pretty well until about 700k+.” – a_e_k
“The 1 M context has been a welcome addition, but the per‑task cost goes up while the time‑to‑correct‑output drops significantly.” – gregharned
Users praise the ability to avoid mid‑task compaction, but many note that coherence degrades after ~600‑700 k tokens and that compaction can erase useful information.

2. Workflow engineering is key to making long context useful
“I start with a PRD, ask for a step‑by‑step plan, and just execute on each step at a time.” – frannky
“I keep a CLAUDE.md file in my project root with key decisions and context.” – nvardakas
“Plan mode is great, but the plan file names are randomly generated, so it can delete the plan without asking.” – hedora
Successful users rely on explicit planning, markdown logs, sub‑agents, and frequent context resets to keep the model focused and avoid “dumb” loops.

3. Pricing and token‑budget anxiety dominate conversations
“I’ve been poking at it today, and it definitely changes my workflow – I feel like a full three or four hour parallel coding session with subagents is now generally fitting into a single master session.” – vessenes
“The 1 M context will be great for this, but it’s expensive – 500‑1000 $ for a session.” – tudelo
“I’m paying $200/mo for the most expensive plan, but I still get charged extra for 1 M context usage.” – LoganDark
Users are torn between the promise of larger windows and the reality of higher per‑token costs, especially when compaction or tool calls inflate the input size.

4. Model differences matter – Claude vs Codex vs Gemini
“Codex continues working great post‑compaction since 5.2.” – furyofantares
“Gemini gets real real bad when you get far into the context – it gets into loops, forgets how to call tools, etc.” – girvo
“Opus 4.6 is nuts. Everything I throw at it works.” – frannky
Users compare the same prompt across models, noting that Claude’s long‑context performance is still uneven, while Codex and Gemini often suffer from hallucinations or tool‑forgetting.

5. Real‑world use cases reveal both promise and limits
“I’ve had Opus fix a Rust B‑rep CSG classification pipeline successfully over the course of a week, unsupervised.” – virtualritz
“I’m building a game with Opus – it can write a test harness, but it also creates a test suite for an existing tool.” – sarchertech
“I’m using it for refactoring, code reviews, and generating feature ideas, but I still have to step in for architectural decisions.” – avereveard
While many developers can generate large amounts of code quickly, they still need to review, correct, and sometimes rewrite the output, especially for complex or safety‑critical tasks.

6. Community sentiment oscillates between hype and skepticism
“I’m not sure if AI will replace human developers, but it’s already changing how we work.” – popcorncowboy
“I think we’re just getting better at using LLMs across the board, not that the models themselves are getting better.” – fbrncci
“The shift is that 1 M context makes ‘load the whole codebase once, run many agents’ viable, whereas before you were constantly re‑chunking.” – gregharned
Debate centers on whether the technology is truly transformative or simply a productivity boost that still requires human oversight.

🚀 Project Ideas

ContextWatch – Real‑Time LLM Context Dashboard

Summary

Provides live metrics on token usage, context window fill, compaction events, and cost per request for any LLM harness (Claude, GPT‑4, Gemini, etc.).
Gives alerts when approaching “dumb zone” thresholds or when compaction is triggered, helping users keep conversations coherent.

Details

Key	Value
Target Audience	Developers, AI‑ops teams, product managers using LLMs in production.
Core Feature	Web dashboard + API that streams context stats, compaction logs, and cost estimates.
Tech Stack	React + D3 for UI, Node.js + Express for API, WebSocket for real‑time updates, PostgreSQL for persistence.
Difficulty	Medium
Monetization	Revenue‑ready: subscription tiers ($10/mo for 10k tokens/day, $50/mo for 100k tokens/day).

Notes

HN users like “MikeNotThePope” and “tudelo” complain about not knowing when compaction will happen; this tool gives that visibility.
Enables teams to decide when to reset context or start a new session, reducing wasted tokens and cost.

PruneBot – Automated Context Pruning CLI

Summary

Analyzes conversation logs, identifies low‑value or redundant messages, and suggests deletions or summarizations before sending the next prompt.
Integrates with popular harnesses (Claude Code, Copilot CLI, OpenAI API) via hooks.

Details

Key	Value
Target Audience	AI developers, CI/CD pipelines, automated agents.
Core Feature	CLI that parses JSONL logs, scores tokens by semantic value, outputs a prune plan.
Tech Stack	Python 3.11, FastAPI for optional web UI, spaCy for NLP scoring, GitHub Actions for CI integration.
Difficulty	Medium
Monetization	Hobby (open source).

Notes

Addresses frustration from “fnordpiglet” and “dominotw” about manual context editing.
Reduces token cost by removing noise, improving model coherence.

PlanKeeper – Persistent Plan & Log File System

Summary

Automatically creates, updates, and version‑controls plan (PLAN.md), task (TASK.md), and log (LOG.md) files in a repo.
Ensures the LLM always starts with the latest plan and can resume long tasks across sessions.

Details

Key	Value
Target Audience	Teams using agentic workflows, CI pipelines, solo devs.
Core Feature	Git‑based file generator with hooks for plan creation, update, and archiving.
Tech Stack	Node.js, Git, YAML for config, VS Code extension for in‑IDE integration.
Difficulty	Medium
Monetization	Revenue‑ready: $5/mo per repo, free tier for open source.

Notes

Solves pain points from “avereveard” and “visarga” who need persistent context across sessions.
Encourages best practices: plan first, then implement, then review.

AgentMesh – Lightweight Sub‑Agent Orchestration Framework

Summary

Provides a simple API to spawn isolated sub‑agents, each with its own context, and orchestrate them from a master agent.
Tracks cost per sub‑agent, handles compaction automatically, and supports rollback.

Details

Key	Value
Target Audience	AI‑ops engineers, product managers building complex workflows.
Core Feature	Context‑isolation, cost accounting, automatic compaction triggers, rollback support.
Tech Stack	Go for concurrency, gRPC for communication, Redis for state, Docker for sandboxing.
Difficulty	High
Monetization	Revenue‑ready: $20/mo per deployment, enterprise licensing.

Notes

Addresses “brookst” and “s900mhz” frustrations with long autonomous tasks and compaction.
Enables parallel execution without polluting the master context.

CostSight – Long‑Context Cost Optimizer

Summary

Predicts token usage and cost for a given context size and task type, suggesting optimal compaction strategy and budget allocation.
Integrates with billing APIs to provide real‑time cost dashboards.

Details

Key	Value
Target Audience	Developers on tight budgets, product managers, finance teams.
Core Feature	Machine‑learning model trained on historical usage, cost‑saving recommendations.
Tech Stack	Python, TensorFlow, Flask, Stripe API for billing, Grafana for dashboards.
Difficulty	Medium
Monetization	Revenue‑ready: $15/mo per user, enterprise tier.

Notes

Responds to “tudelo” and “aenis” concerns about $200/month plans and extra usage charges.
Helps teams decide when to use 1M context vs. 200k context.

PromptCraft – Interactive Context‑Aware Prompt Builder

Summary

Guided UI that helps users build research → plan → implement → review prompts, automatically inserting context references and compaction triggers.
Supports markdown templates and auto‑generation of plan files.

Details

Key	Value
Target Audience	Developers, technical writers, AI enthusiasts.
Core Feature	Step‑by‑step prompt wizard, context‑aware suggestions, auto‑insert `/compact` commands.
Tech Stack	Vue.js, Node.js, OpenAI/Claude API wrappers, local storage for templates.
Difficulty	Medium
Monetization	Hobby (open source).

Notes

Directly tackles “comboy” and “copperx” frustrations with manual /compact usage.
Encourages best practices like the RPI Framework and frequent intentional compaction.

1M context is now generally available for Opus 4.6 and Sonnet 4.6

🚀 Project Ideas

ContextWatch – Real‑Time LLM Context Dashboard

Summary

Details

Notes

PruneBot – Automated Context Pruning CLI

Summary

Details

Notes

PlanKeeper – Persistent Plan & Log File System

Summary

Details

Notes

AgentMesh – Lightweight Sub‑Agent Orchestration Framework

Summary

Details

Notes

CostSight – Long‑Context Cost Optimizer

Summary

Details

Notes

PromptCraft – Interactive Context‑Aware Prompt Builder

Summary

Details

Notes

Read Later