Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed

📝 Discussion Summary (Click to expand)

1. Harness design is the real lever for productivity

“The harness is where the open source should shine. It doesn’t require millions of dollars of compute but the search space is vast and explorable with limited budgets.” – scotty79
“The best results I get aren’t from fully autonomous agents, they’re from tight human‑in‑the‑loop cycles where I steer in real time.” – logicprog

2. Token‑saving edit formats (hash‑line vs line‑number) matter

“I’m still seeing 25‑50 % reduction in tokens.” – andai
“The token savings alone are worth it.” – kachapopopow

3. Subscription lock‑in vs open, local harnesses

“Being locked into a specific harness because you pay 20 bucks per month … is kinda dumb.” – withinboredom
“The right route is open models and open harnesses, ideally on local hardware.” – eshaham78

4. Security, privacy and policy friction

“They can capture some additional telemetry owning the harness as well, but… borders on unethical spyware.” – CuriouslyC
“Getting banned from Gemini while attempting to improve Gemini is the most Googley thing ever.” – nekitamo

These four threads—harness engineering, token‑efficient editing, economic lock‑in, and policy/security—dominate the discussion.

🚀 Project Ideas

HashLine Editor

Summary

Provides a lightweight CLI and editor plugin that lets LLMs edit files using hash‑based line identifiers instead of fragile line numbers.
Guarantees idempotent edits, handles concurrent modifications, and reduces token usage by sending only diffs.
Core value: eliminates “clobbering” and “wrong‑file” errors while keeping context minimal.

Details

Key	Value
Target Audience	Developers using LLM‑powered coding assistants (e.g., Claude Code, Cursor, Pi).
Core Feature	Hash‑based line addressing, concurrency‑safe patch application, optional editor integration.
Tech Stack	Rust/Go for CLI, Node.js for VSCode/Neovim plugin, JSON‑Patch, Git for version control.
Difficulty	Medium
Monetization	Hobby

Notes

HN users complained about “clobbering a file if the file changed between read and write” and “line numbers causing wrong edits” (energy123, withinboredom).
The tool’s hash approach directly addresses these frustrations and offers a token‑efficient workflow, a point many commenters highlighted (e.g., jahala’s tilth).
Discussion potential: compare token savings vs. traditional line‑number methods; benchmark against existing tools like apply_patch.

OpenHarness Hub

Summary

A web‑based marketplace for open‑source harnesses that can be plugged into any LLM (Claude, Gemini, OpenAI, etc.) with OAuth and pay‑per‑use billing.
Enables users to swap models and harnesses without vendor lock‑in, and to monetize custom harnesses.
Core value: removes subscription friction and gives developers control over the “body” of their AI agent.

Details

Key	Value
Target Audience	AI developers, small teams, open‑source contributors.
Core Feature	OAuth‑based authentication, model‑agnostic harness API, marketplace, billing integration.
Tech Stack	Django/Node.js backend, PostgreSQL, Stripe/PayPal, Docker for sandboxing, OpenAPI spec.
Difficulty	High
Monetization	Revenue‑ready: subscription + marketplace fees

Notes

Users expressed anger at “locked‑in harnesses” and subscription plans (withinboredom, horsawlarway, aurornis).
The platform satisfies the demand for “open models and open harnesses” (eshaham78, deaux).
Potential discussion: how to enforce security, rate limits, and compliance while keeping the ecosystem open.

Semantic Search & Refactor Engine

Summary

A library that builds a semantic index of a codebase using tree‑sitter and language‑specific parsers, exposing a structured API for LLMs to perform search, navigation, and refactoring.
Reduces token churn by returning only relevant AST nodes or code snippets, and supports multi‑file edits with minimal context.
Core value: improves LLM accuracy and speed for code‑centric tasks, addressing the “search‑replace” pain point.

Details

Key	Value
Target Audience	LLM‑powered coding assistants, IDE extensions, CI pipelines.
Core Feature	Tree‑sitter based indexing, semantic search, structured refactor commands, diff generation.
Tech Stack	Rust (tree‑sitter bindings), Python API, SQLite/Faiss for vector search, Docker for sandboxed execution.
Difficulty	Medium
Monetization	Hobby

Notes

Comments highlighted the need for better “search‑replace” and “structured diff” tools (pcwelder, jahala, kachapopopow).
The engine directly tackles the token‑heavy “cat + grep + manual line counting” workflow, offering a more efficient alternative.
Discussion angle: compare performance against existing tools like Serena, tilth, and the benefits of AST‑level operations.

Secure Edit Validator

Summary

A service that automatically compiles, tests, and verifies LLM‑generated edits before they are applied to a codebase.
Provides blast‑radius analysis, rollback, and audit logs, ensuring that automated changes do not introduce regressions or security issues.
Core value: addresses the “tool output is gospel” frustration and builds trust in AI‑assisted coding.

Details

Key	Value
Target Audience	Teams using LLM agents for code changes, CI/CD pipelines, security‑critical projects.
Core Feature	Automated build/test, static analysis, diff‑based rollback, audit trail, integration with GitHub Actions.
Tech Stack	Go/Python microservice, Docker, Kubernetes, GitHub API, OWASP ZAP, SonarQube.
Difficulty	High
Monetization	Revenue‑ready: SaaS subscription + per‑run fee

Notes

HN users noted that “the model suggests edits that break auth logic” and “no compile checks” (the_harpia_io).
The validator directly mitigates these concerns, providing a safety net for AI‑generated code.
Potential discussion: how to balance speed vs. thoroughness, integration with existing CI workflows, and open‑source vs. commercial deployment.

Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed

🚀 Project Ideas

HashLine Editor

Summary

Details

Notes

OpenHarness Hub

Summary

Details

Notes

Semantic Search & Refactor Engine

Summary

Details

Notes

Secure Edit Validator

Summary

Details

Notes

Read Later