FAWK: LLMs can write a language interpreter

📝 Discussion Summary (Click to expand)

The discussion revolves primarily around the use of Large Language Models (LLMs) in developing programming languages or interpreters. Here are the three most prevalent themes:

1. LLMs as Accelerators for Language Prototyping/Implementation

Many participants view LLMs as powerful tools that significantly speed up the development of new, often niche or "toy," languages, making ambitious projects accessible to those lacking the time or expertise to write everything manually.

Supporting Quote: Regarding why an individual would use an LLM instead of writing a compiler in a few weeks: > "It's between 'do it with LLMs or don't do it at all' - because most people don't have the time to take on an ambitious project like implementing a new programming language just for fun," said "simonw".
Supporting Quote: A user explicitly states the gain in speed for a substantial project: > "I'm creating my language to do AoC in this year! ... It took between a week and ten days. Cost about €10 ... I'm still getting my head around how incredible that is," said "igravious" about porting Ruby to Cosmopolitan using Claude Code.

2. Concerns Over Code Quality, Maintainability, and Depth of Understanding

A significant counterpoint involves skepticism regarding the quality, correctness, and auditability of LLM-generated language implementations, especially when developers are not deeply involved in the low-level details.

Supporting Quote: A user expressed doubt about the resulting language structure: > "The whole blog post does not mention the word 'grammar'. As presented, it is examples based and the LLM spit out its plagiarized code and beat it into shape until the examples passed," stated "bgwalter".
Supporting Quote: Another user highlighted the risk of relying on initial output without deep understanding: > "The problem comes, that you dig too deep and unearth the Balrog of 'how TF does this work?' You're creating future problems for yourself," noted "cmrdporcupine".

3. Validation Through Testing and Iterative Refinement

Despite concerns about initial output quality, several users emphasized that robust testing frameworks are crucial for successful LLM-assisted development, allowing the model to iterate and self-correct based on objective feedback (passing/failing tests).

Supporting Quote: A successful user credits their tests for enabling effective LLM use: > "because I was diligent about test coverage, sonnet 4.5 perfectly converted the entire parser to tree-sitter for me. all tests passed. that was wild," commented "l9o".
Supporting Quote: This point was echoed as a general best practice for interacting with coding agents: > "I often suspect that people who complain about getting poor results from agents haven't yet started treating automated tests as a hard requirement for working with them," suggested "simonw".

🚀 Project Ideas

LLM-Assisted Racket HashLang Generator

Summary

A tool using advanced LLMs (like GPT-4o or Claude 3 Opus) to automatically translate a user-provided Racket package definition (especially those tagging other languages like C, Python, Lua) into a working, idiomatic Racket "hashlang" implementation, including necessary lexer/parser scaffolding if required.
Core value proposition: Dramatically accelerates the creation of Racket language extensions/dialects by automating the tedious boilerplate and initial parsing logic, allowing users to focus on semantic implementation.

Details

Key	Value
Target Audience	Racket developers interested in language creation, academics, and users wanting to quickly prototype domain-specific languages (DSLs).
Core Feature	Takes a description or existing language's package link/name and generates the necessary Racket code structure to embed that language via `#lang` (hashlang).
Tech Stack	Python/TypeScript backend for API interaction (OpenAI/Anthropic), potential Racket CLI wrapper for initial setup/hooking into `raco`.
Difficulty	Medium (The difficulty lies in reliably prompting the model to produce runnable, idiomatic Racket code that correctly interacts with Racket's language machinery, not just generating code snippets.)
Monetization	Hobby

Notes

Why HN commenters would love it: Addresses Y_Y's specific goal: "I've been trying to get LLMs to make Racket 'hashlangs'† for years now..." This product directly solves that frustration by focusing an LLM effort on Racket's specific language tooling.
Potential for discussion or practical utility: The discussion around whether LLMs can truly succeed at generating complex parser interactions (mentioned by bgwalter and jamesu) would be immediately tested by this focused tool.

LLM-Validated Compiler/Interpreter Scaffolder

Summary

A platform that scaffolds the initial structure for a custom programming language interpreter or compiler, ensuring the generated boilerplate (Lexer, Parser, Token definitions) is immediately testable against a user-provided set of example programs.
Core value proposition: Overcomes the initial inertia and multi-week commitment required to set up foundational compiler components, allowing users like epolanski to "play as quickly as possible with their ideas."

Details

Key	Value
Target Audience	Aspiring language developers, hobbyists, and computer science students learning compilation/interpretation.
Core Feature	Generation of Lexer/Parser code in a chosen target language (e.g., C++, Python, Rust), paired with an auto-generated, mandatory test suite derived from user input examples, which must all pass before outputting the final scaffold.
Tech Stack	Multi-language support (generating output in Python, Rust, or C++), using advanced code execution sandbox (e.g., Piston for secure testing) to validate generated components against examples.
Difficulty	Medium (Requires robust, secure execution environment for testing generated parser output against user-provided inputs.)
Monetization	Hobby

Notes

Why HN commenters would love it: Directly addresses the barrier to entry mentioned by epolanski ("it took me a lot of time and effort to have something solid and functional") and the desire for high-leverage prototyping (keepamovin). The focus on making tests pass mitigates concerns about "vibe-coded" trash (lionkor).
Potential for discussion or practical utility: Greatly democratizes PL theory exploration; sparks debate about whether automated test validation is the key to trustworthy LLM-generated infrastructure code.

Tooling Context Uplift Service (TCUS)

Summary

A service designed to solve the context retrieval problem for specialized code generation by automatically indexing and indexing a user's local codebase, configuration files, and relevant documentation into a dedicated, fast-retrieval context layer for an LLM coding agent.
Core value proposition: Provides the LLM with deep, localized project knowledge ("search ~/dev/datasette/docs for documentation") without relying on slow or rate-limited GitHub lookups, ensuring the LLM uses correct internal APIs and architectural patterns.

Details

Key	Value
Target Audience	Developers extensively using coding agents (like Claude Code) on large, proprietary, or non-public projects.
Core Feature	A lightweight indexing agent that monitors specified local directories (e.g., `~/dev/`) and creates vector embeddings or optimized RAG indices. It then automatically injects context-aware retrieval commands into the prompt stream when interacting with the LLM API.
Tech Stack	Python indexing agent, local vector database (e.g., ChromaDB, FAISS), prompt engineering layer to manage context injection via pre-prompting/tool calling hooks.
Difficulty	Medium (The complexity is in managing the synchronization, indexing speed, and the orchestration layer that correctly surfaces local data retrieval into the LLM conversation.)
Monetization	Hobby

Notes

Why HN commenters would love it: Directly supports simonw's demonstrated technique ("I often suspect that people who complain about getting poor results from agents haven't yet started treating automated tests as a hard requirement...") but generalizes access to local knowledge beyond just tests, solving the "nonexistent APIs" problem (Razengan).
Potential for discussion or practical utility: Becomes a necessary utility layer for serious AI-assisted development, leading to discussions on local RAG relevance versus cloud-based data provision.