Three prevailing themes in the discussion
| Theme | Key idea | Representative quotes |
|---|---|---|
| 1. Self‑generated skills are largely ineffective without human guidance | LLMs that write their own procedural knowledge tend to add little value; curated or human‑refined skills outperform them. | “The finding that self‑generated skills provide negative benefit (-1.3pp) while curated skills give +16.2pp is the most interesting result here imo.” – secbear “Self‑generated Skills are useless (-1.3pp) and human‑curated ones help a lot (+16.2pp).” – rriley |
| 2. Human feedback and iterative refinement are essential | Agents need a human in the loop to steer, reflect, and update skills; the loop itself is the real value. | “I treat them like mini CLAUDE.mds that are specific only to certain workflows… I ask it to reflect on why, and update the Skill to clarify.” – turnsout “I only generate skills after I've worked through a problem… I have no idea why people would think it can zero‑shot a problem space without any guidance.” – rcarmo |
| 3. Information degrades through repeated LLM calls (“telephone” effect) | Each successive LLM layer or iteration loses fidelity; explicit constraints and prompts are needed to preserve intent. | “The more layers you automate with LLMs, the worse each successive layer gets.” – embedding‑shape “It's like those sequences of images where we ask the LLM to reproduce the same image exactly… we get a grotesque collapse after a few dozen iterations.” – nimonian “The model knows damn well when it's written ugly code… unless explicitly prompted for it with constraints.” – embedding‑shape |
These three threads—limitations of pure self‑generation, the indispensable role of human oversight, and the rapid semantic drift of repeated LLM use—capture the core concerns voiced by the community.