Skills Officially Comes to Codex

📝 Discussion Summary (Click to expand)

1. Skills Superior to MCPs/Tools

Skills praised for simplicity, context efficiency, composability, and on-demand loading vs. token-heavy MCPs.
"much bigger deal long-term than e.g. MCP" (cube2222); "This is much better than MCP, which also stuffs every session's precious context with potentially irrelevant instructions" (wahnfrieden); "Skills are much simpler than mcps, which are hopelessly overengineered" (Sammi).

2. Practical Use Cases & Customization

Custom skills for workflows (e.g., DB access, PRs, testing, niche tools) enable reusability and team sharing.
"the ones I create myself... very specific and proprietary. For instance, a skill on how to write a service in my back-testing framework" (frankc); "skills: writing tests, fetching sentry info, using playwright... submitting a PR according to team conventions" (JamesSwift); "a skill to access your database for testing purposes" (jonrosner).

3. Verifiability & Token Concerns

Free-form Markdown hinders evaluation; index bloats context despite on-demand promise.
"free-form nature of the 'body' part... lead to an inevitably unverifiable process?" (mikaelaast); "skill descriptions are all essentially prompt injections... add to your input tokens on every agentic turn" (btown); "the index seems as much a liability as a boon. Keeping the context clean... is one of the most important things" (Sammi).

🚀 Project Ideas

SkillEval: Structured Evaluation Framework for Agent Skills

Summary

A CLI tool and web dashboard for defining structured schemas (JSON/YAML) for skills, running automated evals with test suites, DSPy/GEPA integration, and generating reports on reliability/accuracy.
Core value: Turns unverifiable free-form Markdown skills into parameterized, testable artifacts for iteration and trust.

Details

Key	Value
Target Audience	Developers and teams building/maintaining agent skills (e.g., Claude/Codex users like frankc, mikaelaast)
Core Feature	Schema validator, multi-run eval harness with LLM-as-judge alternatives, skill optimization via DSPy
Tech Stack	Python (DSPy, Pydantic), React for dashboard, SQLite/Postgres for test storage
Difficulty	Medium
Monetization	Revenue-ready: Freemium SaaS ($10/mo pro evals)

Notes

"I would like my agents to be inherently evaluable... structured skills file format help you evaluate" (mikaelaast, coldtea); "DSPy + GEPA... systematic evaluation" (joshka).
HN would love for practical utility in agent workflows; sparks discussions on evals best practices.

SkillsForge: Curated Gallery and Team Sharing Hub

Summary

A GitHub-like platform for discovering, forking, and rating skills.md files with vetted submissions, search by tech stack/task, and org-private repos for team standardization.
Core value: Solves discovery/inspiration gap without spam, enables cross-team reuse like rdli's common repo but with UI/ranking.

Details

Key	Value
Target Audience	Dev teams and individuals sharing skills (e.g., startups like rdli, hu3 needing Django/DB examples)
Core Feature	Skill browser with previews, ratings/comments, import to local .codex/skills, compatibility checker for Claude/Codex
Tech Stack	Next.js, Supabase (auth/storage/search), Git integration
Difficulty	Medium
Monetization	Revenue-ready: Org tiers ($20/user/mo private repos)

Notes

"If there was a marketplace or directory of skills.md files that were ranked with comments" (orliesaurus); "gallery than a marketplace... inspiration" (true2octave, relativeadv).
High discussion potential on curation vs. open-source; utility for quick onboarding like marimo or Django skills.

SecretSkill: Secure Config Injector for Distributable Skills

Summary

A CLI/service that packages skills with encrypted secret vaults (.env-like but prompt-injected), one-time setup wizard for non-tech users, and auto-adapts to agent sandboxes.
Core value: Enables easy client/team distribution without hardcoding creds or manual .env, fixing proprietary workflow sharing.

Details

Key	Value
Target Audience	Consultants/freelancers distributing skills (e.g., jonrosner to clients, frankc's proprietary back-testing)
Core Feature	Vault encryption (age/sops), skill templating with vars, first-use setup UI, multi-agent export (Claude/Codex/Gemini)
Tech Stack	Node.js CLI, Tauri for desktop app, Fernet for secrets
Difficulty	Low
Monetization	Revenue-ready: $5/mo per user unlimited vaults

Notes

"some kind of secret storage... ask for those setup-parameters once" (jonrosner); "distribute my logic easily... non-technical users" (jonrosner, freakynit).
Practical for real workflows; HN praises simplicity over MCP complexity.

Skills Officially Comes to Codex

1. Skills Superior to MCPs/Tools

2. Practical Use Cases & Customization

3. Verifiability & Token Concerns

🚀 Project Ideas

SkillEval: Structured Evaluation Framework for Agent Skills

Summary

Details

Notes

SkillsForge: Curated Gallery and Team Sharing Hub

Summary

Details

Notes

SecretSkill: Secure Config Injector for Distributable Skills

Summary

Details

Notes

Read Later