Project ideas from Hacker News discussions.

Implications of AI to schools

📝 Discussion Summary (Click to expand)

The discussion revolves heavily around the impact of Large Language Models (LLMs) on education, focusing on assessment methods, the unreliability of detection software, and the broader implications for institutional integrity.

Here are the three most prevalent themes:

1. Necessity of In-Person/Unproctored Assessment Methods

There is a strong consensus that traditional take-home assignments are obsolete due to AI, leading many users to advocate for returning to methods that ensure work is done solely under supervision.

  • Supporting Quotation: User "wffurr" noted, "Andrej and Garry Trudeau are in agreement that 'blue book exams' (I.e. the teacher gives you a blank exam booklet, traditionally blue) to fill out in person for the test, after confiscating devices, is the only way to assess students anymore."
  • Supporting Quotation: User "mavhc" suggested a scalable, modern alternative: "Tests were created to save money, more students per teacher, we're just going back to the older, actually useful, method of talking to people to see if they understand what they've been taught."

2. Distrust in AI Detection Software and False Positives

A major theme is the deep skepticism regarding the accuracy of current AI detection tools, specifically citing their potential to unfairly penalize honest students (false positives).

  • Supporting Quotation: User "phh" stated, regarding 80% accuracy claims, "In a classroom of 30 all honest pupils, 6 will get a 0 mark because the software says its AI?"
  • Supporting Quotation: User "polynomial" illustrated the absurdity of relying on these tools: "Famously, a popular AI detector 'determined' the Declaration of Independence was written by AI."

3. Shift in Educational Goal: Assessing Understanding Over Output

Several users argue that the crisis forces educators to reassess what they are testing. If AI can generate proficient output, assessment must pivot to verifying genuine comprehension, often through direct interaction.

  • Supporting Quotation: User "muldvarp" summarized the core issue driving this shift: "AI research tools are increasingly useful if you're a competent researcher that can judge the output and detect BS."
  • Supporting Quotation: User "ubj" relayed an anecdote highlighting the only viable defense against false accusations: "My only suggestion was for her to ask the teacher to sit down with her and have a 30-60 minute oral discussion on the essay so she could demonstrate she in fact knew the material."

🚀 Project Ideas

AI Output Authenticity Proofing Service (AOAPS)

Summary

  • A service that leverages ubiquitous version control/document history (e.g., Google Docs, Git) to create a provable, time-stamped chain of custody for student work, countering false AI detection accusations.
  • Core value proposition: Providing verifiable proof of human composition process, protecting honest students from tools like Turnitin that exhibit high false positive rates mentioned in the discussion (e.g., "80% accuracy").

Details

Key Value
Target Audience Students frequently facing unsubstantiated AI detection claims, and educators seeking fair assessment methods beyond black-box tools.
Core Feature CLI/Web tool that ingests links/exports from common platforms (Google Docs, Overleaf/Git) and generates a signed, immutable proof-of-work artifact (e.g., a specific Merkle tree commitment or verifiable credential).
Tech Stack Go/Rust for the backend processor, standard APIs for Google Drive/GitHub integration, Cryptography libraries for signing (e.g., using a public key infrastructure to issue claims).
Difficulty Medium. Integrating with all major platforms securely is complex, but the core challenge is the auditing/signing mechanism.
Monetization Hobby

Notes

  • Why HN commenters would love it: Addresses the pain point raised by ubj ("His sister had written... an essay... AI detection tool' had classified it as having been written by AI with '100% confidence'") by providing "edit history link" verification suggested by johanam and FloorEgg.
  • Potential for discussion or practical utility: It directly addresses the fairness concern ("unfair to those students to have to prove their innocence," kelseyfrog), positioning against unreliable, opaque detection tools by focusing on verifiable transparency.

Scalable Oral Examination Simulator (SOES)

Summary

  • A service that uses advanced LLMs (fine-tuned for pedagogical questioning) to conduct automated, secure 1:1 oral examinations/viva voce sessions based on submitted homework or essays.
  • Core value proposition: Offers a scalable alternative to pen-and-paper exams or in-person grilling sessions, assessing genuine understanding as suggested by mavhc ("Use AI to talk to the student to find out if they understand").

Details

Key Value
Target Audience University/Higher Education instructors looking for robust, scalable assessment methods that test comprehension beyond written output.
Core Feature Students submit written work; the system then initiates a timed, randomized, multi-turn conversation regarding specific sections of their work. The session is recorded (audio/transcript) and graded (or flagged for human review) based on conversational depth and defense of chosen reasoning.
Tech Stack LLMs (e.g., high-context models like GPT-4o or Claude Opus), Speech-to-Text/Text-to-Speech services, secure video/audio recording architecture (potentially with browser-enforced session context).
Difficulty High. Ensuring the AI interrogator cannot be easily circumvented (e.g., by another AI impersonating the student, as noted by rixed) requires sophisticated session security and behavioral verification.
Monetization Hobby

Notes

  • Why HN commenters would love it: Directly supports the idea floated by mavhc and nmfisher ("I'd be much more in favour of oral examinations") while addressing the logistical concerns (throwaway31131, HDThoreaun) about teacher bandwidth.
  • Potential for discussion or practical utility: Sparks intense debate on the role of AI in pedagogy—whether AI can be an effective, unbiased interrogator, or if it just creates a new arms race (bluefirebrand's skepticism about "more AI" solving AI problems).

Pedagogical Shift Assessment Tool (PSAT)

Summary

  • A tool designed for educational institutions to rapidly analyze assessment strategies against expected cognitive load, simulating the effectiveness of current assignments against potential AI assistance.
  • Core value proposition: Moves assessment away from "teaching to the test" (nmfisher) and credential chasing, by quantifying what skills (memorization vs. application) are actually being measured, thereby guiding educators toward more AI-resistant assignments.

Details

Key Value
Target Audience Curriculum designers, academic administrators, and department heads struggling to redesign coursework post-LLM adoption.
Core Feature Educators input assignment prompts, expected outcomes (linking to Bloom's Taxonomy levels), and current assessment artifact examples. The tool analyzes the prompt's susceptibility to high-quality LLM generation vs. tasks requiring genuine novel synthesis or physical demonstration (e.g., "show your work" analysis).
Tech Stack Statistical modeling, LLM analysis (for prompt evaluation), Taxonomy mapping libraries (for structuring learning objectives).
Difficulty Medium. The technical challenge is less about the code and more about establishing reliable taxonomy mapping and convincing educators to use nuanced analytical frameworks rather than simple detection tools.
Monetization Hobby

Notes

  • Why HN commenters would love it: It attacks the root cause identified by many: the reliance on easily gamed metrics (array_key_first, calgoo). It supports the sentiment that "If you can’t tell what your students are independently capable of... you’re not doing a good job" (peebee67).
  • Potential for discussion or practical utility: It provides a structured framework for the meta-discussion happening in the thread about what education should be, moving beyond stop-gap detection measures to proactive redesign.