A recent experience with ChatGPT 5.5 Pro

📝 Discussion Summary (Click to expand)

1. AI timeline & the need for natural‑language math search

“I saw Tim Gowers give a talk … where he predicted that in 100 years humans would no longer be doing research mathematics. I wonder if he’s adjusted his timeline.”
— bustermellotron

The community wonders whether AI will truly replace mathematicians within a century and what tools (e.g., a math‑oriented search engine) are required to bridge the gap.

2. Grading & exam design under AI assistance

“> 90 % of the final grade are in room examinations … This is really just a glorified undergraduate education, the real point of graduate school is to learn to do real‑world relevant research.”
— zozbot234

When students use LLMs for homework or exams, instructors must redesign assessments (e.g., give broken code to debug) to retain meaningful evaluation.

3. Credit and authenticity of LLM‑aided proofs

“If a mathematician solved a major problem by having a long exchange with an LLM … would we regard that as a major achievement of the mathematician? I don’t think we would.”
— doginasuit There is ongoing debate about whether credit should be given to humans who merely guide an LLM that produces the bulk of a proof.

4. Access & funding inequities for AI tools> “Can you tell me what is the budget necessary to supply AI tools capable of substantial research assistance to all academic staff at a university?” > — NotOscarWilde

High‑cost frontier models exacerbate existing disparities between well‑funded and under‑resourced institutions or individuals.

5. Motivational and existential impact on researchers

“I always believed that my work speaks for itself and transcends beyond my limited time on this cosmic experience. This notion of immortality was just a small intangible bonus I hoped for when I jumped into grad school.”
— MinimalAction

Many fear that if AI can “solve” easy problems, the personal sense of achievement and legacy that motivates graduate students may evaporate.

6. Technical limits – verification and digestibility of results

“For the latter, I think LLM use will be accepted but there will be a heavy expectation on the author of making the result very easily digestible for human mathematicians and linking it thoroughly with the existing literature – something that LLMs are very much not successful at.”
— crocdundae

Even capable models produce outputs that are hard for humans to assess, requiring additional layers of scrutiny before acceptance in research.

🚀 Project Ideas

MathLit Search

Summary- Natural‑language search engine for mathematics that returns papers, lemmas, and proof snippets relevant to a user’s query.

Core value: lets researchers find literature without needing exact terminology or jargon.

Details

Key	Value
Target Audience	Graduate students, mathematicians, research librarians
Core Feature	Semantic search with citation retrieval and proof‑snippet preview
Tech Stack	Elasticsearch + sentence‑transformers + fine‑tuned Llama‑3‑70B + LaTeX parser
Difficulty	Medium
Monetization	Revenue-ready: $15/mo per user (academic license)

Notes

Directly addresses HN complaints about “can’t find relevant math papers quickly.”
Provides provenance links for source verification, reducing plagiarism risk.

ProofGuard

Summary

Collaborative AI‑assisted proof verification platform for graduate‑level mathematics.
Core value: integrates LLM‑generated proof steps with real‑time human annotation and inconsistency detection.

Details

Key	Value
Target Audience	PhD candidates, postdoctoral researchers, journal reviewers
Core Feature	Live co‑authoring interface where LLM suggests proof steps and a verifier flags logical gaps
Tech Stack	React front‑end, FastAPI backend, GPT‑4o, Lean/Coq interactivity layer
Difficulty	High
Monetization	Revenue-ready: $30/mo per reviewer (team plans)

Notes

Solves the grading and verification concerns raised in the discussion about AI‑generated assignments.
Aligns with Gowers’ call for “human‑in‑the‑loop” verification of AI‑produced mathematics.

AutoGradFix

Summary

Automated tool that detects AI‑generated code in student assignments and creates targeted debugging exercises to test true comprehension. - Core value: preserves assessment integrity while reducing manual grading workload.

Details

Key	Value
Target Audience	Undergraduate and graduate instructors in CS/math‑heavy courses
Core Feature	Static analysis + LLM‑generated “broken” code snippets that students must fix to earn points
Tech Stack	Node.js backend, GPT‑4 for code mutation, CodeBERT embeddings for detection
Difficulty	Low
Monetization	Hobby

Notes

Responds to the grading nightmare described by crocdundae with ClaudeCode.
Implements the “ask what the code does” approach that HN users praised, automating its creation.

MathAudit

Summary- Service that audits AI‑generated mathematical claims before they appear in preprints or journals.

Core value: provides instant formal‑verification reports using proof assistants.

Details

Key	Value
Target Audience	Researchers submitting to arXiv, journal editors, conference reviewers
Core Feature	Submit a LaTeX proof; backend translates to Lean and runs an automated checker, returning a verification score
Tech Stack	Lean proof checker, GPT‑4 for informal‑to‑formal translation, Dockerized workflow
Difficulty	High
Monetization	Revenue-ready: $0.01 per verification (pay‑per‑use)

Notes

Tackles the “AI slop” and publishing‑quality concerns discussed in the thread.
Offers a practical solution to the verification bottleneck mentioned by multiple commenters.

ResearchCoPilot

Summary

Integrated research workflow platform that combines LLM‑driven idea generation, citation suggestion, and version control for math papers.
Core value: streamlines the “reading‑and‑writing” loop that currently forces scholars to juggle multiple tools.

Details

Key	Value
Target Audience	Mathematicians, PhD students, research groups
Core Feature	Draft management with auto‑generated bibliography, change tracking, and LLM‑suggested extensions
Tech Stack	Django + Git backend, LlamaIndex for retrieval, Redis for caching
Difficulty	Medium
Monetization	Revenue-ready: $20/mo per user (institutional subscription)

Notes

Turns an LLM into a persistent research assistant, addressing the “mentoring” analogy from the discussion.
Reduces friction around learning how to prompt effectively, a pain point highlighted by many HN participants.

LeanSketch

Summary

Interactive proof‑sketching tool that converts hand‑drawn diagrams or informal outlines into Lean‑formalized proofs.
Core value: lowers the barrier to entering formal verification by letting users focus on intuition first.

Details

Key	Value
Target Audience	Graduate students, math educators, proof‑assistant newcomers
Core Feature	Sketch → LLM generates Lean skeleton, with step‑by‑step verification hints
Tech Stack	Electron UI, Whisper for audio notes, GPT‑4 for sketch parsing, Lean for checking
Difficulty	High
Monetization	Hobby

Notes

Leverages the observation that “most published research sits ignored,” providing a visual entry point to formal work.
Supports the future of mathematics education discussed in several HN comments.

A recent experience with ChatGPT 5.5 Pro

1. AI timeline & the need for natural‑language math search

2. Grading & exam design under AI assistance

3. Credit and authenticity of LLM‑aided proofs

4. Access & funding inequities for AI tools> “Can you tell me what is the budget necessary to supply AI tools capable of substantial research assistance to all academic staff at a university?” > — NotOscarWilde

5. Motivational and existential impact on researchers

6. Technical limits – verification and digestibility of results

🚀 Project Ideas

MathLit Search

Summary- Natural‑language search engine for mathematics that returns papers, lemmas, and proof snippets relevant to a user’s query.

Details

Notes

ProofGuard

Summary

Details

Notes

AutoGradFix

Summary

Details

Notes

MathAudit

Summary- Service that audits AI‑generated mathematical claims before they appear in preprints or journals.

Details

Notes

ResearchCoPilot

Summary

Details

Notes

LeanSketch

Summary

Details

Notes

Read Later