Creative Problem Solving in Science
We focus on theoretical sciences (math, theoretical CS, physics, etc.) where progress centers on definitions, conjectures, reductions, constructions, heuristics, and proof strategies.
What we collect (beta)
- Task briefs (problem framing, priors, constraints)
- Interaction transcripts (curated “episodes” with turns, branches, rationales)
- Milestones (new lemmas, promising directions, dead-ends)
- Outcome labels (novelty, utility, correctness proxy, “breakthrough points”)
Why this matters
- Training: episodes become instructional curricula for creative reasoning, not just short Q&A.
- Evaluation: we score creative moves and trajectory quality, not only final answers.
Roadmap
- v0 (now): recruit beta testers and pilot the schema with guided prompts
- v1: public benchmark + leaderboard for creative scientific tasks
- v2: open Arena for model-vs-model co-creative sessions (observer-rated)