Fine-Tuned Small LLMs

Published: 12 June 2026| Version 1 | DOI: 10.17632/rbwzjfw4rd.1
Contributors:
,
, Wei-Ta Fang

Description

We report what is, to our knowledge, the largest properly powered expert human evaluation of this task to date — four experienced curriculum specialists, a fully crossed design spanning five open-source model families, three school-age bands, and three weighted educational dimensions — triangulated against two automatic protocols. The dual-rubric result, in particular, is a methodological contribution to how our field evaluates AI-generated educational materials.

Files

Categories

Artificial Intelligence, Taiwan

Funders

Licence