Fine-Tuned Small LLMs
Published: 12 June 2026| Version 1 | DOI: 10.17632/rbwzjfw4rd.1
Contributors:
, , Wei-Ta FangDescription
We report what is, to our knowledge, the largest properly powered expert human evaluation of this task to date — four experienced curriculum specialists, a fully crossed design spanning five open-source model families, three school-age bands, and three weighted educational dimensions — triangulated against two automatic protocols. The dual-rubric result, in particular, is a methodological contribution to how our field evaluates AI-generated educational materials.
Files
Categories
Artificial Intelligence, Taiwan