BegoniaGPT: Cultivating the large language model to be an exceptional K-12 English teacher
2025 (English)In: Neural Networks, ISSN 0893-6080, E-ISSN 1879-2782, Vol. 189, p. 1-15, article id 107488Article in journal (Refereed) Published
Abstract [en]
Large language models (LLMs) have taken the natural language processing (NLP) domain by storm, and their transformative momentum has surged into the domain of education, giving rise to a nascent wave of education-tailored LLMs. Despite their potential to facilitate homework assistance, such LLMs fall short in the fine-grained domain of elementary and secondary school (i.e., K-12) education. They often indiscriminately incorporate broad knowledge across diverse disciplines, overlooking the stark disparities in cognitive demands and curricular content among elementary, middle, and high school phases. To fill this gap, we propose a new English teaching LLM, called BegoniaGPT, which discards irrelevant knowledge from other disciplines, and shapes the general LLM to be an exceptional English teacher by emphasizing four key aspects: foundational English knowledge, professional proficiency, international vision, and psychological support. In particular, we build a large-scale English corpus named EngCorpus, including 35,000 instructions and conversations tailored towards three roles: students, teachers, and parents, as well as 30,000 emotional conversations. By continued pre-training and supervised fine-tuning the general LLM on the carefully curated EngCorpus and aligning it with reinforcement learning with expert feedback, BegoniaGPT could provide refined, specialized, personalized and compassionate English education. Through a comprehensive empirical comparison on four English benchmarks, e.g., E-EVAL, 2023–2024 the PEP edition of entrance examination for middle school in China (EEM), 2024 the PEP edition of entrance examination for high school (EEH), 2024 Gaokao, National Paper I (Eng-Gaokao), we show that BegoniaGPT achieves the state-of-the-art performance over 10 SOTA LLMs. Further Claude 3-opus and expert manual evaluations further validate BegoniaGPT's teaching advantages. © 2025 Elsevier Ltd
Place, publisher, year, edition, pages
Oxford: Elsevier, 2025. Vol. 189, p. 1-15, article id 107488
Keywords [en]
English teaching, Large language model, K-12 education, Fine-tuning
National Category
Studies of Specific Languages Didactics
Identifiers
URN: urn:nbn:se:hh:diva-56147DOI: 10.1016/j.neunet.2025.107488ISI: 001493193500001PubMedID: 40375418Scopus ID: 2-s2.0-105004878325OAI: oai:DiVA.org:hh-56147DiVA, id: diva2:1975467
Note
Funding: This paper is partly supported by a grant under Hong Kong RGC Theme-based Research Scheme (project no. T45-401/22-N). This work is supported by National Science Foundation of China under grant No. 62006212, Fellowship from the China Postdoctoral Science Foundation, China (2023M733907), Natural Science Foundation of Hunan Province of China (242300421412), Foundation of Key Laboratory of Dependable Service Computing in Cyber–Physical-Society (Ministry of Education), Chongqing University, China (PJ.No: CPSDSC202103).
2025-06-242025-06-242025-10-01Bibliographically approved