Developing an Intelligent Tutoring System for Personalized Skill Development Using Reinforcement Learning

Main Article Content

Sherlyn T. Guzman
Maria Crisella Dela Cruz-Mercado

Abstract

The one-size-fits-all model of traditional education is increasingly inadequate for addressing diverse learner needs. Intelligent Tutoring Systems (ITS) offer a solution but are often limited by static, hand-crafted pedagogical rules that cannot optimize long-term learning trajectories. This paper presents the design, implementation, and empirical validation of RL-Tutor, a novel ITS that leverages Deep Reinforcement Learning (RL) to provide dynamic, personalized instruction. RL-Tutor integrates a Deep Knowledge Tracing model based on a Dynamic Key-Value Memory Network (DKVMN) to maintain a rich, continuous representation of the student's knowledge state. This state serves as the input to a Proximal Policy Optimization (PPO) agent, which functions as the pedagogical module, selecting optimal actions from a hierarchical space including problem selection, hint provision, and instructional review. A critical contribution is the formulation of a multi-faceted reward function that balances immediate performance, learning efficiency, and long-term knowledge retention. Due to the sample inefficiency of RL, the agent was first trained in a high-fidelity simulated environment with a population of 10,000 synthetic students. The trained system was then evaluated against a rule-based tutor and a static tutor in a between-subjects human study (N=90) in the domain of introductory Python programming. Results show that RL-Tutor led to significantly higher normalized learning gains (0.72 vs. 0.58 for rule-based and 0.45 for static, p < 0.01) and better retention in a one-week delayed post-test. Analysis of the learned policy revealed emergent, pedagogically sound strategies such as adaptive hinting and implicit spaced repetition. This work establishes that RL can autonomously discover complex, effective teaching policies that are tailored to individual learners and outperform traditional ITS architectures.

Article Details

Section

Articles

How to Cite

T. Guzman, S., & Dela Cruz-Mercado, M. C. (2025). Developing an Intelligent Tutoring System for Personalized Skill Development Using Reinforcement Learning. Qubahan Techno Journal, 4(1). https://doi.org/10.48161/qtj.v4n1a43

References

M. Benvenuti et al., “Artificial intelligence and human behavioral development: A perspective on new skills and competences acquisition for the educational context,” Comput Human Behav, vol. 148, 2023, doi: 10.1016/j.chb.2023.107903. DOI: https://doi.org/10.1016/j.chb.2023.107903

F. Niño-Rojas, D. Lancheros-Cuesta, M. T. P. Jiménez-Valderrama, G. Mestre, and S. Gómez, “Systematic Review: Trends in Intelligent Tutoring Systems in Mathematics Teaching and Learning,” International Journal of Education in Mathematics, Science and Technology, vol. 12, no. 1, 2023, doi: 10.46328/ijemst.3189. DOI: https://doi.org/10.46328/ijemst.3189

G. N. Vivekananda et al., “Retracing-efficient IoT model for identifying the skin-related tags using automatic lumen detection,” Intelligent Data Analysis, vol. 27, pp. 161–180, 2023, doi: 10.3233/IDA-237442. DOI: https://doi.org/10.3233/IDA-237442

J. A. Esponda-Pérez, M. A. Mousse, S. M. Almufti, I. Haris, S. Erdanova, and R. Tsarev, “Applying Multiple Regression to Evaluate Academic Performance of Students in E-Learning,” 2024, pp. 227–235. doi: 10.1007/978-3-031-70595-3_24. DOI: https://doi.org/10.1007/978-3-031-70595-3_24

J. A. Esponda-Pérez et al., “Application of Chi-Square Test in E-learning to Assess the Association Between Variables,” 2024, pp. 274–281. doi: 10.1007/978-3-031-70595-3_28. DOI: https://doi.org/10.1007/978-3-031-70595-3_28

P. H. Nguyen, S. M. Almufti, J. A. Esponda-Pérez, D. Salguero García, I. Haris, and R. Tsarev, “The Impact of E-Learning on the Processes of Learning and Memorization,” 2024, pp. 218–226. doi: 10.1007/978-3-031-70595-3_23. DOI: https://doi.org/10.1007/978-3-031-70595-3_23

A. Shaban, R. Rajab Asaad, and S. Almufti, “The Evolution of Metaheuristics: From Classical to Intelligent Hybrid Frameworks,” Qubahan Techno Journal, vol. 1, no. 1, pp. 1–15, Jan. 2022, doi: 10.48161/qtj.v1n1a13. DOI: https://doi.org/10.48161/qtj.v1n1a13

R. Asaad, R. Ismail Ali, and S. Almufti, “Hybrid Big Data Analytics: Integrating Structured and Unstructured Data for Predictive Intelligence,” Qubahan Techno Journal, vol. 1, no. 2, Apr. 2022, doi: 10.48161/qtj.v1n2a14. DOI: https://doi.org/10.48161/qtj.v1n2a14

M. K. Sharma, H. A. Alkhazaleh, S. Askar, N. H. Haroon, S. M. Almufti, and M. R. Al Nasar, “FEM-supported machine learning for residual stress and cutting force analysis in micro end milling of aluminum alloys,” International Journal of Mechanics and Materials in Design, vol. 20, no. 5, pp. 1077–1098, Oct. 2024, doi: 10.1007/s10999-024-09713-9. DOI: https://doi.org/10.1007/s10999-024-09713-9

S. M. Abdulrahman, R. R. Asaad, H. B. Ahmad, A. Alaa Hani, S. R. M. Zeebaree, and A. B. Sallow, “Machine Learning in Nonlinear Material Physics,” Journal of Soft Computing and Data Mining, vol. 5, no. 1, Jun. 2024, doi: 10.30880/jscdm.2024.05.01.010. DOI: https://doi.org/10.30880/jscdm.2024.05.01.010

K. Rustamov, “5G-Enabled Internet of Things: Latency Optimization through AI-Assisted Network Slicing,” Qubahan Techno Journal, vol. 2, no. 1, pp. 1–10, Feb. 2023, doi: 10.48161/qtj.v2n1a18. DOI: https://doi.org/10.48161/qtj.v2n1a18

N. Rustamova and , Raveenthiran Vivekanantharasa, “Comprehensive Review and Hybrid Evolution of Teaching–Learning-Based Optimization,” Qubahan Techno Journal, vol. 2, no. 2, pp. 1–13, May 2023, doi: 10.48161/qtj.v2n2a19. DOI: https://doi.org/10.48161/qtj.v2n2a19

A. B. Sallow, R. R. Asaad, H. B. Ahmad, S. Mohammed Abdulrahman, A. A. Hani, and S. R. M. Zeebaree, “Machine Learning Skills To K–12,” Journal of Soft Computing and Data Mining, vol. 5, no. 1, Jun. 2024, doi: 10.30880/jscdm.2024.05.01.011. DOI: https://doi.org/10.30880/jscdm.2024.05.01.011

H. B. Ahmad, R. R. Asaad, S. M. Almufti, A. A. Hani, A. B. Sallow, and S. R. M. Zeebaree, “SMART HOME ENERGY SAVING WITH BIG DATA AND MACHINE LEARNING,” Jurnal Ilmiah Ilmu Terapan Universitas Jambi, vol. 8, no. 1, pp. 11–20, May 2024, doi: 10.22437/jiituj.v8i1.32598. DOI: https://doi.org/10.22437/jiituj.v8i1.32598

D. A. Majeed et al., “DATA ANALYSIS AND MACHINE LEARNING APPLICATIONS IN ENVIRONMENTAL MANAGEMENT,” Jurnal Ilmiah Ilmu Terapan Universitas Jambi, vol. 8, no. 2, pp. 398–408, Sep. 2024, doi: 10.22437/jiituj.v8i2.32769.

D. A. Majeed et al., “DATA ANALYSIS AND MACHINE LEARNING APPLICATIONS IN ENVIRONMENTAL MANAGEMENT,” Jurnal Ilmiah Ilmu Terapan Universitas Jambi, vol. 8, no. 2, pp. 398–408, Sep. 2024, doi: 10.22437/jiituj.v8i2.32769. DOI: https://doi.org/10.22437/jiituj.v8i2.32769

UU Republik Indonesia et al., “PENENTUAN ALTERNATIF LOKASI TEMPAT PEMBUANGAN AKHIR (TPA) SAMPAH DI KABUPATEN SIDOARJO,” Energies (Basel), vol. 15, no. 1, 2022.

S. M. Almufti et al., “INTELLIGENT HOME IOT DEVICES: AN EXPLORATION OF MACHINE LEARNING-BASED NETWORKED TRAFFIC INVESTIGATION,” Jurnal Ilmiah Ilmu Terapan Universitas Jambi, vol. 8, no. 1, pp. 1–10, May 2024, doi: 10.22437/jiituj.v8i1.32767. DOI: https://doi.org/10.22437/jiituj.v8i1.32767

A. Yahya, “Systematic Review of Regression Algorithms for Predictive Analytics,” Qubahan Techno Journal, vol. 1, no. 4, Nov. 2022, doi: 10.48161/qtj.v1n4a17. DOI: https://doi.org/10.48161/qtj.v1n4a17

J. A. Dela Fuente, “Automated Software Testing through Large Language Models: Opportunities and Challenges,” Qubahan Techno Journal, vol. 1, no. 3, pp. 1–16, Jul. 2022, doi: 10.48161/qtj.v1n3a15. DOI: https://doi.org/10.48161/qtj.v1n3a15

R. Rajab Asaad, R. Ismael Ali, Z. Arif Ali, and A. Ahmad Shaaban, “Image Processing with Python Libraries,” Academic Journal of Nawroz University, vol. 12, no. 2, pp. 410–416, Jun. 2023, doi: 10.25007/ajnu.v12n2a1754. DOI: https://doi.org/10.25007/ajnu.v12n2a1754

Ç. Sıcakyüz, R. Rajab Asaad, S. Almufti, and N. R. Rustamova, “Adaptive Deep Learning Architectures for Real-Time Data Streams in Edge Computing Environments,” Qubahan Techno Journal, vol. 3, no. 2, pp. 1–14, Jun. 2024, doi: 10.48161/qtj.v3n2a25. DOI: https://doi.org/10.48161/qtj.v3n2a25

Z. Liu, P. Agrawal, S. Singhal, V. Madaan, M. Kumar, and P. K. Verma, “LPITutor: An LLM based personalized intelligent tutoring system using RAG and prompt engineering,” PeerJ Comput Sci, vol. 11, 2025, doi: 10.7717/peerj-cs.2991. DOI: https://doi.org/10.7717/peerj-cs.2991

A. Ahmed Shaban, S. M. Almufti, and R. B. Marqas, “A Modified Bat Algorithm for Economic Dispatch with Enhanced Performance Metrics,” FMDB Transactions on Sustainable Technoprise Letters, vol. 3, no. 2, pp. 59–72, Jun. 2025, doi: 10.69888/ftstpl.2025.000437. DOI: https://doi.org/10.69888/FTSTPL.2025.000437

S. M. Almufti, R. B. Marqas, Z. A. Nayef, and T. S. Mohamed, “Real Time Face-mask Detection with Arduino to Prevent COVID-19 Spreading,” Qubahan Academic Journal, vol. 1, no. 2, pp. 39–46, Apr. 2021, doi: 10.48161/qaj.v1n2a47. DOI: https://doi.org/10.48161/qaj.v1n2a47

S. M. Almufti and A. M. Abdulazeez, “An Integrated Gesture Framework of Smart Entry Based on Arduino and Random Forest Classifier,” Indonesian Journal of Computer Science, vol. 13, no. 1, Feb. 2024, doi: 10.33022/ijcs.v13i1.3735. DOI: https://doi.org/10.33022/ijcs.v13i1.3735

Similar Articles

You may also start an advanced similarity search for this article.