Skip Navigation
Skip to contents

JEEHP : Journal of Educational Evaluation for Health Professions



Page Path
HOME > J Educ Eval Health Prof > Forthcoming articles > Article
Research article Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study
Dong Gi Seo1,2*orcid , Jeongwook Choi1,2orcid , Jinha Kim1orcid

DOI: [Epub ahead of print]
Published online: July 9, 2024
1Department of Psychology, Hallym Applied Psychology Institute, College of Social Science, Hallym University, Chuncheon, Korea
2The CAT Korea Company, Chuncheon, Korea
*Corresponding email:

Editor: Sun Huh, Hallym University, Korea

• Received: 13 June 2024   • Accepted: 27 June 2024
  • 53 Download
  • 0 Crossref
  • 0 Scopus

This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under two stopping rules (SEM 0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r = 0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.

JEEHP : Journal of Educational Evaluation for Health Professions