Skip Navigation
Skip to contents

JEEHP : Journal of Educational Evaluation for Health Professions

OPEN ACCESS
SEARCH
Search

Search

Page Path
HOME > Search
2 "Statistical model"
Filter
Filter
Article category
Keywords
Publication year
Authors
Funded articles
Research articles
Special article on the 20th anniversary of the journal
Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study  
Dong Gi Seo, Jeongwook Choi, Jinha Kim
J Educ Eval Health Prof. 2024;21:18.   Published online July 9, 2024
DOI: https://doi.org/10.3352/jeehp.2024.21.18
  • 1,315 View
  • 319 Download
AbstractAbstract PDFSupplementary Material
Purpose
This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
Methods
This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Results
Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
Conclusion
The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
The accuracy and consistency of mastery for each content domain using the Rasch and deterministic inputs, noisy “and” gate diagnostic classification models: a simulation study and a real-world analysis using data from the Korean Medical Licensing Examination  
Dong Gi Seo, Jae Kum Kim
J Educ Eval Health Prof. 2021;18:15.   Published online July 5, 2021
DOI: https://doi.org/10.3352/jeehp.2021.18.15
  • 5,255 View
  • 301 Download
  • 2 Web of Science
  • 3 Crossref
AbstractAbstract PDFSupplementary Material
Purpose
Diagnostic classification models (DCMs) were developed to identify the mastery or non-mastery of the attributes required for solving test items, but their application has been limited to very low-level attributes, and the accuracy and consistency of high-level attributes using DCMs have rarely been reported compared with classical test theory (CTT) and item response theory models. This paper compared the accuracy of high-level attribute mastery between deterministic inputs, noisy “and” gate (DINA) and Rasch models, along with sub-scores based on CTT.
Methods
First, a simulation study explored the effects of attribute length (number of items per attribute) and the correlations among attributes with respect to the accuracy of mastery. Second, a real-data study examined model and item fit and investigated the consistency of mastery for each attribute among the 3 models using the 2017 Korean Medical Licensing Examination with 360 items.
Results
Accuracy of mastery increased with a higher number of items measuring each attribute across all conditions. The DINA model was more accurate than the CTT and Rasch models for attributes with high correlations (>0.5) and few items. In the real-data analysis, the DINA and Rasch models generally showed better item fits and appropriate model fit. The consistency of mastery between the Rasch and DINA models ranged from 0.541 to 0.633 and the correlations of person attribute scores between the Rasch and DINA models ranged from 0.579 to 0.786.
Conclusion
Although all 3 models provide a mastery decision for each examinee, the individual mastery profile using the DINA model provides more accurate decisions for attributes with high correlations than the CTT and Rasch models. The DINA model can also be directly applied to tests with complex structures, unlike the CTT and Rasch models, and it provides different diagnostic information from the CTT and Rasch models.

Citations

Citations to this article as recorded by  
  • Stable Knowledge Tracing Using Causal Inference
    Jia Zhu, Xiaodong Ma, Changqin Huang
    IEEE Transactions on Learning Technologies.2024; 17: 124.     CrossRef
  • Just When You Thought that Quantitizing Merely Involved Counting: A Renewed Call for Expanding the Practice of Quantitizing in Mixed Methods Research With a Focus on Measurement-Based Quantitizing
    Tony Onwuegbuzie
    Journal of Mixed Methods Studies.2024; (10): 99.     CrossRef
  • Development of a character qualities test for medical students in Korea using polytomous item response theory and factor analysis: a preliminary scale development study
    Yera Hur, Dong Gi Seo
    Journal of Educational Evaluation for Health Professions.2023; 20: 20.     CrossRef

JEEHP : Journal of Educational Evaluation for Health Professions
TOP