-
Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study
-
Dong Gi Seo, Jeongwook Choi, Jinha Kim
-
J Educ Eval Health Prof. 2024;21:18. Published online July 9, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.18
-
-
Abstract
PDFSupplementary Material
- Purpose
This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
Methods This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Results Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
Conclusion The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
-
Introduction to the LIVECAT web-based computerized adaptive testing platform
-
Dong Gi Seo, Jeongwook Choi
-
J Educ Eval Health Prof. 2020;17:27. Published online September 29, 2020
-
DOI: https://doi.org/10.3352/jeehp.2020.17.27
-
-
5,852
View
-
141
Download
-
5
Web of Science
-
5
Crossref
-
Abstract
PDFSupplementary Material
- This study introduces LIVECAT, a web-based computerized adaptive testing platform. This platform provides many functions, including writing item content, managing an item bank, creating and administering a test, reporting test results, and providing information about a test and examinees. The LIVECAT provides examination administrators with an easy and flexible environment for composing and managing examinations. It is available at http://www.thecatkorea.com/. Several tools were used to program LIVECAT, as follows: operating system, Amazon Linux; web server, nginx 1.18; WAS, Apache Tomcat 8.5; database, Amazon RDMS—Maria DB; and languages, JAVA8, HTML5/CSS, Javascript, and jQuery. The LIVECAT platform can be used to implement several item response theory (IRT) models such as the Rasch and 1-, 2-, 3-parameter logistic models. The administrator can choose a specific model of test construction in LIVECAT. Multimedia data such as images, audio files, and movies can be uploaded to items in LIVECAT. Two scoring methods (maximum likelihood estimation and expected a posteriori) are available in LIVECAT and the maximum Fisher information item selection method is applied to every IRT model in LIVECAT. The LIVECAT platform showed equal or better performance compared with a conventional test platform. The LIVECAT platform enables users without psychometric expertise to easily implement and perform computerized adaptive testing at their institutions. The most recent LIVECAT version only provides a dichotomous item response model and the basic components of CAT. Shortly, LIVECAT will include advanced functions, such as polytomous item response models, weighted likelihood estimation method, and content balancing method.
-
Citations
Citations to this article as recorded by
- Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study
Dong Gi Seo, Jeongwook Choi, Jinha Kim Journal of Educational Evaluation for Health Professions.2024; 21: 18. CrossRef - Educational Technology in the University: A Comprehensive Look at the Role of a Professor and Artificial Intelligence
Cheolkyu Shin, Dong Gi Seo, Seoyeon Jin, Soo Hwa Lee, Hyun Je Park IEEE Access.2024; 12: 116727. CrossRef - Presidential address: improving item validity and adopting computer-based testing, clinical skills assessments, artificial intelligence, and virtual reality in health professions licensing examinations in Korea
Hyunjoo Pai Journal of Educational Evaluation for Health Professions.2023; 20: 8. CrossRef - Patient-reported outcome measures in cancer care: Integration with computerized adaptive testing
Minyu Liang, Zengjie Ye Asia-Pacific Journal of Oncology Nursing.2023; 10(12): 100323. CrossRef - Development of a character qualities test for medical students in Korea using polytomous item response theory and factor analysis: a preliminary scale development study
Yera Hur, Dong Gi Seo Journal of Educational Evaluation for Health Professions.2023; 20: 20. CrossRef
-
Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination
-
Dong Gi Seo, Jeongwook Choi
-
J Educ Eval Health Prof. 2018;15:14. Published online May 17, 2018
-
DOI: https://doi.org/10.3352/jeehp.2018.15.14
-
Correction in: J Educ Eval Health Prof 2018;15(0):27
-
36,844
View
-
329
Download
-
9
Web of Science
-
10
Crossref
-
Abstract
PDFSupplementary Material
- Purpose
Computerized adaptive testing (CAT) has been adopted in licensing examinations because it improves the efficiency and accuracy of the tests, as shown in many studies. This simulation study investigated CAT scoring and item selection methods for the Korean Medical Licensing Examination (KMLE).
Methods This study used a post-hoc (real data) simulation design. The item bank used in this study included all items from the January 2017 KMLE. All CAT algorithms for this study were implemented using the ‘catR’ package in the R program.
Results In terms of accuracy, the Rasch and 2-parametric logistic (PL) models performed better than the 3PL model. The ‘modal a posteriori’ and ‘expected a posterior’ methods provided more accurate estimates than maximum likelihood estimation or weighted likelihood estimation. Furthermore, maximum posterior weighted information and minimum expected posterior variance performed better than other item selection methods. In terms of efficiency, the Rasch model is recommended to reduce test length.
Conclusion Before implementing live CAT, a simulation study should be performed under varied test conditions. Based on a simulation study, and based on the results, specific scoring and item selection methods should be predetermined.
-
Citations
Citations to this article as recorded by
- Large-Scale Parallel Cognitive Diagnostic Test Assembly Using A Dual-Stage Differential Evolution-Based Approach
Xi Cao, Ying Lin, Dong Liu, Henry Been-Lirn Duh, Jun Zhang IEEE Transactions on Artificial Intelligence.2024; 5(6): 3120. CrossRef - Assessing the Potentials of Compurized Adaptive Testing to Enhance Mathematics and Science Student’t Achievement in Secondary Schools
Mary Patrick Uko, I.O. Eluwa, Patrick J. Uko European Journal of Theoretical and Applied Sciences.2024; 2(4): 85. CrossRef - Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study
Dong Gi Seo, Jeongwook Choi, Jinha Kim Journal of Educational Evaluation for Health Professions.2024; 21: 18. CrossRef - Presidential address: improving item validity and adopting computer-based testing, clinical skills assessments, artificial intelligence, and virtual reality in health professions licensing examinations in Korea
Hyunjoo Pai Journal of Educational Evaluation for Health Professions.2023; 20: 8. CrossRef - Developing Computerized Adaptive Testing for a National Health Professionals Exam: An Attempt from Psychometric Simulations
Lingling Xu, Zhehan Jiang, Yuting Han, Haiying Liang, Jinying Ouyang Perspectives on Medical Education.2023;[Epub] CrossRef - Optimizing Computer Adaptive Test Performance: A Hybrid Simulation Study to Customize the Administration Rules of the CAT-EyeQ in Macular Edema Patients
T. Petra Rausch-Koster, Michiel A. J. Luijten, Frank D. Verbraak, Ger H. M. B. van Rens, Ruth M. A. van Nispen Translational Vision Science & Technology.2022; 11(11): 14. CrossRef - The accuracy and consistency of mastery for each content domain using the Rasch and deterministic inputs, noisy “and” gate diagnostic classification models: a simulation study and a real-world analysis using data from the Korean Medical Licensing Examinat
Dong Gi Seo, Jae Kum Kim Journal of Educational Evaluation for Health Professions.2021; 18: 15. CrossRef - Linear programming method to construct equated item sets for the implementation of periodical computer-based testing for the Korean Medical Licensing Examination
Dong Gi Seo, Myeong Gi Kim, Na Hui Kim, Hye Sook Shin, Hyun Jung Kim Journal of Educational Evaluation for Health Professions.2018; 15: 26. CrossRef - Funding information of the article entitled “Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination”
Dong Gi Seo, Jeongwook Choi Journal of Educational Evaluation for Health Professions.2018; 15: 27. CrossRef - Updates from 2018: Being indexed in Embase, becoming an affiliated journal of the World Federation for Medical Education, implementing an optional open data policy, adopting principles of transparency and best practice in scholarly publishing, and appreci
Sun Huh Journal of Educational Evaluation for Health Professions.2018; 15: 36. CrossRef
|