This study examines the legality and appropriateness of keeping the multiple-choice question items of the Korean Medical Licensing Examination (KMLE) confidential. Through an analysis of cases from the United States, Canada, and Australia, where medical licensing exams are conducted using item banks and computer-based testing, we found that exam items are kept confidential to ensure fairness and prevent cheating. In Korea, the Korea Health Personnel Licensing Examination Institute (KHPLEI) has been disclosing KMLE questions despite concerns over exam integrity. Korean courts have consistently ruled that multiple-choice question items prepared by public institutions are non-public information under Article 9(1)(v) of the Korea Official Information Disclosure Act (KOIDA), which exempts disclosure if it significantly hinders the fairness of exams or research and development. The Constitutional Court of Korea has upheld this provision. Given the time and cost involved in developing high-quality items and the need to accurately assess examinees’ abilities, there are compelling reasons to keep KMLE items confidential. As a public institution responsible for selecting qualified medical practitioners, KHPLEI should establish its disclosure policy based on a balanced assessment of public interest, without influence from specific groups. We conclude that KMLE questions qualify as non-public information under KOIDA, and KHPLEI may choose to maintain their confidentiality to ensure exam fairness and efficiency.
Purpose The Dr. LEE Jong-wook Fellowship Program, established by the Korea Foundation for International Healthcare (KOFIH), aims to strengthen healthcare capacity in partner countries. The aim of the study was to develop new performance evaluation indicators for the program to better assess long-term educational impact across various courses and professional roles.
Methods A 3-stage process was employed. First, a literature review of established evaluation models (Kirkpatrick’s 4 levels, context/input/process/product evaluation model, Organization for Economic Cooperation and Development Assistance Committee criteria) was conducted to devise evaluation criteria. Second, these criteria were validated via a 2-round Delphi survey with 18 experts in training projects from May 2021 to June 2021. Third, the relative importance of the evaluation criteria was determined using the analytic hierarchy process (AHP), calculating weights and ensuring consistency through the consistency index and consistency ratio (CR), with CR values below 0.1 indicating acceptable consistency.
Results The literature review led to a combined evaluation model, resulting in 4 evaluation areas, 20 items, and 92 indicators. The Delphi surveys confirmed the validity of these indicators, with content validity ratio values exceeding 0.444. The AHP analysis assigned weights to each indicator, and CR values below 0.1 indicated consistency. The final set of evaluation indicators was confirmed through a workshop with KOFIH and adopted as the new evaluation tool.
Conclusion The developed evaluation framework provides a comprehensive tool for assessing the long-term outcomes of the Dr. LEE Jong-wook Fellowship Program. It enhances evaluation capabilities and supports improvements in the training program’s effectiveness and international healthcare collaboration.
Purpose This study aimed to explore how the grading system affected medical students’ academic performance based on their perceptions of the learning environment and intrinsic motivation in the context of changing from norm-referenced A–F grading to criterion-referenced honors/pass/fail grading.
Methods The study involved 238 second-year medical students from 2014 (n=127, A–F grading) and 2015 (n=111, honors/pass/fail grading) at Yonsei University College of Medicine in Korea. Scores on the Dundee Ready Education Environment Measure, the Academic Motivation Scale, and the Basic Medical Science Examination were used to measure overall learning environment perceptions, intrinsic motivation, and academic performance, respectively. Serial mediation analysis was conducted to examine the pathways between the grading system and academic performance, focusing on the mediating roles of student perceptions and intrinsic motivation.
Results The honors/pass/fail grading class students reported more positive perceptions of the learning environment, higher intrinsic motivation, and better academic performance than the A–F grading class students. Mediation analysis demonstrated a serial mediation effect between the grading system and academic performance through learning environment perceptions and intrinsic motivation. Student perceptions and intrinsic motivation did not independently mediate the relationship between the grading system and performance.
Conclusion Reducing the number of grades and eliminating rank-based grading might have created an affirming learning environment that fulfills basic psychological needs and reinforces the intrinsic motivation linked to academic performance. The cumulative effect of these 2 mediators suggests that a comprehensive approach should be used to understand student performance.
Citations
Citations to this article as recorded by
Erratum: Impact of a change from A–F grading to honors/pass/fail grading on academic performance at Yonsei University College of Medicine in Korea: a cross-sectional serial mediation analysis
Journal of Educational Evaluation for Health Professions.2024; 21: 35. CrossRef
Purpose This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
Methods This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Results Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
Conclusion The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
Tai-hwan Uhm, Heakyung Choi, Seok Hwan Hong, Hyungsub Kim, Minju Kang, Keunyoung Kim, Hyejin Seo, Eunyoung Ki, Hyeryeong Lee, Heejeong Ahn, Uk-jin Choi, Sang Woong Park
J Educ Eval Health Prof. 2024;21:13. Published online June 12, 2024
Purpose The duties of paramedics and emergency medical technicians (P&EMTs) are continuously changing due to developments in medical systems. This study presents evaluation goals for P&EMTs by analyzing their work, especially the tasks that new P&EMTs (with less than 3 years’ experience) find difficult, to foster the training of P&EMTs who could adapt to emergency situations after graduation.
Methods A questionnaire was created based on prior job analyses of P&EMTs. The survey questions were reviewed through focus group interviews, from which 253 task elements were derived. A survey was conducted from July 10, 2023 to October 13, 2023 on the frequency, importance, and difficulty of the 6 occupations in which P&EMTs were employed.
Results The P&EMTs’ most common tasks involved obtaining patients’ medical histories and measuring vital signs, whereas the most important task was cardiopulmonary resuscitation (CPR). The task elements that the P&EMTs found most difficult were newborn delivery and infant CPR. New paramedics reported that treating patients with fractures, poisoning, and childhood fever was difficult, while new EMTs reported that they had difficulty keeping diaries, managing ambulances, and controlling infection.
Conclusion Communication was the most important item for P&EMTs, whereas CPR was the most important skill. It is important for P&EMTs to have knowledge of all tasks; however, they also need to master frequently performed tasks and those that pose difficulties in the field. By deriving goals for evaluating P&EMTs, changes could be made to their education, thereby making it possible to train more capable P&EMTs.
Purpose This study aimed to propose a revision of the evaluation objectives of the Korean Dentist Clinical Skill Test by analyzing the opinions of those involved in the examination after a review of those objectives.
Methods The clinical skill test objectives were reviewed based on the national-level dental practitioner competencies, dental school educational competencies, and the third dental practitioner job analysis. Current and former examinees were surveyed about their perceptions of the evaluation objectives. The validity of 22 evaluation objectives and overlapping perceptions based on area of specialty were surveyed on a 5-point Likert scale by professors who participated in the clinical skill test and dental school faculty members. Additionally, focus group interviews were conducted with experts on the examination.
Results It was necessary to consider including competency assessments for “emergency rescue skills” and “planning and performing prosthetic treatment.” There were no significant differences between current and former examinees in their perceptions of the clinical skill test’s objectives. The professors who participated in the examination and dental school faculty members recognized that most of the objectives were valid. However, some responses stated that “oromaxillofacial cranial nerve examination,” “temporomandibular disorder palpation test,” and “space management for primary and mixed dentition” were unfeasible evaluation objectives and overlapped with dental specialty areas.
Conclusion When revising the Korean Dentist Clinical Skill Test’s objectives, it is advisable to consider incorporating competency assessments related to “emergency rescue skills” and “planning and performing prosthetic treatment.”
Purpose This study aimed to explore the perceptions held by practicing dietitians of the importance of their tasks performed in current work environments, the frequency at which those tasks are performed, and predictions about the importance of those tasks in future work environments.
Methods This was a cross-sectional survey study. An online survey was administered to 350 practicing dietitians. They were asked to assess the importance, performance frequency, and predicted changes in the importance of 27 tasks using a 5-point scale. Descriptive statistics were calculated, and the means of the variables were compared across categorized work environments using analysis of variance.
Results The importance scores of all surveyed tasks were higher than 3.0, except for the marketing management task. Self-development, nutrition education/counseling, menu planning, food safety management, and documentation/data management were all rated higher than 4.0. The highest performance frequency score was related to documentation/data management. The importance scores of all duties, except for professional development, differed significantly by workplace. As for predictions about the future importance of the tasks surveyed, dietitians responded that the importance of all 27 tasks would either remain at current levels or increase in the future.
Conclusion Twenty-seven tasks were confirmed to represent dietitians’ job functions in various workplaces. These tasks can be used to improve the test specifications of the Korean Dietitian Licensing Examination and the curriculum of dietetic education programs.
Purpose This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.
Methods From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.
Results Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.
Conclusion In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.
Citations
Citations to this article as recorded by
Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review Xiaojun Xu, Yixiao Chen, Jing Miao Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef
The emergence of generative artificial intelligence platforms in 2023, journal metrics, appreciation to reviewers and volunteers, and obituary Sun Huh Journal of Educational Evaluation for Health Professions.2024; 21: 9. CrossRef
Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao Journal of Multidisciplinary Healthcare.2024; Volume 17: 3917. CrossRef
Purpose This study aimed to evaluate the impact of a transcultural nursing course on enhancing the cultural competency of graduate nursing students in Korea. We hypothesized that participants’ cultural competency would significantly improve in areas such as communication, biocultural ecology and family, dietary habits, death rituals, spirituality, equity, and empowerment and intermediation after completing the course. Furthermore, we assessed the participants’ overall satisfaction with the course.
Methods A before-and-after study was conducted with graduate nursing students at Hallym University, Chuncheon, Korea, from March to June 2023. A transcultural nursing course was developed based on Giger & Haddad’s transcultural nursing model and Purnell’s theoretical model of cultural competence. Data was collected using a cultural competence scale for registered nurses developed by Kim and his colleagues. A total of 18 students participated, and the paired t-test was employed to compare pre-and post-intervention scores.
Results The study revealed significant improvements in all 7 categories of cultural nursing competence (P<0.01). Specifically, the mean differences in scores (pre–post) ranged from 0.74 to 1.09 across the categories. Additionally, participants expressed high satisfaction with the course, with an average score of 4.72 out of a maximum of 5.0.
Conclusion The transcultural nursing course effectively enhanced the cultural competency of graduate nursing students. Such courses are imperative to ensure quality care for the increasing multicultural population in Korea.
Purpose This study presents item analysis results of the 26 health personnel licensing examinations managed by the Korea Health Personnel Licensing Examination Institute (KHPLEI) in 2022.
Methods The item difficulty index, item discrimination index, and reliability were calculated. The item discrimination index was calculated using a discrimination index based on the upper and lower 27% rule and the item-total correlation.
Results Out of 468,352 total examinees, 418,887 (89.4%) passed. The pass rates ranged from 27.3% for health educators level 1 to 97.1% for oriental medical doctors. Most examinations had a high average difficulty index, albeit to varying degrees, ranging from 61.3% for prosthetists and orthotists to 83.9% for care workers. The average discrimination index based on the upper and lower 27% rule ranged from 0.17 for oriental medical doctors to 0.38 for radiological technologists. The average item-total correlation ranged from 0.20 for oriental medical doctors to 0.38 for radiological technologists. The Cronbach α, as a measure of reliability, ranged from 0.872 for health educators-level 3 to 0.978 for medical technologists. The correlation coefficient between the average difficulty index and average discrimination index was -0.2452 (P=0.1557), that between the average difficulty index and the average item-total correlation was 0.3502 (P=0.0392), and that between the average discrimination index and the average item-total correlation was 0.7944 (P<0.0001).
Conclusion This technical report presents the item analysis results and reliability of the recent examinations by the KHPLEI, demonstrating an acceptable range of difficulty index and discrimination index values, as well as good reliability.
Purpose This study aimed to analyze patterns of using ChatGPT before and after group activities and to explore medical students’ perceptions of ChatGPT as a feedback tool in the classroom.
Methods The study included 99 2nd-year pre-medical students who participated in a “Leadership and Communication” course from March to June 2023. Students engaged in both individual and group activities related to negotiation strategies. ChatGPT was used to provide feedback on their solutions. A survey was administered to assess students’ perceptions of ChatGPT’s feedback, its use in the classroom, and the strengths and challenges of ChatGPT from May 17 to 19, 2023.
Results The students responded by indicating that ChatGPT’s feedback was helpful, and revised and resubmitted their group answers in various ways after receiving feedback. The majority of respondents expressed agreement with the use of ChatGPT during class. The most common response concerning the appropriate context of using ChatGPT’s feedback was “after the first round of discussion, for revisions.” There was a significant difference in satisfaction with ChatGPT’s feedback, including correctness, usefulness, and ethics, depending on whether or not ChatGPT was used during class, but there was no significant difference according to gender or whether students had previous experience with ChatGPT. The strongest advantages were “providing answers to questions” and “summarizing information,” and the worst disadvantage was “producing information without supporting evidence.”
Conclusion The students were aware of the advantages and disadvantages of ChatGPT, and they had a positive attitude toward using ChatGPT in the classroom.
Citations
Citations to this article as recorded by
Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review Xiaojun Xu, Yixiao Chen, Jing Miao Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef
Embracing ChatGPT for Medical Education: Exploring Its Impact on Doctors and Medical Students Yijun Wu, Yue Zheng, Baijie Feng, Yuqi Yang, Kai Kang, Ailin Zhao JMIR Medical Education.2024; 10: e52483. CrossRef
Integration of ChatGPT Into a Course for Medical Students: Explorative Study on Teaching Scenarios, Students’ Perception, and Applications Anita V Thomae, Claudia M Witt, Jürgen Barth JMIR Medical Education.2024; 10: e50545. CrossRef
A cross sectional investigation of ChatGPT-like large language models application among medical students in China Guixia Pan, Jing Ni BMC Medical Education.2024;[Epub] CrossRef
A Pilot Study of Medical Student Opinions on Large Language Models Alan Y Xu, Vincent S Piranio, Skye Speakman, Chelsea D Rosen, Sally Lu, Chris Lamprecht, Robert E Medina, Maisha Corrielus, Ian T Griffin, Corinne E Chatham, Nicolas J Abchee, Daniel Stribling, Phuong B Huynh, Heather Harrell, Benjamin Shickel, Meghan Bre Cureus.2024;[Epub] CrossRef
The intent of ChatGPT usage and its robustness in medical proficiency exams: a systematic review Tatiana Chaiban, Zeinab Nahle, Ghaith Assi, Michelle Cherfane Discover Education.2024;[Epub] CrossRef
ChatGPT and Clinical Training: Perception, Concerns, and Practice of Pharm-D Students Mohammed Zawiah, Fahmi Al-Ashwal, Lobna Gharaibeh, Rana Abu Farha, Karem Alzoubi, Khawla Abu Hammour, Qutaiba A Qasim, Fahd Abrah Journal of Multidisciplinary Healthcare.2023; Volume 16: 4099. CrossRef
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study Hyunju Lee, Soobin Park Journal of Educational Evaluation for Health Professions.2023; 20: 39. CrossRef
Purpose This study aims to suggest the number of test items in each of 8 nursing activity categories of the Korean Nursing Licensing Examination, which comprises 134 activity statements including 275 items. The examination will be able to evaluate the minimum ability that nursing graduates must have to perform their duties. Methods: Two opinion surveys involving the members of 7 academic societies were conducted from March 19 to May 14, 2021. The survey results were reviewed by members of 4 expert associations from May 21 to June 4, 2021. The results for revised numbers of items in each category were compared with those reported by Tak and his colleagues and the National Council License Examination for Registered Nurses of the United States. Results: Based on 2 opinion surveys and previous studies, the suggestions for item allocation to 8 nursing activity categories of the Korean Nursing Licensing Examination in this study are as follows: 50 items for management of care and improvement of professionalism, 33 items for safety and infection control, 40 items for management of potential risk, 28 items for basic care, 47 items for physiological integrity and maintenance, 33 items for pharmacological and parenteral therapies, 24 items for psychosocial integrity and maintenance, and 20 items for health promotion and maintenance. Twenty other items related to health and medical laws were not included due to their mandatory status. Conclusion: These suggestions for the number of test items for each activity category will be helpful in developing new items for the Korean Nursing Licensing Examination.
Purpose The number of Korean midwifery licensing examination applicants has steadily decreased due to the low birth rate and lack of training institutions for midwives. This study aimed to evaluate the adequacy of the examination-based licensing system and the possibility of a training-based licensing system.
Methods A survey questionnaire was developed and dispatched to 230 professionals from December 28, 2022 to January 13, 2023, through an online form using Google Surveys. Descriptive statistics were used to analyze the results.
Results Responses from 217 persons (94.3%) were analyzed after excluding incomplete responses. Out of the 217 participants, 198 (91.2%) agreed with maintaining the current examination-based licensing system; 94 (43.3%) agreed with implementing a training-based licensing system to cover the examination costs due to the decreasing number of applicants; 132 (60.8%) agreed with establishing a midwifery education evaluation center for a training-based licensing system; 163 (75.1%) said that the quality of midwifery might be lowered if midwives were produced only by a training-based licensing system, and 197 (90.8%) said that the training of midwives as birth support personnel should be promoted in Korea.
Conclusion Favorable results were reported for the examination-based licensing system; however, if a training-based licensing system is implemented, it will be necessary to establish a midwifery education evaluation center to manage the quality of midwives. As the annual number of candidates for the Korean midwifery licensing examination has been approximately 10 in recent years, it is necessary to consider more actively granting midwifery licenses through a training-based licensing system.
Purpose The aim of this study was to identify factors influencing the learning transfer of nursing students in a non-face-to-face educational environment through structural equation modeling and suggest ways to improve the transfer of learning.
Methods In this cross-sectional study, data were collected via online surveys from February 9 to March 1, 2022, from 218 nursing students in Korea. Learning transfer, learning immersion, learning satisfaction, learning efficacy, self-directed learning ability and information technology utilization ability were analyzed using IBM SPSS for Windows ver. 22.0 and AMOS ver. 22.0.
Results The assessment of structural equation modeling showed adequate model fit, with normed χ2=1.74 (P<0.024), goodness-of-fit index=0.97, adjusted goodness-of-fit index=0.93, comparative fit index=0.98, root mean square residual=0.02, Tucker-Lewis index=0.97, normed fit index=0.96, and root mean square error of approximation=0.06. In a hypothetical model analysis, 9 out of 11 pathways of the hypothetical structural model for learning transfer in nursing students were statistically significant. Learning self-efficacy and learning immersion of nursing students directly affected learning transfer, and subjective information technology utilization ability, self-directed learning ability, and learning satisfaction were variables with indirect effects. The explanatory power of immersion, satisfaction, and self-efficacy for learning transfer was 44.4%.
Conclusion The assessment of structural equation modeling indicated an acceptable fit. It is necessary to improve the transfer of learning through the development of a self-directed program for learning ability improvement, including the use of information technology in nursing students’ learning environment in non-face-to-face conditions.
Citations
Citations to this article as recorded by
Flow in Relation to Academic Achievement in Online-Learning: A Meta-Analysis Study Da Xing, Yunjung Lee, Gyun Heo Measurement: Interdisciplinary Research and Perspectives.2024; : 1. CrossRef
The Mediating Effect of Perceived Institutional Support on Inclusive Leadership and Academic Loyalty in Higher Education Olabode Gbobaniyi, Shalini Srivastava, Abiodun Kolawole Oyetunji, Chiemela Victor Amaechi, Salmia Binti Beddu, Bajpai Ankita Sustainability.2023; 15(17): 13195. CrossRef
Transfer of Learning of New Nursing Professionals: Exploring Patterns and the Effect of Previous Work Experience Helena Roig-Ester, Paulina Elizabeth Robalino Guerra, Carla Quesada-Pallarès, Andreas Gegenfurtner Education Sciences.2023; 14(1): 52. CrossRef
Purpose To ensure faculty members’ active participation in education in response to growing demand, medical schools should clearly describe educational activities in their promotion regulations. This study analyzed the status of how medical education activities are evaluated in promotion regulations in 2022, in Korea.
Methods Data were collected from promotion regulations retrieved by searching the websites of 22 medical schools/universities in August 2022. To categorize educational activities and evaluation methods, the Association of American Medical Colleges framework for educational activities was utilized. Correlations between medical schools’ characteristics and the evaluation of medical educational activities were analyzed.
Results We defined 6 categories, including teaching, development of education products, education administration and service, scholarship in education, student affairs, and others, and 20 activities with 57 sub-activities. The average number of included activities was highest in the development of education products category and lowest in the scholarship in education category. The weight adjustment factors of medical educational activities were the characteristics of the target subjects and faculty members, the number of involved faculty members, and the difficulty of activities. Private medical schools tended to have more educational activities in the regulations than public medical schools. The greater the number of faculty members, the greater the number of educational activities in the education administration and service categories.
Conclusion Medical schools included various medical education activities and their evaluation methods in promotion regulations in Korea. This study provides basic data for improving the rewarding system for efforts of medical faculty members in education.