Journal of Educational Evaluation for Health Professions

Research articles

Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study: Hyunju Lee, Soobin Park; J Educ Eval Health Prof. 2023;20:39. Published online December 28, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.39

1,065 View
139 Download
1 Crossref

Purpose
This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.
Methods
From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.
Results
Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.
Conclusion
In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.

Citations

Citations to this article as recorded by

Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef

Negative effects on medical students’ scores for clinical performance during the COVID-19 pandemic in Taiwan: a comparative study: Eunice Jia-Shiow Yuan, Shiau-Shian Huang, Chia-An Hsu, Jiing-Feng Lirng, Tzu-Hao Li, Chia-Chang Huang, Ying-Ying Yang, Chung-Pin Li, Chen-Huan Chen; J Educ Eval Health Prof. 2023;20:37. Published online December 26, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.37

751 View
71 Download

Abstract PDF Supplementary Material: Purpose
Coronavirus disease 2019 (COVID-19) has heavily impacted medical clinical education in Taiwan. Medical curricula have been altered to minimize exposure and limit transmission. This study investigated the effect of COVID-19 on Taiwanese medical students’ clinical performance using online standardized evaluation systems and explored the factors influencing medical education during the pandemic.
Methods
Medical students were scored from 0 to 100 based on their clinical performance from 1/1/2018 to 6/31/2021. The students were placed into pre-COVID-19 (before 2/1/2020) and midst-COVID-19 (on and after 2/1/2020) groups. Each group was further categorized into COVID-19-affected specialties (pulmonary, infectious, and emergency medicine) and other specialties. Generalized estimating equations (GEEs) were used to compare and examine the effects of relevant variables on student performance.
Results
In total, 16,944 clinical scores were obtained for COVID-19-affected specialties and other specialties. For the COVID-19-affected specialties, the midst-COVID-19 score (88.513.52) was significantly lower than the pre-COVID-19 score (90.143.55) (P<0.0001). For the other specialties, the midst-COVID-19 score (88.323.68) was also significantly lower than the pre-COVID-19 score (90.063.58) (P<0.0001). There were 1,322 students (837 males and 485 females). Male students had significantly lower scores than female students (89.333.68 vs. 89.993.66, P=0.0017). GEE analysis revealed that the COVID-19 pandemic (unstandardized beta coefficient=-1.99, standard error [SE]=0.13, P<0.0001), COVID-19-affected specialties (B=0.26, SE=0.11, P=0.0184), female students (B=1.10, SE=0.20, P<0.0001), and female attending physicians (B=-0.19, SE=0.08, P=0.0145) were independently associated with students’ scores.
Conclusion
COVID-19 negatively impacted medical students' clinical performance, regardless of their specialty. Female students outperformed male students, irrespective of the pandemic.

Use of learner-driven, formative, ad-hoc, prospective assessment of competence in physical therapist clinical education in the United States: a prospective cohort study: Carey Holleran, Jeffrey Konrad, Barbara Norton, Tamara Burlis, Steven Ambler; J Educ Eval Health Prof. 2023;20:36. Published online December 8, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.36

709 View
110 Download

Abstract PDF Supplementary Material: Purpose
The purpose of this project was to implement a process for learner-driven, formative, prospective, ad-hoc, entrustment assessment in Doctor of Physical Therapy clinical education. Our goals were to develop an innovative entrustment assessment tool, and then explore whether the tool detected (1) differences between learners at different stages of development and (2) differences within learners across the course of a clinical education experience. We also investigated whether there was a relationship between the number of assessments and change in performance.
Methods
A prospective, observational, cohort of clinical instructors (CIs) was recruited to perform learner-driven, formative, ad-hoc, prospective, entrustment assessments. Two entrustable professional activities (EPAs) were used: (1) gather a history and perform an examination and (2) implement and modify the plan of care, as needed. CIs provided a rating on the entrustment scale and provided narrative support for their rating.
Results
Forty-nine learners participated across 4 clinical experiences (CEs), resulting in 453 EPA learner-driven assessments. For both EPAs, statistically significant changes were detected both between learners at different stages of development and within learners across the course of a CE. Improvement within each CE was significantly related to the number of feedback opportunities.
Conclusion
The results of this pilot study provide preliminary support for the use of learner-driven, formative, ad-hoc assessments of competence based on EPAs with a novel entrustment scale. The number of formative assessments requested correlated with change on the EPA scale, suggesting that formative feedback may augment performance improvement.

Effect of a transcultural nursing course on improving the cultural competency of nursing graduate students in Korea: a before-and-after study: Kyung Eui Bae, Geum Hee Jeong; J Educ Eval Health Prof. 2023;20:35. Published online December 4, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.35

794 View
122 Download

Abstract PDF Supplementary Material: Purpose
This study aimed to evaluate the impact of a transcultural nursing course on enhancing the cultural competency of graduate nursing students in Korea. We hypothesized that participants’ cultural competency would significantly improve in areas such as communication, biocultural ecology and family, dietary habits, death rituals, spirituality, equity, and empowerment and intermediation after completing the course. Furthermore, we assessed the participants’ overall satisfaction with the course.
Methods
A before-and-after study was conducted with graduate nursing students at Hallym University, Chuncheon, Korea, from March to June 2023. A transcultural nursing course was developed based on Giger & Haddad’s transcultural nursing model and Purnell’s theoretical model of cultural competence. Data was collected using a cultural competence scale for registered nurses developed by Kim and his colleagues. A total of 18 students participated, and the paired t-test was employed to compare pre-and post-intervention scores.
Results
The study revealed significant improvements in all 7 categories of cultural nursing competence (P<0.01). Specifically, the mean differences in scores (pre–post) ranged from 0.74 to 1.09 across the categories. Additionally, participants expressed high satisfaction with the course, with an average score of 4.72 out of a maximum of 5.0.
Conclusion
The transcultural nursing course effectively enhanced the cultural competency of graduate nursing students. Such courses are imperative to ensure quality care for the increasing multicultural population in Korea.

Effect of motion-graphic video-based training on the performance of operating room nurse students in cataract surgery in Iran: a randomized controlled study: Behnaz Fatahi, Samira Fatahi, Sohrab Nosrati, Masood Bagheri; J Educ Eval Health Prof. 2023;20:34. Published online November 28, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.34

1,069 View
81 Download

Abstract PDF Supplementary Material: Purpose
The present study was conducted to determine the effect of motion-graphic video-based training on the performance of operating room nurse students in cataract surgery using phacoemulsification at Kermanshah University of Medical Sciences in Iran.
Methods
This was a randomized controlled study conducted among 36 students training to become operating room nurses. The control group only received routine training, and the intervention group received motion-graphic video-based training on the scrub nurse’s performance in cataract surgery in addition to the educator’s training. The performance of the students in both groups as scrub nurses was measured through a researcher-made checklist in a pre-test and a post-test.
Results
The mean scores for performance in the pre-test and post-test were 17.83 and 26.44 in the control group and 18.33 and 50.94 in the intervention group, respectively, and a significant difference was identified between the mean scores of the pre- and post-test in both groups (P=0.001). The intervention also led to a significant increase in the mean performance score in the intervention group compared to the control group (P=0.001).
Conclusion
Considering the significant difference in the performance score of the intervention group compared to the control group, motion-graphic video-based training had a positive effect on the performance of operating room nurse students, and such training can be used to improve clinical training.

Assessment of the viability of integrating virtual reality programs in practical tests for the Korean Radiological Technologists Licensing Examination: a survey study: Hye Min Park, Eun Seong Kim, Deok Mun Kwon, Pyong Kon Cho, Seoung Hwan Kim, Ki Baek Lee, Seong Hu Kim, Moon Il Bong, Won Seok Yang, Jin Eui Kim, Gi Bong Kang, Yong Su Yoon, Jung Su Kim; J Educ Eval Health Prof. 2023;20:33. Published online November 28, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.33

923 View
89 Download

Abstract PDF Supplementary Material: Purpose
The objective of this study was to assess the feasibility of incorporating virtual reality/augmented reality (VR/AR) programs into practical tests administered as part of the Korean Radiological Technologists Licensing Examination (KRTLE). This evaluation is grounded in a comprehensive survey that targeted enrolled students in departments of radiology across the nation.
Methods
In total, 682 students from radiology departments across the nation were participants in the survey. An online survey platform was used, and the questionnaire was structured into 5 distinct sections and 27 questions. A frequency analysis for each section of the survey was conducted using IBM SPSS ver. 27.0.
Results
Direct or indirect exposure to VR/AR content was reported by 67.7% of all respondents. Furthermore, 55.4% of the respondents expressed that VR/AR could be integrated into their classes, which signified a widespread acknowledgment of VR among the students. With regards to the integration of a VR/AR or mixed reality program into the practical tests for purposes of the KRTLE, a substantial amount of the respondents (57.3%) exhibited a positive inclination and recommended its introduction.
Conclusion
The application of VR/AR programs within practical tests of the KRTLE will be used as an alternative for evaluating clinical examination procedures and validating job skills.

Brief report

ChatGPT (GPT-3.5) as an assistant tool in microbial pathogenesis studies in Sweden: a cross-sectional comparative study: Catharina Hultgren, Annica Lindkvist, Volkan Özenci, Sophie Curbo; J Educ Eval Health Prof. 2023;20:32. Published online November 22, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.32

820 View
94 Download
1 Web of Science
2 Crossref

Abstract PDF Supplementary Material

ChatGPT (GPT-3.5) has entered higher education and there is a need to determine how to use it effectively. This descriptive study compared the ability of GPT-3.5 and teachers to answer questions from dental students and construct detailed intended learning outcomes. When analyzed according to a Likert scale, we found that GPT-3.5 answered the questions from dental students in a similar or even more elaborate way compared to the answers that had previously been provided by a teacher. GPT-3.5 was also asked to construct detailed intended learning outcomes for a course in microbial pathogenesis, and when these were analyzed according to a Likert scale they were, to a large degree, found irrelevant. Since students are using GPT-3.5, it is important that instructors learn how to make the best use of it both to be able to advise students and to benefit from its potential.

Citations

Citations to this article as recorded by

Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study
Hyunju Lee, Soobin Park
Journal of Educational Evaluation for Health Professions.2023; 20: 39. CrossRef

Research articles

Performance of ChatGPT, Bard, Claude, and Bing on the Peruvian National Licensing Medical Examination: a cross-sectional study: Betzy Clariza Torres-Zegarra, Wagner Rios-Garcia, Alvaro Micael Ñaña-Cordova, Karen Fatima Arteaga-Cisneros, Xiomara Cristina Benavente Chalco, Marina Atena Bustamante Ordoñez, Carlos Jesus Gutierrez Rios, Carlos Alberto Ramos Godoy, Kristell Luisa Teresa Panta Quezada, Jesus Daniel Gutierrez-Arratia, Javier Alejandro Flores-Cohaila; J Educ Eval Health Prof. 2023;20:30. Published online November 20, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.30

1,219 View
159 Download
4 Web of Science
4 Crossref

Abstract PDF Supplementary Material

Purpose
We aimed to describe the performance and evaluate the educational value of justifications provided by artificial intelligence chatbots, including GPT-3.5, GPT-4, Bard, Claude, and Bing, on the Peruvian National Medical Licensing Examination (P-NLME).
Methods
This was a cross-sectional analytical study. On July 25, 2023, each multiple-choice question (MCQ) from the P-NLME was entered into each chatbot (GPT-3, GPT-4, Bing, Bard, and Claude) 3 times. Then, 4 medical educators categorized the MCQs in terms of medical area, item type, and whether the MCQ required Peru-specific knowledge. They assessed the educational value of the justifications from the 2 top performers (GPT-4 and Bing).
Results
GPT-4 scored 86.7% and Bing scored 82.2%, followed by Bard and Claude, and the historical performance of Peruvian examinees was 55%. Among the factors associated with correct answers, only MCQs that required Peru-specific knowledge had lower odds (odds ratio, 0.23; 95% confidence interval, 0.09–0.61), whereas the remaining factors showed no associations. In assessing the educational value of justifications provided by GPT-4 and Bing, neither showed any significant differences in certainty, usefulness, or potential use in the classroom.
Conclusion
Among chatbots, GPT-4 and Bing were the top performers, with Bing performing better at Peru-specific MCQs. Moreover, the educational value of justifications provided by the GPT-4 and Bing could be deemed appropriate. However, it is essential to start addressing the educational value of these chatbots, rather than merely their performance on examinations.

Citations

Citations to this article as recorded by

Performance of GPT-4V in Answering the Japanese Otolaryngology Board Certification Examination Questions: Evaluation Study
Masao Noda, Takayoshi Ueno, Ryota Koshu, Yuji Takaso, Mari Dias Shimada, Chizu Saito, Hisashi Sugimoto, Hiroaki Fushiki, Makoto Ito, Akihiro Nomura, Tomokazu Yoshizaki
JMIR Medical Education.2024; 10: e57054. CrossRef
Response to Letter to the Editor re: “Artificial Intelligence Versus Expert Plastic Surgeon: Comparative Study Shows ChatGPT ‘Wins' Rhinoplasty Consultations: Should We Be Worried? [1]” by Durairaj et al.
Kay Durairaj, Omer Baker
Facial Plastic Surgery & Aesthetic Medicine.2024;[Epub] CrossRef
Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study
Hyunju Lee, Soobin Park
Journal of Educational Evaluation for Health Professions.2023; 20: 39. CrossRef

Medical students’ patterns of using ChatGPT as a feedback tool and perceptions of ChatGPT in a Leadership and Communication course in Korea: a cross-sectional study: Janghee Park; J Educ Eval Health Prof. 2023;20:29. Published online November 10, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.29

1,371 View
136 Download
2 Web of Science
4 Crossref

Abstract PDF Supplementary Material

Purpose
This study aimed to analyze patterns of using ChatGPT before and after group activities and to explore medical students’ perceptions of ChatGPT as a feedback tool in the classroom.
Methods
The study included 99 2nd-year pre-medical students who participated in a “Leadership and Communication” course from March to June 2023. Students engaged in both individual and group activities related to negotiation strategies. ChatGPT was used to provide feedback on their solutions. A survey was administered to assess students’ perceptions of ChatGPT’s feedback, its use in the classroom, and the strengths and challenges of ChatGPT from May 17 to 19, 2023.
Results
The students responded by indicating that ChatGPT’s feedback was helpful, and revised and resubmitted their group answers in various ways after receiving feedback. The majority of respondents expressed agreement with the use of ChatGPT during class. The most common response concerning the appropriate context of using ChatGPT’s feedback was “after the first round of discussion, for revisions.” There was a significant difference in satisfaction with ChatGPT’s feedback, including correctness, usefulness, and ethics, depending on whether or not ChatGPT was used during class, but there was no significant difference according to gender or whether students had previous experience with ChatGPT. The strongest advantages were “providing answers to questions” and “summarizing information,” and the worst disadvantage was “producing information without supporting evidence.”
Conclusion
The students were aware of the advantages and disadvantages of ChatGPT, and they had a positive attitude toward using ChatGPT in the classroom.

Citations

Citations to this article as recorded by

Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef
Embracing ChatGPT for Medical Education: Exploring Its Impact on Doctors and Medical Students
Yijun Wu, Yue Zheng, Baijie Feng, Yuqi Yang, Kai Kang, Ailin Zhao
JMIR Medical Education.2024; 10: e52483. CrossRef
ChatGPT and Clinical Training: Perception, Concerns, and Practice of Pharm-D Students
Mohammed Zawiah, Fahmi Al-Ashwal, Lobna Gharaibeh, Rana Abu Farha, Karem Alzoubi, Khawla Abu Hammour, Qutaiba A Qasim, Fahd Abrah
Journal of Multidisciplinary Healthcare.2023; Volume 16: 4099. CrossRef
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study
Hyunju Lee, Soobin Park
Journal of Educational Evaluation for Health Professions.2023; 20: 39. CrossRef

Mentorship and self-efficacy are associated with lower burnout in physical therapists in the United States: a cross-sectional survey study: Matthew Pugliese, Jean-Michel Brismée, Brad Allen, Sean Riley, Justin Tammany, Paul Mintken; J Educ Eval Health Prof. 2023;20:27. Published online September 27, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.27

2,642 View
265 Download

Abstract PDF Supplementary Material: Purpose
This study investigated the prevalence of burnout in physical therapists in the United States and the relationships between burnout and education, mentorship, and self-efficacy.
Methods
This was a cross-sectional survey study. An electronic survey was distributed to practicing physical therapists across the United States over a 6-week period from December 2020 to January 2021. The survey was completed by 2,813 physical therapists from all states. The majority were female (68.72%), White or Caucasian (80.13%), and employed full-time (77.14%). Respondents completed questions on demographics, education, mentorship, self-efficacy, and burnout. The Burnout Clinical Subtypes Questionnaire 12 (BCSQ-12) and self-reports were used to quantify burnout, and the General Self-Efficacy Scale (GSES) was used to measure self-efficacy. Descriptive and inferential analyses were performed.
Results
Respondents from home health (median BCSQ-12=42.00) and skilled nursing facility settings (median BCSQ-12=42.00) displayed the highest burnout scores. Burnout was significantly lower among those who provided formal mentorship (median BCSQ-12=39.00, P=0.0001) compared to no mentorship (median BCSQ-12=41.00). Respondents who received formal mentorship (median BCSQ-12=38.00, P=0.0028) displayed significantly lower burnout than those who received no mentorship (median BCSQ-12=41.00). A moderate negative correlation (rho=-0.49) was observed between the GSES and burnout scores. A strong positive correlation was found between self-reported burnout status and burnout scores (rrb=0.61).
Conclusion
Burnout is prevalent in the physical therapy profession, as almost half of respondents (49.34%) reported burnout. Providing or receiving mentorship and higher self-efficacy were associated with lower burnout. Organizations should consider measuring burnout levels, investing in mentorship programs, and implementing strategies to improve self-efficacy.

Development and validation of the student ratings in clinical teaching scale in Australia: a methodological study: Pin-Hsiang Huang, Anthony John O’Sullivan, Boaz Shulruf; J Educ Eval Health Prof. 2023;20:26. Published online September 5, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.26

917 View
113 Download

Abstract PDF Supplementary Material: Purpose
This study aimed to devise a valid measurement for assessing clinical students’ perceptions of teaching practices.
Methods
A new tool was developed based on a meta-analysis encompassing effective clinical teaching-learning factors. Seventy-nine items were generated using a frequency (never to always) scale. The tool was applied to the University of New South Wales year 2, 3, and 6 medical students. Exploratory and confirmatory factor analysis (exploratory factor analysis [EFA] and confirmatory factor analysis [CFA], respectively) were conducted to establish the tool’s construct validity and goodness of fit, and Cronbach’s α was used for reliability.
Results
In total, 352 students (44.2%) completed the questionnaire. The EFA identified student-centered learning, problem-solving learning, self-directed learning, and visual technology (reliability, 0.77 to 0.89). CFA showed acceptable goodness of fit (chi-square P<0.01, comparative fit index=0.930 and Tucker-Lewis index=0.917, root mean square error of approximation=0.069, standardized root mean square residual=0.06).
Conclusion
The established tool—Student Ratings in Clinical Teaching (STRICT)—is a valid and reliable tool that demonstrates how students perceive clinical teaching efficacy. STRICT measures the frequency of teaching practices to mitigate the biases of acquiescence and social desirability. Clinical teachers may use the tool to adapt their teaching practices with more active learning activities and to utilize visual technology to facilitate clinical learning efficacy. Clinical educators may apply STRICT to assess how these teaching practices are implemented in current clinical settings.

Effect of an interprofessional simulation program on patient safety competencies of healthcare professionals in Switzerland: a before and after study: Sylvain Boloré, Thomas Fassier, Nicolas Guirimand; J Educ Eval Health Prof. 2023;20:25. Published online August 28, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.25

1,342 View
141 Download

Abstract PDF Supplementary Material: Purpose
This study aimed to identify the effects of a 12-week interprofessional simulation program, operated between February 2020 and January 2021, on the patient safety competencies of healthcare professionals in Switzerland.
Methods
The simulation training was based on 2 scenarios of hospitalized patients with septic shock and respiratory failure, and trainees were expected to demonstrate patient safety competencies. A single-group before and after study was conducted after the intervention—simulation program, using a measurement tool (the Health Professional Education in Patient Safety Survey) to measure the perceived competencies of physicians, nurses, and nursing assistants. Out of 57 participants, 37 answered the questionnaire surveys 4 times: 48 hours before the training, followed by post-surveys at 24 hours, 6 weeks, and 12 weeks after the training. The linear mixed effect model was applied for the analysis.
Results
Four components out of 6 perceived patient safety competencies improved at 6 weeks but returned to a similar level before training at 12 weeks. Competencies of “communicating effectively,” “managing safety risks,” “understanding human and environmental factors that influence patient safety,” and “recognize and respond to remove immediate risks of harm” are statistically significant both overall and in the comparison between before the training and 6 weeks after the training.
Conclusion
Interprofessional simulation programs contributed to developing some areas of patient safety competencies of healthcare professionals, but only for a limited time. Interprofessional simulation programs should be repeated and combined with other forms of support, including case discussions and debriefings, to ensure lasting effects.

Development of a character qualities test for medical students in Korea using polytomous item response theory and factor analysis: a preliminary scale development study: Yera Hur, Dong Gi Seo; J Educ Eval Health Prof. 2023;20:20. Published online June 26, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.20

1,271 View
101 Download

Abstract PDF Supplementary Material: Purpose
This study aimed to develop a test scale to measure the character qualities of medical students as a follow-up study on the 8 core character qualities revealed in a previous report.
Methods
In total, 160 preliminary items were developed to measure 8 core character qualities. Twenty questions were assigned to each quality, and a questionnaire survey was conducted among 856 students in 5 medical schools in Korea. Using the partial credit model, polytomous item response theory analysis was carried out to analyze the goodness-of-fit, followed by exploratory factor analysis. Finally, confirmatory factor and reliability analyses were conducted with the final selected items.
Results
The preliminary items for the 8 core character qualities were administered to the participants. Data from 767 students were included in the final analysis. Of the 160 preliminary items, 25 were removed by classical test theory analysis and 17 more by polytomous item response theory assessment. A total of 118 items and sub-factors were selected for exploratory factor analysis. Finally, 79 items were selected, and the validity and reliability were confirmed through confirmatory factor analysis and intra-item relevance analysis.
Conclusion
The character qualities test scale developed through this study can be used to measure the character qualities corresponding to the educational goals and visions of individual medical schools in Korea. Furthermore, this measurement tool can serve as primary data for developing character qualities tools tailored to each medical school’s vision and educational goals.

Enhancement of the technical and non-technical skills of nurse anesthesia students using the Anesthetic List Management Assessment Tool in Iran: a quasi-experimental study: Ali Khalafi, Maedeh Kordnejad, Vahid Saidkhani; J Educ Eval Health Prof. 2023;20:19. Published online June 16, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.19

1,095 View
80 Download

Abstract PDF Supplementary Material: Purpose
This study investigated the effect of evaluations based on the Anesthetic List Management Assessment Tool (ALMAT) form on improving the technical and non-technical skills of final-year nurse anesthesia students at Ahvaz Jundishapur University of Medical Sciences (AJUMS).
Methods
This was a semi-experimental study with a pre-test and post-test design. It included 45 final-year nurse anesthesia students of AJUMS and lasted for 3 months. The technical and non-technical skills of the intervention group were assessed at 4 university hospitals using formative-feedback evaluation based on the ALMAT form, from induction of anesthesia until reaching mastery and independence. Finally, the students’ degree of improvement in technical and non-technical skills was compared between the intervention and control groups. Statistical tests (the independent t-test, paired t-test, and Mann-Whitney test) were used to analyze the data.
Results
The rate of improvement in post-test scores of technical skills was significantly higher in the intervention group than in the control group (P˂0.0001). Similarly, the students in the intervention group received significantly higher post-test scores for non-technical skills than the students in the control group (P˂0.0001).
Conclusion
The findings of this study showed that the use of ALMAT as a formative-feedback evaluation method to evaluate technical and non-technical skills had a significant effect on improving these skills and was effective in helping students learn and reach mastery and independence.

Brief report

Comparing ChatGPT’s ability to rate the degree of stereotypes and the consistency of stereotype attribution with those of medical students in New Zealand in developing a similarity rating test: a methodological study: Chao-Cheng Lin, Zaine Akuhata-Huntington, Che-Wei Hsu; J Educ Eval Health Prof. 2023;20:17. Published online June 12, 2023; DOI: https://doi.org/10.3352/jeehp.2023.20.17

1,717 View
128 Download
1 Web of Science
1 Crossref

Abstract PDF Supplementary Material

Learning about one’s implicit bias is crucial for improving one’s cultural competency and thereby reducing health inequity. To evaluate bias among medical students following a previously developed cultural training program targeting New Zealand Māori, we developed a text-based, self-evaluation tool called the Similarity Rating Test (SRT). The development process of the SRT was resource-intensive, limiting its generalizability and applicability. Here, we explored the potential of ChatGPT, an automated chatbot, to assist in the development process of the SRT by comparing ChatGPT’s and students’ evaluations of the SRT. Despite results showing non-significant equivalence and difference between ChatGPT’s and students’ ratings, ChatGPT’s ratings were more consistent than students’ ratings. The consistency rate was higher for non-stereotypical than for stereotypical statements, regardless of rater type. Further studies are warranted to validate ChatGPT’s potential for assisting in SRT development for implementation in medical education and evaluation of ethnic stereotypes and related topics.

Citations

Citations to this article as recorded by

Efficacy and limitations of ChatGPT as a biostatistical problem-solving tool in medical education in Serbia: a descriptive study
Aleksandra Ignjatović, Lazar Stevanović
Journal of Educational Evaluation for Health Professions.2023; 20: 28. CrossRef

First
Prev
Page of 12
Next
Last

Data sharing