Purpose This study evaluated the Dr Lee Jong-wook Fellowship Program’s impact on Tanzania’s health workforce, focusing on relevance, effectiveness, efficiency, impact, and sustainability in addressing healthcare gaps.
Methods A mixed-methods research design was employed. Data were collected from 97 out of 140 alumni through an online survey, 35 in-depth interviews, and one focus group discussion. The study was conducted from November to December 2023 and included alumni from 2009 to 2022. Measurement instruments included structured questionnaires for quantitative data and semi-structured guides for qualitative data. Quantitative analysis involved descriptive and inferential statistics (Spearman’s rank correlation, non-parametric tests) using Python ver. 3.11.0 and Stata ver. 14.0. Thematic analysis was employed to analyze qualitative data using NVivo ver. 12.0.
Results Findings indicated high relevance (mean=91.6, standard deviation [SD]=8.6), effectiveness (mean=86.1, SD=11.2), efficiency (mean=82.7, SD=10.2), and impact (mean=87.7, SD=9.9), with improved skills, confidence, and institutional service quality. However, sustainability had a lower score (mean=58.0, SD=11.1), reflecting challenges in follow-up support and resource allocation. Effectiveness strongly correlated with impact (ρ=0.746, P<0.001). The qualitative findings revealed that participants valued tailored training but highlighted barriers, such as language challenges and insufficient practical components. Alumni-led initiatives contributed to knowledge sharing, but limited resources constrained sustainability.
Conclusion The Fellowship Program enhanced Tanzania’s health workforce capacity, but it requires localized curricula and strengthened alumni networks for sustainability. These findings provide actionable insights for improving similar programs globally, confirming the hypothesis that tailored training positively influences workforce and institutional outcomes.
Purpose To generate Cronbach’s alpha and further mixed methods construct validity evidence for the Blended Learning Usability Evaluation–Questionnaire (BLUE-Q).
Methods Forty interprofessional clinicians completed the BLUE-Q after finishing a 3-month long blended learning professional development program in Ontario, Canada. Reliability was assessed with Cronbach’s α for each of the 3 sections of the BLUE-Q and for all quantitative items together. Construct validity was evaluated through the Grand-Guillaume-Perrenoud et al. framework, which consists of 3 elements: congruence, convergence, and credibility. To compare quantitative and qualitative results, descriptive statistics, including means and standard deviations for each Likert scale item of the BLUE-Q were calculated.
Results Cronbach’s α was 0.95 for the pedagogical usability section, 0.85 for the synchronous modality section, 0.93 for the asynchronous modality section, and 0.96 for all quantitative items together. Mean ratings (with standard deviations) were 4.77 (0.506) for pedagogy, 4.64 (0.654) for synchronous learning, and 4.75 (0.536) for asynchronous learning. Of the 239 qualitative comments received, 178 were identified as substantive, of which 88% were considered congruent and 79% were considered convergent with the high means. Among all congruent responses, 69% were considered confirming statements and 31% were considered clarifying statements, suggesting appropriate credibility. Analysis of the clarifying statements assisted in identifying 5 categories of suggestions for program improvement.
Conclusion The BLUE-Q demonstrates high reliability and appropriate construct validity in the context of a blended learning program with interprofessional clinicians, making it a valuable tool for comprehensive program evaluation, quality improvement, and evaluative research in health professions education.
The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.
Purpose This research presents an experimental study using validated questionnaires to quantitatively assess the outcomes of art-based observational training in medical students, residents, and specialists. The study tested the hypothesis that art-based observational training would lead to measurable effects on judgement skills (tolerance of ambiguity) and empathy in medical students and doctors.
Methods An experimental cohort study with pre- and post-intervention assessments was conducted using validated questionnaires and qualitative evaluation forms to examine the outcomes of art-based observational training in medical students and doctors. Between December 2023 and June 2024, 15 art courses were conducted in the Rijksmuseum in Amsterdam. Participants were assessed on empathy using the Jefferson Scale of Empathy (JSE) and tolerance of ambiguity using the Tolerance of Ambiguity in Medical Students and Doctors scale (TAMSAD).
Results In total, 91 participants were included; 29 participants completed the JSE and 62 completed the TAMSAD scales. The results showed statistically significant post-test increases for mean JSE and TAMSAD scores (3.71 points for the JSE, ranging from 20 to 140, and 1.86 points for the TAMSAD, ranging from 0 to 100). The qualitative findings were predominantly positive.
Conclusion The results suggest that incorporating art-based observational training in medical education improves empathy and tolerance of ambiguity. This study highlights the importance of art-based observational training in medical education in the professional development of medical students and doctors.
Purpose This study aimed to explore pharmacy students’ perceptions of remote flipped classrooms in Malaysia, focusing on their learning experiences and identifying areas for potential improvement to inform future educational strategies.
Methods A qualitative approach was employed, utilizing inductive thematic analysis. Twenty Bachelor of Pharmacy students (18 women, 2 men; age range, 19–24 years) from Monash University participated in 8 focus group discussions over 2 rounds during the coronavirus disease 2019 pandemic (2020–2021). Participants were recruited via convenience sampling. The focus group discussions, led by experienced academics, were conducted in English via Zoom, recorded, and transcribed for analysis using NVivo. Themes were identified through emergent coding and iterative discussions to ensure thematic saturation.
Results Five major themes emerged: flexibility, communication, technological challenges, skill-based learning challenges, and time-based effects. Students appreciated the flexibility of accessing and reviewing pre-class materials at their convenience. Increased engagement through anonymous question submission was noted, yet communication difficulties and lack of non-verbal cues in remote workshops were significant drawbacks. Technological issues, such as internet connectivity problems, hindered learning, especially during assessments. Skill-based learning faced challenges in remote settings, including lab activities and clinical examinations. Additionally, prolonged remote learning led to feelings of isolation, fatigue, and a desire to return to in-person interactions.
Conclusion Remote flipped classrooms offer flexibility and engagement benefits but present notable challenges related to communication, technology, and skill-based learning. To improve remote education, institutions should integrate robust technological support, enhance communication strategies, and incorporate virtual simulations for practical skills. Balancing asynchronous and synchronous methods while addressing academic success and socioemotional wellness is essential for effective remote learning environments.
Purpose This study aimed to examine the reliability and validity of a measurement tool for portfolio assessments in medical education. Specifically, it investigated scoring consistency among raters and assessment criteria appropriateness according to an expert panel.
Methods A cross-sectional observational study was conducted from September to December 2018 for the Introduction to Clinical Medicine course at the Ewha Womans University College of Medicine. Data were collected for 5 randomly selected portfolios scored by a gold-standard rater and 6 trained raters. An expert panel assessed the validity of 12 assessment items using the content validity index (CVI). Statistical analysis included Pearson correlation coefficients for rater alignment, the intraclass correlation coefficient (ICC) for inter-rater reliability, and the CVI for item-level validity.
Results Rater 1 had the highest Pearson correlation (0.8916) with the gold-standard rater, while Rater 5 had the lowest (0.4203). The ICC for all raters was 0.3821, improving to 0.4415 after excluding Raters 1 and 5, indicating a 15.6% reliability increase. All assessment items met the CVI threshold of ≥0.75, with some achieving a perfect score (CVI=1.0). However, items like “sources” and “level and degree of performance” showed lower validity (CVI=0.72).
Conclusion The present measurement tool for portfolio assessments demonstrated moderate reliability and strong validity, supporting its use as a credible tool. For a more reliable portfolio assessment, more faculty training is needed.
Purpose Paramedicine education often uses high-fidelity simulations that mimic real-life emergencies. These experiences can trigger stress responses characterized by physiological changes, including alterations in cerebral blood flow and oxygenation. Functional near-infrared spectroscopy (fNIRS) is emerging as a promising tool for assessing cognitive stress in educational settings.
Methods Eight final-year undergraduate paramedicine students completed 2 high-acuity scenarios 7 days apart. Real-time continuous recording of cerebral blood flow and oxygenation levels in the prefrontal cortex was undertaken via fNIRS as a means of assessing neural activity during stressful scenarios.
Results fNIRS accurately determined periods of increased cerebral oxygenation when participants were undertaking highly technical skills or making significant clinical decisions.
Conclusion fNIRS holds potential for objectively measuring the cognitive load in undergraduate paramedicine students. By providing real-time insights into neurophysiological responses, fNIRS may enhance training outcomes in paramedicine programs and improve student well-being (Australian New Zealand Clinical Trials Registry: ACTRN12623001214628).
Purpose This study aimed to develop and validate the 21st Century Skills Assessment Scale (21CSAS) for Thai public health (PH) undergraduate students using the Partnership for 21st Century Skills framework.
Methods A cross-sectional survey was conducted among 727 first- to fourth-year PH undergraduate students from 4 autonomous universities in Thailand. Data were collected using self-administered questionnaires between January and March 2023. Exploratory factor analysis (EFA) was used to explore the underlying dimensions of 21CSAS, while confirmatory factor analysis (CFA) was conducted to test the hypothesized factor structure using Mplus software (Muthén & Muthén). Reliability and item discrimination were assessed using Cronbach’s α and the corrected item-total correlation, respectively.
Results EFA performed on a dataset of 300 students revealed a 20-item scale with a 6-factor structure: (1) creativity and innovation; (2) critical thinking and problem-solving; (3) information, media, and technology; (4) communication and collaboration; (5) initiative and self-direction; and (6) social and cross-cultural skills. The rotated eigenvalues ranged from 2.12 to 1.73. CFA performed on another dataset of 427 students confirmed a good model fit (χ2/degrees of freedom=2.67, comparative fit index=0.93, Tucker-Lewis index=0.91, root mean square error of approximation=0.06, standardized root mean square residual=0.06), explaining 34%–71% of variance in the items. Item loadings ranged from 0.58 to 0.84. The 21CSAS had a Cronbach’s α of 0.92.
Conclusion The 21CSAS proved be a valid and reliable tool for assessing 21st century skills among Thai PH undergraduate students. These findings provide insights for educational system to inform policy, practice, and research regarding 21st-century skills among undergraduate students.
The introduction of modern Western medicine in the late 19th century, notably through vaccination initiatives, marked the beginning of governmental involvement in medical licensure, with the licensing of doctors who performed vaccinations. The establishment of the national medical school “Euihakkyo” in 1899 further formalized medical education and licensure, granting graduates the privilege to practice medicine without additional examinations. The enactment of the Regulations on Doctors in 1900 by the Joseon government aimed to define doctor qualifications, including modern and traditional practitioners, comprehensively. However, resistance from the traditional medical community hindered its full implementation. During the Japanese colonial occupation of the Korean Peninsula from 1910 to 1945, the medical licensure system was controlled by colonial authorities, leading to the marginalization of traditional Korean medicine and the imposition of imperial hierarchical structures. Following liberation in 1945 from Japanese colonial rule, the Korean government undertook significant reforms, culminating in the National Medical Law, which was enacted in 1951. This law redefined doctor qualifications and reinstated the status of traditional Korean medicine. The introduction of national examinations for physicians increased state involvement in ensuring medical competence. The privatization of the Korean Medical Licensing Examination led to the establishment of the Korea Health Personnel Licensing Examination Institute in 1992, which assumed responsibility for administering licensing examinations for all healthcare workers. This shift reflected a move towards specialized management of professional standards. The evolution of the medical licensure system in Korea illustrates a dynamic process shaped by the historical context, balancing the protection of public health with the rights of medical practitioners.
Purpose This study evaluates the use of ChatGPT-4o in creating tailored continuing professional development (CPD) plans for radiography students, addressing the challenge of aligning CPD with Medical Radiation Practice Board of Australia (MRPBA) requirements. We hypothesized that ChatGPT-4o could support students in CPD planning while meeting regulatory standards.
Methods A descriptive, experimental design was used to generate 3 unique CPD plans using ChatGPT-4o, each tailored to hypothetical graduate radiographers in varied clinical settings. Each plan followed MRPBA guidelines, focusing on computed tomography specialization by the second year. Three MRPBA-registered academics assessed the plans using criteria of appropriateness, timeliness, relevance, reflection, and completeness from October 2024 to November 2024. Ratings underwent analysis using the Friedman test and intraclass correlation coefficient (ICC) to measure consistency among evaluators.
Results ChatGPT-4o generated CPD plans generally adhered to regulatory standards across scenarios. The Friedman test indicated no significant differences among raters (P=0.420, 0.761, and 0.807 for each scenario), suggesting consistent scores within scenarios. However, ICC values were low (–0.96, 0.41, and 0.058 for scenarios 1, 2, and 3), revealing variability among raters, particularly in timeliness and completeness criteria, suggesting limitations in the ChatGPT-4o’s ability to address individualized and context-specific needs.
Conclusion ChatGPT-4o demonstrates the potential to ease the cognitive demands of CPD planning, offering structured support in CPD development. However, human oversight remains essential to ensure plans are contextually relevant and deeply reflective. Future research should focus on enhancing artificial intelligence’s personalization for CPD evaluation, highlighting ChatGPT-4o’s potential and limitations as a tool in professional education.
Purpose This study aimed to compare cognitive, non-cognitive, and overall learning outcomes for sepsis and trauma resuscitation skills in novices with virtual patient simulation (VPS) versus in-person simulation (IPS).
Methods A randomized controlled trial was conducted on junior doctors in 1 emergency department from January to December 2022, comparing 70 minutes of VPS (n=19) versus IPS (n=21) in sepsis and trauma resuscitation. Using the nominal group technique, we created skills assessment checklists and determined Bloom’s taxonomy domains for each checklist item. Two blinded raters observed participants leading 1 sepsis and 1 trauma resuscitation simulation. Satisfaction was measured using the Student Satisfaction with Learning Scale (SSLS). The SSLS and checklist scores were analyzed using the Wilcoxon rank sum test and 2-tailed t-test respectively.
Results For sepsis, there was no significant difference between VPS and IPS in overall scores (2.0; 95% confidence interval [CI], -1.4 to 5.4; Cohen’s d=0.38), as well as in items that were cognitive (1.1; 95% CI, -1.5 to 3.7) and not only cognitive (0.9; 95% CI, -0.4 to 2.2). Likewise, for trauma, there was no significant difference in overall scores (-0.9; 95% CI, -4.1 to 2.3; Cohen’s d=0.19), as well as in items that were cognitive (-0.3; 95% CI, -2.8 to 2.1) and not only cognitive (-0.6; 95% CI, -2.4 to 1.3). The median SSLS scores were lower with VPS than with IPS (-3.0; 95% CI, -1.0 to -5.0).
Conclusion For novices, there were no major differences in overall and non-cognitive learning outcomes for sepsis and trauma resuscitation between VPS and IPS. Learners were more satisfied with IPS than with VPS (clinicaltrials.gov identifier: NCT05201950).
Purpose With the coronavirus disease 2019 pandemic, online high-stakes exams have become a viable alternative. This study evaluated the feasibility of computer-based testing (CBT) for medical residency applications in Brazil and its impacts on item quality and applicants’ access compared to paper-based testing.
Methods In 2020, an online CBT was conducted in a Ribeirao Preto Clinical Hospital in Brazil. In total, 120 multiple-choice question items were constructed. Two years later, the exam was performed as paper-based testing. Item construction processes were similar for both exams. Difficulty and discrimination indexes, point-biserial coefficient, difficulty, discrimination, guessing parameters, and Cronbach’s α coefficient were measured based on the item response and classical test theories. Internet stability for applicants was monitored.
Results In 2020, 4,846 individuals (57.1% female, mean age of 26.64±3.37 years) applied to the residency program, versus 2,196 individuals (55.2% female, mean age of 26.47±3.20 years) in 2022. For CBT, there was an increase of 2,650 applicants (120.7%), albeit with significant differences in demographic characteristics. There was a significant increase in applicants from more distant and lower-income Brazilian regions, such as the North (5.6% vs. 2.7%) and Northeast (16.9% vs. 9.0%). No significant differences were found in difficulty and discrimination indexes, point-biserial coefficients, and Cronbach’s α coefficients between the 2 exams.
Conclusion Online CBT with multiple-choice questions was a viable format for a residency application exam, improving accessibility without compromising exam integrity and quality.
Purpose The primary aim of this study is to validate the Blended Learning Usability Evaluation–Questionnaire (BLUE-Q) for use in the field of health professions education through a Bayesian approach. As Bayesian questionnaire validation remains elusive, a secondary aim of this article is to serve as a simplified tutorial for engaging in such validation practices in health professions education.
Methods A total of 10 health education-based experts in blended learning were recruited to participate in a 30-minute interviewer-administered survey. On a 5-point Likert scale, experts rated how well they perceived each item of the BLUE-Q to reflect its underlying usability domain (i.e., effectiveness, efficiency, satisfaction, accessibility, organization, and learner experience). Ratings were descriptively analyzed and converted into beta prior distributions. Participants were also given the option to provide qualitative comments for each item.
Results After reviewing the computed expert prior distributions, 31 quantitative items were identified as having a probability of “low endorsement” and were thus removed from the questionnaire. Additionally, qualitative comments were used to revise the phrasing and order of items to ensure clarity and logical flow. The BLUE-Q’s final version comprises 23 Likert-scale items and 6 open-ended items.
Conclusion Questionnaire validation can generally be a complex, time-consuming, and costly process, inhibiting many from engaging in proper validation practices. In this study, we demonstrate that a Bayesian questionnaire validation approach can be a simple, resource-efficient, yet rigorous solution to validating a tool for content and item-domain correlation through the elicitation of domain expert endorsement ratings.
Purpose This study aimed to develop and validate the Student Perceptions of Real Patient Use in Physical Therapy Education (SPRP-PTE) survey to assess physical therapy student (SPT) perceptions regarding real patient use in didactic education.
Methods This cross-sectional observational study developed a 48-item survey and tested the survey on 130 SPTs. Face and content validity were determined by an expert review and content validity index (CVI). Construct validity and internal consistency reliability were determined via exploratory factor analysis (EFA) and Cronbach’s α.
Results Three main constructs were identified (value, satisfaction, and confidence), each having 4 subconstruct components (overall, cognitive, psychomotor, and affective learning). Expert review demonstrated adequate face and content validity (CVI=96%). The initial EFA of the 48-item survey revealed items with inconsistent loadings and low correlations, leading to the removal of 18 items. An EFA of the 30-item survey demonstrated 1-factor loadings of all survey constructs except satisfaction and the entire survey. All constructs had adequate internal consistency (Cronbach’s α >0.85).
Conclusion The SPRP-PTE survey provides a reliable and valid way to assess student perceptions of real patient use. Future studies are encouraged to validate the SPRP-PTE survey further.