Online assessment for the final year medical students during COVID-19 pandemics; the exam quality and students' performance

Bahaeldin Hassan; Ayed A. Shati; Abdulaziz Alamri; Ayyub Patel; Ali Alsuheel Asseri; Muhammed Abid; Saleh M. Al-Qahatani; Ismaeel Satti

Research Article - Onkologia i Radioterapia ( 2020) Volume 14, Issue 6

Online assessment for the final year medical students during COVID-19 pandemics; the exam quality and students' performance

Bahaeldin Hassan¹^*, Ayed A. Shati², Abdulaziz Alamri³, Ayyub Patel⁴, Ali Alsuheel Asseri², Muhammed Abid⁵, Saleh M. Al-Qahatani² and Ismaeel Satti¹

¹Department of Obstetrics and Gynaecology, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia
²Department of Child Health, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia
³Department of Surgery, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia
⁴Department of Biochemistry, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia
⁵Department of Medical Education, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia

^*Corresponding Author:
Bahaeldin Hassan, Department of Obstetrics and Gynaecology, College of Medicine, King Khalid University, Abha, Kingdom of Saudi Arabia, Email: bahasuikt@hotmail.com

Received: 12-Nov-2020 Accepted: 20-Nov-2020 Published: 27-Nov-2020

Abstract

Background: Saudi Arabia responded to corona virus (COVID 19) pandemic earlier, the decision of lockdown taken in March 2020, and education and assessment were continued through E-learning since that time. Objectives: We aimed to assess the quality of online MCQs test taken by final medical students after COVID 19 pandemic and to review student’s performance in online assessment. Methods: This study was carried out in the college of medicine, King Khalid University, Saudi Arabia, participants were undergraduate final year medical students who completed their four major clinical courses. Item analysis parameters of the online MCQs test were compared with the item analysis parameters of the paper-based tests. Paper-based tests assessed the cohort of students in semester one before COVID 19 pandemic lockdown. The overall student’s performance on classical, face to face assessment was compared with the performance on an online assessment. Chi-square test was used P values < 0:05 considered as statistically significant. Results: In two courses out of four, the test reliability of online MCQs tests improved significantly compared with paper-based tests. Three courses out of four showed significantly increased average discrimination indices among the online MCQs items. The average difficulty indices of all courses increased significantly in online MCQs tests. We observed that out of a maximum raw score of 100, the mean student’s score for online assessment in three courses was significantly higher than that for traditional assessment. Conclusion: we studied the impact of the COVID-19 pandemic on assessment of final year medical students. Online MCQs approved to be more reliable, better discrimination ability, but easier than paper-based examination. The overall student’s performance in theoretical and practical assessment was significantly improved in online assessment.

Keywords

COVID 19, online assessment, paper-based assessment, item analysis, MCQs

Introduction

On December 2019, the World Health Organization (WHO) announced the classification of the Novel Coronavirus (COVID-19) as a worldwide pandemic, since that time, the lockdown policy had been adopted in many countries. The education sector all over the world faced difficulties in running the schools and universities. In order to continue the learning process, major changes in assessment and curriculum have been implemented [1]. Saudi Arabia is one of the first countries that responded to the pandemic, the decision of lockdown taken in March 2020, and education was continued through E-learning since that time. Our medical schools cancelled the clinical teaching to reduce the risk of viral infection to students. The faculty prepared recorded history taking and examination video sessions. Sessions were delivered electronically to the students through the official platform of the university (Blackboard system). Globally, institutions removed written assessments and replaced them with remote online assessments for students [2, 3]. Online tests raise questions of honesty and fairness. Online assessments lakes supervision of students without a guarantee against cheating. Cheating can be in the form of open-book test behaviour, which includes using multiple media for quick searching for answers and increased possibility of students taking the test in small groups. To control some of these practices, e-proctoring systems to monitor students was practiced widely by the universities [4]. Online tests include Multiple Choice Questions (MCQs), true/ false questions, short answer questions, and matching questions. Among these methods of online assessment, MCQs are the most frequently used tool. Applying Bloom’s taxonomy, studies nominate MCQs as the most suitable for the first three cognitive levels of remembering, comprehend, apply, and to some extent, the level of analysis [5, 6]. Researchers recommend the use of online formative and summative multiple-choice tests to support independent and self-directed learning. MCQs improve students' and faculty performance when compared with a paperbased test [7, 8]. However, other studies observed no difference in scores between online tests and paper-based tests [9]. Final year medical students in Saudi Arabia are required to meet learning objectives set by the Saudi National Commission for Academic Accreditation and Assessment (NCAAA) as graduation requirements [10]. This is the first time that our students exposed to summative online assessment instead of face to face assessment. Implementation of remote online summative assessments in medical curricula necessities development of robust systems to guarantee the fairness of the examinations [11]. The experience of online examinations in Italian University of Catanzaro during COVID 19 concluded that it was suboptimum in evaluating students in health education [12]. In our institution, the college of medicine, King Khalid University, Abha, Saudi Arabia, since the COVID-19 pandemic, E-learning was activated. Recorded lectures, collaborative virtual, and clinical video sessions were the methods of teaching. At the time of assessment, all assessment methods were reformed to online assessment, including online MCQs tests and clinical assessments. This study was conducted to assess the quality of online MCQs test taken by final medical students after COVID 19 pandemic and to review the overall student’s performance in online assessment.

Methods

This study was carried out in the college of medicine, King Khalid University, Saudi Arabia, participants were undergraduate final year medical students who completed their four major clinical courses in obstetrics and gynaecology, surgery, medicine, and Pediatrics. The final year of the MBBS program (Level 11 and 12) composed of four major clinical courses. Courses were taught in 8 weeks duration for each and considered as a requirement for graduation. In response to COVID 19 pandemic, our institution decided to deliver all courses through the Blackboard system. Assessment for final year medical students was conducted electronically through the university Blackboard system. Assessment methods were online MCQs tests for the theoretical part and Oral Structured Practical Examination (OSPE) and/or Oral Structured Clinical Examination (OSCE) for the clinical assessment. This study aimed to assess the quality of the online MCQs test and compare the student's performance in online assessment (theoretical plus practical) compared to the classical face to face assessment. MCQs tests of single answer type of four options format (one correct answer and three distracters) were delivered online into the final year medical students, semester 2, the academic year 2019-2020 (512 students) after COVID 19 pandemic lockdown, the total number of the tests were four (gynaecology, surgery, medicine and Pediatrics) with the total number of 124 items. In order to avoid gathering during COVID 19 pandemic, all students received MCQs tests in their devices at home after login to the Blackboard system. Questions were delivered on the screen one by one; students could access the next question, review and modulate the answer to the previous questions. Two minutes per question was used to calculate the total exam duration. Before COVID 19 pandemic lockdown, semester one students in the four courses (512 students) were sat for class-controlled paper-based MCQs tests (231 items). Post-test item analysis were recruited from the assessment office after taking the permission of the vice dean of academic affairs for research purposes. Item analysis parameters were used to assess the quality of the online MCQs tests. Item analysis parameters of the online MCQs test were compared with the item analysis parameters of the paperbased tests. Paper-based tests assessed the cohort of students in semester one before COVID 19 pandemic lockdown. For an MCQ examination, the psychometric parameters used for comparison were Kuder- Richardson formula 20 (KR20) reliability coefficient as an estimate of score reliability. KR20 more than 0.70 is acceptable for medical schools [13]. Computed item difficulty and discrimination indices were reflected in how items perform in the objectives of the assessment. Items with difficulty values of more than 0.7 were considered easy items; 0.3- 0.7 range was considered as moderate difficulty, and below 0.3 was considered as very difficult items [13]. Item discrimination values approved the ability of the item to discriminate between low and high student performance. Discrimination index above 0.2 considered satisfactory, negatively discriminating items are items that poor performers answer correctly more than good test performers. Zero discrimination represents the equal performance of poor and good candidates (Champlain 2010) The overall student’s performance on classical, face to face assessment was compared with the performance on an online assessment.

Statistics

Data was transformed from excel to SPSS ver.20 software for analysis. Continuous variables were mentioned as mean ± standard deviation, and qualitative variables were measured by frequency and (%). Chi-square test and t-test was used to measure the significant differences among the parameters for the courses; p-values<0:05 were considered as statistically significant.

Results

In two courses out of four, the test reliability of online MCQs tests improved significantly compared with paper-based tests. Online MCQs KR-20 Vs paper-based test for surgery (0.92 Vs 0.72, p-value=0.00001) and Pediatrics course KR-20 (0.83 Vs 0.62, p-value=0.04) respectively (Table 1).

Courses	Online MCQs test					Paper based MCQs tests				K-20 reliability	P-values for Discrimination Index	P-values for Difficulty	P-values for K-20 reliability
	Average Discrimination Index		Average Difficulty		K-20 reliability	Average Discrimination Index		Average Difficulty
	Average Discrimination Index		Index			Average Discrimination Index		Index
	Mean	S.D	Mean	S.D		Mean	S.D	Mean	S.D
Medicine	0.57	0.29	0.76	0.33	0.76	0.24	0.79	0.55	0.24	0.74	0.0484*	0.001*	0.84
Pediatrics	0.33	0.2	0.81	0.25	0.83	0.64	0.28	0.61	0.46	0.64	0.0000*1	0.0151*	0.07
Surgery	0.51	0.4	0.83	0.27	0.94	0.32	0.48	0.35	0.68	0.72	0.000001*	0.0006*	0.01*
Obstetrics and gynecology	0.68	0.39	0.89	0.26	0.54	0.34	0.73	0.64	0.27	0.75	0.0049*	0.0001*	0.04*

*S.D Standard Deviation
*p-value of <0.05

Tab. 1. Item analysis of online MCQs tests Vs paper-based tests

Medicine course KR-20 was not significantly different between online MCQs tests and paper-based MCQs tests. The obstetrics and gynaecology course was the only course that showed lower reliability of the online MCQs test (0.54) in comparison with paper-based MCQs test (0.75), p-value=0.04. Three courses out of four showed significantly increased average discrimination indices among the online MCQs items.

Medicine course (0.57 ± 0.29 vs 0.24 ± 0.79, p-value =0.048) surgery course (0.51 ± 0.40 Vs 0.35 ± 0.68, p-value =0.000001) and obstetrics and gynaecology course (0.68 ± 0.39 vs 0.34 ± 0.73, p-value =0.0049).

Pediatrics course average discrimination indices were reduced significantly in online MCQs test items after comparison with paper-based MCQs test items (0.33 ± 0.20 vs. 0.64 ± 0.28, p-value=0.00001). The average difficulty indices of all courses increased significantly in online MCQs tests compared with paper-based MCQs tests. Average difficulty indices of medicine (0.76 ± 0.33 vs 0.55 ± 0.24, p-value=0.001), pediatrics (0.81 ± 0.25 vs 0.61 ± 0.46, p-value=0.0151), surgery (0.83 ± 0.27 Vs 0.35 ± 0.68, p-value=0.0006) and obstetrics and gynaecology course (0.89 ± 0.26 Vs 0.64 ± 0.27, p-value=0.0001) for online MCQs tests Vs paper based MCQs tests respectively (Table 1). All courses demonstrated increased proportions of easy questions in online MCQs tests. Proportions of easy questions increased from (40% to 78%, p-value=0.0003), (14% to 72%, p-value=0.0001), (32% to 77% p-value=0.0002) and (36% to 90%, p-value=0.0001) in pediatrics, medicine, surgery and obstetrics and gynaecology respectively (Table 2).

Courses	Difficulty index of paper based MCQs tests items			Difficulty index of online MCQs tests items			p-values
Items	Easy	Moderate	Difficult	Easy	Moderate	Difficult	Easy	Moderate	Difficult
Pediatrics	40%	46%	14%	78%	11%	11%	*0.0003	*0.002	0.668
Medicine	14%	73%	13%	72%	8%	20%	*0.0001	*0.0001	0.394
Surgery	32%	57%	12%	77%	15%	8%	*0.0002	*0.0005	0.5967
Obstetrics and Gynecology	36%	58%	6%	90%	0%	10%	*0.0001	*0.0001	0.49

*p-value of <0.05

Tab. 2. Proportion of difficulty indices of online MCQs tests items Vs paper-based tests

The percentages of questions with good Discrimination index (≥ 0.2) had been decreased significantly in the online MCQs tests of surgery from (65% to 17%, p-value=0.0001) and obstetrics and gynaecology from (61% to 17%, p-value=0.0001) (Table 3).

Courses	Discrimination Index of the items of the MCQs paper-based test				Discrimination Index of the items of Online MCQs tests				p-values


Items	Negative	Zero	0-0.19	0.2 or above	Negative	Zero	0-0.19	0.2 or above	Negative	Zero Values	0-0.19	0.2 or above
Items	Discrimination	Values	0-0.19	Good questions	Discrimination	Values	0-0.19	Good Questions	Discrimination	Zero Values	0-0.19	Good Questions
Pediatrics	7%	28%	3%	62%	16%	2%	22%	60%	0.183	0.0006*	0.006*	0.84
Medicine	21%	24%	1%	54%	4%	20%	12%	64%	0.049	0.6817	0.012*	0.384
Surgery	5%	22%	8%	65%	3%	57%	23%	17%	0.689	0.002*	0.06	0.0001*
Obstetrics and Gynecology	0.06	0.27	0.06	0.61	0.03	0.57	0.23	0.17	0.54	0.006*	0.019*	0.001*

*P-value of <0.05

Tab. 3. Proportions of discrimination indices of online MCQs tests items Vs paper-based tests

We observed that out of a maximum raw score of 100, the mean student’s score for online assessment in three courses was significantly higher than that for traditional assessment.

Average student’s score in online assessment Vs traditional assessment were (94.10 ± 6.30 vs 74.90 ± 8.40, p-value=0.0001), (86.25 ± 7.06 vs 80.20 ± 9.40, p-value=0.0004) and (82.80 ± 4.71 Vs 69.19 ± 7.59, p-value=0.0001) for obstetrics and gynaecology, pediatrics and medicine respectively.

Surgery course results reported better student’s performance in traditional assessment than online assessment.

The average student's score out of 100 in traditional assessment vs. online assessment was (84.00 ± 5.14 Vs. 73.10 ± 8.60, p-value=0.0001), respectively (Table 4).

Courses	Traditional (face-to-face) assessment/100		Online assessment/100		p-value
	Mean	S.D	Mean	S.D	p-value
Obstetrics and gynecology	74.9	8.4	94.1	6.3	p<0.0001*
Pediatrics	80.2	9.4	86.25	7.06	p=0.0004*
Surgery	84	5.14	73.1	8.6	p<0.0001*
Medicine	69.19	7.59	82.8	4.71	p<0.0001*

*p-value of <0.05

Tab. 4. Average student’s score in traditional assessment Vs online assessment

Discussion

Our institution used the Blackboard system for learning and assessment; students and faculty were well trained to use it.

The Blackboard platform is accurate in the scoring of the tests as the computers eliminating the human error; hence, it guarantees the reliability of the online assessments, However Blackboard systems are only applicable for MCQs and/or short-answer questions [14]. Reliability is one of the psychometric parameters of the MCQs test that ensure the consistency of the results. Our data showed significant improvement of reliability in online MCQs tests in comparison to paper-based tests. Two out of the four courses demonstrated this improvement. In consistent with our observations, previously published studies noted the reliability and consistency of student’s scores in online tests [14, 15]. Difficulty and discrimination indices are good measures of the quality of MCQs tests. The difficulty and discrimination indices have to be used to achieve a good question banking [16]. We observed significantly increased average discrimination indices among three courses of the online MCQs items. However, the proportions of questions with good Discrimination index (≥ 0.2) decreased significantly in the online MCQs tests of surgery and obstetrics and gynecology courses.

This could be explained by the jump in an average difficulty index in these two courses' online tests. Our data showed Average difficulty indices of surgery from 0.35 ± 0.68 in paperbased test to 0.83 ± 0.27 (p-value=0.0006) and obstetrics and gynecology course from 0.64 ± 0.27 in paper-based test to 0.89 ± 0.26 (p-value=0.0001). In consistent with our findings, the Malaysian study reported that the discrimination power of the test item was reduced at the difficulty level above 70% [17]. The current study demonstrated increased proportions of easy questions in online MCQs tests. All courses, similar to our results, in a previous study, two online MCQs tests noted to have increased ease of test items. Increased proportions of easy items in our online MCQs tests might be due to cheating event. E proctoring system was not used in our exam. Moreover, the duration of the tests was relatively long, which offer a chance of small group discussion before answering the question. Careful interpretation of difficulty and discrimination indices is essential to build question bank. In a previous published study, when very difficult and very easy questions were removed, the relationship between difficulty and discrimination indices became linear, the easy questions gained a higher discriminatory value [18].

Our MCQs composed of one correct answer (the key) and three incorrect distractors; these distractors considered functioning if distracted more than 5% from the right answer.

A published study in Bahrain stated that the reduction in the number of non-functioning distractors improved the quality of the MCQs test [19]. The function of distractors was not assessed in our study, and this considered a limitation of this study.

In the present study, we investigated the effect of a change from face to face assessment in theoretical and practical to online assessment on student performance.

Previous published studies including one study in Saudi Arabia noted that the results of online tests and paper-based tests were not significantly different [20].

Our observations have shown that the mean scores of virtual education in theoretical tests and OSCE were higher than the traditional education group in three courses, recent Iranian study involving fourth-year dental students of Shiraz University reported similar findings [21]. In consistent with our findings, the mean score for online tests was significantly greater than that for the paper-based test. Online assessment experience in our institution was encouraging; item analysis of MCQs in this study approved the reliability and discrimination of online tests compared with paper-based tests. Previous Saudi studies concluded that students preferred paper-based tests; a significant proportion of students preferred online examinations in view of automatic results delivery, feedback, and time management. More studies concern perceptions of our faculty and students on this online assessment experience are needed in the future.

Conclusion

We studied the impact of the sudden change in the assessment as a result of the COVID-19 pandemic. Online MCQs approved to be more reliable, better discrimination ability, but easier than paper-based examination. The overall student’s performance in theoretical and practical assessment was significantly improved in online assessment. The only weakness observed in our data was the increased easy items in the online MCQs test. This might be attributed to cheating events; we recommend implementing e proctoring, minimizing exam time, and randomization and question bank to improve exam quality.

References

O’Byrne L, Gavin B, McNicholas F. Medical students and COVID-19: the need for pandemic preparedness. J Med Ethics. 2020;46:623-626.
Alsafi Z, Abbas AR, Hassan A, Ali MA. The coronavirus (COVID-19) pandemic: Adaptations in medical education. 2020;78:64-65.
Ahmed H, Allaf M, Elghazaly H. COVID-19 and medical education. Lancet Infect Dis. 2020;20:777-778.
Boitshwarelo B, Reedy AK, Billany T. Envisioning the use of online tests in assessing twenty-first-century learning: a literature review. Res Pract Technol Enhanc Learn. 2017;12:1-16.
David N. E-assessment by design: Using multiple-choice tests to good effect. J Furth High Educ. 2007;31:53-64.
Costello E, Holland J, Kirwan C. The future of online testing and assessment: question quality in MOOCs. Int J Educ Technol High Educ. 2018;15:1-14.
Douglas M, Wilson J, Ennis S. Multiple-choice question tests: a convenient, flexible and effective learning tool? A case study. Innovat Educ Teach Intern. 2012;49:111-121.
Washburn S, Herman J, Stewart R. Evaluation of performance and perceptions of electronic vs. paper multiple-choice exams. Adv Physiol Educ. 2017;41:548-555.
Čandrlić S, Katić MA, Dlab MH. Online vs. paper-based testing: a comparison of test results. In: 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2014-Proceedings. IEEE Comput Soc. 2014;1:657-662.
Ali W, Balaha M, Kaliyadan F, Bahgat M, Aboulmagd E. A framework for a competency-based medical curriculum in Saudi Arabia. Mater Socio Medica. 2013;25:148.
Choi B, Jegatheeswaran L, Minocha A, Alhilani M, Nakhoul M, et al. The impact of the COVID-19 pandemic on the final year medical students in the United Kingdom: a national survey. BMC Med Educ. 2020;20:206.
Bennardo F, Buffone C, Fortunato L, Giudice A. COVID‐19 is a challenge for dental education-A commentary. Eur J Dent Educ. 2020.
Champlain AF. The item to discriminate between low- and high-ability candidates. A primer on classical test theory and item response theory for assessments in medical education. Med Educ. 2010;44:109-117.
Farzin S. Shervin F. Attitude of students towards e-examination system: an application of e-learning. Sci J Educ. 2016;4:222-227.
Shraim K. Online Examination practices in higher education institutions: learners’ perspectives. Turkish Online J Distance Educ. 2019;20:185-196.
Gamage SHPW, Ayres JR, Behrend MB, Smith EJ. Optimizing Moodle quizzes for online assessments. Int J STEM Educ. 2019;6:1-14.
Mitra N, Bindal U, Koshy S. Analysis of quality of test items and students’ perception of the online formative tests in Anatomy. J Contemp Med Educ. 2015;3:150-154.
Subramaniam AVV, Gupta R, Singh N, Ravishankar M. Usefulness of multiple choice question-based online formative assessments for determination of item statistics. J Res Med Educ Ethics. 2019;9:119.
Kheyami D, Jaradat A, Al-Shibani T, Ali FA. Item analysis of multiple-choice questions at the department of paediatrics, Arabian Gulf University, Manama, Bahrain. Sultan Qaboos Univ Med J. 2018;18:e68-e74.
Soltanimehr E, Bahrampour E, Imani MM, Rahimi F, Almasi B, et al. Effect of virtual versus traditional education on theoretical knowledge and reporting skills of dental students in radiographic interpretation of bony lesions of the jaw. BMC Med Educ. 2019;19:33.
Al-Qdah M, Ababneh I. Comparing online and paper exams: performances and perceptions of Saudi students. Int J Inf Educ Technol. 2017;7:106-109.