https://doi.org/10.31449/inf.v47i1.4297 Informatica 47 (2023) 97–108 97 A Prediction Model for Student Academic Performance Using Machine Learning Harjinder Kaur 1 ,Tarandeep Kaur 1 , Rachit Garg 2 1 Scool of Computer Applications, Lovely Professional University, Phagwara, 144401, India 2 COS, School of Computer Science and Engineering, Lovely Professional University, Phagwara, 144401, India. E-mail: Harjinder.12962@lpu.co.in, Tarandeep.24836@lpu.co.in, rachit.garg@lpu.co.in Keywords: academic performance, decision tree education data mining, ensemble model, naïve bayes, performance prediction Received: July 15, 2022 Abstract: Academic data mining impacts a large number of educational institutions, significantly, playing a prime role in accumulating, studying, and analyzing the academic data. The accumulated academic data can be processed and analyzed for various purposes. It can be used for predicting the student academic performance and thereby broadening the retention rate of academic institutions. The prediction of students’ academic performance at the initial stage helps the students to identify their lacking subjects such that they can focus more on their deficient subjects and improvise their academic performance. Currently, numerous machine learning techniques are being used by the academic institutions to extract, analyze and predict the students’ academic performance and identify the fast and slow learners. This paper proposes an ensemble model, using the voting method for preclusive prediction of the student academic performance. The predicted results are being further utilized by the poor performers to concentrate more on their deficit courses. Accordingly, the instructors can focus on creating and implementing novel strategies or amending the existing pedagogical tools and approaches to aid the slow learners in improvising their performance. The proposed model has been tested on the academic data of an educational institution using the RapidMiner tool. The results depicts that how the number of E grades proportionally affects the performance of the students in academics. The proposed ensemble model generates the predicted results with an accuracy of 90.83%. Povzetek: Predstavljena je metoda strojnega učenja za napovedovanje učnega uspeha. 1 Introduction Academic Data Mining (ADM) has obtained astounding inquisitiveness in the recent years. The need for the analysis and assessment of the factors impacting the academic performance of students has embellished the demand for Academic Data Mining (ADM) or Educational Data Mining (EDM) [1]. Significantly, such factors can include student academic performance measured in terms of final grades obtained, course attendance, mid-assessment marks, etc. [2]. ADM plays a pivotal role in analyzing student performance based on the above-said factors and thereby classifying them into fast and slow learners. Additionally, ADM can also aid in providing subtle suggestions and recommendations for both the instructors as well as the students in improvising their performance. This can involve processes such as academic performance prediction and academic performance recommendations. Both the processes are essential for every educational institution as their reputation is centered upon the academic accomplishments of students [3]. The primary goal of academic performance prediction of learners is the identification of students at risk in their initial stage of career. This identification helps the instructor to analyze the factors affecting the performance such that corrective actions can be taken for the students at risk of lower achievement levels. Moreover, the timely analysis of weak performers benefits the academic institutions in increasing their retention rate [4]. The academic performance of students is predicted using different supervised learning techniques such as classification and prediction. Learning Analytics (LA) plays a very significant in the field of education. The motivation for using LA by academic institutions is to analyze the patterns obtained from the educational data after prediction. So, after the academic performance prediction, LA in association with ADM is used to generate effective results that leads to the categorization of different types of students [5]. This research proposes a model that serves as an alarming structure for educational organizations. The proposed model can be used by the students to discover and concentrate on their disconcerting subjects while the faculties can focus on improving their learning strategies towards such students. Currently, many machine learning algorithms are available for envisaging student educational performance and ADM [6, 7]. The proposed model is also an ensemble machine learning-based model that predicts the student’s academic performance using an ensemble of machine learning algorithms, 98 Informatica 47 (2023) 97–108 H. Kaur et al. Decision Tree, Naïve Bayes, and K-Nearest Neighbor. For performance prediction, the records are collected from the academic institution which is then pre- processed to eliminate anomalies so that only the data which is helpful for the analysis purpose is anomalies free. The cleaned data is then applied to the model and thereafter produced the predicted results. 1.1 Motivation for the work Currently, the majority of the academic institutions face challenges related to the decreasing student academic performance and thereby rising student dropout ratio. This poses an alarming and stake- compromising situation for the academic institutions. They consistently struggle to maintain the retention rate of the students. Similarly, the decline in the student academic performance impacts a student physiologically, economically and socially. Some students get demotivated and resultantly think of discontinuing their degree. This leads to the increase in the dropout rate for the academic institution. Such circumstances are challenging for the teaching fraternity as well since the failure or decrease in the student academic performance puts a question mark on overall conduct of the teacher. It raises concerns on the teaching capabilities and pedagogical approach followed by him/her. The proposed ensemble prediction model has been developed considering such circumstances. It helps in the reduction of drop-out rates and results in improving the retention rate of students. It provides the solution for the increase drop out issue faced by institutions by predicting the academic performance of the students precisely and proficiently. The proposed model has been trained using the historic data of students and then tested using the testing dataset. The predicted results classify the students into slow learners and fast learners. The proposed model serves as an alarming system for slow learners, the students who are at academic risk at the early stage of their carrier along with the courses affecting their performance. The early identification of students at academic risk helps the instructors to create new pedagogies, strategies and special academic counselling sessions for the weak students. Additionally, such initiatives helps the slow learners to concentrate more on their weak areas so that they can perform well in their academics and thereby improvising their performance. The improvement in the academic performance at early stage helps the slow learners to complete their degree on time that further improves the retention rate which further improves the repute of academic institutions. Overall, the proposed model is useful for academic stakeholders including learners, instructors/ teachers and educational institutions. It benefits the learners in their self-assessment on academic background by providing the reasons which are responsible for their academic downfall. The model assist the instructors to keep track of the academic growth of the students and helps them to provide special attention towards the slow learners. The predicted results of the proposed model helps the educational institutions to devise new strategies and steps for promoting and educating slow learners for their performance improvement thereby increasing the retention rate of the institutions. The rest of the paper has been divided into 5 sections. Section 2 lists a tabular representation of the existing techniques used for predicting students’ academic performance. The proposed model has been elaborately discussed along with its structure and working in Section 3. Section 4 covers the empirical analysis of the proposed model on the collected data. The last fragment in the paper concludes with a brief description of why the ensemble approach has been preferred for predicting students’ academic performance. It also concludes with an insight into the futuristic extensions that can be made in the proposed model. 2 Literature review The existing educational research shows that the intersection of academic data and machine learning techniques is advantageous for carrying out interdisciplinary work [8]. Research on educational data helps in the identification and selection of various factors revealing argumentative and empirical academic results. The implementation of various machine learning techniques on collected academic records can help in developing dynamic alarming systems. Such systems will be beneficial for both instructor/tutor as well as learners to work in their lacking areas [9, 10]. Subsequently, the learners can improvise their academic performance based on the feedback of predicted results of alarming systems such that they can complete their respective degrees on time and with minimum dropouts or backlogs. Table. 1 illustrates the review of literature along with the techniques used and objectives of each model, and Figure. 1 shows the categorization of different prediction models based on machine learning. A Prediction Model for Student Academic Performance… Informatica 47 (2023) 97–108 99 Table 1: Existing academic performance prediction models. Prediction Models Machine Learning Technique(s) Used Core Objective [1] Decision Trees, Support Vector Machines, Naive Bayes, Bagged Trees, and Boosted Trees The early segmentation of students based upon their performance in the first year which helps in achievements of better results during the course completion. [3] Decision Trees To categorize the students based upon their performance. [4] Logistic Regression, Neural Networks, Random Forests. To identify the various challenges faced by the student in their first educational year based upon student registration data. [5] Decision Trees, Rule and Fuzzy Rule Induction Methods, and Neural Networks. To predict the marks of university students in their final exams. [11] Logistic/Linear Regression, Matrix Factorization To use educational data for an intelligent tutoring system. [12] Linear Regression, Neural Networks, Support Vector Machines To predict the student score based upon their mid-term marks. [13] Neural Networks, Random Forests, and Decision Tree To predict the student academic performance of first-year students [14] Linear regression, neural networks, support vector machines, decision trees, naive Bayes, k-nearest neighbor To provide various courses based upon the existing data which help in improving the academic performance of a student. [15] Decision tree, Gradient boost algorithm, and Naïve Bayes To identify the weak students and provide special counselling for their betterment. [16] SVM and Naïve Bayes To predict the student’s academic performance using Naïve Bayes and compare the predicted results with the results generated by SVM. [17] K-Nearest Neighbor, Naïve Bayes, Decision Tree, and Logistic Regression The main objective of the study is to predict the student’s academic performance along with the factors affecting their performance. [18,19] Decision Tree To assess the student’s academic performance using the decision tree. The predicted results were used to provide a recommendation to weak students so that they can improve their performance which lowers the failure rate. [20] Naïve Bayes, Neural Network, and Decision Tree The main objective is this research is the usage of various data mining techniques to predict and analyse the academic performance of students founded from the academic data available by a participated forum. [21,22] Random Forest, Neural Networks, SVMs, and Regression Techniques EDM was used to identify the weak students, based upon their performance. It also helps in the identification of various factors responsible for affecting and deteriorating the academic performance of the students. 100 Informatica 47 (2023) 97–108 H. Kaur et al. Figure 1: Categorization of existing prediction models based on the machine learning techniques used. 3 Proposed ensemble model The primary goal of creating an ensemble model helps in the production of more accurate results as compared to the accuracy of results produced by individual classifier. The proposed model uses the ensemble of heterogeneous classifiers. The ensemble model proposed here accepts the output from multiple classifiers such as decision tree, Naïve Bayes, and K-NN. The proposed ensemble combines the output of heterogeneous classifies using voting approach which resultantly produces the final prediction results. The idea of ensemble approach works if and only if all the selected classifiers producing different class labels rather than agreeing on the same decision. Figure 2 depicts the flow of ensemble method. Figure 2: Basic ensemble approach for prediction. The proposed ensemble model performs classification of the students based on their academic performance considering their marks in the courses inclusive of their attendance in each course. The data for classification has been collected from sources such as using Google form and a designed interface. Certain attributes generate irrelevant values such as incomplete data, duplicate data, naming identification problems and hence have no participation in the classification process. Thus, such irrelevant attributes were stricken out of the classification process else the use of these attributes could have increased the classification errors and complexity of the selected algorithm. Conclusively, this helped in making the predictions more accurate. The proposed ensemble model has been designed to predict student academic performance using an ensemble of machine learning algorithms. The primary objective of designing an ensemble model is that every selected classifier must be complementary to each other in the context of a judgment so that further accuracy can be achieved [23]. The model intends to compute the student academic performance (in terms of Cumulative Grade Points) and achieve an early separation of learners Techniques Used Decision Tree [1],[9],[13][ 14],[15],[1 6],[18],[19] ,[20],[26] Support Vector Machines [1],[12],[14 ],[17],[21] Naive Bayes [1],[14][15] ,[17][18],[2 0] Neural Networks [9],[12],[13 ],[14],[22],[ 27] Random Forest [13],[22],[2 7] Bagged Trees and Boosted Trees [1],[15] Linear Regression [11],[12],[14] ,[22] K-Nearest Neighbour [14],[18] Logistic Regression [18],[28],[27] A Prediction Model for Student Academic Performance… Informatica 47 (2023) 97–108 101 segregating them into slow and fast learners based upon their educational performance. 3.1 Working of the proposed ensemble model When it comes to predicting student academic performance, a single classification model might not produce the appropriate outcome. Moreover, the single classification models suffer from high variance [24, 25].In the proposed ensemble approach, the output of multiple models has been combined which further enhances the overall accuracy of prediction results. There are some ensemble approaches like bagging, boosting, stacking, and voting with each having its pros and cons. In the proposed model, voting technique has been used because the prediction results have been produced by combining the output of multiple classifiers. The results generated by the voting approach are better in comparison with a single classifier because in voting the decision depends upon the majority vote [26]. The choice of voting approach has been made because it produces predicted results with low variance in comparison to the variance produced by single classification model [27, 28]. The students are the key component of the proposed ensemble model as they provide their academic details as input. The academic details comprise their courses, marks/grade in each course, and attendance in individual courses as these academic parameters are considered as the crucial factors for measuring the academic performance of students. An interface has been designed to get the academic details of the students that are used for the model testing. The interface supports heterogeneous devices where the learners can provide their academic inputs by using either their smartphones, laptops, or even their desktops too. The students input their educational details through the designed student interface. Such student academic data is stored in an academic database and is the core substantial asset for the prediction process. The stored data formulates different student records and is pre- processed, and then it is used to train the proposed model. During the pre-processing stage, the academic records have been integrated followed by checks to look for any inconsistencies, such as duplicates, missing values, etc. Consequently, the pre-processing stage generates the refined data which is further used to train the proposed ensemble model. In the proposed ensemble model, the training dataset is used for the generation of rules which are being used for the prediction as shown in Figure 3. The testing dataset is being applied to constructed ensemble model to get the predicted academic performance based upon the rules generated using the training dataset. Figure 3: Proposed ensemble model. 102 Informatica 47 (2023) 97–108 H. Kaur et al. The predicted results of the model are beneficial for both the instructor as well as the learner. It enables the instructor in scrutinizing the student's academic results and derive their performance from them which can be further used to take certain novel strategic actions for improvising the performance of slow learners. Concomitantly, this helps the recognizing the students at academic risk at the stage of academics which helps in augmenting the student retention rate and completion of degree on time. Also, the predicted academic performance is used as feedback by the students. 3.2 Mathematical formulation and proposed algorithm Analytically, the proposed algorithm helps to categorize the different types of learners into strong and weak learners. The differentiation identifies the weak learners and also the courses in which they have underperformed. Subsequently, this helps the weak learners to concentrate more on such subjects they were lagging and resultantly improvise their performance. Identification of weak performers at early stage guides them to perform well in their end term exams. Mathematically, in order to categorize the students, their 𝐶𝐺𝑃𝐴 has been calculated by considering their grade points and credit for each course. For calculating the 𝐶𝐺𝑃𝐴 , the student’s grade points have been initially computed from the marks obtained in each course as shown in Table. 2. The proposed model has based on certain assumptions which are as follows: • The 𝐶𝐺𝑃𝐴 of students has been calculated by considering the grade points of each course. In the proposed model for 𝐶𝐺𝑃𝐴 calculation, the grade point consideration is at a 10 scale. • The results of the proposed ensemble model used by 2 nd -semester students further recommend the courses because in majority of the universities the selection option has been started from the second year onwards. • The number of subjects considered for the calculation of 𝐶𝐺𝑃𝐴 was 8. • The total marks of various courses inclusive of attendance marks. • For predicting the student academic performance the grade consideration is from A-E. The following table shows the description of grade points and grades based upon the marks: Table 2: Grade as per marks range. Range of Marks Grade Point Grade 90 - 100 9.0 - 10.0 A+ 80 - 89 8.0 - 8.9 A 70 - 79 7.0 - 7.9 B+ 60 - 69 6.0 - 6.9 B 50 - 59 5.0 - 5.9 C 40 - 49 4.0 - 4.9 D < 40 0.0-3.9 E Objective Function: Map (𝑆𝑡 𝑢 𝑖,𝐶𝑜 𝑢 𝑗 ,𝑀 𝐶 𝑖𝑗 𝑦𝑖𝑒𝑙𝑑𝑠 → 𝑐𝑔𝑝𝑎 ) (1) Where: 𝑆𝑡𝑢 : Students 𝐶𝑜𝑢 : Courses 𝑀 𝐶 𝑖𝑗 : Marks obtained by 𝑖 𝑡 ℎ student in 𝑗 𝑡 ℎ course. 𝐶𝐺𝑃𝐴 : Commulative Grade Point Assessment. 𝑖 : Index of Students𝑖 ∈𝑆 𝑤 ℎ𝑒𝑟𝑒 𝑆 ={1≤𝑖 ≤𝑛 } 𝑆 = 𝑆𝑒𝑡 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑠 𝑎𝑛𝑑 𝑛 𝑖𝑠 𝑡 ℎ𝑒 𝑚𝑎𝑥𝑖𝑢𝑚𝑢𝑚 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑗 : Index of Course and 𝑗 ∈𝑅 𝑤 ℎ𝑒𝑟𝑒 𝑅 ={1≤𝑗 ≤𝑚 } 𝑅 = 𝑆𝑒𝑡 𝑜𝑓 𝐶𝑜𝑢𝑟𝑠𝑒𝑠 𝑎𝑛𝑑 𝑚 𝑖𝑠 𝑡 ℎ𝑒 𝑚𝑎𝑥𝑖𝑢𝑚𝑢𝑚 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑢𝑟𝑠𝑒𝑠 𝑀𝐶 𝑖,𝑗 : Marks in each course such that𝑖 ∈𝑆 𝑎𝑛𝑑 𝑗 ∈𝑅 𝑤 ℎ𝑒𝑟𝑒 𝑆 ={1≤𝑖 ≤𝑛 } 𝑎𝑛𝑑 𝑅 ={1≤𝑗 ≤𝑚 } For accomplishing the objective function, a map function has been devised. The mapping function predicts the performance of the students by calculating their CGPA based upon the academic details given by students. Here, the map is the function that maps the𝑖 𝑡 ℎ students in to their corresponding CGPA by considering A Prediction Model for Student Academic Performance… Informatica 47 (2023) 97–108 103 their course and marks in each course. The general formula for the calculation of 𝐶𝐺𝑃𝐴 is depicted in Eq. (2). CGPA= ∑(G∗CR) ∑CR (2) Where: 𝐶𝐺𝑃𝐴 −Cumulative Grade point Average 𝐶𝑅 −Represents the credit score of a course 𝐺 −Represents Grade points obtained by the student in a course. The proposed model is composed of a set 𝑆𝑡 = {𝑠 𝑡𝑢 1 ,𝑠 𝑡𝑢 2 ,𝑠 𝑡 𝑢 3 ,……..𝑠 𝑡𝑢 𝑛 } of n students such that 𝑆𝑡𝑢 ={𝑠 𝑡𝑢 𝑖|1≤𝑖 ≤𝑛 } specifies the number of students; a set 𝐶𝑜𝑢 ={𝑐𝑜 𝑢 1 ,𝑐𝑜 𝑢 2 ,𝑐𝑜 𝑢 3 ,……..𝑐𝑜 𝑢 𝑚 } represents the 𝑚 different subjects such that 𝐶𝑜𝑢 = {𝑐𝑜 𝑢 𝑗 |1≤𝑗 ≤𝑚 }. Let 𝑔 𝑖𝑗 denotes the grade points obtained by the 𝑖 𝑡 ℎ student in𝑗 𝑡 ℎ course. If 𝑐𝑔𝑝 𝑎 𝑖 is the CGPA of 𝑖 𝑡 ℎ student, then it can be obtained by matrix algorithm specified in Eq. (3): 𝐶𝑔𝑝 𝑎 𝑖 =[ 𝑐𝑔𝑝 𝑎 1 𝑐𝑔𝑝 𝑎 2 ⋮ 𝑐𝑔𝑝 𝑎 𝑛 ]= 1 ∑ 𝑐 𝑟 𝑗 𝑚 𝑗 =1 [ 𝑔 11 𝑔 12 𝑔 21 𝑔 22 ⋯ 𝑔 1𝑚 ⋯ 𝑔 2𝑚 ⋮ ⋮ 𝑔 𝑛 1 𝑔 𝑛 2 ⋮ ⋯ 𝑔 𝑛𝑚 ][ 𝑐 𝑟 1 𝑐 𝑟 2 ⋮ 𝑐 𝑟 𝑚 ] (3) Where 𝑐 𝑟 𝑗 denotes the credits corresponding to 𝑗 𝑡 ℎ course∀ 1≤𝑗 ≤𝑚 , and the proposed algorithm is shown as follows: Objective Function: Mapping of student with their CGPA by considering their program courses and marks in individual course which affect student’s academic performance consists of {Students, Courses, Marks in each course} Input: Student academic details Output: Student categorization into weak and strong learners; Special inputs to weak students for improving their performance. 1. Perform preprocessing of collected data. 2. Use the pre-processed data as a training dataset. 3. Training dataset is used to train the model for the generation of rules. 4. Testing data is used for the prediction of performance using trained model; {𝑠𝑡 𝑢 𝑖,𝑐𝑜 𝑢 𝑗 ,𝑀 𝐶 𝑖𝑗 } has been applied to map to get 𝑐𝑔𝑝 𝑎 𝑖 , using Eq. (1). 5. (a) Eq. (2) specifies the general formula for the calculation of 𝐶𝐺𝑃𝐴 . (b) 𝑐𝑔𝑝 𝑎 𝑖 is computed using Eq. (3) where 𝑐𝑔𝑝 𝑎 𝑖 is the 𝐶𝐺𝑃𝐴 of individual student. 6. The calculated𝐶𝐺𝑃𝐴 helps in the identification of weak and strong learners. 7. The predicted results are being used by the: Learners (to improve their performance). Instructors (to provide suggestive measures to poor performers) 4 Results The experimental results have been obtained using the data from the department of computer science of an academic institution. The dataset contains 400 records of current students belonging to different sections of the computer science department. The dataset has been divided using the split operator, where 70% of the entire data is being used for training the model and the rest 30% is used for the testing of an ensemble model. Major attributes considered for analyzing the performance are the attendance in each course, the grade obtained in each course, the overall CGPA of the student, number of pending E grades. Figure 4 shows the results generated by the proposed ensemble model. 104 Informatica 47 (2023) 97–108 H. Kaur et al. Figure 4: Results generated by ensemble method. Performance vector shown in Tab. 3 proves the accuracy of the ensemble method using a vote operator that uses the majority vote from the base learners for predicting the results. The ensemble method has shown an accuracy of 90.83%. In the confusion matrix, 0 represents good performers and 1 denotes bad performers. For fast learners, 82 instances are correctly identified whereas 10 are incorrectly identified. Similarly, for bad performers, 27 instances are correctly identified whereas 1 is incorrectly identified. Table 3: Performance vector (ensemble method). true 0 true 1 class precision pred. 0 82 10 89.13% pred. 1 1 27 96.43% class recall 98.80% 72.97% Figure 5: Relationship between actual and predicted performance. 82 1 10 27 P R E D . 0 P R E D . 1 pred. 0 pred. 1 true 1 10 27 true 0 82 1 P R E D I CT E D P E RF O RMAN CE BY P RO P O S E D E N S E MBLE MO D E L true 1 true 0 A Prediction Model for Student Academic Performance… Informatica 47 (2023) 97–108 105 The scattered 3D plot view of the relationship between actual and predicted results generated by the proposed ensemble model is illustrated in Figure 5 where x axis represents the RegdNo and the value column signifies performance in terms of slow (1) and fast learners (0). Figure. 6: Actual and predicted results based on E grades. Figure 7: Predicted performance based upon pending E grades. The actual and predicted results based on E-grades has been depicted in Figure 6 and 7. Figure 8 illustrates the registration-wise predicted performance of students after the re-appear exam has been given. The blue colour circle indicates good performance that is signified by 0 whereas the green colour represents the poor performance of the student which is denoted using 1. The results show that the more the number of re-appears a student is having considered under the category of a poor performer. Therefore, corrective actions for such students are required to be taken on time by the student as well as from the instructor. 106 Informatica 47 (2023) 97–108 H. Kaur et al. Figure 8: Predicted results considering reappear exam given. 5 Conclusion and future directions Presently, the academic educational institutions are facing difficulty in sustaining the low retention rate of students. The task of maintain the retention rate can only be achieved by reducing the drop-out ratio of students. The high student retention rate depends significantly on the student academic performance. It becomes highly important for the academic institutions to predict the student performance for subsequent sessions such that retention rate can be maintained as well student performance can be improved. Also, the prediction of student academic performance at an early phase of their degree helps to do self-assessment for their downfall so that the student can do the corrective actions to improvise his/her on time. The model is helpful for the instructors as well who can verify and revise their pedagogical approaches if required. A lot of research is being carried out to develop models for predicting the student academic performance using academic data mining strategies. Various machine learning techniques have been used to develop such predication models that act as an aid for the academic institutions. The paper proposes an ensemble model based on machine learning techniques, Decision Tree, Naïve Bayes, and K-NN classification algorithms catering to such problems. It helps in identifying the weak learners by predicting their performance based upon the historical academic data. The model has been implemented on a gathered dataset and achieves an accuracy 90.83%. The research work presented in this paper can be further extended to develop a recommender system that will use the performance prediction results and subsequently recommend course-specific elective courses to the students. Such recommendations tend to augment student skills depending on their performance. Additionally, a recommender system can be developed that offers students interest-oriented or choice-driven suggestions regarding course selection considering and mapping the student’s previous performance along with the student choice. The major research for the academic performance prediction of the students considers the direct factors (such as courses, marks in each course, attendance and grades etc.). The incorporation of the in- direct factors (such as physiological, behavioral, economic and social etc.) that affect the student academic performance can be carried out further. Recently, several edtech companies have emerged during COVID 2019 era. Such companies are engaged in the practice of incorporating Information Technology (IT) and digital tools for the student learning and engagement. The edtech companies are now using predictive analytics for mining student academic records, enrollment, attendance, class engagement, etc. The edtech companies can use the prediction as well as recommendation models to help the students by suggesting the appropriate course based upon their predicted performance. Acknowledgement: Mohamed Alwanin would like to thank Deanship of Scientific Research at Majmaah University for supporting this work under Project No. R- 2022-###. The authors deeply acknowledge the Researchers Supporting Program (TUMA-Project-2021- 14), AlMaarefa University, Riyadh, Saudi Arabia for supporting steps of this work. Funding Statement: Mohamed Alwanin like to thank Deanship of Scientific Research at Majmaah University for supporting this work under Project No. R-2022-###. This research was supported by Researchers Supporting Program (TUMA-Project-2021-14), AlMaarefa University, Riyadh, Saudi Arabia. Conflicts of Interest: Authors declare that there is no conflict of interest associated with this study. A Prediction Model for Student Academic Performance… Informatica 47 (2023) 97–108 107 R efer ence s [1] V.L. Miguéi, A. Freitas, P.J.V. Garcia and A. Silva, "Early segmentation of students according to their academic performance: A predictive modelling approach," Decision Support System,vol. 6, no. 5, pp. 65-78, 2018. [2] S. J. Lakshmi and M. Thangaraj, "Recommender system for stimulating the learning skill of slow learner in higher educational institution using EDM," International Journal on Recent Technolofical Engineering, vol. 5, pp. 98-109, 2019. [3] D. T. Ha, P. T. T. Loan, C. N. Giap and N. T. L. Huong, "An empirical study for student academic performance prediction using machine learning techniques," International Journal of Computer Science and Information Security (IJCSIS), vol. 18, no. 3, pp. 75-82, 2020 [4] R. Umer, T. Susnjak, A. Mathrani and S. Suriadi,"On predicting academic performance with process mining in learning analytics," Journal of Resource Innovation and Teach Learnearning, vol. 78, pp. 155-168, 2017. [5] O.H.T. Lu, A.Y.Q. Huang, J.C.H. Huang, A.J.Q Lin, H. Ogata et al., "Applying learning analytics for the early prediction of students’ academic performance in blended learning," Educational Technological Socoety,vol. 55, pp. 111-123, 2018. [6] O. Viberg, M. Hatakka, O. Bälter, A. Mavroudi,"The current landscape of learning analytics in higher education,"Computers in Human Behavior, vol. 18, pp. 1001-1222, 2018. [7] M. S. B. M. Azmi and I. H. B. M. Paris, “Academic performance prediction based on voting technique,” in 2011 IEEE 3rd International Conference on Communication Software and Networks , Calcuta, India, pp. 24-27, 2011. [8] Tarandeep Kaur, Harjinder Kaur, "Machine Learning: An Internal Review", Journal of Emerging Technologies and Innovative Research, 5, no. 11, 6, 2018. [9] C. Romero, P.G. Espejo, A. Zafra, J.R. Romero and S. Ventura, "Web usage mining for predicting final marks of students that use Moodle courses," Computer Application in Engineering and Education,vol. 65, pp. 555-578, 2013. [10] A. M. Shahiri and W. Husain, "A review on predicting student's performance using data mining techniques," Procedia Computer Science, vol. 72, pp. 414-422, 2015. [11] N. Thai-Nghe, L. Drumond, A. Krohn-Grimberghe and L. Schmidt-Thieme, "Recommender system for predicting student performance," Procedia Computer Science, vol. 20, pp. 55-65, 2010. [12] S. Huang and N. Fang, "Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models,"Comput Education, vol. 55, no. 6, pp. 33-42, 2013. [13] M. Imran, S. Latif, D. Mehmood and M. S. Shah,"Student academic performance prediction using supervised learning techniques," International Journal on Emerging Technologies in Learning, vol. 77, pp. 102-120, 2019. [14] P. Strecht, L. Cruz, C. Soares, J. Mendes-Moreira and R. Abreu,"A Comparative Study of Classification and Regression Algorithms for Modelling Students’ Academic Performance,", in Proc. ICEDM, Noida, India, pp. 55-64, 2015. [15] P. Kamal and S. Ahuja,"An ensemble-based model for prediction of academic performance of students in undergrad professional course," Journal of Engineering Design and Technology, vol. 98, pp. 654-672, 2019. [16] V. Skrbinjek and V. Dermol, "Predicting students’ satisfaction using a decision tree,"Tert Education and Management,vol. 64, pp. 210-218, 2019. [17] Dr. Antino Marelino. (2014). Customer Satisfaction Analysis based on Customer Relationship Management. International Journal of New Practices in Management and Engineering, 3(01), 07 - 12. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/2 6 [18] Dr. Sandip Kadam. (2014). An Experimental Analysis on performance of Content Management Tools in an Organization. International Journal of New Practices in Management and Engineering, 3(02), 01 - 07. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/2 7 [19] Ms. Nora Zilam Runera. (2014). Performance Analysis on Knowledge Management System on Project Management. International Journal of New Practices in Management and Engineering, 3(02), 08 - 13. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/2 8 [20] Mrs. Leena Rathi. (2014). Ancient Vedic Multiplication Based Optimized High Speed Arithmetic Logic. International Journal of New Practices in Management and Engineering, 3(03), 01 - 06. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/2 9Kaur H, Kushwaha AS., “An elicit elucidation on the process of education data mining” , International Conference on Intelligent Computing and Control Systems, ICCS 2019. [21] S. Roy and A. Garg, "Predicting academic performance of student using classification techniques," in 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical, Computer and Electronics (UPCON), Korat, Thailand, pp. 568-572, 2017. [22] S. Poonam, S. Ahuja, V. Jaitly and S. Jain, “A framework to alleviate common problems from 108 Informatica 47 (2023) 97–108 H. Kaur et al. recommender system,"A case study for technical course recommendation," Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no.2, pp. 451-460, 2020. [23] A. Rajak, A. K. Shrivastava and V. Vidushi, “Applying and comparing machine learning classification algorithms for predicting the results of students,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no.2, pp. 419- 427, 2020. [24] H. Guruler, A. Istanbullu and M. Karahasan, "A new student performance analysing system using knowledge discovery in higher educational databases," Computer Education, vol. 6, no. 5, pp. 125-138, 2010. [25] A. Rajak, A. K. Shrivastava and V. Vidushi, “Applying and comparing machine learning classification algorithms for predicting the results of students,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no.2, pp. 419- 427, 2020. [26] A. Siddique, A. Jan, F. Majeed, A.I. Qahmash, N.N. Quadri et al., “Predicting Academic Performance Using an Efficient Model Based on Fusion of Classifiers,” Applied Sciences, vol. 11, no. 24, pp. 11845, 2021. [27] A. S. Hoffait and M. Schyns,"Early detection of university students with potential difficulties," Decision Support System, vol. 9, no. 5, pp. 5-20, 2017.