Completion Date
Summer 8-10-2019
Document Type
Thesis
Degree Name
Master of Science (MS)
Program or Discipline Name
Analytics
First Advisor
Siamak Aram
Second Advisor
Kevin Huggins
Abstract
Abstract
Students are chronically absent when they miss at least 15 days of the school year. Past researchers have identified income and environment as factors that affect school absenteeism. Alabama is a poor state with a high crime rate. The hypothesis for this research is that the absenteeism of female students in Alabama is high. Do we reject or fail to reject this hypothesis. If we fail to reject this hypothesis, then what other factors can affect absenteeism in schools? How can we best predict the absenteeism of female students in Alabama? What is the effect of bad data on predictive models? This research aims to answer the above questions.
Machine learning has proven to be one of the best methods in making good predictions for better decision-making. Different machine learning models are used for making predictions, but the outstanding question is how to identify the best model to predict dependent features. Are features very essential when considering the type of model to be used for prediction?
This research aims to analyze and compare the percentage of prediction and accuracy of prediction using supervised machine learning models while considering features. Based on findings, the recommendation for the best model to predict female students' absenteeism in Alabama school districts was made. Also, the hypothesis that the absenteeism of female students in Alabama is high was rejected. This research is limited to only supervised machine learning models. Information on male students in Alabama is not included in this research
Keywords: Absenteeism, Machine Learning, Predictive Model, Naïve, Random Forest, Boruta
Recommended Citation
Okelana, F. (2019). PREDICTING ABSENTEEISM OF FEMALE STUDENTS IN ALABAMA. Retrieved from https://digitalcommons.harrisburgu.edu/anms_dandt/2