Paper: Model Comparison - NPORS US Voting Data

Paper: Model Comparison - NPORS US Voting Data

Paper Title: Comparison of Logistic Regression and Random Forest Models to Predict Voting Outcomes

Abstract

This paper compares logistic regression models with random forest models. The dataset examined using these models is the 2021 National Public Opinion Reference Survey (NPORS), and the models predict the respondent’s political party vote. The datasets and the two models’ backgrounds and summaries are presented. The implementation of these models is done in R. The final accuracy for the random forest model is slightly higher, by a nonsignificant amount. Model interpretation and training time is shown to be better for the logistic regression model. The variable importance plots reveal key features of the dataset such as the respondent’s ideal government size. Newer model selection algorithms for logistic regresssion and more comprehensive parameter optimization for random forests are identified as potential improvements.

To view the paper (PDF), click here.