Database Normalization and Performance Prediction Using Linear Regression on a Student Dataset

Authors

  • Laveena Ganwani Assistant Professor, Department of Computer Science, Sophia College (Autonomous), Ajmer, Rajasthan, India

DOI:

https://doi.org/10.63671/ijsssr.v3i4.542

Keywords:

Database Normalization, Linear Regression, Feature Engineering, Educational Data Mining, Student Performance Prediction

Abstract

The practical integration of database management systems (DBMS) and machine learning to forecast students' academic performance is presented in this paper. In order to remove redundancy and guarantee reference integrity, the Student Performance Dataset, which comprises 1000 records, is first configured and normalized using SQL up to Third Normal Form (3NF). Following normalization, tables are merged and feature engineering methods, such as one-hot encoding of classified attributes and target variable (mean score) generation, are used. The etched features are then used in a linear regression model to forecast the students' average grades. MAE, RMSE, and R2 measurements are used to evaluate sample performance after the dataset is split into training and test sets (80:20 ratio). The findings demonstrate a moderate degree of predictive power, suggesting that academic performance is influenced by variables like exam preparation courses, parents' educational attainment, and lunch type. This research shows how machine learning workflows can be efficiently supported by structured database design in a completely real-world setting. 

Downloads

Published

2026-03-10

Issue

Section

Articles

How to Cite

Database Normalization and Performance Prediction Using Linear Regression on a Student Dataset. (2026). International Journal of Science and Social Science Research, 3(4), 193-212. https://doi.org/10.63671/ijsssr.v3i4.542

Similar Articles

1-10 of 191

You may also start an advanced similarity search for this article.