APPLICATION OF DATA ANALYSIS AND PREPROCESSING IN PYTHON TO THE HOUSING PRICE PREDICTION PROBLEM

Authors

  • Minh Thai Tran Faculty of Information Technology, Ho Chi Minh City University of Foreign Languages - Information Technology https://orcid.org/0000-0002-7671-4785
  • Khánh Vy Nguyễn Mai Khoa Công nghệ Thông tin, HUFLIT
  • Thảo Ngân Trần Ngọc Khoa Công nghệ Thông tin, HUFLIT

Keywords:

Predicting house prices, data analysis, data preprocessing, Support Vector Regressor, Random Forest Regressor, Python

Abstract

Predictive modeling is one of the most important and widely applicable problems in the field of machine learning. It serves as the foundation for many important applications in human life, ranging from familiar areas such as weather forecasting and price prediction to more complex areas such as disease diagnosis, fraud detection, and autonomous driving. The focus of predictive modeling is to predict the outcome of an event or a variable in the future based on historical data by automatically learning from the data and building a prediction model. This paper focuses on building a model to predict housing prices in Ho Chi Minh City. Through the application of analysis and data preprocessing techniques using Python programming language libraries, the data is cleaned, missing values are handled, duplicates and outliers are addressed, categorical variables are encoded, data is normalized, feature selection and dimensionality reduction are performed. Next, machine learning models are trained to predict housing prices using the Support Vector Regressor (SVR) and Random Forest Regressor (RFR) methods. Experimental results show that RFR is capable of capturing complex and nonlinear relationships, is less affected by outliers and noise, and outperforms SVR in terms of performance.

Published

31-10-2024

How to Cite

Tran, M. T., Nguyễn Mai, K. V., & Trần Ngọc, . T. N. (2024). APPLICATION OF DATA ANALYSIS AND PREPROCESSING IN PYTHON TO THE HOUSING PRICE PREDICTION PROBLEM. HUFLIT Journal of Science, 8(4), 72. Retrieved from https://hjs.huflit.edu.vn/index.php/hjs/article/view/217

Issue

Section

Articles

Categories