Predict House Prices with Linear Regression with Python. Data science and data analytics can be used to help real estate professionals gain insights into the real estate market and make more informed decisions. Data science can help identify trends in the market, analyze consumer behaviour, and predict future market movements. Data analytics can be used to gain insights into pricing trends, identify potential investments, and manage risk.

Data science and data analytics can also be used to optimize marketing strategies, improve customer service, and better manage customer relationships. Additionally, data science and data analytics can be used to identify and analyze patterns in the real estate industry, identify new opportunities, and improve operational efficiency.

In this article, we will be working on a dataset consisting of information about the house’s location, price, and other aspects of buying the house. We will learn how to make a model that can give us a good prediction of the price of the house based on different variables. We are going to use Linear Regression for this dataset and see if it gives us good accuracy or not.

Video: – Estimating price of houses (Data Analysis & Statics)

Copy to Clipboard

Step 2. Load Boston data from sklearn datasets and print boston house data keys.

Copy to Clipboard

Step 3. Add price column with data set

Now, add the “PRICE” index with our data for analysis of price with a different – 2 Index. also, here we find min, max & median values of the house price

Copy to Clipboard

Step 4. To analyse data, we find out basic information like null values and number of columns.

Copy to Clipboard

Output:

Note: we do not have any null values in data set.

Step 5. Replace Null Values

If you have null values, replace these null values with the median values of features.

Copy to Clipboard

Step 6. Find insight values.

Find out the description of the price column data set to see the most occurrence price & minimum price occurrence values.

Copy to Clipboard

Step 7. Correlation with price.

To use key columns for predicting the price of the house, find out the most co-related key column.

A. Correlation Matrix: It is a table showing correlation coefficients between sets of features. Values close to 1 and -1 indicate strong correlation.

Copy to Clipboard

Output:

B. Relationship b/w “PRICE” & “RM”(Room per dwelling).

Copy to Clipboard

Output:

C. Relationship b/w age of house and price.

Copy to Clipboard

Output:

D. Relation b/w price and per capita crime rate by town.

Copy to Clipboard

Output:

Note: According to the previous data analysis, find out room per dwelling. It shows the highest correlation with price. Next, to predict the price value of the house, use room per dwelling as a dependent data set for linear regression. 

Linear Regression Model

Guiding step for predicting price of the house.

Step 1. Import dependent functions and libraries.

Copy to Clipboard

Step 2. Split data set into test and training data set.

According to the previous data visualization, “RM” feature shows a strong correlation with the price. So, we take test and train data sets of RM and Price features.

Note: The whole data set is divided into the training set and test set, in which the train in the sklearn library is used_ test_ Split function, which is used to divide the data set. Set the training data set to 80%.

Copy to Clipboard

Output:

So, here we divide two datasets into training and testing data.

Step 3. Reshape array with numpy array.

Establish a linear regression model and use the training set to train_ Train and Y_ Trains are all one-dimensional arrays, which need to be converted into two-dimensional matrices. Here, we use a reshape function in a numpy library to transform them into two-dimensional matrices.

Copy to Clipboard

Step 4. Define linear regression function to train our dataset to find out predicted values.

Copy to Clipboard

Step 5. Test model with test data set.

Use the test set data to test, get the prediction results, and draw a line chart between the predicted results and the real results, which can directly show the fitting effect.

Copy to Clipboard

Note: “pred” is the predicted value of price for “RM” test data sets.

Copy to Clipboard

Output:

Step 6. Plot regression line.

Also, draw the fitting function and test set data, and you can directly see the fit degree of the regression function and training set. The model uses the function fitted by the training set data to predict, so the fitting function can be obtained by inputting the training set data into the model.

Copy to Clipboard

Output:

Predict House Prices with Linear Regression

Step 7. Find out the standard deviation of error.

Copy to Clipboard

Output:

Conclusion:

This data science and data analytics tutorial in real estate has provided you with an overview of how data science and analytics tools can be used to improve decision-making in the real estate industry. You have seen how data science and analytics can be used to identify trends, optimize marketing campaigns, and analyze customer behaviour. With the help of this tutorial, you now have the necessary knowledge to use data science and analytics to make better decisions in the real estate industry.

Also you may like – A Guide to Face Detection in Python