Introduction to Linear Regression & Train Your First Model.

Ritesh Kumar
4 min readJun 1, 2020

Here is my second week of Deep Learning with PyTorch: Zero to GANs.

Learning Model
Create our first model.

In this blog, we will take a look at what is regression, types of regression, linear regression using PyTorch & we will learn how to train our first model.

Before getting started further we should know what is training data & why it is important to train our model?

What is the Training Data?

It is the most important data which helps your machine's model to learn and make predictions. A huge number of datasets are used to train the model at the best level to get the best results.

Why it is important to train our model?

First, It’s impossible to train our model without training data. And if the available training data is Incomplete or Incorrect data then it will train our model inaccurately and our model starts behaving like an illiterate human that can’t understand. Hence, it is most important to train our model precisely to get accurate results and it helps our model to achieve the best level of accuracy.

What is Regression?

It is a form of predictive modeling technique that tells us the relationship between a dependent and one or more independent variable.

Types of Regression:

There are many types of regressions such as ‘Linear Regression’, ‘Polynomial Regression’, ‘Logistic regression’, and others.

Linear Regression

Linear regression is a very commonly used and basic type of predictive analysis that usually works on continuous data. It is generally used for finding a linear relationship between the dependent and one or more dependent.

How To Train Our First Model?

Before getting started with this we should know a few things i.e,

A. What we do by using our model?

Our model is going to use information like a person’s age, sex, BMI, no. of children, and smoking habit to predict the price of yearly medical bills.

B. Why this model is useful?

This kind of model is useful for insurance companies to determine the yearly insurance premium for a person.

C. How our model use information and from where?

By using the download_url function from PyTorch to get the data as a CSV (comma-separated values) file.

The dataset for this problem is taken from https://www.kaggle.com/mirichoi0218/insurance. You can check this link for more information.

Lets get Started.

STEP 1: Download the Necessary libraries and packages.

#1. Import all the necessary libraries:

now import required packages:

#2. Download & Explore the data:

#3. To use the dataset we’ll use the read_csv function from the pandas library.

For more visit My Notebook: CLICK HERE

STEP 2: Prepare the dataset for training:

We need to convert the data from the Pandas data frame into a PyTorch tensor for training. To do this, the first step is to convert it to NumPy arrays.

inputs_array, targets_array = dataframe_to_arrays(dataframe)
inputs_array, targets_array

To convert into PyTorch tensors.

inputs = torch.from_numpy(inputs_array).type(torch.float32)
targets = torch.from_numpy(targets_array).type(torch.float32)

For more visit My Notebook: CLICK HERE

STEP 3: Create Our Model:

Now create a model using the InsuranceModel class.

Let’s check out the weights and biases of the model using model.parameters.

Step 4: Train the model to fit the data:

To train our model, we’ll use the fit function

evaluate function is used to calculate the loss before training.

Here, Our validation loss is 10881.130859375.

Train the model 4–5 times and try to get to as low a loss as possible.

#Different learning rates & for Different number of epochs.epochs = 2000         // epochs.
lr = 1e-2 // learning rate.
history1 = fit(epochs, lr, model, train_loader, val_loader)
epochs = 1500
lr = 1e-3
history2 = fit(epochs, lr, model, train_loader, val_loader)
epochs = 1000
lr = 1e-4
history3 = fit(epochs, lr, model, train_loader, val_loader)
epochs = 500
lr = 1e-5
history4 = fit(epochs, lr, model, train_loader, val_loader)
epochs = 100
lr = 1e-6
history5 = fit(epochs, lr, model, train_loader, val_loader)

Final validation loss of model is :

Log the final val_loss:

Step 5: Make predictions using the trained model:

Predict_single function is used to make predictions on a single input.

So here our target for the model is 1319.4648 & model predict 1193.1343, almost near to the target value so our model is working fine.

Our model is almost 90% accurate.

Resources:

  1. Use this data visualization cheatsheet for reference: Click Here.
  2. Read through the Pandas documentation to understand how we’re converting categorical variables into numbers.
  3. For good loss functions: Click Here.
  4. For full code check My Notebook.
  5. Google Search
  6. Jovian.ml

Final Advice: Good things take time.

I hope you liked this post. Feel free to share your ideas, thoughts, and suggestions below.

Good luck!

Connect with me — My Linkedin

--

--