================ by Jawad Haider

03 - Basic PyTorch Neural Network¶

Copyright Qalmaqihir For more information, visit us at www.github.com/qalmaqihir/

Basic PyTorch Neural Network
Perform standard imports
Create a model class
Load the iris dataset
Plot the dataset
Perform Train/Test/Split
Prepare DataLoader
Define loss equations and optimizations
Train the model
Plot the loss function
Validate the model
Save the trained model to a file
- Save the model
- Load a new model
Apply the model to classify new, unseen data
Great job!

Basic PyTorch Neural Network¶

Now it’s time to put the pieces together. In this section we’ll: * create a multi-layer deep learning model * load data * train and validate the model

We’ll also introduce a new step: * save and load a trained model

Our goal is to develop a model capable of classifying an iris plant based on four features. This is a multi-class classification where each sample can belong to ONE of 3 classes (Iris setosa, Iris virginica or Iris versicolor). The network will have 4 input neurons (flower dimensions) and 3 output neurons (scores). Our loss function will compare the target label (ground truth) to the corresponding output score.

NOTE: Multi-class classifications usually involve converting the target vector to a one_hot encoded matrix. That is, if 5 labels show up as

tensor([0,2,1,0,1])

then we would encode them as:

tensor([[1, 0, 0],
        [0, 0, 1],
        [0, 1, 0],
        [1, 0, 0],
        [0, 1, 0]])

This is easily accomplished with torch.nn.functional.one_hot().
However, our loss function torch.nn.CrossEntropyLoss() takes care of this for us.

Perform standard imports¶

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

Create a model class¶

For this exercise we’re using the Iris dataset. Since a single straight line can’t classify three flowers we should include at least one hidden layer in our model.

In the forward section we’ll use the rectified linear unit (ReLU) function
$\quad f(x)=max(0,x)$
as our activation function. This is available as a full module torch.nn.ReLU or as just a functional call torch.nn.functional.relu

class Model(nn.Module):
    def __init__(self, in_features=4, h1=8, h2=9, out_features=3):
        super().__init__()
        self.fc1 = nn.Linear(in_features,h1)    # input layer
        self.fc2 = nn.Linear(h1, h2)            # hidden layer
        self.out = nn.Linear(h2, out_features)  # output layer

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.out(x)
        return x

# Instantiate the Model class using parameter defaults:
torch.manual_seed(32)
model = Model()

Load the iris dataset¶

df = pd.read_csv('../Data/iris.csv')
df.head()

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

Plot the dataset¶

The iris dataset has 4 features. To get an idea how they correlate we can plot four different relationships among them.
We’ll use the index positions of the columns to grab their names in pairs with plots = [(0,1),(2,3),(0,2),(1,3)].
Here (0,1) sets “sepal length (cm)” as x and “sepal width (cm)” as y

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10,7))
fig.tight_layout()

plots = [(0,1),(2,3),(0,2),(1,3)]
colors = ['b', 'r', 'g']
labels = ['Iris setosa','Iris virginica','Iris versicolor']

for i, ax in enumerate(axes.flat):
    for j in range(3):
        x = df.columns[plots[i][0]]
        y = df.columns[plots[i][1]]
        ax.scatter(df[df['target']==j][x], df[df['target']==j][y], color=colors[j])
        ax.set(xlabel=x, ylabel=y)

fig.legend(labels=labels, loc=3, bbox_to_anchor=(1.0,0.85))
plt.show()

Perform Train/Test/Split¶

X = df.drop('target',axis=1).values
y = df['target'].values

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=33)

X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
# y_train = F.one_hot(torch.LongTensor(y_train))  # not needed with Cross Entropy Loss
# y_test = F.one_hot(torch.LongTensor(y_test))
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)

Prepare DataLoader¶

For this analysis we don’t need to create a Dataset object, but we should take advantage of PyTorch’s DataLoader tool. Even though our dataset is small (120 training samples), we’ll load it into our model in two batches. This technique becomes very helpful with large datasets.

Note that scikit-learn already shuffled the source dataset before preparing train and test sets. We’ll still benefit from the DataLoader shuffle utility for model training if we make multiple passes throught the dataset.

trainloader = DataLoader(X_train, batch_size=60, shuffle=True)

testloader = DataLoader(X_test, batch_size=60, shuffle=False)

Define loss equations and optimizations¶

As before, we’ll utilize Cross Entropy with torch.nn.CrossEntropyLoss()
For the optimizer, we’ll use a variation of Stochastic Gradient Descent called Adam (short for Adaptive Moment Estimation), with torch.optim.Adam()

# FOR REDO
torch.manual_seed(4)
model = Model()

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

Train the model¶

epochs = 100
losses = []

for i in range(epochs):
    i+=1
    y_pred = model.forward(X_train)
    loss = criterion(y_pred, y_train)
    losses.append(loss)

    # a neat trick to save screen space:
    if i%10 == 1:
        print(f'epoch: {i:2}  loss: {loss.item():10.8f}')

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

epoch:  1  loss: 1.09568226
epoch: 11  loss: 0.98190653
epoch: 21  loss: 0.75652307
epoch: 31  loss: 0.49447522
epoch: 41  loss: 0.34981874
epoch: 51  loss: 0.22807853
epoch: 61  loss: 0.13547322
epoch: 71  loss: 0.09162075
epoch: 81  loss: 0.07378192
epoch: 91  loss: 0.06546164

Plot the loss function¶

plt.plot(range(epochs), losses)
plt.ylabel('Loss')
plt.xlabel('epoch');

Validate the model¶

Now we run the test set through the model to see if the loss calculation resembles the training data.

# TO EVALUATE THE ENTIRE TEST SET
with torch.no_grad():
    y_val = model.forward(X_test)
    loss = criterion(y_val, y_test)
print(f'{loss:.8f}')

0.06246195

correct = 0
with torch.no_grad():
    for i,data in enumerate(X_test):
        y_val = model.forward(data)
        print(f'{i+1:2}. {str(y_val):38}  {y_test[i]}')
        if y_val.argmax().item() == y_test[i]:
            correct += 1
print(f'\n{correct} out of {len(y_test)} = {100*correct/len(y_test):.2f}% correct')

 1. tensor([-0.3355,  7.3630,  1.3783])     1
 2. tensor([0.2775, 8.1554, 0.4269])        1
 3. tensor([ 11.9969,   6.1847, -19.1976])  0
 4. tensor([-2.0187,  7.9664,  4.2447])     1
 5. tensor([-6.1348,  7.9516, 11.0913])     2
 6. tensor([-10.2635,   8.3101,  17.9998])  2
 7. tensor([ 12.0542,   6.4321, -19.2909])  0
 8. tensor([ 12.9507,   6.4819, -20.7540])  0
 9. tensor([-5.7723,  8.2435, 10.5083])     2
10. tensor([-7.8867,  8.6126, 14.0731])     2
11. tensor([-8.7055,  8.6074, 15.4337])     2
12. tensor([ 11.6358,   5.8167, -18.6220])  0
13. tensor([-8.1009,  8.2331, 14.3888])     2
14. tensor([-2.0791,  7.7752,  4.3188])     1
15. tensor([-6.0828,  8.3916, 11.0586])     2
16. tensor([0.1360, 7.8660, 0.6409])        1
17. tensor([-4.0875,  7.7217,  7.6642])     2
18. tensor([ 13.1522,   6.5911, -21.0798])  0
19. tensor([-1.5644,  8.0222,  3.4754])     1
20. tensor([-6.2859,  8.9728, 11.4248])     2
21. tensor([ 12.3859,   6.2571, -19.8275])  0
22. tensor([ 13.8200,   7.0859, -22.1528])  0
23. tensor([-8.8470,  8.3180, 15.6476])     2
24. tensor([ 12.1979,   6.1264, -19.5260])  0
25. tensor([-5.8084,  7.5468, 10.5340])     2
26. tensor([-4.4526,  7.7876,  8.2865])     2
27. tensor([-1.4284,  7.7786,  3.2328])     1
28. tensor([ 0.5356,  7.5360, -0.0492])     1
29. tensor([-5.8230,  8.1573, 10.5975])     2
30. tensor([-5.2569,  7.7476,  9.6105])     2

29 out of 30 = 96.67% correct

Here we can see that #17 was misclassified.

Save the trained model to a file¶

Right now model has been trained and validated, and seems to correctly classify an iris 97% of the time. Let’s save this to disk.
The tools we’ll use are torch.save() and torch.load()

There are two basic ways to save a model.

The first saves/loads the state_dict (learned parameters) of the model, but not the model class. The syntax follows:
Save: torch.save(model.state_dict(), PATH) Load: model = TheModelClass(*args, **kwargs) model.load_state_dict(torch.load(PATH)) model.eval()

The second saves the entire model including its class and parameters as a pickle file. Care must be taken if you want to load this into another notebook to make sure all the target data is brought in properly.
Save: torch.save(model, PATH) Load: model = torch.load(PATH)) model.eval()

In either method, you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.

For more information visit https://pytorch.org/tutorials/beginner/saving_loading_models.html

Save the model¶

torch.save(model.state_dict(), 'IrisDatasetModel.pt')

Load a new model¶

We’ll load a new model object and test it as we had before to make sure it worked.

new_model = Model()
new_model.load_state_dict(torch.load('IrisDatasetModel.pt'))
new_model.eval()

Model(
  (fc1): Linear(in_features=4, out_features=8, bias=True)
  (fc2): Linear(in_features=8, out_features=9, bias=True)
  (out): Linear(in_features=9, out_features=3, bias=True)
)

with torch.no_grad():
    y_val = new_model.forward(X_test)
    loss = criterion(y_val, y_test)
print(f'{loss:.8f}')

0.06246195

Apply the model to classify new, unseen data¶

mystery_iris = torch.tensor([5.6,3.7,2.2,0.5])

Let’s plot this new iris in yellow to see where it falls in relation to the others:

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10,7))
fig.tight_layout()

plots = [(0,1),(2,3),(0,2),(1,3)]
colors = ['b', 'r', 'g']
labels = ['Iris setosa','Iris virginica','Iris versicolor','Mystery iris']

for i, ax in enumerate(axes.flat):
    for j in range(3):
        x = df.columns[plots[i][0]]
        y = df.columns[plots[i][1]]
        ax.scatter(df[df['target']==j][x], df[df['target']==j][y], color=colors[j])
        ax.set(xlabel=x, ylabel=y)

    # Add a plot for our mystery iris:
    ax.scatter(mystery_iris[plots[i][0]],mystery_iris[plots[i][1]], color='y')

fig.legend(labels=labels, loc=3, bbox_to_anchor=(1.0,0.85))
plt.show()

Now run it through the model:

with torch.no_grad():
    print(new_model(mystery_iris))
    print()
    print(labels[new_model(mystery_iris).argmax()])

tensor([ 12.2116,   7.1285, -19.5247])

Iris setosa

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2