MNIST Dataset

This dataset contains 28x28 grayscale images of digits from 0 to 9. In the train csv file, there are 42000 examples/images.
In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
In [2]:
!pip install git+https://github.com/netbrainml/nbml.git
from nbml.workshops.mnist.utils import *
Collecting git+https://github.com/netbrainml/nbml.git
  Cloning https://github.com/netbrainml/nbml.git to /tmp/pip-req-build-djsnmf4m
  Running command git clone -q https://github.com/netbrainml/nbml.git /tmp/pip-req-build-djsnmf4m
Building wheels for collected packages: nbml
  Building wheel for nbml (setup.py) ... - \ done
  Created wheel for nbml: filename=nbml-0.0.1-cp36-none-any.whl size=11985 sha256=43521fcea8cfefa03648fec3b8f911db413985922de04cae90760d8e6c381774
  Stored in directory: /tmp/pip-ephem-wheel-cache-0vssqhle/wheels/3a/b1/27/4431be29eb1fbe8f0912364e44fecc078167c19415ed958b11
Successfully built nbml
Installing collected packages: nbml
Successfully installed nbml-0.0.1
WARNING: You are using pip version 19.2.1, however version 19.2.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

To put the data into the model, we need to wrap the numpy arrays as torch tensors

In [3]:
x_train, x_test, y_train, y_test = getMNIST("../input/train.csv")
x_train, x_test = torch.Tensor(x_train), torch.Tensor(x_test)
y_train, y_test = torch.Tensor(y_train), torch.Tensor(y_test)
shapes(x_train, x_test, y_train, y_test)
arg_0: torch.Size([37800, 784])
arg_1: torch.Size([4200, 784])
arg_2: torch.Size([37800])
arg_3: torch.Size([4200])

What are DataLoaders?

Allow us to use generators that zip features and targets in batches
In [4]:
from torch.utils.data import DataLoader, TensorDataset

tdl = DataLoader(TensorDataset(x_train, y_train), batch_size=256, shuffle=True)
vdl = DataLoader(TensorDataset(x_test, y_test), batch_size=256, shuffle=True)

Single Layer Perceptron

In [5]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class SLP(BasicTrainableClassifier):
    def __init__(self,in_c, out_c, num_units):
        super().__init__()
        self.slp = nn.Sequential(nn.Linear(in_c, num_units),
                                 nn.Linear(num_units, out_c))
    def __call__(self,x):
        return self.slp(x)

Look inside Linear layer

In [6]:
nn.Linear??
In [7]:
F.linear??

Create an instance of our SLP class

In [8]:
slp_m = SLP(784, 10, 512).cuda() #.cpu() to use CPU... .cuda() uses GPU, which allows for parallel processing which is much faster
slp_m
Out[8]:
SLP(
  (crit): CrossEntropyLoss()
  (slp): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): Linear(in_features=512, out_features=10, bias=True)
  )
)

What is BasicTrainableClassifier?

A parent class we wrote that has some functions that may be useful.
You can write your own class, and notice the parent class of BasicTrainableClassifier (nn.Module).
nn.Module is a class that PyTorch created that has many functionalities, acting as the base class for all machine learning models.
All you really need to write is the __init__() and forward().
In the __init__(), call the superclass's dunder init by super().__init__()
In [9]:
BasicTrainableClassifier??
In [10]:
slp_m(torch.Tensor(x_train[:20]).cuda()).shape
Out[10]:
torch.Size([20, 10])
In [11]:
slp_m.fit(tdl, valid_ds=vdl, epochs=1,
            cbs=True, learning_rate=1e-3)
 17%|█▋        | 25/148 [00:00<00:00, 242.69it/s]
Epoch 1
100%|██████████| 148/148 [00:00<00:00, 244.09it/s]
Accuracy: (V:0.9182515565086814, T:0.9161903254083685), Loss: (V:0.29042427855379443, T:0.29330215079558863)

Multi-Layer Perceptron

In [12]:
class MLP(BasicTrainableClassifier):
    def __init__(self, ls):
        super().__init__()
        self.model = nn.Sequential(*[nn.Sequential(nn.Linear(*n), nn.ReLU()) for n in ls])
        
    def forward(self, X):
        return self.model(X)
In [13]:
def get_layers(start, hs, end, step):
    lse = [*list(range(hs, end, -step)), end]
    return list(zip([start,*lse[:]], [*lse[:], end]))[:-1]
In [14]:
ls = get_layers(784,512,10,256)
ls
Out[14]:
[(784, 512), (512, 256), (256, 10)]
In [15]:
mlp_m = MLP(ls).cuda()
mlp_m.fit(tdl, valid_ds=vdl, epochs=5,
            cbs=True, learning_rate=1e-3)
 17%|█▋        | 25/148 [00:00<00:00, 243.27it/s]
Epoch 1
100%|██████████| 148/148 [00:00<00:00, 248.28it/s]
 16%|█▌        | 23/148 [00:00<00:00, 223.90it/s]
Accuracy: (V:0.6753393692128798, T:0.6721087718332136), Loss: (V:0.82520250713124, T:0.842455328719036)
Epoch 2
100%|██████████| 148/148 [00:00<00:00, 245.58it/s]
 17%|█▋        | 25/148 [00:00<00:00, 242.93it/s]
Accuracy: (V:0.6788744365467745, T:0.679912474107098), Loss: (V:0.8000558404361501, T:0.7962220335328901)
Epoch 3
100%|██████████| 148/148 [00:00<00:00, 248.49it/s]
 17%|█▋        | 25/148 [00:00<00:00, 243.44it/s]
Accuracy: (V:0.6856440901756287, T:0.6874007099383587), Loss: (V:0.7735241967089036, T:0.7605302390214559)
Epoch 4
100%|██████████| 148/148 [00:00<00:00, 243.99it/s]
 18%|█▊        | 26/148 [00:00<00:00, 251.84it/s]
Accuracy: (V:0.6905048103893504, T:0.6902160253879186), Loss: (V:0.7489228774519527, T:0.7473121869402963)
Epoch 5
100%|██████████| 148/148 [00:00<00:00, 236.62it/s]
Accuracy: (V:0.6911764705882353, T:0.6915972893302506), Loss: (V:0.7444819176898283, T:0.7331306571896011)