This dataset contains 28x28 grayscale images of digits from 0 to 9. In the train csv file, there are 42000 examples/images.
%load_ext autoreload
%autoreload 2
%matplotlib inline
!pip install git+https://github.com/netbrainml/nbml.git
from nbml.workshops.mnist.utils import *
To put the data into the model, we need to wrap the numpy arrays as torch tensors
x_train, x_test, y_train, y_test = getMNIST("../input/train.csv")
x_train, x_test = torch.Tensor(x_train), torch.Tensor(x_test)
y_train, y_test = torch.Tensor(y_train), torch.Tensor(y_test)
shapes(x_train, x_test, y_train, y_test)
Allow us to use generators that zip features and targets in batches
from torch.utils.data import DataLoader, TensorDataset
tdl = DataLoader(TensorDataset(x_train, y_train), batch_size=256, shuffle=True)
vdl = DataLoader(TensorDataset(x_test, y_test), batch_size=256, shuffle=True)
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class SLP(BasicTrainableClassifier):
def __init__(self,in_c, out_c, num_units):
super().__init__()
self.slp = nn.Sequential(nn.Linear(in_c, num_units),
nn.Linear(num_units, out_c))
def __call__(self,x):
return self.slp(x)
Look inside Linear layer
nn.Linear??
F.linear??
Create an instance of our SLP class
slp_m = SLP(784, 10, 512).cuda() #.cpu() to use CPU... .cuda() uses GPU, which allows for parallel processing which is much faster
slp_m
A parent class we wrote that has some functions that may be useful.
You can write your own class, and notice the parent class of BasicTrainableClassifier (nn.Module).
nn.Module is a class that PyTorch created that has many functionalities, acting as the base class for all machine learning models.
All you really need to write is the __init__() and forward().
In the __init__(), call the superclass's dunder init by super().__init__()
BasicTrainableClassifier??
slp_m(torch.Tensor(x_train[:20]).cuda()).shape
slp_m.fit(tdl, valid_ds=vdl, epochs=1,
cbs=True, learning_rate=1e-3)
class MLP(BasicTrainableClassifier):
def __init__(self, ls):
super().__init__()
self.model = nn.Sequential(*[nn.Sequential(nn.Linear(*n), nn.ReLU()) for n in ls])
def forward(self, X):
return self.model(X)
def get_layers(start, hs, end, step):
lse = [*list(range(hs, end, -step)), end]
return list(zip([start,*lse[:]], [*lse[:], end]))[:-1]
ls = get_layers(784,512,10,256)
ls
mlp_m = MLP(ls).cuda()
mlp_m.fit(tdl, valid_ds=vdl, epochs=5,
cbs=True, learning_rate=1e-3)