15. Pytorch

Pytorch is a framework for doing differentition, optimization, and in particular optimization of neural networks (therefore, sometimes just called a deep learning framework) using the native imperative language style of Python.

15.1. Some optional information about Pytorch and how it differs from its competitors

Most deep learning frameworks like TensorFlow and Theanos work with symbolic differentian and therefore it is necessary to declare the model structure before actually supplying any data and then ask the framework to compile the model (having the gradients calculated symbolically), after which point, the model will be good to go as long it does not change.

While those have advantages like possible algebraic simplifications when working out the derivates, they also come at some price: they impose a necessary "recompilation" of the model if it changes, they make code less intuitive to writting and more difficult to debug, are potentially more restrictive on model characteristics.

On the other hand, the framework of choice for this section, called PyTorch works with reverse-mode automatic differentiation which consists of creating the chain of differentiation (on a "tape") on-the-fly. That is, after the last operation is done (in our case, that’s the calculation of the loss), the chain of operations is back-propagated (i.e.: running the "tape" backwards) and the gradients of the parameters of interest are calculated using chain rule.

Frameworks which works using symbolic differention are often called static while the ones that use automatic differentiation are called dynamic. Regardless of this, most (if not all) of those deep learning framework have two common characteristic that are worth emphasizing: they allow one to use its differentiation facilities to work with problems other than deep learning, neural networks or optimization (e.g.: Markov Chain Monte Carlo and Bayesian inference) and they natively support GPU acceleration (generally, using Nvidia CUDA).

The reason GPU acceleration is a common denominator over the deep learning frameworks is due to the fact that neural networks are strongly parallelizable problems and this make them well suited for GPU acceleration. This thus explain, at least in part, their recent surge in popularity given the scalability properties of such methods to big datasets, which on the other hand, are getting increasingly common and important.

15.2. Now getting started

First of all, let’s start by importing some basic stuff

[1]:
import numpy as np
import scipy.stats as stats
import pandas as pd
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim

from sklearn.externals import joblib

15.3. Tensors and GPU

Now, let me present you with the basic storage type of Pytorch: the tensors. They work in very similar fashion to the numpy arrays.

[2]:
x = torch.tensor([.45, .53])
y = x**2
y[0] = .3

a = torch.ones((2,4))
a = a * 2
a = a + 1
a[0,0] = .4

b = torch.zeros((4,2))
b = (b + 3) / 2
b[0,1] = .2

bt = b.transpose(0,1) # transpose

a + bt

torch.mm(a,b) # matrix multiplication
[2]:
tensor([[14.1000, 13.5800],
        [18.0000, 14.1000]])

But Pytoch tensors have a special caveat: they can live on the GPU (if you have one proper installed… if you don’t, then chances are that your computer might explode if you try to run this)!

[3]:
a = torch.rand((3, 5)) # some random numbers
a = a.cuda()

b = torch.ones((5, 3))
b = b.cuda()
b = (b + 3) / 2
b[0,1] = .2

torch.mm(a, b) # matrix multiplication
[3]:
tensor([[4.9632, 3.9542, 4.9632],
        [2.3442, 2.1493, 2.3442],
        [5.6313, 4.4108, 5.6313]], device='cuda:0')

15.4. Float data type

Note that the default data type of Pytorch floats is 32 bit precision type:

[4]:
torch.rand((3, 5)).dtype
[4]:
torch.float32

While on numpy the default is 64 bit precision type:

[5]:
np.arange(4).dtype
[5]:
dtype('int64')

You can pass a custom data type dtype function parameter or with special constructors:

[6]:
print(torch.rand((3, 5), dtype=torch.float64).dtype)
print(torch.FloatTensor([.5]).dtype)
print(torch.HalfTensor([.5]).dtype)
print(torch.DoubleTensor([.5]).dtype)
torch.float64
torch.float32
torch.float16
torch.float64

But it’s recommended to use the default float32 for most deep learning applications due to the speed up that it can give on vectorized operations, specially on GPUs.

15.5. Differentiation

We can use Pytorch to do numerical differention using the code below:

[7]:
x = torch.tensor([.45], requires_grad=True)
y = x**2
y.backward()
x.grad
[7]:
tensor([0.9000])

x.grad gives us the gradient of y with respect to x. Now pay attetion at (and play with) this other example:

[8]:
x = torch.tensor([.5,.3,.6], requires_grad=True)
y = x**2
z = y.sum()
z.backward()
x.grad
[8]:
tensor([1.0000, 0.6000, 1.2000])

15.6. Optimization

Now let’s try using this integration framework for optimization:

[9]:
x = torch.tensor([.45], requires_grad=True)

# "declares" that x is the variable being optimized by the Adam optimization algorith
optimizer = optim.Adam([x])

y = 2 * x**2 - 7 * x
y.backward() # "declares" that y is value being minimized

optimizer.step() # i.e. find the x that minimizes y
x
[9]:
tensor([0.4510], requires_grad=True)

Here optimizer.step() moved x in the direction of its gradient; i.e. it moved x in the direction of that minimizes y.

(Note that requires_grad=True is necessary in order to Pytorch to now that it must keep track of the operations done with x to backpropagate it back in the future to get the gradient of x, it requires you to do so manually because otherwise it could save computational resources by not creating this structures).

However, this is just a little step towards optimization, we must repeat this many time to get there:

[10]:
x = torch.tensor([-2.45], requires_grad=True)

optimizer = optim.Adam([x], lr=0.05)
for _ in range(1000):
    optimizer.zero_grad()
    y = 2 * x**2 - 7 * x
    y.backward()
    optimizer.step()

print("Numerical optimization solution:", x)
print("Analytic optimization solution:", 7/4)
Numerical optimization solution: tensor([1.7500], requires_grad=True)
Analytic optimization solution: 1.75

Great, we did it! Now let’s try something more difficult, given i.i.d. Gaussian samples, let’s try to \(\hat{\mu}\) such that it minimize the sum of the squared errors of those samples and \(\hat{\mu}\). We know from Statistical theory that the analytical solutional for this is the sample average.

[11]:
mu_hat = torch.tensor([0.1], requires_grad=True)
mu_true = 1.5
x = stats.norm.rvs(size=2000, loc=mu_true, scale=3, random_state=0)
x = torch.as_tensor(x, dtype=torch.float32)

optimizer = optim.Adam([mu_hat], lr=0.05)
criterion = nn.MSELoss()
for _ in range(1000):
    optimizer.zero_grad()
    loss = criterion(x, mu_hat)
    loss.backward()
    optimizer.step()

print("Numerical optimization solution:", mu_hat)
print("Analytic optimization solution:", x.mean())
Numerical optimization solution: tensor([1.4525], requires_grad=True)
Analytic optimization solution: tensor(1.4525)

And voila! It worked again!

15.7. Neural networks with Pytorch

Now probably the most expected part, your first neural network with Pytorch:

[12]:
# Declares the structure of our neural network
class Net(nn.Module):
    def __init__(self):
        # this is strictly necessary!
        super(Net, self).__init__()

        # fully connected layer with input of size 10 and output of size 120
        self.fc1 = nn.Linear(10, 120)

        # fully connected layer with input of size 10 and output of size 120
        self.fc2 = nn.Linear(120, 84)

        # fully connected layer with input of size 10 and output of size 120
        self.fc3 = nn.Linear(84, 1)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net() # Construct the neural network object

# Creates some data using a linear regression
beta = torch.rand(10, 1)
inputv = torch.randn(70, 10)
target = torch.mm(inputv, beta)
target = target + torch.randn(70, 1)

# If a GPU is available, move the network parameters and data into it
if torch.cuda.is_available():
    net.cuda()
    inputv = inputv.cuda()
    target = target.cuda()

criterion = nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

for epoch in range(1000):
    optimizer.zero_grad()
    output = net(inputv)
    loss = criterion(output, target)
    print('Loss', np.round(loss.item(), 2), 'in epoch', epoch + 1)
    loss.backward()
    optimizer.step()
Loss 3.46 in epoch 1
Loss 3.42 in epoch 2
Loss 3.39 in epoch 3
Loss 3.35 in epoch 4
Loss 3.32 in epoch 5
Loss 3.29 in epoch 6
Loss 3.26 in epoch 7
Loss 3.23 in epoch 8
Loss 3.21 in epoch 9
Loss 3.18 in epoch 10
Loss 3.16 in epoch 11
Loss 3.14 in epoch 12
Loss 3.11 in epoch 13
Loss 3.09 in epoch 14
Loss 3.07 in epoch 15
Loss 3.04 in epoch 16
Loss 3.02 in epoch 17
Loss 3.0 in epoch 18
Loss 2.97 in epoch 19
Loss 2.95 in epoch 20
Loss 2.92 in epoch 21
Loss 2.9 in epoch 22
Loss 2.87 in epoch 23
Loss 2.85 in epoch 24
Loss 2.82 in epoch 25
Loss 2.8 in epoch 26
Loss 2.77 in epoch 27
Loss 2.74 in epoch 28
Loss 2.72 in epoch 29
Loss 2.69 in epoch 30
Loss 2.66 in epoch 31
Loss 2.63 in epoch 32
Loss 2.6 in epoch 33
Loss 2.57 in epoch 34
Loss 2.54 in epoch 35
Loss 2.51 in epoch 36
Loss 2.47 in epoch 37
Loss 2.44 in epoch 38
Loss 2.41 in epoch 39
Loss 2.37 in epoch 40
Loss 2.34 in epoch 41
Loss 2.3 in epoch 42
Loss 2.27 in epoch 43
Loss 2.23 in epoch 44
Loss 2.19 in epoch 45
Loss 2.15 in epoch 46
Loss 2.12 in epoch 47
Loss 2.08 in epoch 48
Loss 2.04 in epoch 49
Loss 2.0 in epoch 50
Loss 1.96 in epoch 51
Loss 1.92 in epoch 52
Loss 1.88 in epoch 53
Loss 1.84 in epoch 54
Loss 1.8 in epoch 55
Loss 1.77 in epoch 56
Loss 1.73 in epoch 57
Loss 1.69 in epoch 58
Loss 1.65 in epoch 59
Loss 1.62 in epoch 60
Loss 1.58 in epoch 61
Loss 1.55 in epoch 62
Loss 1.51 in epoch 63
Loss 1.48 in epoch 64
Loss 1.44 in epoch 65
Loss 1.41 in epoch 66
Loss 1.38 in epoch 67
Loss 1.35 in epoch 68
Loss 1.32 in epoch 69
Loss 1.29 in epoch 70
Loss 1.27 in epoch 71
Loss 1.24 in epoch 72
Loss 1.21 in epoch 73
Loss 1.19 in epoch 74
Loss 1.17 in epoch 75
Loss 1.14 in epoch 76
Loss 1.12 in epoch 77
Loss 1.1 in epoch 78
Loss 1.08 in epoch 79
Loss 1.06 in epoch 80
Loss 1.04 in epoch 81
Loss 1.03 in epoch 82
Loss 1.01 in epoch 83
Loss 1.0 in epoch 84
Loss 0.98 in epoch 85
Loss 0.97 in epoch 86
Loss 0.95 in epoch 87
Loss 0.94 in epoch 88
Loss 0.93 in epoch 89
Loss 0.92 in epoch 90
Loss 0.91 in epoch 91
Loss 0.9 in epoch 92
Loss 0.89 in epoch 93
Loss 0.88 in epoch 94
Loss 0.87 in epoch 95
Loss 0.86 in epoch 96
Loss 0.85 in epoch 97
Loss 0.84 in epoch 98
Loss 0.84 in epoch 99
Loss 0.83 in epoch 100
Loss 0.82 in epoch 101
Loss 0.82 in epoch 102
Loss 0.81 in epoch 103
Loss 0.8 in epoch 104
Loss 0.8 in epoch 105
Loss 0.79 in epoch 106
Loss 0.79 in epoch 107
Loss 0.78 in epoch 108
Loss 0.78 in epoch 109
Loss 0.77 in epoch 110
Loss 0.77 in epoch 111
Loss 0.76 in epoch 112
Loss 0.76 in epoch 113
Loss 0.75 in epoch 114
Loss 0.75 in epoch 115
Loss 0.74 in epoch 116
Loss 0.74 in epoch 117
Loss 0.74 in epoch 118
Loss 0.73 in epoch 119
Loss 0.73 in epoch 120
Loss 0.72 in epoch 121
Loss 0.72 in epoch 122
Loss 0.72 in epoch 123
Loss 0.71 in epoch 124
Loss 0.71 in epoch 125
Loss 0.7 in epoch 126
Loss 0.7 in epoch 127
Loss 0.7 in epoch 128
Loss 0.69 in epoch 129
Loss 0.69 in epoch 130
Loss 0.69 in epoch 131
Loss 0.68 in epoch 132
Loss 0.68 in epoch 133
Loss 0.68 in epoch 134
Loss 0.67 in epoch 135
Loss 0.67 in epoch 136
Loss 0.67 in epoch 137
Loss 0.66 in epoch 138
Loss 0.66 in epoch 139
Loss 0.66 in epoch 140
Loss 0.65 in epoch 141
Loss 0.65 in epoch 142
Loss 0.65 in epoch 143
Loss 0.65 in epoch 144
Loss 0.64 in epoch 145
Loss 0.64 in epoch 146
Loss 0.64 in epoch 147
Loss 0.63 in epoch 148
Loss 0.63 in epoch 149
Loss 0.63 in epoch 150
Loss 0.62 in epoch 151
Loss 0.62 in epoch 152
Loss 0.62 in epoch 153
Loss 0.62 in epoch 154
Loss 0.61 in epoch 155
Loss 0.61 in epoch 156
Loss 0.61 in epoch 157
Loss 0.61 in epoch 158
Loss 0.6 in epoch 159
Loss 0.6 in epoch 160
Loss 0.6 in epoch 161
Loss 0.59 in epoch 162
Loss 0.59 in epoch 163
Loss 0.59 in epoch 164
Loss 0.59 in epoch 165
Loss 0.58 in epoch 166
Loss 0.58 in epoch 167
Loss 0.58 in epoch 168
Loss 0.58 in epoch 169
Loss 0.57 in epoch 170
Loss 0.57 in epoch 171
Loss 0.57 in epoch 172
Loss 0.57 in epoch 173
Loss 0.56 in epoch 174
Loss 0.56 in epoch 175
Loss 0.56 in epoch 176
Loss 0.56 in epoch 177
Loss 0.55 in epoch 178
Loss 0.55 in epoch 179
Loss 0.55 in epoch 180
Loss 0.55 in epoch 181
Loss 0.54 in epoch 182
Loss 0.54 in epoch 183
Loss 0.54 in epoch 184
Loss 0.54 in epoch 185
Loss 0.53 in epoch 186
Loss 0.53 in epoch 187
Loss 0.53 in epoch 188
Loss 0.53 in epoch 189
Loss 0.52 in epoch 190
Loss 0.52 in epoch 191
Loss 0.52 in epoch 192
Loss 0.52 in epoch 193
Loss 0.52 in epoch 194
Loss 0.51 in epoch 195
Loss 0.51 in epoch 196
Loss 0.51 in epoch 197
Loss 0.51 in epoch 198
Loss 0.5 in epoch 199
Loss 0.5 in epoch 200
Loss 0.5 in epoch 201
Loss 0.5 in epoch 202
Loss 0.5 in epoch 203
Loss 0.49 in epoch 204
Loss 0.49 in epoch 205
Loss 0.49 in epoch 206
Loss 0.49 in epoch 207
Loss 0.49 in epoch 208
Loss 0.48 in epoch 209
Loss 0.48 in epoch 210
Loss 0.48 in epoch 211
Loss 0.48 in epoch 212
Loss 0.48 in epoch 213
Loss 0.47 in epoch 214
Loss 0.47 in epoch 215
Loss 0.47 in epoch 216
Loss 0.47 in epoch 217
Loss 0.46 in epoch 218
Loss 0.46 in epoch 219
Loss 0.46 in epoch 220
Loss 0.46 in epoch 221
Loss 0.46 in epoch 222
Loss 0.45 in epoch 223
Loss 0.45 in epoch 224
Loss 0.45 in epoch 225
Loss 0.45 in epoch 226
Loss 0.45 in epoch 227
Loss 0.44 in epoch 228
Loss 0.44 in epoch 229
Loss 0.44 in epoch 230
Loss 0.44 in epoch 231
Loss 0.44 in epoch 232
Loss 0.44 in epoch 233
Loss 0.43 in epoch 234
Loss 0.43 in epoch 235
Loss 0.43 in epoch 236
Loss 0.43 in epoch 237
Loss 0.43 in epoch 238
Loss 0.42 in epoch 239
Loss 0.42 in epoch 240
Loss 0.42 in epoch 241
Loss 0.42 in epoch 242
Loss 0.42 in epoch 243
Loss 0.41 in epoch 244
Loss 0.41 in epoch 245
Loss 0.41 in epoch 246
Loss 0.41 in epoch 247
Loss 0.41 in epoch 248
Loss 0.41 in epoch 249
Loss 0.4 in epoch 250
Loss 0.4 in epoch 251
Loss 0.4 in epoch 252
Loss 0.4 in epoch 253
Loss 0.4 in epoch 254
Loss 0.39 in epoch 255
Loss 0.39 in epoch 256
Loss 0.39 in epoch 257
Loss 0.39 in epoch 258
Loss 0.39 in epoch 259
Loss 0.39 in epoch 260
Loss 0.38 in epoch 261
Loss 0.38 in epoch 262
Loss 0.38 in epoch 263
Loss 0.38 in epoch 264
Loss 0.38 in epoch 265
Loss 0.38 in epoch 266
Loss 0.37 in epoch 267
Loss 0.37 in epoch 268
Loss 0.37 in epoch 269
Loss 0.37 in epoch 270
Loss 0.37 in epoch 271
Loss 0.37 in epoch 272
Loss 0.36 in epoch 273
Loss 0.36 in epoch 274
Loss 0.36 in epoch 275
Loss 0.36 in epoch 276
Loss 0.36 in epoch 277
Loss 0.36 in epoch 278
Loss 0.36 in epoch 279
Loss 0.35 in epoch 280
Loss 0.35 in epoch 281
Loss 0.35 in epoch 282
Loss 0.35 in epoch 283
Loss 0.35 in epoch 284
Loss 0.35 in epoch 285
Loss 0.34 in epoch 286
Loss 0.34 in epoch 287
Loss 0.34 in epoch 288
Loss 0.34 in epoch 289
Loss 0.34 in epoch 290
Loss 0.34 in epoch 291
Loss 0.34 in epoch 292
Loss 0.33 in epoch 293
Loss 0.33 in epoch 294
Loss 0.33 in epoch 295
Loss 0.33 in epoch 296
Loss 0.33 in epoch 297
Loss 0.33 in epoch 298
Loss 0.33 in epoch 299
Loss 0.32 in epoch 300
Loss 0.32 in epoch 301
Loss 0.32 in epoch 302
Loss 0.32 in epoch 303
Loss 0.32 in epoch 304
Loss 0.32 in epoch 305
Loss 0.32 in epoch 306
Loss 0.31 in epoch 307
Loss 0.31 in epoch 308
Loss 0.31 in epoch 309
Loss 0.31 in epoch 310
Loss 0.31 in epoch 311
Loss 0.31 in epoch 312
Loss 0.31 in epoch 313
Loss 0.3 in epoch 314
Loss 0.3 in epoch 315
Loss 0.3 in epoch 316
Loss 0.3 in epoch 317
Loss 0.3 in epoch 318
Loss 0.3 in epoch 319
Loss 0.3 in epoch 320
Loss 0.29 in epoch 321
Loss 0.29 in epoch 322
Loss 0.29 in epoch 323
Loss 0.29 in epoch 324
Loss 0.29 in epoch 325
Loss 0.29 in epoch 326
Loss 0.29 in epoch 327
Loss 0.29 in epoch 328
Loss 0.28 in epoch 329
Loss 0.28 in epoch 330
Loss 0.28 in epoch 331
Loss 0.28 in epoch 332
Loss 0.28 in epoch 333
Loss 0.28 in epoch 334
Loss 0.28 in epoch 335
Loss 0.28 in epoch 336
Loss 0.27 in epoch 337
Loss 0.27 in epoch 338
Loss 0.27 in epoch 339
Loss 0.27 in epoch 340
Loss 0.27 in epoch 341
Loss 0.27 in epoch 342
Loss 0.27 in epoch 343
Loss 0.27 in epoch 344
Loss 0.26 in epoch 345
Loss 0.26 in epoch 346
Loss 0.26 in epoch 347
Loss 0.26 in epoch 348
Loss 0.26 in epoch 349
Loss 0.26 in epoch 350
Loss 0.26 in epoch 351
Loss 0.26 in epoch 352
Loss 0.26 in epoch 353
Loss 0.25 in epoch 354
Loss 0.25 in epoch 355
Loss 0.25 in epoch 356
Loss 0.25 in epoch 357
Loss 0.25 in epoch 358
Loss 0.25 in epoch 359
Loss 0.25 in epoch 360
Loss 0.25 in epoch 361
Loss 0.25 in epoch 362
Loss 0.24 in epoch 363
Loss 0.24 in epoch 364
Loss 0.24 in epoch 365
Loss 0.24 in epoch 366
Loss 0.24 in epoch 367
Loss 0.24 in epoch 368
Loss 0.24 in epoch 369
Loss 0.24 in epoch 370
Loss 0.24 in epoch 371
Loss 0.23 in epoch 372
Loss 0.23 in epoch 373
Loss 0.23 in epoch 374
Loss 0.23 in epoch 375
Loss 0.23 in epoch 376
Loss 0.23 in epoch 377
Loss 0.23 in epoch 378
Loss 0.23 in epoch 379
Loss 0.23 in epoch 380
Loss 0.23 in epoch 381
Loss 0.22 in epoch 382
Loss 0.22 in epoch 383
Loss 0.22 in epoch 384
Loss 0.22 in epoch 385
Loss 0.22 in epoch 386
Loss 0.22 in epoch 387
Loss 0.22 in epoch 388
Loss 0.22 in epoch 389
Loss 0.22 in epoch 390
Loss 0.22 in epoch 391
Loss 0.21 in epoch 392
Loss 0.21 in epoch 393
Loss 0.21 in epoch 394
Loss 0.21 in epoch 395
Loss 0.21 in epoch 396
Loss 0.21 in epoch 397
Loss 0.21 in epoch 398
Loss 0.21 in epoch 399
Loss 0.21 in epoch 400
Loss 0.21 in epoch 401
Loss 0.21 in epoch 402
Loss 0.2 in epoch 403
Loss 0.2 in epoch 404
Loss 0.2 in epoch 405
Loss 0.2 in epoch 406
Loss 0.2 in epoch 407
Loss 0.2 in epoch 408
Loss 0.2 in epoch 409
Loss 0.2 in epoch 410
Loss 0.2 in epoch 411
Loss 0.2 in epoch 412
Loss 0.2 in epoch 413
Loss 0.19 in epoch 414
Loss 0.19 in epoch 415
Loss 0.19 in epoch 416
Loss 0.19 in epoch 417
Loss 0.19 in epoch 418
Loss 0.19 in epoch 419
Loss 0.19 in epoch 420
Loss 0.19 in epoch 421
Loss 0.19 in epoch 422
Loss 0.19 in epoch 423
Loss 0.19 in epoch 424
Loss 0.19 in epoch 425
Loss 0.18 in epoch 426
Loss 0.18 in epoch 427
Loss 0.18 in epoch 428
Loss 0.18 in epoch 429
Loss 0.18 in epoch 430
Loss 0.18 in epoch 431
Loss 0.18 in epoch 432
Loss 0.18 in epoch 433
Loss 0.18 in epoch 434
Loss 0.18 in epoch 435
Loss 0.18 in epoch 436
Loss 0.18 in epoch 437
Loss 0.17 in epoch 438
Loss 0.17 in epoch 439
Loss 0.17 in epoch 440
Loss 0.17 in epoch 441
Loss 0.17 in epoch 442
Loss 0.17 in epoch 443
Loss 0.17 in epoch 444
Loss 0.17 in epoch 445
Loss 0.17 in epoch 446
Loss 0.17 in epoch 447
Loss 0.17 in epoch 448
Loss 0.17 in epoch 449
Loss 0.17 in epoch 450
Loss 0.17 in epoch 451
Loss 0.16 in epoch 452
Loss 0.16 in epoch 453
Loss 0.16 in epoch 454
Loss 0.16 in epoch 455
Loss 0.16 in epoch 456
Loss 0.16 in epoch 457
Loss 0.16 in epoch 458
Loss 0.16 in epoch 459
Loss 0.16 in epoch 460
Loss 0.16 in epoch 461
Loss 0.16 in epoch 462
Loss 0.16 in epoch 463
Loss 0.16 in epoch 464
Loss 0.16 in epoch 465
Loss 0.15 in epoch 466
Loss 0.15 in epoch 467
Loss 0.15 in epoch 468
Loss 0.15 in epoch 469
Loss 0.15 in epoch 470
Loss 0.15 in epoch 471
Loss 0.15 in epoch 472
Loss 0.15 in epoch 473
Loss 0.15 in epoch 474
Loss 0.15 in epoch 475
Loss 0.15 in epoch 476
Loss 0.15 in epoch 477
Loss 0.15 in epoch 478
Loss 0.15 in epoch 479
Loss 0.15 in epoch 480
Loss 0.14 in epoch 481
Loss 0.14 in epoch 482
Loss 0.14 in epoch 483
Loss 0.14 in epoch 484
Loss 0.14 in epoch 485
Loss 0.14 in epoch 486
Loss 0.14 in epoch 487
Loss 0.14 in epoch 488
Loss 0.14 in epoch 489
Loss 0.14 in epoch 490
Loss 0.14 in epoch 491
Loss 0.14 in epoch 492
Loss 0.14 in epoch 493
Loss 0.14 in epoch 494
Loss 0.14 in epoch 495
Loss 0.14 in epoch 496
Loss 0.13 in epoch 497
Loss 0.13 in epoch 498
Loss 0.13 in epoch 499
Loss 0.13 in epoch 500
Loss 0.13 in epoch 501
Loss 0.13 in epoch 502
Loss 0.13 in epoch 503
Loss 0.13 in epoch 504
Loss 0.13 in epoch 505
Loss 0.13 in epoch 506
Loss 0.13 in epoch 507
Loss 0.13 in epoch 508
Loss 0.13 in epoch 509
Loss 0.13 in epoch 510
Loss 0.13 in epoch 511
Loss 0.13 in epoch 512
Loss 0.13 in epoch 513
Loss 0.12 in epoch 514
Loss 0.12 in epoch 515
Loss 0.12 in epoch 516
Loss 0.12 in epoch 517
Loss 0.12 in epoch 518
Loss 0.12 in epoch 519
Loss 0.12 in epoch 520
Loss 0.12 in epoch 521
Loss 0.12 in epoch 522
Loss 0.12 in epoch 523
Loss 0.12 in epoch 524
Loss 0.12 in epoch 525
Loss 0.12 in epoch 526
Loss 0.12 in epoch 527
Loss 0.12 in epoch 528
Loss 0.12 in epoch 529
Loss 0.12 in epoch 530
Loss 0.12 in epoch 531
Loss 0.12 in epoch 532
Loss 0.11 in epoch 533
Loss 0.11 in epoch 534
Loss 0.11 in epoch 535
Loss 0.11 in epoch 536
Loss 0.11 in epoch 537
Loss 0.11 in epoch 538
Loss 0.11 in epoch 539
Loss 0.11 in epoch 540
Loss 0.11 in epoch 541
Loss 0.11 in epoch 542
Loss 0.11 in epoch 543
Loss 0.11 in epoch 544
Loss 0.11 in epoch 545
Loss 0.11 in epoch 546
Loss 0.11 in epoch 547
Loss 0.11 in epoch 548
Loss 0.11 in epoch 549
Loss 0.11 in epoch 550
Loss 0.11 in epoch 551
Loss 0.11 in epoch 552
Loss 0.1 in epoch 553
Loss 0.1 in epoch 554
Loss 0.1 in epoch 555
Loss 0.1 in epoch 556
Loss 0.1 in epoch 557
Loss 0.1 in epoch 558
Loss 0.1 in epoch 559
Loss 0.1 in epoch 560
Loss 0.1 in epoch 561
Loss 0.1 in epoch 562
Loss 0.1 in epoch 563
Loss 0.1 in epoch 564
Loss 0.1 in epoch 565
Loss 0.1 in epoch 566
Loss 0.1 in epoch 567
Loss 0.1 in epoch 568
Loss 0.1 in epoch 569
Loss 0.1 in epoch 570
Loss 0.1 in epoch 571
Loss 0.1 in epoch 572
Loss 0.1 in epoch 573
Loss 0.1 in epoch 574
Loss 0.1 in epoch 575
Loss 0.09 in epoch 576
Loss 0.09 in epoch 577
Loss 0.09 in epoch 578
Loss 0.09 in epoch 579
Loss 0.09 in epoch 580
Loss 0.09 in epoch 581
Loss 0.09 in epoch 582
Loss 0.09 in epoch 583
Loss 0.09 in epoch 584
Loss 0.09 in epoch 585
Loss 0.09 in epoch 586
Loss 0.09 in epoch 587
Loss 0.09 in epoch 588
Loss 0.09 in epoch 589
Loss 0.09 in epoch 590
Loss 0.09 in epoch 591
Loss 0.09 in epoch 592
Loss 0.09 in epoch 593
Loss 0.09 in epoch 594
Loss 0.09 in epoch 595
Loss 0.09 in epoch 596
Loss 0.09 in epoch 597
Loss 0.09 in epoch 598
Loss 0.09 in epoch 599
Loss 0.09 in epoch 600
Loss 0.09 in epoch 601
Loss 0.08 in epoch 602
Loss 0.08 in epoch 603
Loss 0.08 in epoch 604
Loss 0.08 in epoch 605
Loss 0.08 in epoch 606
Loss 0.08 in epoch 607
Loss 0.08 in epoch 608
Loss 0.08 in epoch 609
Loss 0.08 in epoch 610
Loss 0.08 in epoch 611
Loss 0.08 in epoch 612
Loss 0.08 in epoch 613
Loss 0.08 in epoch 614
Loss 0.08 in epoch 615
Loss 0.08 in epoch 616
Loss 0.08 in epoch 617
Loss 0.08 in epoch 618
Loss 0.08 in epoch 619
Loss 0.08 in epoch 620
Loss 0.08 in epoch 621
Loss 0.08 in epoch 622
Loss 0.08 in epoch 623
Loss 0.08 in epoch 624
Loss 0.08 in epoch 625
Loss 0.08 in epoch 626
Loss 0.08 in epoch 627
Loss 0.08 in epoch 628
Loss 0.08 in epoch 629
Loss 0.08 in epoch 630
Loss 0.07 in epoch 631
Loss 0.07 in epoch 632
Loss 0.07 in epoch 633
Loss 0.07 in epoch 634
Loss 0.07 in epoch 635
Loss 0.07 in epoch 636
Loss 0.07 in epoch 637
Loss 0.07 in epoch 638
Loss 0.07 in epoch 639
Loss 0.07 in epoch 640
Loss 0.07 in epoch 641
Loss 0.07 in epoch 642
Loss 0.07 in epoch 643
Loss 0.07 in epoch 644
Loss 0.07 in epoch 645
Loss 0.07 in epoch 646
Loss 0.07 in epoch 647
Loss 0.07 in epoch 648
Loss 0.07 in epoch 649
Loss 0.07 in epoch 650
Loss 0.07 in epoch 651
Loss 0.07 in epoch 652
Loss 0.07 in epoch 653
Loss 0.07 in epoch 654
Loss 0.07 in epoch 655
Loss 0.07 in epoch 656
Loss 0.07 in epoch 657
Loss 0.07 in epoch 658
Loss 0.07 in epoch 659
Loss 0.07 in epoch 660
Loss 0.07 in epoch 661
Loss 0.07 in epoch 662
Loss 0.07 in epoch 663
Loss 0.07 in epoch 664
Loss 0.06 in epoch 665
Loss 0.06 in epoch 666
Loss 0.06 in epoch 667
Loss 0.06 in epoch 668
Loss 0.06 in epoch 669
Loss 0.06 in epoch 670
Loss 0.06 in epoch 671
Loss 0.06 in epoch 672
Loss 0.06 in epoch 673
Loss 0.06 in epoch 674
Loss 0.06 in epoch 675
Loss 0.06 in epoch 676
Loss 0.06 in epoch 677
Loss 0.06 in epoch 678
Loss 0.06 in epoch 679
Loss 0.06 in epoch 680
Loss 0.06 in epoch 681
Loss 0.06 in epoch 682
Loss 0.06 in epoch 683
Loss 0.06 in epoch 684
Loss 0.06 in epoch 685
Loss 0.06 in epoch 686
Loss 0.06 in epoch 687
Loss 0.06 in epoch 688
Loss 0.06 in epoch 689
Loss 0.06 in epoch 690
Loss 0.06 in epoch 691
Loss 0.06 in epoch 692
Loss 0.06 in epoch 693
Loss 0.06 in epoch 694
Loss 0.06 in epoch 695
Loss 0.06 in epoch 696
Loss 0.06 in epoch 697
Loss 0.06 in epoch 698
Loss 0.06 in epoch 699
Loss 0.06 in epoch 700
Loss 0.06 in epoch 701
Loss 0.06 in epoch 702
Loss 0.06 in epoch 703
Loss 0.06 in epoch 704
Loss 0.05 in epoch 705
Loss 0.05 in epoch 706
Loss 0.05 in epoch 707
Loss 0.05 in epoch 708
Loss 0.05 in epoch 709
Loss 0.05 in epoch 710
Loss 0.05 in epoch 711
Loss 0.05 in epoch 712
Loss 0.05 in epoch 713
Loss 0.05 in epoch 714
Loss 0.05 in epoch 715
Loss 0.05 in epoch 716
Loss 0.05 in epoch 717
Loss 0.05 in epoch 718
Loss 0.05 in epoch 719
Loss 0.05 in epoch 720
Loss 0.05 in epoch 721
Loss 0.05 in epoch 722
Loss 0.05 in epoch 723
Loss 0.05 in epoch 724
Loss 0.05 in epoch 725
Loss 0.05 in epoch 726
Loss 0.05 in epoch 727
Loss 0.05 in epoch 728
Loss 0.05 in epoch 729
Loss 0.05 in epoch 730
Loss 0.05 in epoch 731
Loss 0.05 in epoch 732
Loss 0.05 in epoch 733
Loss 0.05 in epoch 734
Loss 0.05 in epoch 735
Loss 0.05 in epoch 736
Loss 0.05 in epoch 737
Loss 0.05 in epoch 738
Loss 0.05 in epoch 739
Loss 0.05 in epoch 740
Loss 0.05 in epoch 741
Loss 0.05 in epoch 742
Loss 0.05 in epoch 743
Loss 0.05 in epoch 744
Loss 0.05 in epoch 745
Loss 0.05 in epoch 746
Loss 0.05 in epoch 747
Loss 0.05 in epoch 748
Loss 0.05 in epoch 749
Loss 0.05 in epoch 750
Loss 0.05 in epoch 751
Loss 0.05 in epoch 752
Loss 0.05 in epoch 753
Loss 0.05 in epoch 754
Loss 0.04 in epoch 755
Loss 0.04 in epoch 756
Loss 0.04 in epoch 757
Loss 0.04 in epoch 758
Loss 0.04 in epoch 759
Loss 0.04 in epoch 760
Loss 0.04 in epoch 761
Loss 0.04 in epoch 762
Loss 0.04 in epoch 763
Loss 0.04 in epoch 764
Loss 0.04 in epoch 765
Loss 0.04 in epoch 766
Loss 0.04 in epoch 767
Loss 0.04 in epoch 768
Loss 0.04 in epoch 769
Loss 0.04 in epoch 770
Loss 0.04 in epoch 771
Loss 0.04 in epoch 772
Loss 0.04 in epoch 773
Loss 0.04 in epoch 774
Loss 0.04 in epoch 775
Loss 0.04 in epoch 776
Loss 0.04 in epoch 777
Loss 0.04 in epoch 778
Loss 0.04 in epoch 779
Loss 0.04 in epoch 780
Loss 0.04 in epoch 781
Loss 0.04 in epoch 782
Loss 0.04 in epoch 783
Loss 0.04 in epoch 784
Loss 0.04 in epoch 785
Loss 0.04 in epoch 786
Loss 0.04 in epoch 787
Loss 0.04 in epoch 788
Loss 0.04 in epoch 789
Loss 0.04 in epoch 790
Loss 0.04 in epoch 791
Loss 0.04 in epoch 792
Loss 0.04 in epoch 793
Loss 0.04 in epoch 794
Loss 0.04 in epoch 795
Loss 0.04 in epoch 796
Loss 0.04 in epoch 797
Loss 0.04 in epoch 798
Loss 0.04 in epoch 799
Loss 0.04 in epoch 800
Loss 0.04 in epoch 801
Loss 0.04 in epoch 802
Loss 0.04 in epoch 803
Loss 0.04 in epoch 804
Loss 0.04 in epoch 805
Loss 0.04 in epoch 806
Loss 0.04 in epoch 807
Loss 0.04 in epoch 808
Loss 0.04 in epoch 809
Loss 0.04 in epoch 810
Loss 0.04 in epoch 811
Loss 0.04 in epoch 812
Loss 0.04 in epoch 813
Loss 0.04 in epoch 814
Loss 0.04 in epoch 815
Loss 0.04 in epoch 816
Loss 0.03 in epoch 817
Loss 0.03 in epoch 818
Loss 0.03 in epoch 819
Loss 0.03 in epoch 820
Loss 0.03 in epoch 821
Loss 0.03 in epoch 822
Loss 0.03 in epoch 823
Loss 0.03 in epoch 824
Loss 0.03 in epoch 825
Loss 0.03 in epoch 826
Loss 0.03 in epoch 827
Loss 0.03 in epoch 828
Loss 0.03 in epoch 829
Loss 0.03 in epoch 830
Loss 0.03 in epoch 831
Loss 0.03 in epoch 832
Loss 0.03 in epoch 833
Loss 0.03 in epoch 834
Loss 0.03 in epoch 835
Loss 0.03 in epoch 836
Loss 0.03 in epoch 837
Loss 0.03 in epoch 838
Loss 0.03 in epoch 839
Loss 0.03 in epoch 840
Loss 0.03 in epoch 841
Loss 0.03 in epoch 842
Loss 0.03 in epoch 843
Loss 0.03 in epoch 844
Loss 0.03 in epoch 845
Loss 0.03 in epoch 846
Loss 0.03 in epoch 847
Loss 0.03 in epoch 848
Loss 0.03 in epoch 849
Loss 0.03 in epoch 850
Loss 0.03 in epoch 851
Loss 0.03 in epoch 852
Loss 0.03 in epoch 853
Loss 0.03 in epoch 854
Loss 0.03 in epoch 855
Loss 0.03 in epoch 856
Loss 0.03 in epoch 857
Loss 0.03 in epoch 858
Loss 0.03 in epoch 859
Loss 0.03 in epoch 860
Loss 0.03 in epoch 861
Loss 0.03 in epoch 862
Loss 0.03 in epoch 863
Loss 0.03 in epoch 864
Loss 0.03 in epoch 865
Loss 0.03 in epoch 866
Loss 0.03 in epoch 867
Loss 0.03 in epoch 868
Loss 0.03 in epoch 869
Loss 0.03 in epoch 870
Loss 0.03 in epoch 871
Loss 0.03 in epoch 872
Loss 0.03 in epoch 873
Loss 0.03 in epoch 874
Loss 0.03 in epoch 875
Loss 0.03 in epoch 876
Loss 0.03 in epoch 877
Loss 0.03 in epoch 878
Loss 0.03 in epoch 879
Loss 0.03 in epoch 880
Loss 0.03 in epoch 881
Loss 0.03 in epoch 882
Loss 0.03 in epoch 883
Loss 0.03 in epoch 884
Loss 0.03 in epoch 885
Loss 0.03 in epoch 886
Loss 0.03 in epoch 887
Loss 0.03 in epoch 888
Loss 0.03 in epoch 889
Loss 0.03 in epoch 890
Loss 0.03 in epoch 891
Loss 0.03 in epoch 892
Loss 0.03 in epoch 893
Loss 0.03 in epoch 894
Loss 0.03 in epoch 895
Loss 0.03 in epoch 896
Loss 0.03 in epoch 897
Loss 0.03 in epoch 898
Loss 0.03 in epoch 899
Loss 0.03 in epoch 900
Loss 0.03 in epoch 901
Loss 0.02 in epoch 902
Loss 0.02 in epoch 903
Loss 0.02 in epoch 904
Loss 0.02 in epoch 905
Loss 0.02 in epoch 906
Loss 0.02 in epoch 907
Loss 0.02 in epoch 908
Loss 0.02 in epoch 909
Loss 0.02 in epoch 910
Loss 0.02 in epoch 911
Loss 0.02 in epoch 912
Loss 0.02 in epoch 913
Loss 0.02 in epoch 914
Loss 0.02 in epoch 915
Loss 0.02 in epoch 916
Loss 0.02 in epoch 917
Loss 0.02 in epoch 918
Loss 0.02 in epoch 919
Loss 0.02 in epoch 920
Loss 0.02 in epoch 921
Loss 0.02 in epoch 922
Loss 0.02 in epoch 923
Loss 0.02 in epoch 924
Loss 0.02 in epoch 925
Loss 0.02 in epoch 926
Loss 0.02 in epoch 927
Loss 0.02 in epoch 928
Loss 0.02 in epoch 929
Loss 0.02 in epoch 930
Loss 0.02 in epoch 931
Loss 0.02 in epoch 932
Loss 0.02 in epoch 933
Loss 0.02 in epoch 934
Loss 0.02 in epoch 935
Loss 0.02 in epoch 936
Loss 0.02 in epoch 937
Loss 0.02 in epoch 938
Loss 0.02 in epoch 939
Loss 0.02 in epoch 940
Loss 0.02 in epoch 941
Loss 0.02 in epoch 942
Loss 0.02 in epoch 943
Loss 0.02 in epoch 944
Loss 0.02 in epoch 945
Loss 0.02 in epoch 946
Loss 0.02 in epoch 947
Loss 0.02 in epoch 948
Loss 0.02 in epoch 949
Loss 0.02 in epoch 950
Loss 0.02 in epoch 951
Loss 0.02 in epoch 952
Loss 0.02 in epoch 953
Loss 0.02 in epoch 954
Loss 0.02 in epoch 955
Loss 0.02 in epoch 956
Loss 0.02 in epoch 957
Loss 0.02 in epoch 958
Loss 0.02 in epoch 959
Loss 0.02 in epoch 960
Loss 0.02 in epoch 961
Loss 0.02 in epoch 962
Loss 0.02 in epoch 963
Loss 0.02 in epoch 964
Loss 0.02 in epoch 965
Loss 0.02 in epoch 966
Loss 0.02 in epoch 967
Loss 0.02 in epoch 968
Loss 0.02 in epoch 969
Loss 0.02 in epoch 970
Loss 0.02 in epoch 971
Loss 0.02 in epoch 972
Loss 0.02 in epoch 973
Loss 0.02 in epoch 974
Loss 0.02 in epoch 975
Loss 0.02 in epoch 976
Loss 0.02 in epoch 977
Loss 0.02 in epoch 978
Loss 0.02 in epoch 979
Loss 0.02 in epoch 980
Loss 0.02 in epoch 981
Loss 0.02 in epoch 982
Loss 0.02 in epoch 983
Loss 0.02 in epoch 984
Loss 0.02 in epoch 985
Loss 0.02 in epoch 986
Loss 0.02 in epoch 987
Loss 0.02 in epoch 988
Loss 0.02 in epoch 989
Loss 0.02 in epoch 990
Loss 0.02 in epoch 991
Loss 0.02 in epoch 992
Loss 0.02 in epoch 993
Loss 0.02 in epoch 994
Loss 0.02 in epoch 995
Loss 0.02 in epoch 996
Loss 0.02 in epoch 997
Loss 0.02 in epoch 998
Loss 0.02 in epoch 999
Loss 0.02 in epoch 1000

Now let’s create more data from the same linear regression and see how well our network is able to predict it:

[13]:
# Moves the network back to the CPU if it was on a GPU.
net.cpu()

# Since we are not training the network anymore, let's put in
# evaluation mode which is faster. In case you need to train it
# again, call net.train()
net.eval()

# Creates some data using a linear regression
inputv = torch.randn(5, 10)
correct_values = torch.mm(inputv, beta)
predicted_values = net(inputv)
criterion(correct_values, predicted_values).item()
[13]:
0.4604862332344055

Exercise: try decreasing and increasing the amount of training data and see if this error goes down! Also try to change the number of features to see how it affects the error.