PyTorch: Build model & consistent naming

c9aaa8ec · Lars Sowa · 484d08af · c9aaa8ec · c9aaa8ec
Commit c9aaa8ec authored 1 year ago by Lars Sowa
--- a/pre-exercises/intro_pytorch_build_model.ipynb
+++ b/pre-exercises/intro_pytorch_build_model.ipynb
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# PyTorch: Build a model\n",
+        "\n",
+        "Make sure that you understood the tutorial about tensors & PyTorch's autograd function before. In this Tutorial we will have look on how to build a PyTorch model.\n",
+        "\n",
+        "This tutorial is adapted from https://pytorch.org/tutorials/beginner/basics/intro.html"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "%matplotlib inline"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Build the Neural Network\n",
+        "\n",
+        "Neural networks comprise of layers/modules that perform operations on data.\n",
+        "The [torch.nn](https://pytorch.org/docs/stable/nn.html) namespace provides all the building blocks you need to\n",
+        "build your own neural network. Every module in PyTorch subclasses the [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html).\n",
+        "A neural network is a module itself that consists of other modules (layers). This nested structure allows for\n",
+        "building and managing complex architectures easily.\n",
+        "\n",
+        "In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import torch\n",
+        "from torch import nn"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Get Device for Training\n",
+        "We want to be able to train our model on a hardware accelerator like the GPU,\n",
+        "if available. Let's check to see if [torch.cuda](https://pytorch.org/docs/stable/notes/cuda.html) is available, otherwise we use the CPU.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "device = (\n",
+        "    \"cuda:3\"\n",
+        "    if torch.cuda.is_available()\n",
+        "    else \"cpu\"\n",
+        ")\n",
+        "print(f\"Using {device} device\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Define the Class\n",
+        "We define our neural network by subclassing ``nn.Module``, and\n",
+        "initialize the neural network layers in ``__init__``. Every ``nn.Module`` subclass implements\n",
+        "the operations on input data in the ``forward`` method. The individual layers are discussed in the next section.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "class NeuralNetwork(nn.Module):\n",
+        "    def __init__(self):\n",
+        "        super().__init__()\n",
+        "        self.flatten = nn.Flatten()\n",
+        "        self.linear_relu_stack = nn.Sequential(\n",
+        "            nn.Linear(28*28, 512),\n",
+        "            nn.ReLU(),\n",
+        "            nn.Linear(512, 512),\n",
+        "            nn.ReLU(),\n",
+        "            nn.Linear(512, 10),\n",
+        "            nn.Softmax(dim=1)\n",
+        "        )\n",
+        "\n",
+        "    def forward(self, x):\n",
+        "        x = self.flatten(x)\n",
+        "        logits = self.linear_relu_stack(x)\n",
+        "        return logits"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print\n",
+        "its structure.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "model = NeuralNetwork().to(device)\n",
+        "print(model)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "To use the model, we pass it the input data. This executes the model's ``forward``,\n",
+        "along with some [background operations](https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866).\n",
+        "Do not call ``model.forward()`` directly!\n",
+        "\n",
+        "Calling the model on the input returns a 2-dimensional tensor with dim=0 corresponding to each output of 10 raw predicted values for each class, and dim=1 corresponding to the individual values of each output.\n",
+        "We get the prediction probabilities by passing it through an instance of the ``nn.Softmax`` module.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "X = torch.rand(1, 28, 28, device=device)\n",
+        "logits = model(X)\n",
+        "pred_probab = (logits)\n",
+        "y_pred = pred_probab.argmax(1)\n",
+        "print(f\"Predicted class: {y_pred}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Model Layers\n",
+        "\n",
+        "Let's break down the layers in the FashionMNIST model. To illustrate it, we\n",
+        "will take a sample minibatch of 3 images of size 28x28 and see what happens to it as\n",
+        "we pass it through the network.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "input_image = torch.rand(3,28,28)\n",
+        "input_image.size()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### nn.Flatten\n",
+        "We initialize the [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html)\n",
+        "layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values (\n",
+        "the minibatch dimension (at dim=0) is maintained).\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "flatten = nn.Flatten()\n",
+        "flat_image = flatten(input_image)\n",
+        "print(flat_image.size())"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### nn.Linear\n",
+        "The [linear layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)\n",
+        "is a module that applies a linear transformation $\\vec{x}\\cdot \\bold{w} +\\vec{b}$ on the input using its stored weights and biases. \n",
+        "Have a look at the documentation of [nn.Linear()](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)!\n",
+        "\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "layer1 = nn.Linear(in_features=28*28, out_features=20)\n",
+        "hidden1 = layer1(flat_image)\n",
+        "print(hidden1.size())"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Note that this layer contains the trainable parameters (weights and bias) in our model, therefore `requires_grad` is set to `true` by default."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "print(layer1.weight.requires_grad) # this is w\n",
+        "print(layer1.bias.requires_grad)   # this is b"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### nn.ReLU\n",
+        "Non-linear activations are what create the complex mappings between the model's inputs and outputs.\n",
+        "They are applied after linear transformations to introduce *nonlinearity*, helping neural networks to\n",
+        "learn a wide variety of phenomena.\n",
+        "\n",
+        "![yolo](figures/ReLu.png)\n",
+        "\n",
+        "\n",
+        "\n",
+        "In this model, we use [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html) (see fig. above) between our\n",
+        "linear layers, but there are also other activations to introduce non-linearity in your model, have a look at [them](https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity)!\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "print(f\"Before ReLU: {hidden1}\\n\\n\")\n",
+        "hidden1 = nn.ReLU()(hidden1)\n",
+        "print(f\"After ReLU: {hidden1}\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### nn.Sequential\n",
+        "[nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) is an ordered\n",
+        "container of modules. The data is passed through all the modules in the same order as defined. You can use\n",
+        "sequential containers to put together a quick network like ``seq_modules``."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "seq_modules = nn.Sequential(\n",
+        "    flatten,\n",
+        "    layer1,\n",
+        "    nn.ReLU(),\n",
+        "    nn.Linear(20, 10)\n",
+        ")\n",
+        "input_image = torch.rand(3,28,28)\n",
+        "logits = seq_modules(input_image)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### nn.Softmax\n",
+        "\n",
+        "The last activation function of you model is dependent on the underlying problem. For a regression task, you want your model to produce values between $[-\\infty, \\infty]$, so here it is common to use no activation function in the final layer at all. \n",
+        "\n",
+        "In our model for FashionMNIST classification the last linear layer of the neural network should return `logits` - so the values from the last linear layer ($[-\\infty, \\infty]$) are passed to the\n",
+        "[nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html) module. The logits are scaled to values\n",
+        "[0, 1] representing the model's predicted probabilities for each class. ``dim`` parameter indicates the dimension along\n",
+        "which the values must sum to 1.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "softmax = nn.Softmax(dim=1)\n",
+        "pred_probab = softmax(logits)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Create your own Neural Network! Use the above defined module `seq_modules` as basis, maybe you can extent it?"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "class MyOwnNN(???):\n",
+        "    def __init__(self):\n",
+        "        # your code goes here\n",
+        "    def forward(self):\n",
+        "        # your code goes here"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Model Parameters\n",
+        "Many layers inside a neural network are *parameterized*, i.e. have associated weights\n",
+        "and biases that are optimized during training. Subclassing ``nn.Module`` automatically\n",
+        "tracks all fields defined inside your model object, and makes all parameters\n",
+        "accessible using your model's ``parameters()`` or ``named_parameters()`` methods.\n",
+        "\n",
+        "In this example, we iterate over each parameter, and print its size and a preview of its values.\n",
+        "\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": false
+      },
+      "outputs": [],
+      "source": [
+        "print(f\"Model structure: {model}\\n\\n\")\n",
+        "\n",
+        "for name, param in model.named_parameters():\n",
+        "    print(f\"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \\n\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Use the below defined `cound_parameters()` function to compare the number of learnable parameters of a `NeuralNetwork` instance with a`MyOwnNN` instance."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "def count_parameters(model):\n",
+        "    # source https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/7\n",
+        "    # user baldassarre.fe\n",
+        "    return sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
+        "\n",
+        "# your code goes here"
+      ]
+    }
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.9.16"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
+%% Cell type:markdown id: tags:
+
+# PyTorch: Build a model
+
+Make sure that you understood the tutorial about tensors & PyTorch's autograd function before. In this Tutorial we will have look on how to build a PyTorch model.
+
+This tutorial is adapted from https://pytorch.org/tutorials/beginner/basics/intro.html
+
+%% Cell type:code id: tags:
+
+``` python
+%matplotlib inline
+```
+
+%% Cell type:markdown id: tags:
+
+## Build the Neural Network
+
+Neural networks comprise of layers/modules that perform operations on data.
+The [torch.nn](https://pytorch.org/docs/stable/nn.html) namespace provides all the building blocks you need to
+build your own neural network. Every module in PyTorch subclasses the [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html).
+A neural network is a module itself that consists of other modules (layers). This nested structure allows for
+building and managing complex architectures easily.
+
+In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.
+
+%% Cell type:code id: tags:
+
+``` python
+import os
+import torch
+from torch import nn
+```
+
+%% Cell type:markdown id: tags:
+
+### Get Device for Training
+We want to be able to train our model on a hardware accelerator like the GPU,
+if available. Let's check to see if [torch.cuda](https://pytorch.org/docs/stable/notes/cuda.html) is available, otherwise we use the CPU.
+
+
+%% Cell type:code id: tags:
+
+``` python
+device = (
+    "cuda:3"
+    if torch.cuda.is_available()
+    else "cpu"
+)
+print(f"Using {device} device")
+```
+
+%% Cell type:markdown id: tags:
+
+### Define the Class
+We define our neural network by subclassing ``nn.Module``, and
+initialize the neural network layers in ``__init__``. Every ``nn.Module`` subclass implements
+the operations on input data in the ``forward`` method. The individual layers are discussed in the next section.
+
+
+%% Cell type:code id: tags:
+
+``` python
+class NeuralNetwork(nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.flatten = nn.Flatten()
+        self.linear_relu_stack = nn.Sequential(
+            nn.Linear(28*28, 512),
+            nn.ReLU(),
+            nn.Linear(512, 512),
+            nn.ReLU(),
+            nn.Linear(512, 10),
+            nn.Softmax(dim=1)
+        )
+
+    def forward(self, x):
+        x = self.flatten(x)
+        logits = self.linear_relu_stack(x)
+        return logits
+```
+
+%% Cell type:markdown id: tags:
+
+We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print
+its structure.
+
+
+%% Cell type:code id: tags:
+
+``` python
+model = NeuralNetwork().to(device)
+print(model)
+```
+
+%% Cell type:markdown id: tags:
+
+To use the model, we pass it the input data. This executes the model's ``forward``,
+along with some [background operations](https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866).
+Do not call ``model.forward()`` directly!
+
+Calling the model on the input returns a 2-dimensional tensor with dim=0 corresponding to each output of 10 raw predicted values for each class, and dim=1 corresponding to the individual values of each output.
+We get the prediction probabilities by passing it through an instance of the ``nn.Softmax`` module.
+
+
+%% Cell type:code id: tags:
+
+``` python
+X = torch.rand(1, 28, 28, device=device)
+logits = model(X)
+pred_probab = (logits)
+y_pred = pred_probab.argmax(1)
+print(f"Predicted class: {y_pred}")
+```
+
+%% Cell type:markdown id: tags:
+
+### Model Layers
+
+Let's break down the layers in the FashionMNIST model. To illustrate it, we
+will take a sample minibatch of 3 images of size 28x28 and see what happens to it as
+we pass it through the network.
+
+
+%% Cell type:code id: tags:
+
+``` python
+input_image = torch.rand(3,28,28)
+input_image.size()
+```
+
+%% Cell type:markdown id: tags:
+
+#### nn.Flatten
+We initialize the [nn.Flatten](https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html)
+layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values (
+the minibatch dimension (at dim=0) is maintained).
+
+
+%% Cell type:code id: tags:
+
+``` python
+flatten = nn.Flatten()
+flat_image = flatten(input_image)
+print(flat_image.size())
+```
+
+%% Cell type:markdown id: tags:
+
+#### nn.Linear
+The [linear layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)
+is a module that applies a linear transformation $\vec{x}\cdot \bold{w} +\vec{b}$ on the input using its stored weights and biases.
+Have a look at the documentation of [nn.Linear()](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)!
+
+
+
+%% Cell type:code id: tags:
+
+``` python
+layer1 = nn.Linear(in_features=28*28, out_features=20)
+hidden1 = layer1(flat_image)
+print(hidden1.size())
+```
+
+%% Cell type:markdown id: tags:
+
+Note that this layer contains the trainable parameters (weights and bias) in our model, therefore `requires_grad` is set to `true` by default.
+
+%% Cell type:code id: tags:
+
+``` python
+print(layer1.weight.requires_grad) # this is w
+print(layer1.bias.requires_grad)   # this is b
+```
+
+%% Cell type:markdown id: tags:
+
+#### nn.ReLU
+Non-linear activations are what create the complex mappings between the model's inputs and outputs.
+They are applied after linear transformations to introduce *nonlinearity*, helping neural networks to
+learn a wide variety of phenomena.
+
+![yolo](figures/ReLu.png)
+
+
+
+In this model, we use [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html) (see fig. above) between our
+linear layers, but there are also other activations to introduce non-linearity in your model, have a look at [them](https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity)!
+
+
+%% Cell type:code id: tags:
+
+``` python
+print(f"Before ReLU: {hidden1}\n\n")
+hidden1 = nn.ReLU()(hidden1)
+print(f"After ReLU: {hidden1}")
+```
+
+%% Cell type:markdown id: tags:
+
+#### nn.Sequential
+[nn.Sequential](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) is an ordered
+container of modules. The data is passed through all the modules in the same order as defined. You can use
+sequential containers to put together a quick network like ``seq_modules``.
+
+%% Cell type:code id: tags:
+
+``` python
+seq_modules = nn.Sequential(
+    flatten,
+    layer1,
+    nn.ReLU(),
+    nn.Linear(20, 10)
+)
+input_image = torch.rand(3,28,28)
+logits = seq_modules(input_image)
+```
+
+%% Cell type:markdown id: tags:
+
+#### nn.Softmax
+
+The last activation function of you model is dependent on the underlying problem. For a regression task, you want your model to produce values between $[-\infty, \infty]$, so here it is common to use no activation function in the final layer at all.
+
+In our model for FashionMNIST classification the last linear layer of the neural network should return `logits` - so the values from the last linear layer ($[-\infty, \infty]$) are passed to the
+[nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html) module. The logits are scaled to values
+[0, 1] representing the model's predicted probabilities for each class. ``dim`` parameter indicates the dimension along
+which the values must sum to 1.
+
+
+%% Cell type:code id: tags:
+
+``` python
+softmax = nn.Softmax(dim=1)
+pred_probab = softmax(logits)
+```
+
+%% Cell type:markdown id: tags:
+
+Create your own Neural Network! Use the above defined module `seq_modules` as basis, maybe you can extent it?
+
+%% Cell type:code id: tags:
+
+``` python
+class MyOwnNN(???):
+    def __init__(self):
+        # your code goes here
+    def forward(self):
+        # your code goes here
+```
+
+%% Cell type:markdown id: tags:
+
+### Model Parameters
+Many layers inside a neural network are *parameterized*, i.e. have associated weights
+and biases that are optimized during training. Subclassing ``nn.Module`` automatically
+tracks all fields defined inside your model object, and makes all parameters
+accessible using your model's ``parameters()`` or ``named_parameters()`` methods.
+
+In this example, we iterate over each parameter, and print its size and a preview of its values.
+
+
+
+%% Cell type:code id: tags:
+
+``` python
+print(f"Model structure: {model}\n\n")
+
+for name, param in model.named_parameters():
+    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")
+```
+
+%% Cell type:markdown id: tags:
+
+Use the below defined `cound_parameters()` function to compare the number of learnable parameters of a `NeuralNetwork` instance with a`MyOwnNN` instance.
+
+%% Cell type:code id: tags:
+
+``` python
+def count_parameters(model):
+    # source https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/7
+    # user baldassarre.fe
+    return sum(p.numel() for p in model.parameters() if p.requires_grad)
+
+# your code goes here
+```
--- a/pre-exercises/PyTorch_TensorsAndAutograd.ipynb
+++ b/pre-exercises/PyTorch_TensorsAndAutograd.ipynb