Skip to content

Commit 0e4bcf8

Browse files
committed
Update slides
Adds overview of PyTorch concepts Includes mention of other frameworks Adds example derivative of mse loss Adds diagram of scatter plot with error lines
1 parent 16662c1 commit 0e4bcf8

File tree

1 file changed

+224
-40
lines changed

1 file changed

+224
-40
lines changed

slides/slides.qmd

Lines changed: 224 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "Introduction to Neural Networks with PyTorch"
3-
subtitle: "ICCS Summer School 2024"
3+
subtitle: "ICCS Summer School 2025"
44
bibliography: references.bib
55
format:
66
revealjs:
@@ -22,9 +22,8 @@ authors:
2222
- name: Matt Archer
2323
affiliations: ICCS/Cambridge
2424
orcid: 0009-0002-7043-6769
25-
- name: Surbhi Goel
25+
- name: Isaac Akanho
2626
affiliations: ICCS/Cambridge
27-
orcid: 0009-0005-0237-756X
2827

2928
revealjs-plugins:
3029
- attribution
@@ -37,19 +36,18 @@ revealjs-plugins:
3736
:::: {.columns}
3837
::: {.column width=50%}
3938

40-
* 9:00-9:30 - NN lecture
41-
* 9:30-10:30 - Teaching/Code-along
42-
* 10:30-11:00 - Coffee
43-
* 11:00-12:00 - Teaching/Code-along
39+
### Wednesday
40+
* 9:30-10:00 - NN lecture
41+
* 10:00-10:30 - Teaching/Code-along
42+
* 13:30-15:00 - Teaching/Code-along
4443

45-
Lunch
4644

47-
* 12:00 - 13:30
45+
### Thursday
46+
47+
* 9:30-10:30 - Teaching/Code-along
4848

4949
::: {style="color: turquoise;"}
50-
Helping Today:
5150

52-
* Person 1 - Cambridge RSE
5351
:::
5452
:::
5553
::::
@@ -189,39 +187,33 @@ $$-\frac{dy}{dx}$$
189187
- When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
190188
- We therefore need a way of measuring how well a model's predictions match our observations.
191189

190+
## Fitting a straight line with SGD IV {.smaller}
192191

193-
::: {.fragment .fade-in}
194192

195-
:::: {.columns}
196-
::: {.column width="30%"}
193+
![](error-line.png)
194+
195+
- We can measure the distance between $f(x_{i})$ and $y_{i}$.
196+
197+
198+
<!-- :::: {.columns} -->
199+
<!-- ::: {.column width="30%"} -->
197200

198-
- Consider the data:
201+
<!-- - Consider the data:
199202
200203
| $x_{i}$ | $y_{i}$ |
201204
|:--------:|:-------:|
202205
| 1.0 | 2.1 |
203206
| 2.0 | 3.9 |
204-
| 3.0 | 6.2 |
207+
| 3.0 | 6.2 | -->
205208

206-
:::
207-
::: {.column width="70%"}
208-
- We can measure the distance between $f(x_{i})$ and $y_{i}$.
209-
- Normally we might consider the mean-squared error:
209+
## Fitting a straight line with SGD V {.smaller}
210210

211-
$$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
212211

213-
:::
214-
::::
215-
216-
:::
217-
218-
::: {.fragment .fade-in}
219-
- We can differentiate the loss function w.r.t. to each parameter in the the model $f$.
220-
- We can use these directions of steepest descent to iteratively 'nudge' the parameters in a direction which will reduce the loss.
221-
:::
212+
<!-- ::: {.column width="70%"} -->
222213

214+
- Normally we might consider the mean-squared error:
223215

224-
## Fitting a straight line with SGD IV {.smaller}
216+
$$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
225217

226218
:::: {.columns}
227219
::: {.column width="45%"}
@@ -233,19 +225,43 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
233225
- Loss: \ $\frac{1}{n}\sum_{i=1}^{n}(y_{i} - x_{i})^{2}$
234226

235227
:::
236-
::: {.column width="55%"}
237228

229+
::: {.column width="55%"}
230+
231+
- We can differentiate the loss function w.r.t. to each parameter in the the model $f$.
238232
$$
239233
\begin{align}
240234
L_{\text{MSE}} &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - f(x_{i}))^{2}\\
241235
&= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - mx_{i} + c)^{2}
242236
\end{align}
243237
$$
244-
245238
:::
246239
::::
247240

248-
::: {.fragment .fade-in}
241+
242+
####
243+
244+
## Fitting a straight line with SGD VI {.smaller}
245+
246+
- Differential:
247+
248+
$$
249+
\frac{\partial L}{\partial m}
250+
\;=\;
251+
\frac{1}{n}\sum_{i=1}^{n} 2\bigl(m\,x_{i}+c-y_{i}\bigr)\,x_{i}.
252+
$$
253+
254+
$$
255+
\frac{\partial L}{\partial c}
256+
\;=\;
257+
\frac{1}{n}\sum_{i=1}^{n} 2\bigl(m\,x_{i}+c-y_{i}\bigr).
258+
$$
259+
260+
- This gradient is used to find the parameters that **minimise the loss**, thereby reducing overall error.
261+
262+
263+
## Update Rule
264+
249265
- We can iteratively minimise the loss by stepping the model's parameters in the direction of steepest descent:
250266

251267
::: {layout="[0.5, 1, 0.5, 1, 0.5]"}
@@ -266,7 +282,6 @@ $$c_{n + 1} = c_{n} - \frac{dL}{dc} \cdot l_{r}$$
266282
:::
267283

268284
- where $l_{\text{r}}$ is a small constant known as the _learning rate_.
269-
:::
270285

271286

272287
## Quick recap {.smaller}
@@ -305,7 +320,7 @@ $$a_{l+1} = \sigma \left( W_{l}a_{l} + b_{l} \right)$$
305320
:::
306321
::::
307322

308-
![](https://3b1b-posts.us-east-1.linodeobjects.com//images/topics/neural-networks.jpg){style="border-radius: 50%;" .absolute top=35% left=42.5% width=65%}
323+
![](https://web.archive.org/web/20230105124836if_/https://3b1b-posts.us-east-1.linodeobjects.com//images/topics/neural-networks.jpg){style="border-radius: 50%;" .absolute top=35% left=42.5% width=65%}
309324

310325
::: {.attribution}
311326
Image source: [3Blue1Brown](https://www.3blue1brown.com/topics/neural-networks)
@@ -329,9 +344,178 @@ Image source: [3Blue1Brown](https://www.3blue1brown.com/topics/neural-networks)
329344

330345
- In this workshop, we will implement some straightforward neural networks in PyTorch, and use them for different classification and regression problems.
331346
- PyTorch is a deep learning framework that can be used in both Python and C++.
332-
- I have never met anyone actually training models in C++; I find it a bit weird.
347+
- There are other frameworks like Jax, Tensorflow, PyTorch Lightning
333348
- See the PyTorch website: [https://pytorch.org/](https://pytorch.org/)
334349

350+
# Datasets, DataLoaders & `nn.Module`
351+
352+
353+
---
354+
355+
## What a `Dataset` class does
356+
357+
- Provides a **uniform API** to your data
358+
- Handles
359+
- **Loading** raw files (images, CSVs, audio …)
360+
- **Train / validation / test** split logic
361+
- **Transforms / augmentation** per item
362+
- **Item retrieval** so the rest of PyTorch can stay agnostic
363+
364+
---
365+
366+
## Anatomy of a custom `Dataset`
367+
368+
```python
369+
class MyDataset(torch.utils.data.Dataset):
370+
def __init__(self, root_dir, split="train", transform=None):
371+
# 1️ load or download files / labels
372+
self.paths, self.labels = load_index_file(root_dir, split)
373+
self.transform = transform # 2️ save transforms
374+
```
375+
376+
*The constructor is where you gather file paths, download archives, read CSVs, etc.*
377+
378+
---
379+
380+
## `__len__` & `__getitem__`
381+
382+
```python
383+
def __len__(self):
384+
return len(self.paths) # total #samples
385+
386+
def __getitem__(self, idx):
387+
img = PIL.Image.open(self.paths[idx]).convert("RGB")
388+
if self.transform: # 3️ apply transforms
389+
img = self.transform(img)
390+
label = self.labels[idx]
391+
return img, label # 4️ single example
392+
```
393+
394+
With these two methods PyTorch knows **how big** the dataset is and **how to fetch** one record.
395+
396+
---
397+
398+
## Using the custom dataset
399+
400+
```python
401+
from torchvision import transforms
402+
403+
train_ds = MyDataset(
404+
"data/cats_vs_dogs",
405+
split="train",
406+
transform=transforms.ToTensor()
407+
)
408+
print(len(train_ds)) # e.g. ➜ 20_000
409+
img, y = train_ds[0] # one (tensor, label) pair
410+
```
411+
412+
---
413+
414+
## The **DataLoader** at a glance
415+
416+
- Wraps any `Dataset` in an **iterable**
417+
- **Batches** samples together
418+
- **Shuffles** if asked
419+
- Uses **multiprocessing** (`num_workers`) to pre‑fetch data in parallel
420+
- Returns `(batch, labels)` tuples ready for the GPU
421+
422+
---
423+
424+
## Typical DataLoader code
425+
426+
```python
427+
train_loader = torch.utils.data.DataLoader(
428+
dataset=train_ds,
429+
batch_size=64,
430+
shuffle=True,
431+
num_workers=4, # 4 CPU workers
432+
)
433+
434+
for images, labels in train_loader:
435+
...
436+
```
437+
438+
439+
440+
---
441+
442+
## Quick networks with `nn.Sequential`
443+
444+
```python
445+
mlp = torch.nn.Sequential(
446+
torch.nn.Linear(784, 256), torch.nn.ReLU(),
447+
torch.nn.Linear(256, 64), torch.nn.ReLU(),
448+
torch.nn.Linear(64, 10)
449+
)
450+
451+
out = mlp(torch.rand(32, 784)) # 32‑sample batch
452+
```
453+
454+
Great for simple feed‑forward stacks when no branching logic is needed.
455+
456+
---
457+
458+
## `nn.Module` overview
459+
460+
- The **base class** for *all* neural‑network parts in PyTorch
461+
- You **sub‑class**, then implement
462+
- `__init__(self)`: declare layers
463+
- `forward(self, x)`: define the forward pass
464+
465+
---
466+
467+
## Declaring layers in `__init__`
468+
469+
```python
470+
class MyCNN(torch.nn.Module):
471+
def __init__(self, num_classes=2):
472+
super().__init__()
473+
self.features = torch.nn.Sequential(
474+
torch.nn.Conv2d(3, 32, 3, padding=1), torch.nn.ReLU(),
475+
torch.nn.MaxPool2d(2),
476+
torch.nn.Conv2d(32, 64, 3, padding=1), torch.nn.ReLU(),
477+
torch.nn.MaxPool2d(2)
478+
)
479+
self.classifier = torch.nn.Linear(64*56*56, num_classes)
480+
```
481+
482+
---
483+
484+
## The `forward` pass
485+
486+
```python
487+
def forward(self, x):
488+
x = self.features(x) # conv stack
489+
x = x.flatten(1) # N,…
490+
x = self.classifier(x) # logits
491+
return x
492+
```
493+
494+
Only **forward** is needed – back‑prop is handled automatically.
495+
496+
---
497+
498+
## Calling the model ≈ calling `forward`
499+
500+
```python
501+
model = MyCNN()
502+
logits1 = model(images) # preferred ✔
503+
logits2 = model.forward(images) # works, but avoid
504+
```
505+
506+
`model(input)` internally routes to `model.forward(input)` via `__call__`.
507+
508+
---
509+
510+
## Key Take‑Aways
511+
512+
1. **Dataset** = organized access to *individual* samples
513+
2. **DataLoader** = batching, shuffling, parallel I/O
514+
3. `nn.Module` = reusable building block; override `__init__` & `forward`
515+
4. `model(x)` is the idiomatic way to run a forward pass
516+
5. Use `nn.Sequential` for quick layer chains
517+
518+
335519

336520
# Exercises
337521

@@ -506,13 +690,13 @@ For more information we can be reached at:
506690

507691
::: {.column width="25%"}
508692

509-
{{< fa pencil >}} \ Surbhi Goel
693+
{{< fa pencil >}} \ Isaac Akanho
510694

511695
{{< fa solid person-digging >}} \ [ICCS/UoCambridge](https://iccs.cam.ac.uk/about-us/our-team)
512696

513-
{{< fa solid envelope >}} \ [sg2147[AT]cam.ac.uk](mailto:sg2147@cam.ac.uk)
697+
{{< fa solid envelope >}} \ [ia464[AT]cam.ac.uk](mailto:ia464@cam.ac.uk)
514698

515-
{{< fa brands github >}} \ [surbhigoel77](https://github.com/surbhigoel77)
699+
{{< fa brands github >}} \ [isaacaka](https://github.com/isaacaka)
516700

517701
:::
518702

0 commit comments

Comments
 (0)