Starting from:
$30

$24

Deep Learning Practical Session 6


Introduction


The objective of this session is to observe the impact of residual connections and batch-normalization on the gradient norm at di erent depth in a residual network.

You can start this session with an embryo of code that includes an implementation of a residual network and an example of graph drawing with Matplotlib:

https://fleuret.org/dlc/src/dlc_practical_6_embryo.py


    • Modi cation of the ResNet implementation


Edit the implementation of the ResNet and ResNetBlock so that you can pass two Boolean ags skip_connections and batch_normalization to specify if these features are activated or not.


    • Monitoring the gradient norm

Write a function get_stats(skip_connections, batch_normalization) that

    1. creates a model with 30 residual blocks, 10 channels, 3  3 kernels,

    2. computes the norm of the gradient of the cross-entropy with respect to the weights of the rst convolutional layer of each residual block, on 100 individual samples,

    3. returns the 30  100 resulting tensor.

Hint: You can create a list of the weight tensors of the    rst convolution layer of each block with:

monitored_parameters = [ b.conv1.weight for b in model.resnet_blocks ]

and use it to get the gradient norm for each.


    • Graph


Plot for the four con gurations of the two Boolean ags skip_connections and batch_normalization the average of the gradient norm vs. depth.

1 of 2

If you use a notebook, you can set the Maplotlib backend to the ’inline’ one to have graphs appear in it with

%matplotlib inline



































































2 of 2

More products