The Residual Network (ResNet) Standard: Unterschied zwischen den Versionen
Keine Bearbeitungszusammenfassung |
Keine Bearbeitungszusammenfassung |
||
| Zeile 8: | Zeile 8: | ||
=== MATLAB Usage === | === MATLAB Usage === | ||
It is easy to use in MATLAB with <tt>resnet50</tt>. Many people use it as the default option for transfer learning tests, because its behavior is well known and already understood.<ref>MathWorks. ''Pretrained Convolutional Neural Networks''. MATLAB Documentation. Available at: https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html</ref> | It is easy to use in MATLAB with <tt>resnet50</tt>. Many people use it as the default option for transfer learning tests, because its behavior is well known and already understood.<ref>MathWorks. ''Pretrained Convolutional Neural Networks''. MATLAB Documentation. Available at: https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html</ref> | ||
== The Math Behind Vanishing Gradients and Training Problems == | |||
The main problem when training very deep neural networks is called the vanishing gradient problem. This happens when the gradients from backpropagation get smaller and smaller as they move to the earlier layers of the network. Because of this, the first layers almost stop learning. | |||
This happens because of the chain rule in math. During backpropagation, the gradient in each layer is multiplied by gradients from the layers after it. So it becomes like many small numbers multiplied together. When you multiply many small numbers, the result gets very tiny. | |||
If the network use activation functions that can saturate, or even normal non-linear functions with certain weight setups, the gradients shrink very fast. They can become close to zero in an exponential way. When this happen, the early layers in the model cannot learn properly, and the whole training process slow down a lot or even stop working good.<ref name="Xu2024">Xu, G., Wang, X., Wu, X., Leng, X. and Xu, Y., 2024. Development of skip connection in deep neural networks for computer vision and medical image analysis: A survey. arXiv preprint arXiv:2405.01725.</ref> | |||
== References == | == References == | ||
<references /> | <references /> | ||
Version vom 13. Februar 2026, 16:30 Uhr
ResNet, also called Residual Network, is a very important model in deep learning history. Before ResNet, training very deep neural networks was really hard because of the vanishing gradient problem. This means the signal used to update the weights becomes very small as it moves backward through many layers.
ResNet fixed this problem by adding skip connections, also known as shortcuts. These connections let the gradient flow around some layers instead of passing through all of them. Because of this, very deep networks can be trained, even with hundreds of layers, like ResNet-50, ResNet-101, and ResNet-152.[1]
Performance
ResNet-50 is seen as the normal industry "workhorse." It gives a good mix of accuracy (about 76% top-1 on ImageNet) and how much computing power it needs. It works well for most tasks and people trust it a lot.[2]
MATLAB Usage
It is easy to use in MATLAB with resnet50. Many people use it as the default option for transfer learning tests, because its behavior is well known and already understood.[3]
The Math Behind Vanishing Gradients and Training Problems
The main problem when training very deep neural networks is called the vanishing gradient problem. This happens when the gradients from backpropagation get smaller and smaller as they move to the earlier layers of the network. Because of this, the first layers almost stop learning.
This happens because of the chain rule in math. During backpropagation, the gradient in each layer is multiplied by gradients from the layers after it. So it becomes like many small numbers multiplied together. When you multiply many small numbers, the result gets very tiny.
If the network use activation functions that can saturate, or even normal non-linear functions with certain weight setups, the gradients shrink very fast. They can become close to zero in an exponential way. When this happen, the early layers in the model cannot learn properly, and the whole training process slow down a lot or even stop working good.[4]
References
- ↑ MathWorks. Train Residual Network for Image Classification. [Online]. Available at: https://www.mathworks.com/help/deeplearning/ug/train-residual-network-for-image-classification.html
- ↑ DeepLabCut. What neural network should I use? (Trade-offs, speed performance, and considerations). GitHub Wiki. Available at: https://github.com/DeepLabCut/DeepLabCut/wiki/What-neural-network-should-I-use%3F-(Trade-offs,-speed-performance,-and-considerations)
- ↑ MathWorks. Pretrained Convolutional Neural Networks. MATLAB Documentation. Available at: https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html
- ↑ Xu, G., Wang, X., Wu, X., Leng, X. and Xu, Y., 2024. Development of skip connection in deep neural networks for computer vision and medical image analysis: A survey. arXiv preprint arXiv:2405.01725.