Zwicky-box analysis: Unterschied zwischen den Versionen
Die Seite wurde neu angelegt: „== Zwicky-Box Analysis for Models == We do a Zwicky-box analysis of four CNN and Transformer models (ResNet-50, MobileNetV2, EfficientNet-B0, ViT-Base) across six image task. We look at what each model have (parameters, FLOPs, input size, how much data it need, speed, memory, hardware target, explainability, noise strongness, and test accuracy). We make key parts (like model size, math cost, speed, memory size, data need, transfer, explain, noise handl…“ |
|||
| Zeile 3: | Zeile 3: | ||
We do a Zwicky-box analysis of four CNN and Transformer models (ResNet-50, MobileNetV2, EfficientNet-B0, ViT-Base) across six image task. We look at what each model have (parameters, FLOPs, input size, how much data it need, speed, memory, hardware target, explainability, noise strongness, and test accuracy). | We do a Zwicky-box analysis of four CNN and Transformer models (ResNet-50, MobileNetV2, EfficientNet-B0, ViT-Base) across six image task. We look at what each model have (parameters, FLOPs, input size, how much data it need, speed, memory, hardware target, explainability, noise strongness, and test accuracy). | ||
Here we make key parts (like model size, math cost, speed, memory size, data need, transfer, explain, noise handling, and segmentation output) with different levels. We put each model to these levels and test them for task like: fix image, make image better, remove noise, supervised segmentation, normal segmentation, and classification. | |||
We find that, CNN models (ResNet and EfficientNet) give good correct guess and pixel outputs but it cost more computer power. MobileNetV2 is very small (good for edge device) but guess less correct. ViT-Base is very big and need lots of data, it only do good on classification when it train with huge data. | |||
Below is a morphological table, tell why we pick model for task, and give a compare table and metric chart. | |||
For pixel task (fix, make better, or segmentation), CNNs (ResNet or EfficientNet) is best. For phone or fast real-time, use MobileNetV2 or small EfficientNet. For only classification with too much data, use ViT or big EfficientNet. MATLAB Deep Learning Toolbox have already trained models ready to use, like <tt>resnet50</tt>, <tt>mobilenetv2</tt>, <tt>efficientnetb0</tt>, and <tt>visionTransformer</tt>.<ref>MathWorks. ''Pretrained Deep Neural Networks''. Available at: https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html</ref><ref>MathWorks. ''visionTransformer - Pretrained vision transformer (ViT) neural network''. Available at: https://www.mathworks.com/help/vision/ref/visiontransformer.html</ref> | |||
== References == | == References == | ||
<references /> | <references /> | ||
Aktuelle Version vom 14. April 2026, 13:49 Uhr
Zwicky-Box Analysis for Models
We do a Zwicky-box analysis of four CNN and Transformer models (ResNet-50, MobileNetV2, EfficientNet-B0, ViT-Base) across six image task. We look at what each model have (parameters, FLOPs, input size, how much data it need, speed, memory, hardware target, explainability, noise strongness, and test accuracy).
Here we make key parts (like model size, math cost, speed, memory size, data need, transfer, explain, noise handling, and segmentation output) with different levels. We put each model to these levels and test them for task like: fix image, make image better, remove noise, supervised segmentation, normal segmentation, and classification.
We find that, CNN models (ResNet and EfficientNet) give good correct guess and pixel outputs but it cost more computer power. MobileNetV2 is very small (good for edge device) but guess less correct. ViT-Base is very big and need lots of data, it only do good on classification when it train with huge data.
Below is a morphological table, tell why we pick model for task, and give a compare table and metric chart.
For pixel task (fix, make better, or segmentation), CNNs (ResNet or EfficientNet) is best. For phone or fast real-time, use MobileNetV2 or small EfficientNet. For only classification with too much data, use ViT or big EfficientNet. MATLAB Deep Learning Toolbox have already trained models ready to use, like resnet50, mobilenetv2, efficientnetb0, and visionTransformer.[1][2]
References
- ↑ MathWorks. Pretrained Deep Neural Networks. Available at: https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html
- ↑ MathWorks. visionTransformer - Pretrained vision transformer (ViT) neural network. Available at: https://www.mathworks.com/help/vision/ref/visiontransformer.html