← Zum vorherigen Versionsunterschied Zum nächsten Versionsunterschied →

Version vom 5. Februar 2026, 14:01 Uhr

Abb. 1: Signalverarbeitung mit MATLAB^® und Künstlicher Intelligenz

Autor:	Ajay Paul
Art:	bachelor thesis
Starttermin:	TBD
Abgabetermin:	TBD
Betreuer:	Prof. Dr.-Ing. Schneider
Zweitprüfer:	Mirek Göbel

Introduction

Signal processing has long been a basis in the analysis and manipulation of data in various domains such as telecommunications, radar, audio processing, and imaging. Signal processing has traditionally focused on analyzing, filtering, and interpreting signals in both the time and frequency domains. With the integration of AI, especially deep learning, the possibilities expand to include noise reduction and more. It is essential in many modern applications such as communications, audio and image processing, biomedical signal analysis, and control systems.

Task list

Solve the Image Processing Tasks with classic algorithms.
Research on AI Solutions.
Evaluation of the solutions using a morphological box (Zwicky box)
Solve the Image Processing Tasks with AI algorithms.
Compare the different approaches according to scientific criteria
Scientific documentation

Getting startes

You are invited to the image processing class in MATLAB Grader.
Choose the examples from the categories in table 1. Start with 1 end with 6.
Solve the tasks with classic algorithms.
Solve the tasks with AI.
Save your results in SVN.
Use MATLAB R2025b for the algorithms.

Image Processing Tasks with classic algorithms

Image Processing: Recovery and restoration of information

Table 1: Image Processing Categories
#	Categorie	Grader Lecture
1	Recovery and restoration of information	5
2	Image enhancement	6
3	Noise Reduction	7
4	Data based segmentation	8
5	Model based segmentation	9
6	Classification	10

SVN Repository:

https://svn.hshl.de/svn/MATLAB_Vorkurs/trunk/Signalverarbeitung_mit_Kuenstlicher_Intelligenz

Requirements regarding the Scientific Methodology

Scientific methodology (project plan, etc.), helpful article: Gantt Diagramm erstellen
Weekly progress reports (informative), update the Meeting Minutes
Project presentation in the wiki
Daily backup of work results in SVN
Student Projects with Prof. Schneider
Anforderungen an eine wissenschaftlich Arbeit

→ zurück zum Hauptartikel: Requirements for a scientific project

Convolutional Neural Network for Image Classification

A Convolutional Neural Network (CNN) is a class of deep neural networks, most commonly applied to analyzing visual imagery. This article describes a specific implementation of a CNN using the TensorFlow and Keras libraries to classify images from the CIFAR-10 dataset.

Overview

The model described herein is designed to classify low-resolution color images ($32 \times 32$ pixels) into one of ten distinct classes (e.g., airplanes, automobiles, birds, cats). The implementation utilizes a sequential architecture consisting of a convolutional base for feature extraction followed by a dense network for classification.

Dataset

The system is trained on the CIFAR-10 dataset, which consists of 60,000 $32 \times 32$ color images in 10 classes, with 6,000 images per class.

Training set: 50,000 images
Test set: 10,000 images
Preprocessing: Pixel values are normalized to the range [0, 1] by dividing by 255.0 to accelerate convergence during gradient descent.

Network Architecture

The architecture follows a standard pattern: Conv2D $\rightarrow$ MaxPooling $\rightarrow$ Dense.

Layer Type	Parameters	Output Shape	Description
Input	-	(32, 32, 3)	RGB Image input.
Conv2D	32 filters, $3\times3$ kernel, ReLU	(30, 30, 32)	Extracts low-level features (edges).
MaxPooling2D	$2\times2$ pool size	(15, 15, 32)	Downsamples spatial dimensions.
Conv2D	64 filters, $3\times3$ kernel, ReLU	(13, 13, 64)	Extracts complex features.
MaxPooling2D	$2\times2$ pool size	(6, 6, 64)	Downsamples spatial dimensions.
Conv2D	64 filters, $3\times3$ kernel, ReLU	(4, 4, 64)	Extracts high-level features.
Flatten	-	(1024)	Converts 3D tensor to 1D vector.
Dense	64 units, ReLU	(64)	Fully connected layer.
Dense	10 units	(10)	Output layer (Logits).

Mathematical Descriptions of Core Operations

Convolution

The core building block of the CNN. A filter (kernel) $K$ slides over the input image $I$, computing the dot product. For a pixel at location $(i, j)$, the feature map value $S$ is calculated as:

S (i, j) = (I * K) (i, j) = \sum_{m} \sum_{n} I (m, n) K (i - m, j - n)

This operation preserves the spatial relationship between pixels while learning features.

Rectified Linear Unit (ReLU)

The activation function used in the hidden layers to introduce non-linearity. It outputs the input directly if it is positive, otherwise, it outputs zero:

f (x) = \max (0, x)

Max Pooling

This operation reduces the dimensionality of the feature map by keeping only the maximum value within a specified window (typically $2\times2$). This reduces computation and helps make the model invariant to small translations.

Softmax and Loss

The output layer produces "logits" (raw scores). For classification, the model optimizes the Sparse Categorical Crossentropy. During prediction, the raw logits $z$ are converted into probabilities using the Softmax function:

σ (z)_{i} = \frac{e^{z_{i}}}{\sum_{j = 1}^{K} e^{z_{j}}}

Where $K$ is the number of classes (10).

Implementation Details

Training Configuration

The model is compiled with the following hyperparameters:

Optimizer: Adam (Adaptive Moment Estimation).
Loss Function: Sparse Categorical Crossentropy (from_logits=True).
Metrics: Accuracy.
Epochs: 10 iterations over the entire dataset.

Performance

Upon training for 10 epochs, the model typically achieves:

Training Accuracy: High (variable based on initialization).
Test Accuracy: Approximately 70% – 75%.
Overfitting: A divergence between training accuracy and validation accuracy suggests the model may memorize training data. Techniques such as Dropout or Data Augmentation are recommended to mitigate this.

Inference on Custom Images

To test the model on high-quality external images, the input must be resized to match the network's input constraints ($32 \times 32$ pixels).

img = image.load_img(path, target_size=(32, 32))
img_array = image.img_to_array(img) / 255.0
predictions = model.predict(img_array)

@@ Zeile 81: / Zeile 81: @@
 * '''Training set:''' 50,000 images
 * '''Test set:''' 10,000 images
-* '''Preprocessing:''' Pixel values are normalized to the range $[0, 1]$ by dividing by 255.0 to accelerate convergence during gradient descent.
+* '''Preprocessing:''' Pixel values are normalized to the range [0, 1] by dividing by 255.0 to accelerate convergence during gradient descent.
 [[File:Dataset_Example.png|thumb|center|Dataset example used for CNN image classification]]
 == Network Architecture ==

Image Processing with MATLAB and AI: Unterschied zwischen den Versionen

Version vom 5. Februar 2026, 14:01 Uhr

Inhaltsverzeichnis

Introduction

Task list

Getting startes

Image Processing Tasks with classic algorithms

Requirements regarding the Scientific Methodology

Convolutional Neural Network for Image Classification

Overview

Dataset

Network Architecture

Mathematical Descriptions of Core Operations

Convolution

Rectified Linear Unit (ReLU)

Max Pooling

Softmax and Loss

Implementation Details

Training Configuration

Performance

Inference on Custom Images

Navigationsmenü

Image Processing with MATLAB and AI: Unterschied zwischen den Versionen

Version vom 5. Februar 2026, 14:01 Uhr

Introduction

Task list

Getting startes

Image Processing Tasks with classic algorithms

Requirements regarding the Scientific Methodology

Convolutional Neural Network for Image Classification

Overview

Dataset

Network Architecture

Mathematical Descriptions of Core Operations

Convolution

Rectified Linear Unit (ReLU)

Max Pooling

Softmax and Loss

Implementation Details

Training Configuration

Performance

Inference on Custom Images

Navigationsmenü

Suche