WAIL-PREC-SBGD: Pattern Recognition using Score-Based System & Gradient Descent

Current C++ Implementation (classificator module)

The project now includes a concrete demo pipeline in the classificator folder that does two core tasks:

Build a neighbor graph from active points (spatial relationships).
Build chains of neighbors (connected components) and classify each chain with a score-based model trained with gradient descent.

Data flow

Point activation by saturation
- isActivePoint(...) converts RGB to HSV-style saturation and marks a point active when saturation is above a threshold.
Neighbor generation
- neighboorArray(...) treats each point as a graph node.
- Two points become neighbors when distance is within either point radius.
- Neighbor links are stored symmetrically.
Chain generation
- generateNeighborChains(...) runs BFS on the neighbor graph.
- Each connected component becomes one NeighborChain.
Feature extraction per chain
- avgSaturation: average saturation across points in the chain.
- compactness: inverse-normalized average pairwise distance.
- density: local graph connectivity.
- sizeNorm: normalized chain size.
Classification with score + gradient descent
- classifyNeighborChains(...) uses a linear score with sigmoid output for confidence.
- Weights are refined over epochs using gradient descent against a pseudo-target heuristic.
- Output includes:
  - confidence for each chain,
  - isFaceLike boolean per chain,
  - ClassificationResult summary (totalChains, faceLikeChains, bestConfidence).
Memory cleanup
- freeNeighborChains(...) releases chain allocations.

Public API added

Declared in classificator/classificator.h:

int generateNeighborChains(MemoryZone *zone, NeighborChain **chainsOut);
ClassificationResult classifyNeighborChains(NeighborChain *chains, int chainCount, float learningRate, int epochs);
void freeNeighborChains(NeighborChain *chains, int chainCount);

Run the demo test

From project root:

g++ .\classificator\classificator.cpp .\classificator\test_classificator.cpp -std=c++17 -o .\classificator\test_classificator.exe
.\classificator\test_classificator.exe

Notes

This is still a demo / educational implementation (small scale, simple heuristics).
The classifier currently uses pseudo-label heuristics instead of an externally labeled dataset.
The architecture is now ready to swap pseudo-targets with real training labels later.

Mathematical Formalization

1. Input Representation

Given an input image $I \in \mathbb{R}^{N \times M \times 3}$ where:

$N, M$ are image dimensions
Each pixel $p_{i,j} = (R, G, B)$ with $R, G, B \in [0, 255]$

Convert to HSV color space: $I_{HSV} \in \mathbb{R}^{N \times M \times 3}$ where each pixel has $(H, S, V)$

2. Saturation Shift Detection

Define saturation gradient: $$\nabla S(i,j) = \sqrt{\left(\frac{\partial S}{\partial x}\right)^2 + \left(\frac{\partial S}{\partial y}\right)^2}$$

Where: $$\frac{\partial S}{\partial x}(i,j) = S(i+1,j) - S(i,j)$$ $$\frac{\partial S}{\partial y}(i,j) = S(i,j+1) - S(i,j)$$

Activation zones detected when: $$|\nabla S(i,j)| > \theta_{shift}$$

Where $\theta_{shift}$ is threshold (e.g., 50-100 for significant shifts)

Define activation zone set: $$Z = {(x,y) \mid |\nabla S(x,y)| > \theta_{shift}}$$

3. Shape Extraction and Normalization

Extract boundary points forming shape contour: $$B = {b_1, b_2, ..., b_k}$$ where $b_i = (x_i, y_i) \in Z$

Compute shape centroid: $$c = (\bar{x}, \bar{y}) = \left(\frac{1}{k}\sum_{i=1}^k x_i, \frac{1}{k}\sum_{i=1}^k y_i\right)$$

Compute orientation angle $\phi$ using principal component analysis: $$\phi = \arctan\left(\frac{\lambda_1 - a}{b}\right)$$

Where $a, b, c$ are covariance matrix elements and $\lambda_1$ is largest eigenvalue.

Apply rotation transformation $R(\phi)$ to normalize: $$B' = R(-\phi) \cdot (B - c)$$

4. Score System

For each cell $(i,j)$ in normalized region, compute score vector $\mathbf{s}_{i,j}$:

4.1 Identity Score: $$s_{identity}(i,j) = \exp\left(-\frac{|I'(i,j) - T(i,j)|^2}{2\sigma^2}\right)$$

Where $T$ is target template, $\sigma$ is sensitivity parameter.

4.2 Neighborhood Score: $$s_{neighbor}(i,j) = \frac{1}{8}\sum_{(p,q) \in N(i,j)} \exp\left(-\frac{|I'(p,q) - T(p,q)|^2}{2\sigma^2}\right)$$

Where $N(i,j)$ is 8-neighborhood of $(i,j)$.

4.3 Overall Score: $$s_{overall}(i,j) = w_1 \cdot s_{identity}(i,j) + w_2 \cdot s_{neighbor}(i,j)$$

With weights $w_1 + w_2 = 1$ (typically $w_1 = 0.6, w_2 = 0.4$).

5. Recognition Decision

Compute global match score: $$S_{match} = \frac{1}{|R|}\sum_{(i,j) \in R} s_{overall}(i,j)$$

Where $R$ is the region of interest.

Classification: $$\text{Match} = \begin{cases} \text{True} & \text{if } S_{match} \geq 0.7 \ \text{False} & \text{otherwise} \end{cases}$$

6. Gradient Descent Optimization

To improve matching, optimize weights $\mathbf{w} = (w_1, w_2)$ using gradient descent:

Loss function: $$L(\mathbf{w}) = \frac{1}{n}\sum_{i=1}^n \left(y_i - S_{match}^{(i)}(\mathbf{w})\right)^2$$

Where $y_i \in {0, 1}$ are ground truth labels.

Update rule: $$\mathbf{w}_{t+1} = \mathbf{w}_t - \eta \nabla L(\mathbf{w}_t)$$

Where $\eta$ is learning rate and: $$\frac{\partial L}{\partial w_k} = \frac{2}{n}\sum_{i=1}^n \left(y_i - S_{match}^{(i)}\right) \cdot \frac{\partial S_{match}^{(i)}}{\partial w_k}$$

Algorithm Summary

Input: Image $I$, Target template $T$
Output: Match score $S_{match}$ and classification

Convert $I$ to HSV color space
Detect saturation shift zones $Z$
Extract shape boundary $B$ and normalize to $B'$
Compute score matrix $\mathbf{S}$ for all pixels
Calculate global match score $S_{match}$
Apply gradient descent to update parameters
Return classification result

Learning Resources: Understanding the Mathematics

1. Color Space Conversion (HSV)

What you need to know:

RGB represents colors as combinations of Red, Green, Blue (0-255 each)
HSV represents colors as Hue (color type 0-360°), Saturation (color intensity 0-1), Value (brightness 0-1)
HSV is better for detecting color changes because saturation and brightness are separate

Learn from:

Khan Academy - Color Models: https://www.khanacademy.org/computing/pixar/color
Video: "RGB to HSV Conversion Explained" - Search YouTube
Article: https://www.rapidtables.com/convert/color/rgb-to-hsv.html
Practice: Write a function to convert one RGB pixel to HSV by hand first

Key concept: Saturation tells us how "colorful" vs "gray" a pixel is - perfect for finding edges between different regions.

2. Gradient & Partial Derivatives

What you need to know:

$\frac{\partial S}{\partial x}$ means "how much does saturation change when we move right by 1 pixel"
It's just: S(next_pixel) - S(current_pixel)
The gradient $\nabla S$ combines x and y changes to show the direction and magnitude of biggest change

Learn from:

3Blue1Brown - Essence of Calculus: https://www.youtube.com/watch?v=WUvTyaaNkzM (Chapter 2 on derivatives)
Khan Academy - Partial Derivatives: https://www.khanacademy.org/math/multivariable-calculus/multivariable-derivatives
Image Processing: "Edge Detection Fundamentals" - search for Sobel operator tutorials

Key concept: Think of gradient as "which direction would water flow" - it points toward the steepest change.

Simple example:

Saturation values:  [0.2, 0.2, 0.8, 0.9]
Gradient:           [0,   0,   0.6, 0.1]  <- Big jump at index 2!

3. Set Notation & Set Builder

What you need to know:

$Z = {(x,y) \mid |\nabla S(x,y)| > \theta}$ reads as: "Z is the set of all points (x,y) such that gradient magnitude > threshold"
$\mid$ means "such that" or "where"
It's just a fancy way to write: for each (x,y): if gradient > threshold, add to Z

Learn from:

Khan Academy - Set Notation: https://www.khanacademy.org/math/statistics-probability/probability-library/basic-set-ops
Article: https://www.mathsisfun.com/sets/set-builder-notation.html

Key concept: This is just a mathematical "filter" - collect all points where condition is true.

4. Centroid & Mean

What you need to know:

Centroid = "center of mass" or average position of all points
$\bar{x} = \frac{1}{k}\sum_{i=1}^k x_i$ means: "add all x coordinates and divide by count"
It's literally just the average: (x1 + x2 + ... + xk) / k

Learn from:

Khan Academy - Mean/Average: https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data
Visual: Draw 5 dots on paper, find their average X and Y position - that's the centroid

Key concept: Used to find the "middle" of your detected shape so you can center it.

5. Principal Component Analysis (PCA) & Rotation

What you need to know:

PCA finds the main "direction" of your shape
The orientation angle tells you how tilted the shape is
You use this angle to rotate the shape to a standard orientation (like rotating a tilted face to be upright)

Learn from:

StatQuest - PCA: https://www.youtube.com/watch?v=FgakZw6K1QQ (BEST video explanation!)
Khan Academy - Linear Algebra: https://www.khanacademy.org/math/linear-algebra
Eigenvectors Explained: https://www.youtube.com/watch?v=PFDu9oVAE-g

Simplified approach for coding: You can skip full PCA and use simpler method:

Find two most distant points in your shape
The line between them gives you orientation angle
Use basic 2D rotation matrix to rotate points

2D Rotation Formula (easier to start with):

x_new = x*cos(θ) - y*sin(θ)
y_new = x*sin(θ) + y*cos(θ)

6. Gaussian (Bell Curve) for Similarity

What you need to know:

$\exp\left(-\frac{d^2}{2\sigma^2}\right)$ is the Gaussian/normal distribution
It outputs values close to 1 when distance $d$ is small (similar pixels)
It outputs values close to 0 when distance is large (different pixels)
$\sigma$ controls how "strict" the comparison is (smaller = more strict)

Learn from:

3Blue1Brown - Why π in Gaussian: https://www.youtube.com/watch?v=cy8r7WSuT1I
Khan Academy - Normal Distribution: https://www.khanacademy.org/math/statistics-probability/modeling-distributions-of-data
Interactive: https://www.mathsisfun.com/data/standard-normal-distribution.html

Key concept: This creates a smooth similarity score. Pixels that are almost identical get score ~0.99, slightly different get ~0.7, very different get ~0.01.

Simple example:

If d=0 (identical):     exp(-0/50) = 1.0    (perfect match!)
If d=5 (similar):       exp(-25/50) = 0.61  (pretty good)
If d=20 (different):    exp(-400/50) = 0.0003 (very different)

7. Euclidean Distance (Norm)

What you need to know:

$|A - B|$ means distance between points A and B
For pixels: $\sqrt{(R_1-R_2)^2 + (G_1-G_2)^2 + (B_1-B_2)^2}$
It's just the Pythagorean theorem in 3D (for RGB) or 2D (for positions)

Learn from:

Khan Academy - Distance Formula: https://www.khanacademy.org/math/geometry/hs-geo-analytic-geometry
Visual: Plot two points and measure straight-line distance

8. Gradient Descent Optimization

What you need to know:

Used to automatically find best weights $w_1$ and $w_2$
Loss function $L$ measures "how wrong are we?"
We adjust weights in direction that reduces error (go downhill on error curve)
Learning rate $\eta$ controls step size (0.01 = take small careful steps)

Learn from:

3Blue1Brown - Gradient Descent: https://www.youtube.com/watch?v=IHZwWFHWa-w (MUST WATCH!)
StatQuest - Gradient Descent: https://www.youtube.com/watch?v=sDv4f4s2SB8
Interactive Demo: https://playground.tensorflow.org/ (play with neural network training)

Key concept: Like walking downhill in fog. You check which direction is downward (gradient), take a small step that way, repeat until you reach the bottom.

Simple analogy:

Imagine you're trying to tune a radio (weights = dial position)
Loss = how much static you hear
Gradient = "turning left makes it worse, turning right makes it better"
You keep turning right (adjusting weights) until static (loss) is minimized

9. Summation Notation

What you need to know:

$\sum_{i=1}^n a_i$ means "add up all values from $a_1$ to $a_n$"
It's just a for-loop in math notation: total = 0; for i in 1..n: total += a[i]

Learn from:

Khan Academy - Summation: https://www.khanacademy.org/math/ap-calculus-ab/ab-integration-new/ab-6-3
Article: https://www.mathsisfun.com/algebra/sigma-notation.html

Implementation Strategy

Start Simple, Build Up:

Week 1: RGB to HSV conversion
- Test with single pixel first
- Then test with a small 3x3 image
Week 2: Gradient computation
- Start with 1D case (just x-direction)
- Extend to 2D
Week 3: Finding activation zones
- Create threshold and collect points
- Visualize them (print coordinates)
Week 4: Simple shape centering
- Skip fancy PCA, just compute centroid
- Translate points to center
Week 5: Score system (without optimization)
- Hard-code weights first (0.6, 0.4)
- Test on single pixel pairs
Week 6: Gradient descent
- Start with 1D optimization problem
- Extend to multi-parameter case

Recommended Tools for Learning

Desmos Calculator: https://www.desmos.com/calculator - Graph functions visually
Python Notebook: Use Jupyter to experiment with small examples
NumPy Tutorial: Learn array operations (similar to image operations)
Math Visualization: Use matplotlib to plot gradients, score functions

Quick Reference: Implementation Pseudocode

// 1. Saturation Shift Detection
function detectShifts(image):
    hsv_image = convertRGBtoHSV(image)
    for each pixel (i,j):
        grad_x = hsv[i+1][j].S - hsv[i][j].S
        grad_y = hsv[i][j+1].S - hsv[i][j].S
        magnitude = sqrt(grad_x² + grad_y²)
        if magnitude > threshold:
            add (i,j) to activation_zones
    return activation_zones

// 2. Shape Normalization
function normalizeShape(points):
    centroid = average of all points
    translate points by -centroid
    // Optional: compute orientation and rotate
    return normalized_points

// 3. Score Computation
function computeScore(pixel, template_pixel, sigma):
    distance = ||pixel - template_pixel||
    score = exp(-distance² / (2 * sigma²))
    return score

// 4. Recognition
function recognizePattern(image, template):
    scores = []
    for each pixel:
        identity_score = computeScore(pixel, template_pixel)
        neighbor_score = average of neighbor scores
        overall = 0.6 * identity_score + 0.4 * neighbor_score
        scores.append(overall)
    
    match_score = average(scores)
    return match_score >= 0.7

// 5. Gradient Descent (Advanced)
function optimizeWeights(training_data):
    w1 = 0.6, w2 = 0.4
    learning_rate = 0.01
    
    for iteration in 1..100:
        loss = 0
        gradient_w1 = 0, gradient_w2 = 0
        
        for each sample:
            predicted = computeMatchScore(sample, w1, w2)
            error = predicted - actual_label
            loss += error²
            
            gradient_w1 += error * (∂score/∂w1)
            gradient_w2 += error * (∂score/∂w2)
        
        w1 -= learning_rate * gradient_w1
        w2 -= learning_rate * gradient_w2
    
    return w1, w2

Questions to Ask Yourself While Coding

What are the inputs and outputs of this function?
What does this formula do in plain English?
Can I test this on a tiny example (2x2 image)?
What should happen at image boundaries?
What if the input is all zeros? All the same value?
How do I visualize the result to verify correctness?

Good luck! Take it step by step, and don't hesitate to implement simpler versions first before adding complexity.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
classificator		classificator
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
main.cpp		main.cpp
readme.md		readme.md

Folders and files

Latest commit

History

Repository files navigation

WAIL-PREC-SBGD: Pattern Recognition using Score-Based System & Gradient Descent

Current C++ Implementation (classificator module)

Data flow

Public API added

Run the demo test

Notes

Mathematical Formalization

1. Input Representation

2. Saturation Shift Detection

3. Shape Extraction and Normalization

4. Score System

5. Recognition Decision

6. Gradient Descent Optimization

Algorithm Summary

Learning Resources: Understanding the Mathematics

1. Color Space Conversion (HSV)

2. Gradient & Partial Derivatives

3. Set Notation & Set Builder

4. Centroid & Mean

5. Principal Component Analysis (PCA) & Rotation

6. Gaussian (Bell Curve) for Similarity

7. Euclidean Distance (Norm)

8. Gradient Descent Optimization

9. Summation Notation

Implementation Strategy

Start Simple, Build Up:

Recommended Tools for Learning

Quick Reference: Implementation Pseudocode

Questions to Ask Yourself While Coding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages