The project now includes a concrete demo pipeline in the classificator folder that does two core tasks:
- Build a neighbor graph from active points (spatial relationships).
- Build chains of neighbors (connected components) and classify each chain with a score-based model trained with gradient descent.
-
Point activation by saturation
isActivePoint(...)converts RGB to HSV-style saturation and marks a point active when saturation is above a threshold.
-
Neighbor generation
neighboorArray(...)treats each point as a graph node.- Two points become neighbors when distance is within either point radius.
- Neighbor links are stored symmetrically.
-
Chain generation
generateNeighborChains(...)runs BFS on the neighbor graph.- Each connected component becomes one
NeighborChain.
-
Feature extraction per chain
avgSaturation: average saturation across points in the chain.compactness: inverse-normalized average pairwise distance.density: local graph connectivity.sizeNorm: normalized chain size.
-
Classification with score + gradient descent
classifyNeighborChains(...)uses a linear score with sigmoid output for confidence.- Weights are refined over epochs using gradient descent against a pseudo-target heuristic.
- Output includes:
confidencefor each chain,isFaceLikeboolean per chain,ClassificationResultsummary (totalChains,faceLikeChains,bestConfidence).
-
Memory cleanup
freeNeighborChains(...)releases chain allocations.
Declared in classificator/classificator.h:
int generateNeighborChains(MemoryZone *zone, NeighborChain **chainsOut);ClassificationResult classifyNeighborChains(NeighborChain *chains, int chainCount, float learningRate, int epochs);void freeNeighborChains(NeighborChain *chains, int chainCount);
From project root:
g++ .\classificator\classificator.cpp .\classificator\test_classificator.cpp -std=c++17 -o .\classificator\test_classificator.exe
.\classificator\test_classificator.exe- This is still a demo / educational implementation (small scale, simple heuristics).
- The classifier currently uses pseudo-label heuristics instead of an externally labeled dataset.
- The architecture is now ready to swap pseudo-targets with real training labels later.
Given an input image
-
$N, M$ are image dimensions - Each pixel
$p_{i,j} = (R, G, B)$ with$R, G, B \in [0, 255]$
Convert to HSV color space:
Define saturation gradient:
Where:
Activation zones detected when:
Where
Define activation zone set:
Extract boundary points forming shape contour:
Compute shape centroid:
Compute orientation angle
Where
Apply rotation transformation
For each cell
4.1 Identity Score:
Where
4.2 Neighborhood Score:
Where
4.3 Overall Score:
With weights
Compute global match score:
Where
Classification: $$\text{Match} = \begin{cases} \text{True} & \text{if } S_{match} \geq 0.7 \ \text{False} & \text{otherwise} \end{cases}$$
To improve matching, optimize weights
Loss function:
Where
Update rule:
Where
Input: Image
Output: Match score
- Convert
$I$ to HSV color space - Detect saturation shift zones
$Z$ - Extract shape boundary
$B$ and normalize to$B'$ - Compute score matrix
$\mathbf{S}$ for all pixels - Calculate global match score
$S_{match}$ - Apply gradient descent to update parameters
- Return classification result
What you need to know:
- RGB represents colors as combinations of Red, Green, Blue (0-255 each)
- HSV represents colors as Hue (color type 0-360°), Saturation (color intensity 0-1), Value (brightness 0-1)
- HSV is better for detecting color changes because saturation and brightness are separate
Learn from:
- Khan Academy - Color Models: https://www.khanacademy.org/computing/pixar/color
- Video: "RGB to HSV Conversion Explained" - Search YouTube
- Article: https://www.rapidtables.com/convert/color/rgb-to-hsv.html
- Practice: Write a function to convert one RGB pixel to HSV by hand first
Key concept: Saturation tells us how "colorful" vs "gray" a pixel is - perfect for finding edges between different regions.
What you need to know:
-
$\frac{\partial S}{\partial x}$ means "how much does saturation change when we move right by 1 pixel" - It's just:
S(next_pixel) - S(current_pixel) - The gradient
$\nabla S$ combines x and y changes to show the direction and magnitude of biggest change
Learn from:
- 3Blue1Brown - Essence of Calculus: https://www.youtube.com/watch?v=WUvTyaaNkzM (Chapter 2 on derivatives)
- Khan Academy - Partial Derivatives: https://www.khanacademy.org/math/multivariable-calculus/multivariable-derivatives
- Image Processing: "Edge Detection Fundamentals" - search for Sobel operator tutorials
Key concept: Think of gradient as "which direction would water flow" - it points toward the steepest change.
Simple example:
Saturation values: [0.2, 0.2, 0.8, 0.9]
Gradient: [0, 0, 0.6, 0.1] <- Big jump at index 2!
What you need to know:
-
$Z = {(x,y) \mid |\nabla S(x,y)| > \theta}$ reads as: "Z is the set of all points (x,y) such that gradient magnitude > threshold" -
$\mid$ means "such that" or "where" - It's just a fancy way to write:
for each (x,y): if gradient > threshold, add to Z
Learn from:
- Khan Academy - Set Notation: https://www.khanacademy.org/math/statistics-probability/probability-library/basic-set-ops
- Article: https://www.mathsisfun.com/sets/set-builder-notation.html
Key concept: This is just a mathematical "filter" - collect all points where condition is true.
What you need to know:
- Centroid = "center of mass" or average position of all points
-
$\bar{x} = \frac{1}{k}\sum_{i=1}^k x_i$ means: "add all x coordinates and divide by count" - It's literally just the average:
(x1 + x2 + ... + xk) / k
Learn from:
- Khan Academy - Mean/Average: https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data
- Visual: Draw 5 dots on paper, find their average X and Y position - that's the centroid
Key concept: Used to find the "middle" of your detected shape so you can center it.
What you need to know:
- PCA finds the main "direction" of your shape
- The orientation angle tells you how tilted the shape is
- You use this angle to rotate the shape to a standard orientation (like rotating a tilted face to be upright)
Learn from:
- StatQuest - PCA: https://www.youtube.com/watch?v=FgakZw6K1QQ (BEST video explanation!)
- Khan Academy - Linear Algebra: https://www.khanacademy.org/math/linear-algebra
- Eigenvectors Explained: https://www.youtube.com/watch?v=PFDu9oVAE-g
Simplified approach for coding: You can skip full PCA and use simpler method:
- Find two most distant points in your shape
- The line between them gives you orientation angle
- Use basic 2D rotation matrix to rotate points
2D Rotation Formula (easier to start with):
x_new = x*cos(θ) - y*sin(θ)
y_new = x*sin(θ) + y*cos(θ)
What you need to know:
-
$\exp\left(-\frac{d^2}{2\sigma^2}\right)$ is the Gaussian/normal distribution - It outputs values close to 1 when distance
$d$ is small (similar pixels) - It outputs values close to 0 when distance is large (different pixels)
-
$\sigma$ controls how "strict" the comparison is (smaller = more strict)
Learn from:
- 3Blue1Brown - Why π in Gaussian: https://www.youtube.com/watch?v=cy8r7WSuT1I
- Khan Academy - Normal Distribution: https://www.khanacademy.org/math/statistics-probability/modeling-distributions-of-data
- Interactive: https://www.mathsisfun.com/data/standard-normal-distribution.html
Key concept: This creates a smooth similarity score. Pixels that are almost identical get score ~0.99, slightly different get ~0.7, very different get ~0.01.
Simple example:
If d=0 (identical): exp(-0/50) = 1.0 (perfect match!)
If d=5 (similar): exp(-25/50) = 0.61 (pretty good)
If d=20 (different): exp(-400/50) = 0.0003 (very different)
What you need to know:
-
$|A - B|$ means distance between points A and B - For pixels:
$\sqrt{(R_1-R_2)^2 + (G_1-G_2)^2 + (B_1-B_2)^2}$ - It's just the Pythagorean theorem in 3D (for RGB) or 2D (for positions)
Learn from:
- Khan Academy - Distance Formula: https://www.khanacademy.org/math/geometry/hs-geo-analytic-geometry
- Visual: Plot two points and measure straight-line distance
What you need to know:
- Used to automatically find best weights
$w_1$ and$w_2$ - Loss function
$L$ measures "how wrong are we?" - We adjust weights in direction that reduces error (go downhill on error curve)
- Learning rate
$\eta$ controls step size (0.01 = take small careful steps)
Learn from:
- 3Blue1Brown - Gradient Descent: https://www.youtube.com/watch?v=IHZwWFHWa-w (MUST WATCH!)
- StatQuest - Gradient Descent: https://www.youtube.com/watch?v=sDv4f4s2SB8
- Interactive Demo: https://playground.tensorflow.org/ (play with neural network training)
Key concept: Like walking downhill in fog. You check which direction is downward (gradient), take a small step that way, repeat until you reach the bottom.
Simple analogy:
Imagine you're trying to tune a radio (weights = dial position)
Loss = how much static you hear
Gradient = "turning left makes it worse, turning right makes it better"
You keep turning right (adjusting weights) until static (loss) is minimized
What you need to know:
-
$\sum_{i=1}^n a_i$ means "add up all values from$a_1$ to$a_n$ " - It's just a for-loop in math notation:
total = 0; for i in 1..n: total += a[i]
Learn from:
- Khan Academy - Summation: https://www.khanacademy.org/math/ap-calculus-ab/ab-integration-new/ab-6-3
- Article: https://www.mathsisfun.com/algebra/sigma-notation.html
-
Week 1: RGB to HSV conversion
- Test with single pixel first
- Then test with a small 3x3 image
-
Week 2: Gradient computation
- Start with 1D case (just x-direction)
- Extend to 2D
-
Week 3: Finding activation zones
- Create threshold and collect points
- Visualize them (print coordinates)
-
Week 4: Simple shape centering
- Skip fancy PCA, just compute centroid
- Translate points to center
-
Week 5: Score system (without optimization)
- Hard-code weights first (0.6, 0.4)
- Test on single pixel pairs
-
Week 6: Gradient descent
- Start with 1D optimization problem
- Extend to multi-parameter case
- Desmos Calculator: https://www.desmos.com/calculator - Graph functions visually
- Python Notebook: Use Jupyter to experiment with small examples
- NumPy Tutorial: Learn array operations (similar to image operations)
- Math Visualization: Use matplotlib to plot gradients, score functions
// 1. Saturation Shift Detection
function detectShifts(image):
hsv_image = convertRGBtoHSV(image)
for each pixel (i,j):
grad_x = hsv[i+1][j].S - hsv[i][j].S
grad_y = hsv[i][j+1].S - hsv[i][j].S
magnitude = sqrt(grad_x² + grad_y²)
if magnitude > threshold:
add (i,j) to activation_zones
return activation_zones
// 2. Shape Normalization
function normalizeShape(points):
centroid = average of all points
translate points by -centroid
// Optional: compute orientation and rotate
return normalized_points
// 3. Score Computation
function computeScore(pixel, template_pixel, sigma):
distance = ||pixel - template_pixel||
score = exp(-distance² / (2 * sigma²))
return score
// 4. Recognition
function recognizePattern(image, template):
scores = []
for each pixel:
identity_score = computeScore(pixel, template_pixel)
neighbor_score = average of neighbor scores
overall = 0.6 * identity_score + 0.4 * neighbor_score
scores.append(overall)
match_score = average(scores)
return match_score >= 0.7
// 5. Gradient Descent (Advanced)
function optimizeWeights(training_data):
w1 = 0.6, w2 = 0.4
learning_rate = 0.01
for iteration in 1..100:
loss = 0
gradient_w1 = 0, gradient_w2 = 0
for each sample:
predicted = computeMatchScore(sample, w1, w2)
error = predicted - actual_label
loss += error²
gradient_w1 += error * (∂score/∂w1)
gradient_w2 += error * (∂score/∂w2)
w1 -= learning_rate * gradient_w1
w2 -= learning_rate * gradient_w2
return w1, w2
- What are the inputs and outputs of this function?
- What does this formula do in plain English?
- Can I test this on a tiny example (2x2 image)?
- What should happen at image boundaries?
- What if the input is all zeros? All the same value?
- How do I visualize the result to verify correctness?
Good luck! Take it step by step, and don't hesitate to implement simpler versions first before adding complexity.