Skip to content

Commit

Permalink
TODOs + Grammarly in report
Browse files Browse the repository at this point in the history
  • Loading branch information
albertopasqualetto committed Jul 21, 2024
1 parent bbc754d commit ea9e05b
Show file tree
Hide file tree
Showing 15 changed files with 52 additions and 52 deletions.
Binary file modified Report/main.pdf
Binary file not shown.
9 changes: 4 additions & 5 deletions Report/main.tex
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
% File: 8BallPool video analysis
\documentclass[letterpaper,12pt]{article} % DO NOT CHANGE THIS

\usepackage{amsmath}
Expand Down Expand Up @@ -103,7 +102,7 @@
% nouns, adverbs, adjectives should be capitalized, including both words in hyphenated terms, while
% articles, conjunctions, and prepositions are lower case unless they
% directly follow a colon or long dash
\title{8BallPool video analysis}
\title{8-Ball Pool video analysis}
\author{
%Authors
Michele Sprocatti\textsuperscript{\rm 1}, %\equalcontrib,
Expand All @@ -123,13 +122,13 @@
\{michele.sprocatti\textsuperscript{\rm 1}, alberto.pasqualetto.2\textsuperscript{\rm 2}, michela.schibuola\textsuperscript{\rm 3}\}@studenti.unipd.it
\end{center}

% TODO write about restrictions in the input datasets size/aspect ratio

\input{section/introduction}

\input{section/workload}

\input{section/elements}
\input{section/executables}

\input{section/parts}

\input{section/results}

Expand Down
8 changes: 4 additions & 4 deletions Report/section/ballsDetection.tex
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
\subsection{Balls detection}
To detect balls, Michele proposed a multi-step preprocessing approach. Initially, the table region is isolated using an approach similar to the segmentation described before. Then the corners area is removed to prevent Hough Circle transform to find them as false positives. Subsequently k-means clustering was applied to the image with k=5 (the number of balls type plus the playing field). The resulting clusterized \texttt{Mat} is converted to gray-scale to be used as \texttt{HoughCircle} input. The gray-scale output colors were selected to be as different as possible from each other once the color space is changed.
To detect balls, Michele proposed a multi-step preprocessing approach. Initially, the table region is isolated using an approach similar to the segmentation described before. Then the corner-defined area is removed to prevent the Hough Circle transform from finding them as false positives. Subsequently, k-means clustering was applied to the image with k=5 (the number of balls type plus the playing field). The resulting clustered \texttt{Mat} is converted to gray-scale to be used as \texttt{HoughCircles} input. The gray-scale output colors were selected to be as different as possible from each other once the color space is changed.

Circle parameters, such as radius and center color, were analyzed to identify potential ball regions. By calculating the mean radius of in-table circles with center not selected by the color mask, a radius range was established. Circles within this radius range were considered for further analysis.
Circle parameters, such as radius and center color, are analyzed to identify potential ball regions. By calculating the mean radius of in-table circles with a center not selected by the color mask, a radius range is established. Circles within this radius range are then considered for further analysis.

Ball classification involved creating a circular mask, computing the gray-scale histogram, and excluding background pixels from the values of the histogram. Peak values in the histogram were used to differentiate between striped and solid balls, while HSV color space analysis is used to distinguish white and black balls.
Ball classification involves creating a circular mask, computing the gray-scale histogram, and excluding background pixels from the values of the histogram. Peak values in the histogram are used to differentiate between striped and solid balls, while HSV color space analysis is used to distinguish white and black balls.

After finding the balls, the team identified an optimization opportunity. Since there's only one white ball and one black ball, Michele implemented non-maxima suppression for white and black balls independently, in order to improve performance.
After finding the balls, the team identified an optimization opportunity. Since in an 8-ball game, there must always be one and only one white ball and one and only one black ball, Michele implemented non-maxima suppression for white and black balls independently, in order to improve performance.

The result of the detection process is then used to segment the balls.
10 changes: 9 additions & 1 deletion Report/section/conclusions.tex
Original file line number Diff line number Diff line change
@@ -1,2 +1,10 @@
\section{Conclusions}
Our approach demonstrates consistent performance across the dataset. Notably, table detection achieves high accuracy. However, ball classification presents some challenges due to their varying sizes and colors that sometimes are similar to the one of the playing field, also solid and striped balls are difficult to distinguish because % TODO riflessi e poca parte bianca visibile
Our approach demonstrates consistent performance across the dataset. Notably, table detection achieves high accuracy.
However, ball classification presents some challenges due to their varying sizes and colors that sometimes are similar to the one of the playing field, also solid and striped balls are difficult to distinguish, this happens because \texttt{HoughCircles} finds a smaller circle compared to the ball diameter which excludes too much white pixels of the ball, this problem is really hard to solve because footage with perspective requires a ranged radius to address balls at different distances from the camera.

In other cases a striped ball can be detected as white if the \texttt{HoughCircles} output is smaller than the real ball and includes a lot of white pixels and few colored pixels, this problem has been mitigated by non-maximum suppression.

Solid balls are much similar to the black ball if there is a large shadow inside the circle found by \texttt{HoughCircles}, this problem has been mitigated by non-maximum suppression.


Detection and segmentation parameters are optimized for the provided dataset. The algorithm may require adjustments for different datasets, especially for videos with different aspect ratios or resolutions.
5 changes: 2 additions & 3 deletions Report/section/executables.tex
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
\section{Executables}
The program contains 4 different executables:
\begin{itemize}
\item \texttt{8BallPool}: the main executable that, given a video file, it processes it and creates the output video with the superimposed minimap.
\item \texttt{8BallPool}: the main executable that, given a video file path from command line input, processes it and creates the output video with the superimposed minimap.
\item \texttt{TestAllClip}: it is the executable used to test the detection and segmentation in the first and last frame of all videos through AP and IoU by comparing them with the ground truth.
\item \texttt{ShowSegmentationColored}: is an helper which has been used to show the ground truth of the segmentation of a particular frame using human-readable colors and it was also used as a test for the code that computes the metrics because it computes the performance of the ground truth on itself.
\item \texttt{ShowSegmentationColored}: is a helper executable that has been used to show the ground truth of the segmentation of a particular frame using human-readable colors and it was also used as a test for the code that computes the metrics because it computes the performance of the ground truth on itself.
\item \texttt{ComputePerformance}: is used to compute the performance across the dataset so the mAP and the mIoU.
\end{itemize}

% TODO cmd parameters
7 changes: 3 additions & 4 deletions Report/section/metrics.tex
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
\section{Metrics}
The \texttt{computePerformance} executable handles both mAP and mIoU calculations.

% TODO check if explanations are correct
\noindent\textbf{\emph{mAP (mean Average Precision)}}:
\begin{enumerate}
\item Predictions are performed for all first and last frames in a video.
Expand All @@ -12,9 +11,9 @@ \section{Metrics}

\noindent\textbf{\emph{mIoU (mean Intersection over Union)}}:
\begin{enumerate}
\item IoU is calculated for the first and the last frame for each video.
\item The average IoU is then computed for each object class across all 20 images (10 videos each one with 2 frame) in the dataset.
\item IoU is calculated for the first and the last frames of each video.
\item The average IoU is then computed for each object class across all 20 images (10 videos each one with 2 frames) in the dataset.
\item Finally, the mIoU is obtained by averaging the IoU values obtained in the last step.
\end{enumerate}

The 8BallPool executable displays the performance metrics (AP and IoU) achieved by the method for the specific input video.
The \texttt{8BallPool} executable displays the performance metrics (AP and IoU) achieved by the method for the specific input video.
22 changes: 13 additions & 9 deletions Report/section/minimap.tex
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ \subsection{Minimap creation}
\end{itemize}

\subsubsection{Empty minimap image}
As a first step, an image of an empty billiard table has been selected, and its corner positions and dimensions have been stored in constant variables by testing different values. In particular Alberto had the idea of converting the image into a byte array and inserting it in a header file through ImageMagick (\url{https://imagemagick.org/}).
This step has been performed with the aim of creating a self-contained executable without the need of the png image dependency.
As a first step, an image of an empty billiard table has been selected, and its corner positions and dimensions have been stored in constant variables by testing different values. In particular, Alberto had the idea of converting the image into a byte array and inserting it in a header file through ImageMagick (\url{https://imagemagick.org/}).
This step has been performed with the aim of creating a self-contained executable without the need for the \texttt{.png} image dependency.
The byte array is then used to create a \texttt{Mat} object through the \texttt{imdecode} function.

\subsubsection{Computation of the transformation matrix}
Expand All @@ -20,7 +20,7 @@ \subsubsection{Check if the corners are in the required order}
To check this information, the “percentage of table” with respect to the pocket in a rectangle placed in the center of the edge (with dimensions proportional to the real table and pocket dimensions) has been computed for all the edges. This computation has been done in the table image previously transformed and cropped to the table dimensions; in this way, the center between two corners corresponds to the real one (otherwise, if the table has some perspective effect, the center between the two corners may not correspond to the real one). Then, the edges have been ordered by using this percentile. To understand how the corners were oriented, three cases have been considered:
\begin{itemize}
\item If the edges with "more pocket" are opposite edges, then they are the longest edges; This happens, for example, in Figure \ref{fig:game2_clip1_orientation}.
\item If the edge with "more pocket" is opposite to the one with "less pocket", then they are not the longest edges; This happen, for example, in Figure \ref{fig:game3_clip1_orientation} and Figure \ref{fig:game4_clip1_orientation}, when there is an occlusion or much noise in the center of the edge with "more pocket".
\item If the edge with "more pocket" is opposite to the one with "less pocket", then they are not the longest edges; This happens, for example, in Figure \ref{fig:game3_clip1_orientation} and Figure \ref{fig:game4_clip1_orientation}, when there is an occlusion or much noise in the center of the edge with "more pocket".
\item Otherwise, there is uncertainty, and then, probably, the one with "more pocket" is the longest edge.
\end{itemize}
If the table is not horizontal as expected (for example in Figure \ref{fig:game1_clip1_orientation}), then all the edges are rotated and the transformation matrix is re-computed.
Expand All @@ -39,7 +39,7 @@ \subsubsection{Check if the corners are in the required order}
\caption{Transformation of the table to the minimap table size}
%\label{fig:game1_clip1_mask}
\end{subfigure}
\caption{game1\_clip1 first frame. The table is transformed in a wrong way, because the pockets are located in the shortest edges rather than the longest ones.}
\caption{game1\_clip1 first frame. The table is transformed in the wrong way because the pockets are located in the shortest edges rather than the longest ones.}
\label{fig:game1_clip1_orientation}
\end{figure}

Expand All @@ -54,7 +54,7 @@ \subsubsection{Check if the corners are in the required order}
%\label{fig:game2_clip1_mask}
\includegraphics[width=0.48\textwidth]{images/TableOrientation/g2_c1_mask.jpg}
}
\caption{game2\_clip1 first frame. The table is correctly transformed. In this case the pockets are lightly visible, but they allow to detect the correct orientation.}
\caption{game2\_clip1 first frame. The table is correctly transformed. In this case, the pockets are lightly visible, but they allow the detection of the correct orientation.}
\label{fig:game2_clip1_orientation}
\end{figure}

Expand All @@ -69,7 +69,7 @@ \subsubsection{Check if the corners are in the required order}
%\label{fig:game3_clip1_mask}
\includegraphics[width=0.48\textwidth]{images/TableOrientation/g3_c1_mask.jpg}
}
\caption{game3\_clip1 first frame. The table is correctly transformed. In this case, the center of one of the shortest edges has some noise due to the person playing the game; the result is correct, because in the opposite edge there is no noise.}
\caption{game3\_clip1 first frame. The table is correctly transformed. In this case, the center of one of the shortest edges has some noise due to the person playing the game; the result is correct because in the opposite edge there is no noise.}
\label{fig:game3_clip1_orientation}
\end{figure}

Expand All @@ -84,13 +84,15 @@ \subsubsection{Check if the corners are in the required order}
%\label{fig:game4_clip1_mask}
\includegraphics[width=0.48\textwidth]{images/TableOrientation/g4_c1_mask.jpg}
}
\caption{game4\_clip1 first frame. The table is correctly transformed. In this case, the center of one of the shortest edges has some noise due to the light of the table; the result is correct, because in the opposite edge there is no noise.}
\caption{game4\_clip1 first frame. The table is correctly transformed. In this case, the center of one of the shortest edges has some noise due to the light of the table; the result is correct because in the opposite edge there is no noise.}
\label{fig:game4_clip1_orientation}
\end{figure}


\subsubsection{Draw the minimap with tracking lines and balls}
Given the transformation matrix and the ball positions in the frame, it is possible to compute the positions of the balls in the minimap. This computation has been done in the \texttt{drawMinimap} method. Every time this method is called, the ball positions and the positions of the balls in the previous frame (if they have been computed by the tracker) are computed by using the \texttt{perspectiveTransform} method. For each ball in the frame, a line between the previous position and the current position is drawn on the minimap image, passed as a parameter by reference such that all the tracking lines are kept in a single image (Figure \ref{fig:game2_clip1_tracking}). Then this image is cloned into a copy, and the current balls are drawn on it. This image is then returned (Figure \ref{fig:game2_clip1_balls}). This implementation idea comes from Alberto.
Given the transformation matrix and the ball positions in the frame, it is possible to compute the positions of the balls in the minimap. This computation has been done in the \texttt{drawMinimap} method. Every time this method is called, the ball positions and the positions of the balls in the previous frame (if they have been computed by the tracker) are computed by using the

\noindent\texttt{perspectiveTransform} method. For each ball in the frame, a line between the previous position and the current position is drawn on the minimap image, passed as a parameter by reference such that all the tracking lines are kept in a single image (Figure \ref{fig:game2_clip1_tracking}). Then this image is cloned into a copy, and the current balls are drawn on it. This image is then returned (Figure \ref{fig:game2_clip1_balls}). This implementation idea comes from Alberto.

\begin{figure}[H]
\centering
Expand All @@ -110,4 +112,6 @@ \subsubsection{Draw the minimap with tracking lines and balls}
\label{fig:game2_clip1_balls_and_tracking}
\end{figure}

The ideas of using and the implementation of \texttt{getPerspectiveTransform} and \texttt{perspectiveTransform}, and how to check the orientation of the table were from Michela.
The implementation and the ideas of using \texttt{getPerspectiveTransform} and

\noindent\texttt{perspectiveTransform}, and how to check the orientation of the table were from Michela.
4 changes: 1 addition & 3 deletions Report/section/elements.tex → Report/section/parts.tex
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
\section{Elements of our project} % TODO change name
% TODO ThE FOLLOWING ARE AGAIN SECTIONS -> FIX
\input{section/executables}
\section{Parts of the project}
\input{section/tableDetection}
\input{section/tableSegmentation}
\input{section/ballsDetection}
Expand Down
2 changes: 1 addition & 1 deletion Report/section/radiusAttempt.tex
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ \subsubsection{Attempt to find the ball radius relative to the distance and pers
\item Otherwise it is a value between 0 and 1, which indicates the percentage of slope between the camera and the table; for example, if the value is 0.5, then the camera is about 45° from the table.
\end{itemize}

To compute the final interval, the minimum and maximum values are computed by subtracting and incrementing a value, which increases with the percentage of slope (more the slope, more the variance) by multiplying the percentage of slope with the mean radius previously computed, and a precision value is added due to some other variables in the images.
To compute the final interval, the minimum, and maximum values are computed by subtracting and incrementing a value, which increases with the percentage of slope (more the slope, more the variance) by multiplying the percentage of slope with the mean radius previously computed, and a precision value is added due to some other variables in the images.
\begin{equation}
min\_radius = mean\_radius - mean\_radius \times percentage\_slope - precision
\end{equation}
Expand Down
Loading

0 comments on commit ea9e05b

Please sign in to comment.