Merge branch 'main' of https://github.com/Sproc01/8BallPool

Sproc01 · Jul 21, 2024 · 37f99b0 · 37f99b0
2 parents f3293c4 + 7192606
commit 37f99b0
Show file tree

Hide file tree

Showing 20 changed files with 601 additions and 589 deletions.
diff --git a/Report/main.pdf b/Report/main.pdf
diff --git a/Report/main.tex b/Report/main.tex
@@ -1,4 +1,4 @@
-% File: 8BallPool report
+% File: 8BallPool video analysis
 \documentclass[letterpaper,12pt]{article} % DO NOT CHANGE THIS
 
 \usepackage{amsmath}
@@ -123,6 +123,8 @@
 	\{michele.sprocatti\textsuperscript{\rm 1}, alberto.pasqualetto.2\textsuperscript{\rm 2}, michela.schibuola\textsuperscript{\rm 3}\}@studenti.unipd.it
 \end{center}
 
+% TODO write about restrictions in the input datasets size/aspect ratio
+
 \input{section/introduction}
 
 \input{section/workload}

diff --git a/Report/section/ballsDetection.tex b/Report/section/ballsDetection.tex
@@ -1,6 +1,10 @@
 \subsection{Balls detection}
+To detect balls, Michele proposed a multi-step preprocessing approach. Initially, the table region is isolated using an approach similar to the segmentation described before. Then the corners area is removed to prevent Hough Circle transform to find them as false positives. Subsequently k-means clustering was applied to the image with k=5 (the number of balls type plus the playing field). The resulting clusterized \texttt{Mat} is converted to gray-scale to be used as \texttt{HoughCircle} input. The gray-scale output colors were selected to be as different from each other once the color space is changed.
+
+Circle parameters, such as radius and center color, were analyzed to identify potential ball regions. By calculating the mean radius of in-table circles with center not selected by the color mask, a radius range was established. Circles within this radius range were considered for further analysis.
+
+Ball classification involved creating a circular mask, computing the gray-scale histogram, and excluding background pixels from the values of the histogram. Peak values in the histogram were used to differentiate between striped and solid balls, while HSV color space analysis is used to distinguish white and black balls.
 
-To detect balls, Michele proposed a multi-step preprocessing approach. Initially, the table region was isolated by constructing a polygon using its corners and a color-based mask is generated. Subsequently, pixels outside the table were nullified, and k-means clustering was applied to the image. The resulting clusters were converted to gray-scale for Hough Circle Transform application.
-Circle parameters, such as radius and center color, were analyzed to identify potential ball regions. By calculating the mean radius of in-table circles with center not selected by the color mask, a radius range was established. Circles within this radius range were considered for further analysis. Ball classification involved creating a circular mask, computing the gray-scale histogram, and excluding background pixels from the values of the histogram. Peak values in the histogram were used to differentiate between striped and solid balls, while HSV color space analysis is used to distinguish white and black balls. 
 After finding the balls, the team identified an optimization opportunity. Since there's only one white ball and one black ball, Michele implemented non-maxima suppression for white and black balls independently, in order to improve performance.
-The result of the detection process is then used to segment the balls.
+
+The result of the detection process is then used to segment the balls.
diff --git a/Report/section/conclusions.tex b/Report/section/conclusions.tex
@@ -1,2 +1,2 @@
 \section{Conclusions}
-Our program demonstrates consistent performance across the dataset. Notably, table detection achieves high accuracy. However, ball classification presents some challenges due to their varying sizes and colors that can sometimes are similar to the one of the table.
+Our approach demonstrates consistent performance across the dataset. Notably, table detection achieves high accuracy. However, ball classification presents some challenges due to their varying sizes and colors that can sometimes are similar to the one of the playing field, also solid and striped balls are difficult to distinguish because % TODO riflessi e poca parte bianca visibile
diff --git a/Report/section/elements.tex b/Report/section/elements.tex
@@ -1,5 +1,6 @@
-\section{Elements of our project}
+\section{Elements of our project}	% TODO change name
 
+% TODO ThE FOLLOWING ARE AGAIN SECTIONS -> FIX
 \input{section/executables}
 \input{section/tableDetection}
 \input{section/tableSegmentation}

diff --git a/Report/section/executables.tex b/Report/section/executables.tex
@@ -1,9 +1,8 @@
 \section{Executables}
 The program contains 4 different executables:
 \begin{itemize}
-	\item \texttt{8BallPool}: the main executable that given a video file, it processes it and creates the output video.
-	\item \texttt{TestAllClip}: the executable that was used to test the detection and segmentation in all the first and last clip of all videos.
-	\item \texttt{ShowSegmentationColored}: the executable that was used to show the ground truth of the segmentation of a particular frame and it was also used
-			as test for the code that computes the metrics because it computes the performance of the ground truth it self.
-	\item \texttt{ComputePerfomance}: This executable was used to compute the performance across the dataset so the mAP and the mIoU.
+	\item \texttt{8BallPool}: the main executable that, given a video file, it processes it and creates the output video with the superimposed minimap.
+	\item \texttt{TestAllClip}: it is the executable used to test the detection and segmentation in the first and last frame of all videos through AP and IoU by comparing them with the ground truth.
+	\item \texttt{ShowSegmentationColored}: is an helper which has been used to show the ground truth of the segmentation of a particular frame using human-readable colors and it was also used as a test for the code that computes the metrics because it computes the performance of the ground truth on itself.
+	\item \texttt{ComputePerformance}: is used to compute the performance across the dataset so the mAP and the mIoU.
 \end{itemize}
diff --git a/Report/section/metrics.tex b/Report/section/metrics.tex
@@ -1,20 +1,20 @@
 \section{Metrics}
-The \textit{computePerformance} executable handles both mAP and mIoU calculations.
-\begin{itemize}
-    \item mAP (mean Average Precision):
-        \begin{itemize}
-            \item Predictions are made for all first and last frames in a video.
-            \item Since we lack alternative confidence scores, we opted to calculate mAP using IoU as the measure of confidence, because it is a good indicator of how good the detection is.
-            \item For each object class (e.g., ball type), the Average Precision (AP) is calculated.
-            \item Then the final mAP is obtained by averaging the AP values across all classes.
-        \end{itemize}
-\end{itemize}
-\begin{itemize}
-    \item mIoU (mean Intersection over Union):
-        \begin{itemize}
-            \item IoU is calculated for the first and the last frame for each video.
-            \item The average IoU is then computed for each object class across all 20 images (10 videos each one with 2 frame).
-            \item Finally, the mIoU is obtained by averaging the IoU values obtained in the last step.
-        \end{itemize}
-\end{itemize}
-The 8BallPool executable displays the performance metrics (AP and IoU) achieved by the method for the specific input video.
+The \texttt{computePerformance} executable handles both mAP and mIoU calculations.
+
+% TODO check if explanations are correct
+\noindent\textbf{\emph{mAP (mean Average Precision)}}:
+\begin{enumerate}
+	\item Predictions are performed for all first and last frames in a video.
+	\item Since we lack alternative \textit{confidence scores}, we opted to calculate mAP using IoU as the measure of confidence, because it is a good indicator of how good the detection is.
+	\item For each object class (e.g., ball type), the Average Precision (AP) is calculated.
+	\item Then the final mAP is obtained by averaging the AP values across all classes.
+\end{enumerate}
+
+\noindent\textbf{\emph{mIoU (mean Intersection over Union)}}:
+\begin{enumerate}
+	\item IoU is calculated for the first and the last frame for each video.
+	\item The average IoU is then computed for each object class across all 20 images (10 videos each one with 2 frame) in the dataset.
+	\item Finally, the mIoU is obtained by averaging the IoU values obtained in the last step.
+\end{enumerate}
+
+The 8BallPool executable displays the performance metrics (AP and IoU) achieved by the method for the specific input video.
diff --git a/Report/section/minimap.tex b/Report/section/minimap.tex
@@ -1,29 +1,29 @@
-\subsection{Mini-map creation}
-
-To create the mini-map are needed:
+\subsection{Minimap creation}
+To create the minimap are needed:
 \begin{itemize}
 	\item An image that contains an empty billiard table and some information about it;
 	\item The position of the balls in the current and previous frames;
-	\item A transformation matrix that computes the position of the balls in the mini-map.
+	\item A transformation matrix that computes the position of the balls in the minimap.
 \end{itemize}
 
-\subsubsection{Empty mini-map image}
-
-As a first step, an image of an empty billiard table has been selected, and its corner positions and dimensions have been stored in constant variables by testing different values. In particular we decided to use an header version of the image to have a self-contained executable without the need of the image. This version has been created with an online tool (ImageMagick - \url{https://imagemagick.org/}).
+\subsubsection{Empty minimap image}
+As a first step, an image of an empty billiard table has been selected, and its corner positions and dimensions have been stored in constant variables by testing different values. In particular Alberto had the idea of converting the image into a byte array and inserting it in a header file through ImageMagick (\url{https://imagemagick.org/}).
+This step has been performed with the aim of creating a self-contained executable without the need of the png image dependency.
+The byte array is then used to create a \texttt{Mat} object through the \texttt{imdecode} function.
 
 \subsubsection{Computation of the transformation matrix}
+The \texttt{computeTransformation} method has been written to compute the transformation matrix, which allows for the computation of the positions of the balls in the 2d table represented in the minimap. To do that, a relationship between the corners of the table in the frame and the corners of the table in the minimap has been found by the OpenCV \texttt{getPerspectiveTransform()} method, which “calculates a perspective transform from 4 pairs of the corresponding points” and returns a transformation matrix.
+At first, it is supposed that the corners are given in clockwise order and that the first corner is followed by a long table edge. To check this information, \texttt{checkHorizontalTable} has been written.
 
-The \texttt{computeTransformation()} method has been written to compute the transformation matrix, which allows for the computation of the positions of the balls in the table of the mini-map. To do that, a relationship between the corners of the table in the frame and the corners of the table in the mini-map has been made. This relationship has been made by the OpenCV \texttt{getPerspectiveTransform()} method, which “calculates a perspective transform from four pairs of the corresponding points” and returns a transformation matrix. At first, it is supposed that the corners are given in clockwise order and that the first corner is followed by a long table edge. To check this information, \texttt{checkHorizontalTable()} has been written. 
-
-\subsubsection{Check if the corners are in the order needed}
-
-The \texttt{checkHorizontalTable()} method checks, using the image in input and the corners of the table in that image, if the corners are oriented such that the first corner is followed by a long table edge. To check this information, the “percentage of table” with respect to the pool in a rectangle placed in the center of the edge (with dimensions proportional to the real table and pool dimensions) has been computed for all the edges. This computation has been done in the table image previously transformed and cropped to the table dimensions; in this way, the center between two corners corresponds to the real one (otherwise, if the table has some perspective effect, the center between the two corners may not correspond to the real one). Then, the edges have been ordered by using this percentile. To understand how the corners were oriented, three cases have been considered:
+\subsubsection{Check if the corners are in the required order}	% TODO pool -> pocket
+The \texttt{checkHorizontalTable} method checks, using the image in input and the corners of the table in that image, if the corners are oriented such that the first corner is followed by a long table edge.
+To check this information, the “percentage of table” with respect to the pool in a rectangle placed in the center of the edge (with dimensions proportional to the real table and pool dimensions) has been computed for all the edges. This computation has been done in the table image previously transformed and cropped to the table dimensions; in this way, the center between two corners corresponds to the real one (otherwise, if the table has some perspective effect, the center between the two corners may not correspond to the real one). Then, the edges have been ordered by using this percentile. To understand how the corners were oriented, three cases have been considered:
 \begin{itemize}
 	\item If the edges with "more pool" are opposite edges, then they are the longest edges; This happens, for example, in Figure \ref{fig:game2_clip1_orientation}.
-	\item If the edge with "more pool" is opposite to the one with "less pool", then they are not the longest edges; This happen, for example, in Figure \ref{fig:game3_clip1_orientation} and Figure \ref{fig:game4_clip1_orientation}.
+	\item If the edge with "more pool" is opposite to the one with "less pool", then they are not the longest edges; This happen, for example, in Figure \ref{fig:game3_clip1_orientation} and Figure \ref{fig:game4_clip1_orientation}, when there is an occlusion or much noise in the center of the edge with "more pool".
 	\item Otherwise, there is uncertainty, and then, probably, the one with "more pool" is the longest edge.
 \end{itemize}
-If the table is not horizontal as expected (for example in Figure \ref{fig:game1_clip1_orientation}), then all the edges have been rotated and the transformation matrix has been re-computed.
+If the table is not horizontal as expected (for example in Figure \ref{fig:game1_clip1_orientation}), then all the edges are rotated and the transformation matrix is re-computed.
 
 \begin{figure}[H]
 	\centering
@@ -89,26 +89,25 @@ \subsubsection{Check if the corners are in the order needed}
 \end{figure}
 
 
-\subsubsection{Draw the mini-map with tracking lines and balls}
-
-Given the transformation matrix and the ball positions in the frame, it is possible to compute the positions of the balls in the mini-map. This computation has been done in the \texttt{drawMinimap()} method. Every time this method is called, the ball positions and the positions of the balls in the previous frame (if they have been computed by the tracker) are computed by using the \texttt{perspectiveTransform()} method. For each ball in the frame, a line between the previous position and the current position is drawn on the mini-map image, passed as a parameter by reference such that all the tracking lines are kept in a single image (Figure \ref{fig:game2_clip1_tracking}). Then this image is cloned into a copy, and the current balls are drawn on it. This image is then returned (Figure \ref{fig:game2_clip1_balls}).\
+\subsubsection{Draw the minimap with tracking lines and balls}
+Given the transformation matrix and the ball positions in the frame, it is possible to compute the positions of the balls in the minimap. This computation has been done in the \texttt{drawMinimap} method. Every time this method is called, the ball positions and the positions of the balls in the previous frame (if they have been computed by the tracker) are computed by using the \texttt{perspectiveTransform} method. For each ball in the frame, a line between the previous position and the current position is drawn on the minimap image, passed as a parameter by reference such that all the tracking lines are kept in a single image (Figure \ref{fig:game2_clip1_tracking}). Then this image is cloned into a copy, and the current balls are drawn on it. This image is then returned (Figure \ref{fig:game2_clip1_balls}). This implementation idea comes from Alberto.
 
 \begin{figure}[H]
 	\centering
 	\begin{subfigure}[b]{0.48\textwidth}
 		\centering
 		\includegraphics[width=\textwidth]{images/Minimap/g2_c1minimap_with_track.jpg}
-		\caption{Mini-map with tracking lines}
+		\caption{Minimap with tracking lines}
 		\label{fig:game2_clip1_tracking}
 	\end{subfigure}
 	\begin{subfigure}[b]{0.48\textwidth}
 		\centering
 		\includegraphics[width=\textwidth]{images/Minimap/g2_c1_minimap_with_balls.jpg}
-		\caption{Mini-map with tracking lines and balls}
+		\caption{Minimap with tracking lines and balls}
 		\label{fig:game2_clip1_balls}
 	\end{subfigure}
-	\caption{game2\_clip1. Mini-map of the last frame.}
+	\caption{game2\_clip1. Minimap of the last frame.}
 	\label{fig:game2_clip1_balls_and_tracking}
 \end{figure}
 
-The ideas of using \texttt{getPerspectiveTransform()} and \texttt{perspectiveTransform()}, and how to check the orientation of the table were from Michela; the ideas of using an header version of the mini-map image and of drawing the balls on a copy of the mini-map image, rather than the one that contains the tracking, were from Alberto.
+The ideas of using and the implementation of \texttt{getPerspectiveTransform} and \texttt{perspectiveTransform}, and how to check the orientation of the table were from Michela.