This project implements the classic Six Degrees of Separation problem, also known as Connect the Stars, using IMDb data and the Breadth-First Search (BFS) algorithm. The application finds the shortest connection path between two actors, showing which movies connect them, directly or indirectly.
The core idea is simple but powerful: what is the minimum number of movies that connects two actors? Each connection represents one degree of separation. Actors are modeled as nodes in a graph, and an edge exists between two actors if they starred in the same movie.
The data is loaded from CSV files based on IMDb.
people.csvstores actors with their IDs, names, and birth yearsmovies.csvstores movie information such as title and yearstars.csvrepresents the relationship between actors and movies
All data is stored in memory using dictionaries and sets for fast lookup.
- Python 3
- Dictionaries and sets
- Queue-based graph traversal
- Breadth-First Search (BFS)
Each actor is treated as a node in an unweighted graph. Two actors are connected if they appeared in the same movie. The algorithm starts from the source actor and explores neighbors level by level using BFS, ensuring the shortest path is found. Once the target actor is reached, the path is reconstructed by following parent pointers back to the source. The final result is a list of (movie_id, person_id) pairs representing the connection path.
Clone the repository and navigate to the project directory:
git clone https://github.com/your-username/your-repository.git
cd your-repositoryExpected folder structure:
.
├── degrees.py
├── util.py
├── large/
│ ├── people.csv
│ ├── movies.csv
│ └── stars.csv
└── small/
│ ├── people.csv
│ ├── movies.csv
│ └── stars.csv
Run the program with:
python degrees.py largeOr use the smaller dataset:
python degrees.py smallYou will be prompted to enter the names of two actors. If multiple actors share the same name, the program will ask you to choose the correct IMDb ID.
4️⃣ Enter the actors' names
The program will prompt for two names:
Name: Viola Davis
Name: ian McKellen
Output:
2 degrees of separation.
1: Viola Davis and James McAvoy starred in The Disappearance of Eleanor Rigby: Them
2: James McAvoy and Ian McKellen starred in X-Men: Days of Future Past
Each line shows the degree of separation, the connected actors, and the movie that links them.
- Graph visualization
- Graphical user interface (GUI)
- Support for additional datasets
Inspired by the CS50’s Introduction to Artificial Intelligence, focusing on graph modeling and search algorithms.
@lucaasleal