Skip to content

lucaasleal/SixDegrees

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Six Degrees of Separation — Connect the Stars

This project implements the classic Six Degrees of Separation problem, also known as Connect the Stars, using IMDb data and the Breadth-First Search (BFS) algorithm. The application finds the shortest connection path between two actors, showing which movies connect them, directly or indirectly.

🧠 Project Idea

The core idea is simple but powerful: what is the minimum number of movies that connects two actors? Each connection represents one degree of separation. Actors are modeled as nodes in a graph, and an edge exists between two actors if they starred in the same movie.

📊 Data Structure

The data is loaded from CSV files based on IMDb.

  • people.csv stores actors with their IDs, names, and birth years
  • movies.csv stores movie information such as title and year
  • stars.csv represents the relationship between actors and movies

All data is stored in memory using dictionaries and sets for fast lookup.

⚙️ Technologies Used

  • Python 3
  • Dictionaries and sets
  • Queue-based graph traversal
  • Breadth-First Search (BFS)

🧩 How the Algorithm Works

Each actor is treated as a node in an unweighted graph. Two actors are connected if they appeared in the same movie. The algorithm starts from the source actor and explores neighbors level by level using BFS, ensuring the shortest path is found. Once the target actor is reached, the path is reconstructed by following parent pointers back to the source. The final result is a list of (movie_id, person_id) pairs representing the connection path.

▶️ How to Run the Project

Clone the repository and navigate to the project directory:

git clone https://github.com/your-username/your-repository.git
cd your-repository

Expected folder structure:

.
├── degrees.py
├── util.py
├── large/
│   ├── people.csv
│   ├── movies.csv
│   └── stars.csv
└── small/
│   ├── people.csv
│   ├── movies.csv
│   └── stars.csv

Run the program with:

python degrees.py large

Or use the smaller dataset:

python degrees.py small

You will be prompted to enter the names of two actors. If multiple actors share the same name, the program will ask you to choose the correct IMDb ID.

📌 Example

4️⃣ Enter the actors' names

The program will prompt for two names:

Name: Viola Davis
Name: ian McKellen

Output:

2 degrees of separation.
1: Viola Davis and James McAvoy starred in The Disappearance of Eleanor Rigby: Them
2: James McAvoy and Ian McKellen starred in X-Men: Days of Future Past
image

Each line shows the degree of separation, the connected actors, and the movie that links them.

🚀 Possible Improvements (Feel free to contribute)

  • Graph visualization
  • Graphical user interface (GUI)
  • Support for additional datasets

📚 Inspiration

Inspired by the CS50’s Introduction to Artificial Intelligence, focusing on graph modeling and search algorithms.

👤 Author

@lucaasleal

About

The project is a implementation of the game Six Degrees of Kevin Bacon, or famous nowadays Connect The Stars. Both games consists in build a conection between two actors/actress of Hollywood, using Search Algorithms like BFS or DFS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages