Harry Potter Character Network Visualisation

A Deep Dive into Magical Interconnections Using Six Degrees of Separation Theory

Class Project: DS 210 Programming for Data Science

Objective: In this project, I utilised the six degrees of separation theory by using graph theory and network analysis to analyse the relationship between characters in the Harry Potter network! This project combines data processing, graph traversal algorithms, and interactive visualisations to reveal how interconnected the characters are.

Technologies Used

  1. Rust: Efficient graph traversal and BFS algorithm implementation.

  2. Python: Visualization using networkx and matplotlib.

  3. Data Transformation Tools: CSV manipulation for input/output.

Key Deliverables:

  1. Implementation of a graph-based analysis in Rust.

  2. Data cleaning and transformation of a raw JSON dataset into a CSV format.

  3. Visualisation of the character network using Python.

Dataset Section

Data Source:

Data Transformation:

  • Converted the JSON into a structured CSV file for easier manipulation.

  • Nodes represent characters, and edges represent their interactions.

char_1, char_2
Harry Potter, Hermione Granger
Harry Potter, Ron Weasley
Voldemort, Bellatrix Lestrange
...

Snipped of Processed Data:


Implementation Section

Algorithm Implementation in Rust:
The graph analysis was implemented in Rust for efficient computation.

  • Breadth-First Search (BFS): Calculated degrees of separation from a randomly selected character to all others in the network.

  • Random Sampling Module: Randomly selects a starting character for the analysis.

Key Rust Code Snippet:


Visualization Section

Interactive Network Visualization in Python:
To make the results interpretable, the character network was visualized using networkx and matplotlib. Filtering logic was applied to focus on relevant characters and reduce clutter.

Visualization Process:

  1. Load the Rust-generated output (output.txt).

  2. Filtered the network graph to include only key connections:

    • Starting character (e.g., Stan Shunpike).

    • Direct neighbors and optionally, neighbors-of-neighbors.

  3. Rendered the graph with color-coded nodes based on degrees of separation.

Python Code Snippet:

Sample Visualisation:


Insights and Results

  1. Connectedness of the Network:

    • All characters are interconnected within a few degrees, confirming the Six Degrees of Separation theory in the Harry Potter universe.

    • E.g., Stan Shunpike has most characters within 2 degrees of separation.

  2. Performance:

    • Execution time of the algorithm: ~1600 microseconds.

    • Efficient graph traversal with Rust's memory safety and speed.

  3. Visualization Impact:

    • Highlighted clear connections between central characters while filtering out noise

Conclusion

This project highlights my ability to:

  1. Transform raw datasets into structured formats.

  2. Implement and optimize algorithms for real-world problems.

  3. Design intuitive visualizations to make complex data comprehensible.

For more detailed project information, visit my Github: