Skip to content

Latest commit

 

History

History
114 lines (94 loc) · 3.94 KB

exer5.md

File metadata and controls

114 lines (94 loc) · 3.94 KB

Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors Visitors

Exercise 5: Data Visualization with Matplotlib and Seaborn

The steps to create various plots such as histograms, box plots, and scatter plots using Matplotlib and Seaborn. We'll visualize the distribution of ages in the Titanic dataset and explore relationships between different features.

Step 1: Import Necessary Libraries

  1. Import Matplotlib and Seaborn:
    import matplotlib.pyplot as plt
    import seaborn as sns

Step 2: Load the Titanic Dataset

  1. Load the dataset:
    import pandas as pd
    !wget https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv -O train.csv
    df = pd.read_csv('train.csv')

Step 3: Create a Histogram for Age Distribution

  1. Histogram:
    plt.figure(figsize=(10, 6))
    sns.histplot(df['Age'].dropna(), kde=True)
    plt.title('Age Distribution')
    plt.xlabel('Age')
    plt.ylabel('Frequency')
    plt.show()

Step 4: Create a Box Plot for Age Distribution by Survived

  1. Box Plot:
    plt.figure(figsize=(10, 6))
    sns.boxplot(x='Survived', y='Age', data=df)
    plt.title('Age Distribution by Survival')
    plt.xlabel('Survived')
    plt.ylabel('Age')
    plt.show()

Step 5: Create a Scatter Plot for Age vs Fare

  1. Scatter Plot:
    plt.figure(figsize=(10, 6))
    sns.scatterplot(x='Age', y='Fare', data=df)
    plt.title('Age vs Fare')
    plt.xlabel('Age')
    plt.ylabel('Fare')
    plt.show()

Step-by-Step Execution

  1. Import Necessary Libraries:

    import matplotlib.pyplot as plt
    import seaborn as sns
  2. Load the Titanic Dataset:

    import pandas as pd
    df = pd.read_csv('train.csv')
  3. Create a Histogram for Age Distribution:

    plt.figure(figsize=(10, 6))
    sns.histplot(df['Age'].dropna(), kde=True)
    plt.title('Age Distribution')
    plt.xlabel('Age')
    plt.ylabel('Frequency')
    plt.show()
  4. Create a Box Plot for Age Distribution by Survived:

    plt.figure(figsize=(10, 6))
    sns.boxplot(x='Survived', y='Age', data=df)
    plt.title('Age Distribution by Survival')
    plt.xlabel('Survived')
    plt.ylabel('Age')
    plt.show()
  5. Create a Scatter Plot for Age vs Fare:

    plt.figure(figsize=(10, 6))
    sns.scatterplot(x='Age', y='Fare', data=df)
    plt.title('Age vs Fare')
    plt.xlabel('Age')
    plt.ylabel('Fare')
    plt.show()

Contribution 🛠️

Please create an Issue for any improvements, suggestions or errors in the content.

You can also contact me using Linkedin for any other queries or feedback.

Visitors