The steps to create various plots such as histograms, box plots, and scatter plots using Matplotlib and Seaborn. We'll visualize the distribution of ages in the Titanic dataset and explore relationships between different features.
- Import Matplotlib and Seaborn:
import matplotlib.pyplot as plt import seaborn as sns
- Load the dataset:
import pandas as pd !wget https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv -O train.csv df = pd.read_csv('train.csv')
- Histogram:
plt.figure(figsize=(10, 6)) sns.histplot(df['Age'].dropna(), kde=True) plt.title('Age Distribution') plt.xlabel('Age') plt.ylabel('Frequency') plt.show()
- Box Plot:
plt.figure(figsize=(10, 6)) sns.boxplot(x='Survived', y='Age', data=df) plt.title('Age Distribution by Survival') plt.xlabel('Survived') plt.ylabel('Age') plt.show()
- Scatter Plot:
plt.figure(figsize=(10, 6)) sns.scatterplot(x='Age', y='Fare', data=df) plt.title('Age vs Fare') plt.xlabel('Age') plt.ylabel('Fare') plt.show()
-
Import Necessary Libraries:
import matplotlib.pyplot as plt import seaborn as sns
-
Load the Titanic Dataset:
import pandas as pd df = pd.read_csv('train.csv')
-
Create a Histogram for Age Distribution:
plt.figure(figsize=(10, 6)) sns.histplot(df['Age'].dropna(), kde=True) plt.title('Age Distribution') plt.xlabel('Age') plt.ylabel('Frequency') plt.show()
-
Create a Box Plot for Age Distribution by Survived:
plt.figure(figsize=(10, 6)) sns.boxplot(x='Survived', y='Age', data=df) plt.title('Age Distribution by Survival') plt.xlabel('Survived') plt.ylabel('Age') plt.show()
-
Create a Scatter Plot for Age vs Fare:
plt.figure(figsize=(10, 6)) sns.scatterplot(x='Age', y='Fare', data=df) plt.title('Age vs Fare') plt.xlabel('Age') plt.ylabel('Fare') plt.show()
Please create an Issue for any improvements, suggestions or errors in the content.
You can also contact me using Linkedin for any other queries or feedback.