Skip to content

Latest commit

 

History

History
157 lines (115 loc) · 5.35 KB

seaborn.md

File metadata and controls

157 lines (115 loc) · 5.35 KB

Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors Visitors

🌟 Hit star button to save this repo in your profile

Seaborn

Seaborn is a Python data visualization library based on Matplotlib, designed for creating informative and attractive statistical graphics. It's particularly well-suited for various aspects of Exploratory Data Analysis (EDA). Here are some common Seaborn syntax and functions suitable for EDA:

  1. Importing Seaborn:

    • Import Seaborn:

      import seaborn as sns
  2. Setting Aesthetic Parameters:

    • Set the style and color palette for plots:

      sns.set(style="whitegrid", palette="husl")
  3. Distribution Plots:

    • Create various distribution plots like histograms and kernel density estimates:

      sns.histplot(data, kde=True)
  4. Pair Plots:

    • Generate pair plots for visualizing relationships between multiple variables:

      sns.pairplot(df, hue='category_column')
  5. Scatter Plots:

    • Create scatter plots to explore relationships between two variables:

      sns.scatterplot(x='x_column', y='y_column', data=df)
  6. Box Plots:

    • Generate box plots to visualize data distributions and outliers:

      sns.boxplot(x='category_column', y='value_column', data=df)
  7. Violin Plots:

    • Create violin plots for a combination of box plots and kernel density estimation:

      sns.violinplot(x='category_column', y='value_column', data=df)
  8. Heatmaps:

    • Generate heatmaps to visualize correlation matrices:

      sns.heatmap(correlation_matrix, annot=True, cmap="YlGnBu")
  9. Count Plots:

    • Create count plots to visualize the distribution of categorical variables:

      sns.countplot(x='category_column', data=df)
  10. PairGrids:

    • Build custom PairGrids for more control over pair plots:

      g = sns.PairGrid(df, hue='category_column')
      g.map_upper(sns.scatterplot)
      g.map_diag(sns.histplot)
      g.map_lower(sns.kdeplot)
  11. Regression Plots:

    • Create regression plots to explore linear relationships between variables:

      sns.regplot(x='x_column', y='y_column', data=df)
  12. Customizing Plots:

    • Customize plots with titles, labels, and other attributes:

      plt.title('Plot Title')
      plt.xlabel('X-axis label')
      plt.ylabel('Y-axis label')

Example

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# URL of the Titanic dataset
url = 'https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv'

# Load the dataset
titanic_data = pd.read_csv(url)

# Display the first 5 rows
print(titanic_data.head())

# Set the aesthetic style of the plots
sns.set_style('whitegrid')

# Plotting the distribution of passengers' ages
plt.figure(figsize=(10, 6))
sns.histplot(titanic_data['Age'].dropna(), bins=30, kde=True)
plt.title('Distribution of Passengers\' Ages')
plt.xlabel('Age')
plt.ylabel('Number of Passengers')
plt.show()

# Plotting the survival rate by gender
plt.figure(figsize=(10, 6))
sns.barplot(x='Sex', y='Survived', data=titanic_data, palette='pastel')
plt.title('Survival Rate by Gender')
plt.xlabel('Gender')
plt.ylabel('Survival Rate')
plt.show()

# Plotting the survival rate by class
plt.figure(figsize=(10, 6))
sns.barplot(x='Pclass', y='Survived', data=titanic_data, palette='muted')
plt.title('Survival Rate by Class')
plt.xlabel('Class')
plt.ylabel('Survival Rate')
plt.show()

Seaborn is known for its high-level abstractions and aesthetically pleasing visualizations. You can use these Seaborn syntax and functions to create informative and attractive plots for EDA, making it easier to explore and understand your data.

Contribution 🛠️

Please create an Issue for any improvements, suggestions or errors in the content.

You can also contact me using Linkedin for any other queries or feedback.

Visitors