Skip to content

SaiSriramKamineni/EDA-e-Commerce-Transaction-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Error in user YAML: (<unknown>): mapping values are not allowed in this context at line 6 column 17
---

# 📊 **Exploratory Data Analysis (EDA) on e-Commerce Transaction Data** 🌐

## 🎯 **Project Overview**
This project involves conducting **Exploratory Data Analysis (EDA)** on a transactional dataset from a UK-based online retail company. The dataset contains valuable insights into customer purchasing behavior, including information such as product details, sales, countries, and customer activities. Our aim is to uncover important trends, patterns, and statistics through analysis and visualization.

**Dataset Size**: The dataset is **500,000 rows** long, containing extensive transactional data over a one-year period. This large dataset allows for comprehensive analysis of customer behavior and purchasing trends.

## 📚 **What is EDA?**
EDA stands for **Exploratory Data Analysis**. It is a crucial process in data analysis that involves:

- **Understanding the data** through statistical summary and visualization.
- **Identifying patterns** in the dataset, including trends, correlations, and anomalies.
- **Summarizing the dataset** using graphical tools like histograms, box plots, and scatter plots.

By using EDA, we gain insights that help drive business decisions, reveal hidden patterns, and guide further analysis.

## 🔍 **Dataset Overview**
The dataset used for this analysis includes transactional data from an online retail store. Key features in the dataset include:

- **InvoiceNo** (Invoice Number) 💳: Unique number for each transaction.
- **StockCode** (Product Code) 🏷️: Unique identifier for each product.
- **Description** (Product Name) 🛍️: Name of the product purchased.
- **Quantity** (Quantity Purchased) 🔢: Number of units purchased in each transaction.
- **InvoiceDate** (Transaction Date) 📅: Timestamp when the transaction occurred.
- **UnitPrice** (Price per Unit) 💰: Price of a single unit of the product.
- **CustomerID** (Customer ID) 👤: Unique identifier for each customer.
- **Country** (Customer Country) 🌍: The country from which the customer made the purchase.

## 🧮 **Key Insights from the EDA**
In this analysis, we will uncover several key findings about the dataset:

1. **Top Countries by Number of Orders** 🇬🇧🌎
   - Discover which countries place the highest number of orders.

2. **Top Countries by Total Spending** 💸
   - Identify the countries where the most money was spent on purchases.

3. **Monthly Sales Trends** 📅💵
   - Analyze which months had the highest and lowest sales.

4. **Peak Purchase Hours** 🕒
   - Investigate the hours of the day when the most transactions occurred.

5. **Sales Trends Over Time** 📈
   - Discover periods of sales growth and decline, including:
     - **Sales Increase Days** 📈
     - **Sales Decrease Days** 📉

6. **Unusual Gaps in Transactions** ❌
   - Identify dates where there were no transactions.

## 🛠️ **Tools & Techniques Used**
- **Python** 🐍: For data manipulation and analysis.
- **Pandas** 🧑‍💻: Data analysis library to handle the dataset.
- **Matplotlib & Seaborn** 📊: Libraries used to create interactive and visually appealing plots.
- **Jupyter Notebooks** 💻: To run and document the analysis in an interactive environment.

## 💡 **How the Analysis was Conducted**
### 1. **Data Cleaning** 🧹
   - Handling missing values and incorrect data types.
   - Removing duplicates and irrelevant rows.
   
### 2. **Data Exploration** 🔍
   - Descriptive statistics to summarize the data.
   - Data grouping and aggregation to find key metrics like total sales per country.

### 3. **Visualization** 📊
   - Plotting graphs and charts to represent:
     - Number of orders by country.
     - Total spending by country.
     - Monthly sales trends.
     - Hourly purchase patterns.

### 4. **Pattern Recognition** 📈
   - Identifying trends like peak purchasing times, high-sales months, and days of sales increase.

## 📦 **Project Files**
- **Data Files** 📁: Contains the transactional data (500,000 rows).
- **Analysis Notebook** 📓: The Jupyter notebook where the analysis and visualizations are documented.
- **Visualizations** 📸: Charts and graphs showcasing the findings.

## 📈 **Interactive Insights & Visualizations**
All visualizations are interactive, allowing users to zoom, hover, and explore data in greater detail. We’ve made it easy to understand the complex patterns with visually appealing graphs and stats. Here's a sneak peek of the insights you can explore:
- 🌍 **Top Countries**: You can hover over each country to see the number of orders and spending details.
- 📅 **Monthly Trends**: Interactive time-series plots that help you track changes in sales over time.
- ⏰ **Purchase Hour Analysis**: Visuals to identify the hours of the day with the highest transactions.

## 🌟 **Conclusion**
This EDA project provides actionable insights that can help the online retail company understand customer behavior better. By analyzing purchasing patterns, peak times, and the countries with the highest sales, the company can make data-driven decisions for future marketing strategies and inventory management. 

---

🔗 Let’s Connect: Feel free to explore the project and get in touch if you have any questions!
Happy Analyzing! 🎉


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages