fix

sak-codes · sak-codes · commit 0a6a3e89b6af · 2025-02-23T21:26:37.000-08:00
diff --git a/docs/posts/blogs/blog1.md b/docs/posts/blogs/blog1.md
@@ -76,4 +76,4 @@ Satellite imagery, IoT sensors, and big data analytics are helping predict natur
 The power of data science lies in its ability to solve problems, create opportunities, and transform the world. Whether you’re a student, a professional, or an enthusiast, now is the time to embrace data science and contribute to this evolving field.  
 
 
-Thank you for reading! 🚀✨  
+Thank you for reading! 
diff --git a/docs/posts/blogs/blog2.md b/docs/posts/blogs/blog2.md
@@ -1,4 +1,4 @@
-# Data Science in Entertainment: Lights, Camera, Algorithms! 🎥  
+# Data Science in Entertainment: Lights, Camera, Algorithms!
 
 ## Introduction: When Data Meets Showbiz  
 
@@ -7,7 +7,7 @@ Have you ever wondered how Netflix knows exactly what you want to watch next, or
 Let’s dive into the world where data science meets the spotlight, and explore how algorithms are transforming the entertainment industry.
 
 
-## 1. Personalized Recommendations: Your Digital BFF 🎯  
+## 1. Personalized Recommendations
 
 ### A. The Netflix Effect 
 
@@ -21,7 +21,7 @@ Fun Fact: Over 80% of Netflix views are driven by recommendations!
 
 
 
-### B. Spotify’s Musical Genius 🎵
+### B. Spotify’s Musical Genius
 
 ![Spotify Data Science](../../images/blogs/ds_spotify.png)
 
@@ -33,7 +33,7 @@ Real-World Impact: Spotify’s recommendation system boosts user engagement by o
 
 
 
-## 2. Box Office Predictions: Data Behind the Blockbusters 🎥  
+## 2. Box Office Predictions: Data Behind the Blockbusters
 
 Before the first ticket is sold, data science predicts whether a movie will be a flop or a blockbuster.  
 
@@ -46,7 +46,7 @@ Case Study: Predicting Marvel’s *Avengers: Endgame* would surpass $1 billion i
 
 
 
-## 3. Gaming Analytics: Leveling Up the Experience 🎮  
+## 3. Gaming Analytics: Leveling Up the Experience 
 
 ![Games](../../images/blogs/ds_games.png)
 
@@ -76,7 +76,7 @@ Fun Fact: Epic Games processes over 2 petabytes of data daily to improve gamepla
 3. Sentiment-Driven Content: Movies and songs dynamically changing based on your mood, tracked through wearables or devices.  
 
 
-## 5. Challenges: The Dark Side of the Spotlight 🌑  
+## 5. Challenges: The Dark Side of the Spotlight  
 
 1. Privacy Concerns: With so much personal data being collected, maintaining user trust is critical.  
 2. Algorithmic Bias: Ensuring diverse and inclusive recommendations instead of reinforcing existing biases.  
@@ -88,5 +88,3 @@ Fun Fact: Epic Games processes over 2 petabytes of data daily to improve gamepla
 Data science has turned entertainment into an experience that’s personal, predictive, and magical. From the playlists that understand your mood to the games that adapt to your skills, it’s an era where creativity and computation come together.  
 
 The future of entertainment is not just about watching or listening; it’s about living the experience – and data science is leading the way.  
-
-
diff --git a/docs/posts/open_source/gsoc.md b/docs/posts/open_source/gsoc.md
@@ -1,4 +1,4 @@
-# My Google Summer of Code Journey – Extending Data Structures, Algorithms, and the C++ Backend 🚀
+# My Google Summer of Code Journey – Extending Data Structures, Algorithms, and the C++ Backend
 
 ## Introduction: A Summer of Growth, Code, and Algorithms
 
@@ -8,7 +8,7 @@ I’m Sakshi Oza, a Master's student passionate about data-science, open-source
 If you’ve ever wondered how Python libraries can achieve C++-level performance while implementing advanced algorithms, this blog is for you. I’ll share my GSoC journey, including challenges, solutions, and learnings.
 
 
-## About the Project 📚
+## About the Project
 
 The project focused on enhancing a Python-based data structures library, `pydatastructs` by:  
 1. Extending existing data structures and algorithms.  
@@ -18,7 +18,7 @@ The project focused on enhancing a Python-based data structures library, `pydata
 This combination bridges the gap between Python’s developer-friendliness and C++’s computational efficiency.
 
 
-## Breaking It Down: My GSoC Timeline ⏳
+## Breaking It Down: My GSoC Timeline
 
 ### 🔹 Community Bonding Period
 Before coding began, I focused on:  
@@ -57,7 +57,7 @@ I explored the gaps in the current implementation and set milestones for my cont
 In Phase 2, I focused on backend optimization by implementing a C++ backend for performance-critical algorithms. Why? Python is fantastic for development, but when it comes to heavy computation, C++ shines. By combining the two, we achieve the best of both worlds.
 
 
-### Key Contributions in this Phase 🛠️
+### Key Contributions in this Phase
 
 #### 1. Sorting Algorithms  
 - Added bubble_sort, selection_sort, and insertion_sort with C++ backend support.  
@@ -82,7 +82,7 @@ Introduced a lazy segment tree to handle range-based queries and updates efficie
 Segment trees are invaluable for applications like interval management and range sum/count queries.
 
 
-## Challenges and Learnings 🤓
+## Challenges and Learnings
 
 ### 1. Network Flow Complexity  
 Implementing Edmond-Karp and Dinic’s algorithms required a deep understanding of graph theory, BFS, and DFS.  
@@ -97,7 +97,7 @@ I explored Cython and other backend options to ensure seamless interoperability
 - Wrote comprehensive test cases for every addition to ensure robustness.
 
 
-## Memes from the Journey 🎭
+## Memes from the Journey
 
 1. When Network Flow Started Making Sense  
 *“When theory meets implementation, and it finally clicks.”*
@@ -109,15 +109,15 @@ I explored Cython and other backend options to ensure seamless interoperability
 *“Months of hard work, countless commits, and it all comes down to one button: Merge PR.”*
 
 
-## Impact: Why This Matters 🌍
+## Impact: Why This Matters
 
 The contributions I made during GSoC 2023 will:  
 1. Enhance library performance through optimized algorithms.  
 2. Expand library utility with new data structures and methods.  
 3. Improve scalability with a C++ backend for heavy computations.
 
 
-## Gratitude 🙏
+## Gratitude
 
 I’m incredibly grateful to my mentors:  
 - Gagandeep Singh  
@@ -127,11 +127,11 @@ I’m incredibly grateful to my mentors:
 Their guidance, patience, and feedback were invaluable throughout this project. I also want to thank the GSoC community for fostering such a collaborative and supportive environment.
 
 
-## Conclusion: A Summer to Remember 🌟
+## Conclusion: A Summer to Remember
 
 GSoC 2023 was a transformative experience. From grappling with network flows to integrating C++ backends, I grew as a developer and problem solver. This project has not only strengthened my technical skills but also deepened my appreciation for open-source contributions.
 
 I’m excited to continue my open-source journey, contribute more, and keep growing as a developer.
 
-Thank you for reading! 💻✨  
+Thank you for reading!  
 “Keep coding, keep learning, and let’s build something amazing together!” 
diff --git a/docs/posts/research/project1.md b/docs/posts/research/project1.md
@@ -1,15 +1,15 @@
 
-# 🚀 Predicting Cardiovascular Risk Using Machine Learning 🩺  
+# Predicting Cardiovascular Risk Using Machine Learning  
 
 
-## Introduction: A Fight Against the Silent Killer 💔  
+## Introduction: A Fight Against the Silent Killer
 
 Cardiovascular diseases (CVDs) are the leading cause of death worldwide, accounting for millions of lives lost annually. Identifying individuals at risk early can help implement preventive strategies and save lives. The challenge? Finding the right tools to predict this risk effectively.  
 
 In this project, I used machine learning techniques—Principal Component Analysis (PCA), K-Means Clustering, and LASSO Logistic Regression—to identify high-risk individuals based on health data from the NHANES dataset.  
 
 
-## Data Overview 📊
+## Data Overview
 
 I used the National Health and Nutrition Examination Survey (NHANES) dataset, which includes key demographic, physiological, and dietary features:  
 
@@ -27,7 +27,7 @@ Participants were classified as High Risk (1) if they met at least one of the fo
 Otherwise, they were labeled as Low Risk (0).  
 
 
-## Methodology 🛠️
+## Methodology
 
 I applied three key machine learning techniques to analyze and predict CVD risk:  
 
@@ -52,7 +52,7 @@ Result:
 - PC2 explained 20.4% of variance, focusing on Systolic BP and Diastolic BP.  
 
 
-### 2. K-Means Clustering 🤖 
+### 2. K-Means Clustering
 K-Means was applied to PCA components to identify subgroups within the data. The objective was to minimize within-cluster variance:
   
 $$
@@ -77,7 +77,7 @@ Insights:
 - Cluster 3: Middle-aged individuals with moderate BMI → Medium Risk  
 
 
-### 3. LASSO Logistic Regression 📉
+### 3. LASSO Logistic Regression
 LASSO regression shrinks insignificant predictors to zero, focusing only on the most influential features. The loss function includes a penalty term:  
 
 $$
@@ -104,7 +104,7 @@ The model achieved excellent predictive accuracy:
 - Specificity: 93.6%  
 
 
-## Results and Discussion 📈  
+## Results and Discussion  
 
 ### Key Takeaways 
 
@@ -117,21 +117,21 @@ The model achieved excellent predictive accuracy:
 3. The LASSO model’s high accuracy proves its reliability for predicting CVD risk.  
 
 
-## Visual Results 📊
+## Visual Results
 
 1. PCA Biplot: Visualize variable contributions to CVD risk.  
 2. K-Means Cluster Plot: Show the distinct clusters based on PCA components.  
 3. LASSO Coefficient Table: Highlight the importance of each predictor.  
 4. ROC Curve: Demonstrates the model’s high predictive performance.  
 
 
-## Challenges Faced 💡
+## Challenges Faced
 1. Multicollinearity: Addressed using PCA for dimensionality reduction.  
 2. Optimal Clustering: Achieved using the Elbow Method.  
 3. Model Tuning: Finding the best regularization parameter $\lambda$ for LASSO.
 
 
-## Conclusion: Insights for Public Health 🌍 
+## Conclusion: Insights for Public Health 
 
 This study demonstrates the power of machine learning in predicting cardiovascular risk. By combining PCA, K-Means Clustering, and LASSO Regression, I:  
 - identified key predictors of CVD risk: BMI, Systolic BP, and Age.  
@@ -140,12 +140,12 @@ This study demonstrates the power of machine learning in predicting cardiovascul
 These findings can guide public health strategies to focus resources on high-risk individuals and promote preventive healthcare.  
 
 
-## Future Directions 🚀
+## Future Directions
 
 1. Use longitudinal data to monitor CVD risk over time.  
 2. Explore advanced models like Deep Learning for complex interactions.  
 3. Apply this framework to global datasets for broader impact.
 
 
-## Thank You! 💻✨  
+## Thank You!
 *"Let’s use data to solve real-world problems and create a healthier world!"*  
diff --git a/docs/posts/research/project2.md b/docs/posts/research/project2.md
@@ -1,6 +1,6 @@
-# 🧪 Analyzing the Impact of Occupational Radiation Exposure on Cancer Mortality  
+# Analyzing the Impact of Occupational Radiation Exposure on Cancer Mortality  
 
-## 1. Introduction: The Shadow of Radiation ☢️  
+## 1. Introduction: The Shadow of Radiation  
 
 Occupational radiation exposure is a significant public health concern, particularly for workers in industries like nuclear power, metal processing, and energy production. Long-term exposure to radiation has been linked to increased cancer risks, making this an important area of occupational health research.  
 
@@ -13,9 +13,9 @@ This retrospective study investigates the impact of occupational radiation expos
 
 
 
-## 3. Methods and Materials 🛠️  
+## 3. Methods and Materials
 
-### A. Data Sources 📊  
+### A. Data Sources
 
 Two datasets were analyzed:  
 
@@ -33,15 +33,15 @@ Note: The study focuses on white male workers due to demographic consistency, re
 
 
 
-### B. Workflow of the Study 🔄  
+### B. Workflow of the Study 
 
 The data analysis process followed a systematic workflow as shown below:  
 
 ![flowchart](../../images/project2/flowchart.png)
 
 
 
-### C. Data Preprocessing 🧹  
+### C. Data Preprocessing 
 
 1. Cancer Classification:  
    Workers were categorized into two groups based on ICD-8 codes:  
@@ -57,7 +57,7 @@ Table 1: Example Rows from Merged Dataset
 
 
 
-### D. Statistical Analysis 🧮  
+### D. Statistical Analysis 
 
 1. Descriptive Statistics:  
    Calculated measures like mean, standard deviation, skewness, and kurtosis for photon dose levels.  
@@ -71,7 +71,7 @@ Table 1: Example Rows from Merged Dataset
 
 
 
-## 4. Results 📈  
+## 4. Results
 
 ### A. Descriptive Analysis  
 
@@ -113,26 +113,26 @@ Result: The test showed a statistically significant difference in photon dose le
 
 
 
-## 5. Discussion: Key Insights 🧐  
+## 5. Discussion: Key Insights 
 
 1. Workers who died from cancer-related causes had significantly higher radiation doses than those who died from other causes.  
 2. The skewed distribution of photon doses highlights the importance of non-parametric tests in exposure data analysis.  
 3. These findings align with the hypothesis that long-term radiation exposure increases the risk of cancer mortality.  
 
 
 
-## 6. Conclusion: What This Means for Public Health 🌍  
+## 6. Conclusion: What This Means for Public Health 
 
 Our analysis suggests a strong association between occupational radiation exposure and cancer mortality among FMPC workers. These results underscore the importance of:  
 1. Dose Monitoring: Implementing strict monitoring protocols for radiation levels.  
 2. Safety Regulations: Ensuring safety standards to minimize exposure in occupational settings.  
 3. Future Research: Conducting studies with larger, more diverse datasets to validate these findings. 
 
-## 7. References 📚  
+## 7. References
 
 1. Cragle, D. L., Watkins, J. P., Ingle, J. N., et al. *Mortality Among White Male Workers at a Uranium Processing Plant.* (1996).  
 2. CEDR (1994). *Fernald Retrospective Cancer Mortality Study.*  
 
 
-Thank you for reading! 🚀  
+Thank you for reading!
 *“Science is not only about understanding the world; it’s about protecting it.”*  
diff --git a/docs/posts/research/publication.md b/docs/posts/research/publication.md

Original file line number	Diff line number	Diff line change
`@@ -76,4 +76,4 @@ Satellite imagery, IoT sensors, and big data analytics are helping predict natur`
`76`	`76`	`The power of data science lies in its ability to solve problems, create opportunities, and transform the world. Whether you’re a student, a professional, or an enthusiast, now is the time to embrace data science and contribute to this evolving field.`
`77`	`77`
`78`	`78`
`79`		`-Thank you for reading! 🚀✨`
	`79`	`+Thank you for reading!`