Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 1 KB

README.md

File metadata and controls

12 lines (8 loc) · 1 KB

HarvardX PH125.9x - Data Science: Capstone

In this project, I explore the WordBank open database of children's vocabulary development and growth. The database contains data from 75,000+ kids from 25+ languages. I use machine learning algorithms (i.e., regression trees, random forests, linear regressions) to investigate potential relationships between demographic/linguistic variables (our predictors) and vocabulary growth, as measured by productive vocabulary (our outcome measure) on the The MacArthur-Bates Communicative Development Inventories. All analyses are exploratory in nature and no hypotheses or predictions are made. First, I curate the wordbank dataset, moving to descriptive analyses and visualizations, and finally to the machine learning algorithms.

This repository contains:

  • PDF report (knitted from Rmd)
  • Rmd script
  • R script
  • Reference list bibtex

For more information or questions, please e-mail me at [email protected]