Projects
Selected academic and independent work.
Understanding Misinformation on Social Media
Data Science for Social Good • Python
- Processed and analyzed 10,700 COVID-19 social media posts
- Applied tokenization, stopword filtering, and TF-IDF vectorization
- Built a Random Forest classifier achieving 94% accuracy
- Used SHAP and permutation importance to interpret model decisions
Predicting the Success of Netflix Movies
Statistical Learning • R
- Cleaned and engineered features from a dataset of 32,540 movies
- Defined a profit-based success metric for classification
- Built Bayesian classifiers and ensemble models
- Achieved an average prediction accuracy of 79% with cross-validation
Rally
Independent Project • Product & Data
- Developed the original Suggestioneer concept, winner of the YEP competition
- Rebuilt the project as Rally, a social recommendation engine
- Focused on scalable design, iteration, and data-informed product decisions