Topic Modeling Amazon Reviews

Adapted from Biel 2011 I found Professor Julian McAuley’s work at UCSD when I was searching for academic work identifying the ontology and utility of products on Amazon. Professor McAuley and his students have accomplished impressive work inferring networks of substitutable and complementary items. They constructed a browseable product graph of related products and discovered topics […]

Regularized Logistic Regression Intuition

In this notebook we’ll manually implement regularized logistic regression in order to facilitate intuition about the algorithm’s underlying math and to demonstrate how regularization can address overfitting or underfitting. We’ll then implement logistic regression in a practical manner utilizing the ubiquitous scikit-learn package. The post assumes the reader is familiar with the concepts of optimization, cross validation, […]

Quick Look: Facebook’s Kaggle Competition

Following Friday’s news of yhat’s ggplot port (which I hope they promptly rename to avoid search engine conflation with other variants), I thought it’d be fun to explore the large Stack Overflow dataset Facebook provided (9.7 GB) for their latest Kaggle competition. I discovered that the ggplot port is off to a great start  and will only […]