Mahalanobis Distance and Outliers

I wrote a short article on  Absolute Deviation Around the Median a few months ago after having a conversation with Ryon regarding robust parameter estimators. I am excited to see a wet lab scientist take a big interest in ways to challenge the usual bench stats paradigm of t-tests, means, and z-scores. When researching KNN and SOFM, […]

Gradient Boosting: Analysis of LendingClub’s Data

I originally posted this article last year. I am reposting it as a primer for my upcoming article where I take a second look at LendingClub, but this time with the the Python data stack. An old 5.75% CD of mine recently matured and seeing that those interest rates are gone forever, I figured I’d […]

Predicting Dichotomous Outcomes I

We are trying to predict a dependent dichotomous variable (male/female, yes/no, like/dislike,etc) with independent “predictor” variables. Let’s say we want to determine whether or not an employee will quit based on the percentage of their tenure spent traveling. We assemble the data from HR and erroneously employ simple linear regression to model the relationship, a […]