Tuesday, November 3, 2015

Week 2

Week 2 felt like being back in a college classroom.  Calculus in lecture!  It's been a long time since I had done any calculus, but I was surprised how easy it was to follow the the derivations of formulas from first principals.  Actually doing the calculus myself would take a little more ramp up, but at least it was not completely foreign.

As with programming, there are many students without a strong math background, so I don't know how lecture makes sense to them.  Certainly no spoonfeeding in this bootcamp.  You can get through it without a technical background, but you will get more out of it if you come in with strong technical chops.

Topics this week were probability, experiment design, classical statistics (e.g. frequentist hypothesis testing) and Bayesian statistics.  Coming from a physics background, probability and statistics were never formally taught.  It was just a topic you were supposed to figure out, but having a good foundation would have made statistical mechanics so much less painful.

EDX offers an MIT Probability course which is the best treatment available online:  https://courses.edx.org/courses/MITx/6.041x_1/1T2015/info.  Definitely watch the solutions videos, especially on counting.  I thought I had learned to count kindergarten but apparently I was mistaken.

In addition to Bayesian statistics, bootstrapping and multi-arm bandit were highlights.  Bootstrapping is nothing short of magic.  Bayesian statistics is much mores satisfying than frequentist statistics if for no other reason than not have to use phrases like, "do not fail to reject the null hypothesis".

Multi-arm bandit for A/B testing makes a lot of intuitive sense.  But after running some Monte Carlos on different bandit algorithms, performance better than chance is not guaranteed unless the differences in clickthrough rates between A and B are fairly large (e.g. 2X).  

Along with the stats, the hands-on exercises is making us all stronger in python, numpy, pandas and matplotlib.  Feels good to be able to start wielding the data scientist toolkit with more confidence.



No comments:

Post a Comment