Confidence Intervals

I’m writing this blog post atop a hill overseeing the amazing Pacific Ocean, with deep blue ocean waves crashing onto the sands below and splattering into white foam. I love Southern California!

I have been browsing through data charts from the Google’s public data explorer this afternoon, thanks to one of my coworkers who stumbled upon it last week. It is quite a nice resource, hosting links and visuals to many public databases, from the U.S. Bureau of Labor Statistics to World Development Indicator to IFs forecast. There’s lots of good stats and data visualization using Google’s chart tools. I highly recommend it and hope to pass this site onto all my blog readers.

This week, my statistics class covered multiple regression and confidence intervals. It is pretty amazing how in business often times we just use point estimates from sample statistics, while confidence intervals can significantly increase the accuracy level of the data reported. But as the professor pointed out in the lecture, confidence intervals can sometimes be embarrassingly large due to the nature of sample statistics. I plotted scatterplots with confidence intervals, and compared a few regression models and learned to examine which types of models provide more probabilistically accurately data analysis. I’m definitely interested in applying it in my future work related to data analysis.

Statistical Significance

My online stats class last week reviewed Regression, Null Hypothesis Significance Testing (NHST), and the flaws and remedies of using the NHST method to dictate statistical significance. Seeing that Type I and Type II error table definitely brought back some memories of the p-value stats knowledge I learned in my college research class. I remember at the time I had to calculate the t and p-values for several research projects. But the R program introduced in this online stats class totally blew my mind. It is much more powerful and flexible for performing statistical analysis than any programs I used back in school. Though R is popular in academia, I hear that it is used in professional industries more and more. In this online class, we have been taught to tinker with R to make scatterplots, run regressions, and do other fun things.

Andrew Conway, the Princeton professor who teaches my online stats class, mentioned that if students take away just one lesson from this whole course, he hopes that it is the correct notion for the P-value. What is P? Given that the null hypothesis is true, the probability of these, or more extreme data, is P. The P-value is an arbitrary value selected to define statistical significance. P<.05 is considered standard in academia, while the professional world may take a more liberal range of up to P<.1 for research and AB tests conducted in business.

In my digital product development job, we constantly need to check the statistical significance of our product testings and AB tests. Often times we see results clearly trending one way but without reaching statistical significance. This is one of the most confounding situations in business, as I’m sure it is in academia and any other settings as well. While it is very tempting to reap the results without statistical significance, it is extremely important to recognize the randomness embedded in the result and refrain from jumping to premature conclusions based on these numbers.

It is also helpful to recognize the inherent flaws of the NHST method, like how data size could impact results reaching statistical significance. In a business environment, we have developed other methods alongside NHST like diving deep into the consumer behavior funnel to gain a more comprehensive picture. By utilizing a statistical perspective instead of pure gut feelings, being mindful of the system flaws, and gathering more direct user behavior information, we can shrink blind spots in our knowledge, and paint a more holistic picture for our decision making process.

Probabilistic Thinking

Last week, I finished Nate Silver’s book The Signal and the Noise. In the final chapter, he used research and anecdotes about the US counter-terrorism field to address the point that when forecasting future events, we have to be mindful of the “unknown unknowns”. The signals missed or not assigned enough importance in historical attacks like WWII’s Pearl Harbor and 9/11 are unknown unknowns because they were beyond the scope of imagination, and therefore no probability was even assigned to them. If the national security intelligence had considered the probability of these events, then lots of useful signals wouldn’t have passed by unattended, but instead would have been used to adjust the probability of these events, therefore leading to different outcomes.  Adopting probabilistic thinking and realizing there are unknown unknowns will be helpful in all fields of forecasting.

Homeland Memes

Silver’s take also reminds me of the portrayal of counter-terrorism in the hit drama Homeland (spoilers ahead). The show’s protagonist, CIA analyst Carrie Matheson, is often portrayed as a loose cannon who pursues leads that counter the conventional wisdom or are regarded as impossible by her peers. For example, she is the only person in the agency who even suspects that returning war hero Sgt. Brody could be turned by terrorists, based simply from a piece of intel she gathered from the field. Carrie refuses to discard her far-fetched theory and eventually builds enough evidence to demonstrate a realistic probability Brody really has been turned. For drama’s sake, Homeland often shows Carrie as overly emotional due to her bipolar personality, and characterizes her findings as “gut feelings” by quickly panning over the huge amount of data she has collected and the maps on her wall linking together random events. As an experienced 14-year analyst with rich field experience and so much data in her hand, her knowledge about the subject exceeds that of her peers, which allows her to reduce the unknown unknowns in her knowledge scope, and helps her perceive unlikely events as possible.

In statistics, there are measurements of central tendency, but what are also important are the variance in each data set, the outliers, the whales, which shouldn’t be ignored. Instead these factors should be properly considered and weighed in their context, and help reduce the number of unknown unknowns for us.