The Power of Statistics
--
Statistics is a branch of mathematics that is often overlooked. When the word “statistics” comes in mind, the average person may think of an MLB player’s batting average, or an NBA player’s field goal percentage. However, the reality is that statistics offers so much more than a simple percentage of shots that a player made throughout a season. It has the power that few other branches of mathematics contain: the power of generating definitive conclusions.
The aspect of statistics that allows us to make estimates and decisions is called inferential statistics. The two main statistical methods used in inferential statistics are confidence intervals and hypothesis tests. Given data about a particular population of interest, we can use these methods to generate conclusions!
Confidence Intervals
Let’s say that a large, public university wants to estimate the average SAT score of all of its admitted students. It’s a relatively large school, so the administrators do not have the time to ask each student their score. They begin by randomly selecting 100 students, and then they calculate the sample mean. This sample mean is what will be used to approximate the mean SAT score for the entire university. Obviously, the sample mean is only one value, so the chance that its value is equivalent to the population mean is extremely small. In addition, if we randomly select another 100 students and calculate that sample mean, it is highly likely the two sample means will be different. So, the confidence interval solves this issue by taking into account the sample data and creating a set of values which is likely to contain the actual population mean. There are different formulas for these confidence intervals depending on the sample size, if there are one or two samples being compared, or if the samples contain values that are proportions or means. But for this particular example, the upper bound of the confidence interval is found by adding the confidence level value, z, times the sample standard deviation divided by the square root of the sample size to the sample mean. The lower bound is found by subtracting the previous value from the sample mean. The resulting confidence interval equation is then:
As mentioned before, this effectively creates a range of values in which the actual population mean is likely…