As I was watching this game, I wondered:
"Which team has the better offense?"
To answer this question, I collected the yards gained for every single play of the two teams, the Steelers and the Cardinals, for this Superbowl. I created a datafile with two variables, Yards and Team, and entered this data into R.
Using the R function
I created the following boxplot.
What do we see in this boxplot display?
- Both batches of yards gained look a bit right-skewed.
- There are four outliers for the Steelers and two for the Cardinals. These correspond to big plays for the teams that gained a lot of yards.
Why did I draw this boxplot display? Several comments:
- We wish to make a comparison between the two batches.
- What is a comparison? Well, we could say that one batch tends to be larger than the second batch. For this example, if it were true, then I would say that the Cardinals gained more yards (per play) than the Steelers.
- But that isn't really saying much. When I say "make a comparison", this means that I want to say that one batch is a particular number larger or smaller than the second batch.
- When can we say
"batch 1 is 10 larger than batch 2"?
- As you'll read in the notes, we can only make this type of comparison "batch 1 is 10 larger than batch 2" when the two batches have similar spreads.
Returning to my football example, it doesn't appear that the two teams were very different with respect to yards gained per play. Both teams had approximately the same median.