Sunday, September 14, 2008

Bins in a histogram and looking ahead

Hi EDA folks:

I finished grading your Fathom assignment on the number of bins.  Generally, you all did well on this, but there are a couple of things I should mention.

1.  The moral of this assignment is that as you have more data (bigger n), you should use a small bin width and have more bins.  It seemed that your best histograms by eye were similar to the ones chosen by the "optimal rule" formula.

2.  I think the rule wasn't that effective for constructing a histogram of the old faithful data.  By using a small number of bins, you didn't see any structure in each of the two humps.

3.  If you lost points, it probably was due to some confusion on your calculations or maybe not the best answer to a question -- like the one about the histogram of the old faithful data.  If you don't know why you lost points, just email me .

Looking ahead, the next assignment is on EFFECTIVE COMPARISON.  You'll learn a specific method for equalizing spreads between batches.   Although you might understand the method (spread vs. level plot, reexpressing, etc), it is important not to lose sight of what we are trying to accomplish.  We want to make a reasonable comparison between groups.

So, when you do your homework this week, don't forget to think about the BIG PICTURE.  Conclude your work by making a comparison.

Last, we'll be using some new R commands.  Don't forget to look at the "Chapter 3 work" file that illustrates the use of these new commands.

No comments: