Tuesday, September 16, 2008

Working with subgroups in R

Since we are comparing groups in EDA, I thought I would give some guidance on how to subset data in R.

Suppose we want to construct stemplots for the areas of the islands in each continent in the homework.  Here is some R work for constructing a stemplot of the island areas in the Arctic Ocean.  The key command is "subset".

By the way, I don't think there is a simple way of constructing parallel stemplots in R.

> data(island.areas)
> names(island.areas)
[1] "Ocean" "Name"  "Area" 
> attach(island.areas)
> arctic.areas=subset(Area,Ocean=="Arctic")
> arctic.areas
 [1]  16671 195928  27038   6194  21331  75767  16274  12872   9570  15913  83896
[12]   8000  35000   2800  23940
> library(aplpack)
> stem.leaf(arctic.areas)
1 | 2: represents 12000
 leaf unit: 1000
            n: 15
   1    0* | 2
   4    0. | 689
   5    1* | 2
  (3)   1. | 566
   7    2* | 13
   5    2. | 7
        3* | 
   4    3. | 5
HI: 75767 83896 195928

No comments: