Introduction to R

January 7, 2013

1) How to read a csv file in R ?

data<-read.csv(filename,header=TRUE)

2) How to display the first n lines of the file ?

head(data,n) : The default value of n is 6.

3) How to display the last n lines of the file ?

tail(data,n) 

4) Calculate missing values in all the columns in the data set ?

colSums(data)

Other functions that can be used for this purpose are sapply and apply.

5) Calculate the mean of a column without the missing values ?

colMeans(data,na.rm=TRUE)
     Ozone    Solar.R       Wind       Temp      Month        Day 
 42.129310 185.931507   9.957516  77.882353   6.993464  15.803922 
 colMeans(data)
    Ozone   Solar.R      Wind      Temp     Month       Day 
       NA        NA  9.957516 77.882353  6.993464 15.803922 
 colMeans(data["Ozone"],na.rm=TRUE)
   Ozone 
42.12931 

6) Extract the subset of rows of the data frame where Ozone values are above 31 and Temp values are above 90. What is the mean of Solar.R in this subset?

colMeans(subset(data,(Ozone&gt;31 &amp; Temp&gt;90)))
 Ozone Solar.R    Wind    Temp   Month     Day 
 89.5   212.8     5.6    93.4     8.2    14.5

Additional info on Subset

7) Find the mean temperature in the Month of n ?

colMeans(subset(data,Month==n))
    Ozone   Solar.R      Wind      Temp     Month       Day 
    NA 190.16667  10.26667  79.10000   6.00000  15.50000 

Additional Resources :
1) Filling in nas with column medians in R

2) Apply function and its variants

Advertisements