An Introduction to R: Examples for Actuaries
Nigel De Silva (nigel.desilva@thomasmiller.com)
Up to: Contents
Back to: R Graphics
Forward to: Getting Data Out Of R
2.5 Getting data into R
There are various ways of getting data into R. For small datasets, the command line might be the most efficient method:
> x <- c(7.82,8.00,7.95)
> x
[1] 7.82 8.00 7.95
> x <- scan() # A blank line indicates the end of the data
1: 7.82
2: 8.00
3: 7.95
4:
Read 3 items
> x
[1] 7.82 8.00 7.95
For larger datasets, it is better to import the data from a file. R can read data stored in text (ASCII) files. It can also read files in other formats (Excel, SAS, SPSS, etc), and access SQL-type databases, but the functions needed for this are not in the package base. These functionalities are very useful for a more advanced use of R, but are beyond the scope of this introduction. You may wish to read this link for further information on connecting to databases.
Some useful functions for achieving for reading data are scan and read.table. There are some standard variations of the read.table function which are summarised in the help file.
I generally use the read.csv variant. Comma separated values (or csv) files can be created from any spreadsheet or database application. An example of this is included below.
Note: A character preceded by a "\" is considered a special character in R. For example, "\n" designates a new line, "\t" designates a tab, etc. In order to interpret a slash in a string correctly, it has to be preceded by another slash. Therefore, the correct path string requires the "\\" seen in the command above.
> weld <- read.csv("C:\R\weld.csv")
> weld
x y
1 7.82 3.4
2 8.00 3.5
3 7.95 3.3
4 8.07 3.9
5 8.08 3.9
6 8.01 4.1
7 8.33 4.6
8 8.34 4.3
9 8.32 4.5
10 8.64 4.9
11 8.61 4.9
12 8.57 5.1
13 9.01 5.5
14 8.97 5.5
15 9.05 5.6
16 9.23 5.9
17 9.24 5.8
18 9.24 6.1
19 9.61 6.3
20 9.60 6.4
21 9.61 6.2
> class(weld)
[1] "data.frame"
The data has been converted to a data frame. The rows have been automatically named by their number. However a descriptive names could be used instead.
R tries to determine the class of each data column. Sometimes this requires fine tuning via the colClasses argument to read.csv. Further details are available in the help file. Alternatively, conversions can be done afterwards, via functions such as as.character.
Add-on packages (e.g. foreign or gdata) allow the user to read data from various file formats native to other packages, such as Excel, SAS, Stata, etc.
Comments (0)
You don't have permission to comment on this page.