Tested on R versions 3.0.X through 3.3.1
Last update: 15 August 2016
Elements of Module #3 - Data Management in R - include:
Data come in many different scales - nominal, ordinal, ratio/interval. In R, each scale is of a different class
, such as integer
or numeric
(floating point), or character
, among others. Analyses in R depend on knowing how the scale of your data relates to R data class
, and changing them to other classes is often necessary before analyses in R can proceed.
Data management also ensures proper assignment of data class
, such as factor
(e.g., sex) with levels (male, female), or as character
(e.g., spp126 as code for Juniperus monosperma), representing logical interpretative groupings of the data. Again, the assignment of proper data class
is fundamental to analyses in R.
Understanding the data input source (e.g., MS Excel, extraction from a GIS), and how each external data sources deals with data scale and class (e.g., missing values), also affects R analyses.
A common data management issue, for example, is how best to import and standardize data for analysis in R from different sources. In addition, it is often necessary to “reshape” data, such as transposing rows and columns.
Aspects of data management are explored in the eight elements of this Module.