Previous: Classes of Variables Up: Classes of Variables Next: Recoding Variables


Types of Variables

For all types of variable (vectors), you may use the c() command to ``concatenate'' elements into a vector, the : operator to generate a sequence of integer values, the seq() command to generate a sequence of non-integer values, or the rep() function to repeat a value to a specified length. In addition, you may use the <- operator to save variables (or any other objects) to the workspace. For example:

> logic <- c(TRUE, FALSE, TRUE, TRUE, TRUE) # Creates `logic' (5 T/F values).
> var1 <- 10:20                             # All integers between 10 and 20.  
> var2 <- seq(from = 5, to = 10, by = 0.5)  # Sequence from 5 to 10 by 
                                            #  intervals of 0.5. 
> var3 <- rep(NA, length = 20)              # 20 `NA' values.  
> var4 <- c(rep(1, 15), rep(0, 15))         # 15 `1's followed by 15 `0's.
For the seq() command, you may alternatively specify length instead of by to create a variable with a specific number (denoted by the length argument) of evenly spaced elements.

  1. Numeric variables are real numbers and the default variable class for most dataset values. You can perform any type of math or logical operation on numeric values. If var1 and var2 are numeric variables, we can compute
    > var3 <- log(var2) - 2*var1      # Create `var3' using math operations.
    
    Inf (infinity), -Inf (negative infinity), NA (missing value), and NaN (not a number) are special numeric values on which most math operations will fail. (Logical operations will work, however.) Use as.numeric() to transform variables into numeric variables. Integers are a special class of numeric variable.

  2. Logical variables contain values of either TRUE or FALSE. R supports the following logical operators: ==, exactly equals; >, greater than; <, less than; >=, greater than or equals; <=, less than or equals; and !=, not equals. The = symbol is not a logical operator. Refer to Section [*] for more detail on logical operators. If var1 and var2 both have 2#2 observations, commands such as
    > var3 <- var1 < var2
    > var3 <- var1 == var2
    
    create 2#2 TRUE/FALSE observations such that the 4#4 th observation in var3 evaluates whether the logical statement is true for the 4#4 th value of var1 with respect to the 4#4 th value of var2. Logical variables should usually be converted to integer values prior to analysis; use the as.integer() command.

  3. Character variables are sets of text strings. Note that text strings are always enclosed in quotes to denote that the string is a value, not an object in the workspace or an argument for a function (neither of which take quotes). Variables of class character are not normally used in data analysis, but used as descriptive fields. If a character variable is used in a statistical operation, it must first be transformed into a factored variable.

  4. Factor variables may contain values consisting of either integers or character strings. Use factor() or as.factor() to convert character or integer variables into factor variables. Factor variables separate unique values into levels. These levels may either be ordered or unordered. In practice, this means that including a factor variable among the explanatory variables is equivalent to creating dummy variables for each level. In addition, some models (ordinal logit, ordinal probit, and multinomial logit), require that the dependent variable be a factor variable.



Gary King 2011-11-29