In this tutorial, you will learn how to do reverse coding in R. Specifically, we will use 1) base R to reverse some variables (e.g., questionnaire items) and 2) we will use the Psych package to do reverse scoring.
Table of Contents
- Reverse scoring in R
- Two Steps of Reverse Coding in R:
- What to Do After Reverse Scoring
- R and Reproducible Code
Reverse scoring in R
Many instruments (i.e., questionnaires) contain items that are phrased so that a strong agreement indicates something negative (e.g., “When there is music in the room, I find it hard to concentrate on reading”). These items need to be reversed so that the data will be correct later for statistical analysis.
For more information on reverse scoring, please see my earlier post: Reverse scoring in Python. Since I was more familiar with Python compared to R, and I had no clue on how to do this in SPSS, I wrote a Python script. The Python script used a function that used Pandas DataFrame and it reversed the scores nicely and quickly.
However, in R was pretty much as simple to do reverse scoring as in Python. In the following script, a data frame is generated with column names (i.e., columnNames,’Q1′ to ‘Q6’) and some data is generated using replicate and sample (100 responses, on the 6 questions).
After that, you will find two methods, that are pretty much the same. More specifically, the methods to reverse code, only differ in how the columns are selected. In the first reverse scoring method, items are selected based on the index of the column. The second recode method, variables are selected based on the column names (might be preferable if you know the names of columns but not the indices).
Importing Data in R
In R, there are numerous ways to load data into a dataframe object. In the example below, we are creating a fake dataframe but, of course, most of the time we have the data stored on a computer hard drive.
- Read the How to read a Excel (xlsx) file in R blog post to learn about reading, and writing, .xlsx files in R.
If we collaborate with other researchers that work with SPSS we may want to read a sav file in R. Other options to import data, might be to read a SAS file. Now, both of these tasks can be done with the Haven package.
Sometimes, of course, we have a dataframe with a lot of variables and we may want to remove a column in R after we have imported the data.
Two Steps of Reverse Coding in R:
How do you reverse scores in R? In the following section, the two steps for reversing scores are put forward.
#Generating data for the DataFrame scores <- as.data.frame(replicate(6, replicate(100, sample(1:5,1)))) columnNames = c('Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6') colnames(scores) <- columnNamesCode language: R (r)
1. Create a Character Vector
First, create a vector in R, containing the column names, of the items/variables you want to be reversed:
# Generate column names for the DataFrame (Question 1; Q1, and so on) columnsToReverse <- c('Q2', 'Q3', 'Q4')Code language: R (r)
2. Reverse the Scores based using the Character Vector
In the second step, we use the column names (i.e., the character vector) to do the actual reversing of the scores in R.
# Reversing scores in columns 'Q2', 'Q3', and 'Q4' reversed.scores[,columnsToReverse] <- 6-scores[,columnsToReverse] # Alternative if we want to use the column indices: reversed.scores <- scores reversed.scores[,c(2,3,4)] <- 6-scores[,c(2,3,4)]Code language: R (r)
Reverse Coding in R using the Psych Package
In this section, of the reverse coding tutorial, you will learn how to use the psych package when reversing the items.
Install the pacakge
First, you need to install the r-package called “psych” and then you are going to use the reverse.code function to switch the coding of some items.
install.packages("psych")Code language: R (r)
In the next section, you will learn how to do reverse coding in R using the reverse.code function from the Psych package.
Reverse-Code Variables in R using reverse.code
Finally, you can use the psych package to reverse the items. First, you import the package, create a numeric vector (keys), and then we use the reverse.code function. We created the numeric vector because of the reverse.code function, uses these indexes to know which columns to reverse (1st, 3rd, and 4th).
require(psych) #Reversing scores in columns 'Q2', 'Q3', and 'Q4' keys <- c(1, -1, -1, -1, 1, 1) new <- reverse.code(keys, df) df[1:3,] new[1:3,]Code language: R (r)
What to Do After Reverse Scoring
Now, you have reversed your questionnaire data using R. The next step might be to clean the data. This may, in some cases, include removing unwanted data (see the blog post how to remove a column in R, for more information). After variables that you don’t need is removed it may be time to carry out some descriptive statistics and create some scatter plots to examine the relationships between some of your variables.
If you are interested in learning about other, useful, functions you can check out how to use the %in% operator in R and how to use the r functions replicate and repeat. Furthermore, if you need to create a sequence of numbers in R you can use the seq() function.
R and Reproducible Code
Although, the code in this post can be found in this R Jupyter Notebook, this may not be the most optimal way to share the R script including analysis and visualization. This is, because, with time R and it’s packages get updated, sometimes a function may change the name, and so on. Thus. it may take time for another researcher to, fully, reproduce the R computational environment that we used in a study.
Luckily, there are tools such as Binder and Code Ocean to help us with this. See the tutorial on how to use Binder, R, and RStudio for reproducible research. That way, other researchers can run the R script, such as how we reversed the scores here, and (hopefully) get the exact same results.
In this post, you have learned how to reverse scores in R using base function as well as the Psych package. The Psych package has it’s advantages because the function (reverse.score) comes with some arguments that you might find useful. Moreover, the package also contains more interesting functions (e.g. for factor analysis) that you might find useful.