In this brief tutorial, you will learn how to add a column to a dataframe in R. More specifically, you will learn 1) to add a column using base R (i.e., by using the $-operator and brackets, 2) add a column using the add_column() function (i.e., from tibble), 3) add multiple columns, and 4) to add columns from one dataframe to another.
Note, when adding a column with tibble we are, as well, going to use the
%>% operator which is part of dplyr. Note, dplyr, as well as tibble, has plenty of useful functions that, apart from enabling us to add columns, make it easy to remove a column by name from the R dataframe (e.g., using the
First, before reading an example data set from an Excel file, you are going to get the answer to a couple of questions. Second, we will have a look at the prerequisites to follow this tutorial. Third, we will have a look at how to add a new column to a dataframe using first base R and, then, using tibble and the
add_column() function. In this section, using dplyr and
add_column(), we will also have a quick look at how we can add an empty column. Note, we will also append a column based on other columns. Furthermore, we are going to learn, in the two last sections, how to insert multiple columns to a dataframe using tibble.
To follow this tutorial, in which we will carry out a simple data manipulation task in R, you only need to install dplyr and tibble if you want to use the
mutate() functions as well as the %>% operator. However, if you want to read the example data, you will also need to install the readr package.
It may be worth noting that all the mentioned packages are all part of the Tidyverse. This package comes packed with a lot of tools that can be used for cleaning data, visualizing data (e.g. to create a scatter plot in R with ggplot2).
How do I add a column to a DataFrame in R?
o add a new column to a dataframe in R you can use the $-operator. For example, to add the column “NewColumn”, you can do like this:
dataf$NewColumn <- Values. Now, this will effectively add your new variable to your dataset.
In the next section, we are going to use the
read_excel() function from the readr package. After this, we are going to use R to add a column to the created dataframe.
Here’s how to read a .xlsx file in R:
# Import readr library(readr) # Read data from .xlsx file dataf <- read_excel('./SimData/add_column.xlsx')
In the code chunk above, we imported the file add_column.xlsx. This file was downloaded to the same directory as the script. We can obtain some information about the structure of the data using the
Before going to the next section it may be worth pointing out that it is possible to import data from other formats. For example, you can see a couple of tutorials covering how to read data from SPSS, Stata, and SAS:
- How to Read and Write Stata (.dta) Files in R with Haven
- Reading SAS Files in R
- How to Read & Write SPSS Files in R Statistical Environment
Now that we have some example data, to practice with, move on to the next section in which we will learn how to add a new column to a dataframe in base R.
Two Methods to Add a Column to a Dataframe using Base R.
First, we will use the $-operator and assign a new variable to our dataset. Second, we will use brackets ("[ ]") to do the same.
1) Add a Column Using the $-Operator
Here’s how to add a new column to a dataframe using the $-operator in R:
# add column to dataframe dataf$Added_Column <- "Value"
Note how we used the operator $ to create the new column in the dataframe. What we added, to the dataframe, was a character (i.e., the same word). This will produce a character vector as long as the number of rows. Here's the first 6 rows of the dataframe with the added column:
If we, on the other hand, tried to assign a vector that is not of the same length as the dataframe, it would fail. We would get an error similar to "Error: Assigned data `c(2, 1)` must be compatible with existing data."
If we would like to add a sequence of numbers we can use
seq() function and the
# add column to dataframe dataf$Seq_Col <- seq(1, 10, length.out = dim(dataf))
Notice how we also used the
dim() function and selected the first element (the number of rows) to create a sequence with the same length as the number of rows. In the next section, we will learn how to add a new column using brackets.
2) Add a Column Using Brackets
Here’s how to append a column to a dataframe in R using brackets (“”):
# Adding a new column dataf["Added_Column <- "Value"
Using the brackets will give us the same result as using the $-operator. However, it may be easier to use the brackets instead of $, sometimes. For example, when we have column names containing whitespaces, brackets may be the way to go. Also, when selecting multiple columns you have to use brackets and not $. In the next section, we are going to create a new column by using tibble and the
How to Add a Column to a dataframe in R using the add_column() Function
Here’s how to add a column to a dataframe in R:
# Append column using Tibble: dataf <- dataf %>% add_column(Add_Column = "Value")
In the example above, we added a new column at “the end” of the dataframe. Note, that we can use dplyr to remove columns by name. This was done to produce the following output:
Finally, if we want to, we can add a column and create a copy of our old dataframe. Change the code so that the left “dataf” is something else e.g. “dataf2”. Now, that we have added a column to the dataframe it might be time for other data manipulation tasks. For example, we may now want to remove duplicate rows from the R dataframe or transpose your dataframe.
Example 1: Add a New Column After Another Column
If we want to append a column at a specific position we can use the
# R add column after another column dataf <- dataf %>% add_column(Column_After = "After", .after = "A")
As you probably understand, doing this will add the new column after the column "A". In the next example, we are going to append a column before a specified column.
Example 2: Add a Column Before Another Column
Here’s how to add a column to the dataframe before another column:
# R add column before another column dataf <- dataf %>% add_column(Column_Before = "Before", .after = "Cost")
In the next example, we are going to use
add_column() to add an empty column to the dataframe.
Example 3: Add an Empty Column to the Dataframe
Here’s how we would do if we wanted to add an empty column in R:
Note that we just added NA (missing value indicator) as the empty column. Here’s the output, with the empty column, added, to the dataframe:
# Empty dataf <- dataf %>% add_column(Empty_Column = NA) %>%
If we want to do this we just replace the
NA with "‘’", for example. However, this would create a character column and may not be considered as empty. In the next example, we are going to add a column to a dataframe based on other columns.
Example 4: Add a Column Based on Other Columns
Here’s how to use R to add a column to a dataframe based on other columns:
# Append column conditionally dataf <- dataf %>% add_column(C = if_else(.$A == .$B, TRUE, FALSE))
In the code chunk above, we added something to the
add_column() function: the
if_else() function. We did this because we wanted to add a value in the column based on the value in another column. Furthermore, we used the
.$ so that we get the two columns compared (using
==). If the values in these two columns are the same we add
TRUE on the specific row. Here’s the new column added:
Note, you can also work with the
mutate() function (also from dplyr) to add columns based on conditions. See this tutorial for more information about adding columns on the basis of other columns.
In the next section, we will have a look at how to work with the
mutate() function to compute, and add, a new variable to the dataset.
Compute and Add a New Variable to a Dataframe in R with mutate()
Here’s how to compute and add a new variable (i.e., column) to a dataframe in R:
# insert new column with mutate dataf <- dataf %>% mutate(DepressionIndex = mean(c_across(Depr1:Depr5))) %>% head()
Notice how we, in the example code above, calculated a new variable called “depression index” which was the mean of the 5 columns named Depr1 to Depr5. Obviously, we used the
mean() function to calculate the mean of the columns. Notice how we also used the
c_across() function. This was done so that we can calculate the mean across these columns.
Note now that you have added new columns, to the dataframe, you may also want to rename factor levels in R with e.g. dplyr. In the next section, however, we will add multiple columns to a dataframe.
How to Add Multiple Columns to the Dataframe in R
Here’s how you would insert multiple columns, to the dataframe, using the
# Add multiple columns dataf <- %>% add_column(New_Column1 = "1st Column Added", New_Column2 = "2nd Column Added")
In the example code above, we had two vectors (“a” and “b”). Now, we then used the
add_column() method to append these two columns to the dataframe. Here’s the first 6 rows of the dataframe with added columns:
Note, if you want to add multiple columns, you just add an argument as we did above for each column you want to insert. It is, again, important that the length of the vector is the same as the number of rows in the dataframe. Or else, we will end up with an error
Add Columns from One Dataframe to Another Dataframe
In this section, you will learn how to add columns from one dataframe to another. Here’s how you append e.g. two columns from one dataframe to another:
# Read data from the .xlsx files: dataf <- read_excel('./SimData/add_column.xlsx') dataf2 <- read_excel('./SimData/add_column2.xlsx') # Add the columns from the second dataframe to the first dataf3 <- cbind(dataf, dataf2[c("Anx1", "Anx2", "Anx3")])
In the example above, we used the
cbind() function together with selecting which columns we wanted to add. Note, that dplyr has the
bind_cols() function that can be used in a similar fashion. Now that you have put together your data sets you can create dummy variables in R with e.g. the fastDummies package or calculate descriptive statistics.
In this post, you have learned how to add a column to a dataframe in R. Specifically, you have learned how to use the base functions available, as well as the add_column() function from Tibble. Furthermore, you have learned how to use the mutate() function from dplyr to append a column. Finally, you have also learned how to add multiple columns and how to add columns from one dataframe to another.
I hope you learned something valuable. If you did, please share the tutorial on your social media accounts, add a link to it in your projects, or just leave a comment below! Finally, suggestions and corrections are welcomed, also as comments below.
Other R Tutorials
Here you will find some additiontal resources that you may find useful- The first three, here, is especially interesting if you work with datetime objects (e.g., time series data):
- How to Extract Year from Date in R with Examples with e.g. lubridate (Tidyverse)
- Learn How to Extract Day from Datetime in R with Examples with e.g. lubridate (Tidyverse)
- How to Extract Time from Datetime in R – with Examples
If you are interested in other useful functions and/or operators these two posts might be useful:
- How to use %in% in R: 7 Example Uses of the Operator
- How to use the Repeat and Replicate functions in R