In this brief tutorial, you will learn how to add a column to a dataframe in R. More specifically, you will learn 1) to add a column using base R (i.e., by using the $-operator and brackets, 2) add a column using the add_column() function (i.e., from tibble), 3) add multiple columns, and 4) to add columns from one dataframe to another.

  • Save
How to add column to dataframe

Note, when adding a column with tibble we are, as well, going to use the %>% operator which is part of dplyr. Note, dplyr, as well as tibble, has plenty of useful functions that, apart from enabling us to add columns, make it easy to remove a column by name from the R dataframe (e.g., using the select() function). 

Outline

First, before reading an example data set from an Excel file, you are going to get the answer to a couple of questions. Second, we will have a look at the prerequisites to follow this tutorial. Third, we will have a look at how to add a new column to a dataframe using first base R and, then, using tibble and the add_column() function. In this section, using dplyr and add_column(), we will also have a quick look at how we can add an empty column. Note, we will also append a column based on other columns. Furthermore, we are going to learn, in the two last sections, how to insert multiple columns to a dataframe using tibble.

Prerequisites 

To follow this tutorial, in which we will carry out a simple data manipulation task in R, you only need to install dplyr and tibble if you want to use the add_column() and mutate() functions as well as the %>% operator. However, if you want to read the example data, you will also need to install the readr package.

It may be worth noting that all the mentioned packages are all part of the Tidyverse. This package comes packed with a lot of tools that can be used for cleaning data, visualizing data (e.g. to create a scatter plot in R with ggplot2).

How do I add a column to a DataFrame in R?

o add a new column to a dataframe in R you can use the $-operator. For example, to add the column “NewColumn”, you can do like this: dataf$NewColumn <-  Values. Now, this will effectively add your new variable to your dataset.

In the next section, we are going to use the read_excel() function from the readr package. After this, we are going to use R to add a column to the created dataframe.

Example Data

Here’s how to read a .xlsx file in R:

# Import readr library(readr) # Read data from .xlsx file dataf <- read_excel('./SimData/add_column.xlsx')

In the code chunk above, we imported the file add_column.xlsx. This file was downloaded to the same directory as the script. We can obtain some information about the structure of the data using the str() function:

  • Save

Before going to the next section it may be worth pointing out that it is possible to import data from other formats. For example, you can see a couple of tutorials covering how to read data from SPSS, Stata, and SAS:

Now that we have some example data, to practice with, move on to the next section in which we will learn how to add a new column to a dataframe in base R.

Two Methods to Add a Column to a Dataframe using Base R.

First, we will use the $-operator and assign a new variable to our dataset. Second, we will use brackets ("[ ]") to do the same.

1) Add a Column Using the $-Operator 

Here’s how to add a new column to a dataframe using the $-operator in R:

# add column to dataframe dataf$Added_Column <- "Value"

Note how we used the operator $ to create the new column in the dataframe. What we added, to the dataframe, was a character (i.e., the same word). This will produce a character vector as long as the number of rows. Here's the first 6 rows of the dataframe with the added column:

Column added to the r dataframe
  • Save

If we, on the other hand, tried to assign a vector that is not of the same length as the dataframe, it would fail. We would get an error similar to "Error: Assigned data `c(2, 1)` must be compatible with existing data."

  • Save

If we would like to add a sequence of numbers we can use seq() function and the length.out argument:

# add column to dataframe dataf$Seq_Col <- seq(1, 10, length.out = dim(dataf)[1])
new column added to the dataframe in R
  • Save

Notice how we also used the dim() function and selected the first element (the number of rows) to create a sequence with the same length as the number of rows. In the next section, we will learn how to add a new column using brackets. 

2) Add a Column Using Brackets

Here’s how to append a column to a dataframe in R using brackets (“[]”):

# Adding a new column dataf["Added_Column <- "Value"

Using the brackets will give us the same result as using the $-operator. However, it may be easier to use the brackets instead of $, sometimes. For example, when we have column names containing whitespaces, brackets may be the way to go. Also, when selecting multiple columns you have to use brackets and not $. In the next section, we are going to create a new column by using tibble and the add_column() function.

How to Add a Column to a dataframe in R using the add_column() Function

Here’s how to add a column to a dataframe in R:

# Append column using Tibble: dataf <- dataf %>% add_column(Add_Column = "Value")

In the example above, we added a new column at “the end” of the dataframe. Note, that we can use dplyr to remove columns by name. This was done to produce the following output:

Column added to the dataframe with Tibble
  • Save

Finally, if we want to, we can add a column and create a copy of our old dataframe. Change the code so that the left “dataf” is something else e.g. “dataf2”. Now, that we have added a column to the dataframe it might be time for other data manipulation tasks. For example, we may now want to remove duplicate rows from the R dataframe or transpose your dataframe.

Example 1: Add a New Column After Another Column

If we want to append a column at a specific position we can use the .after argument:

# R add column after another column dataf <- dataf %>% add_column(Column_After = "After", .after = "A")
Dataframe with column added after another column in R
  • Save

As you probably understand, doing this will add the new column after the column "A". In the next example, we are going to append a column before a specified column.

Example 2: Add a Column Before Another Column

Here’s how to add a column to the dataframe before another column:

# R add column before another column dataf <- dataf %>% add_column(Column_Before = "Before", .after = "Cost")

In the next example, we are going to use add_column() to add an empty column to the dataframe. 

Example 3: Add an Empty Column to the Dataframe

Here’s how we would do if we wanted to add an empty column in R:

Note that we just added NA (missing value indicator) as the empty column. Here’s the output, with the empty column, added, to the dataframe:

# Empty dataf <- dataf %>% add_column(Empty_Column = NA) %>%
empty column added to the dataframe in R
  • Save

If we want to do this we just replace the NA  with "‘’", for example. However, this would create a character column and may not be considered as empty.  In the next example, we are going to add a column to a dataframe based on other columns. 

Example 4: Add a Column Based on Other Columns

Here’s how to use R to add a column to a dataframe based on other columns:

# Append column conditionally dataf <- dataf %>% add_column(C = if_else(.$A == .$B, TRUE, FALSE))

In the code chunk above, we added something to the add_column() function: the if_else() function. We did this because we wanted to add a value in the column based on the value in another column. Furthermore, we used the .$ so that we get the two columns compared (using ==). If the values in these two columns are the same we add TRUE on the specific row. Here’s the new column added:

column added based on condition (ie on other column)
  • Save

Note, you can also work with the mutate() function (also from dplyr) to add columns based on conditions. See this tutorial for more information about adding columns on the basis of other columns.

In the next section, we will have a look at how to work with the mutate() function to compute, and add, a new variable to the dataset.

Compute and Add a New Variable to a Dataframe in R with mutate()

Here’s how to compute and add a new variable (i.e., column) to a dataframe in R:

# insert new column with mutate dataf <- dataf %>% mutate(DepressionIndex = mean(c_across(Depr1:Depr5))) %>% head()

Notice how we, in the example code above, calculated a new variable called “depression index” which was the mean of the 5 columns named Depr1 to Depr5. Obviously, we used the mean() function to calculate the mean of the columns. Notice how we also used the c_across() function. This was done so that we can calculate the mean across these columns. 

added new variable to data in R
  • Save

Note now that you have added new columns, to the dataframe, you may also want to rename factor levels in R with e.g. dplyr. In the next section, however, we will add multiple columns to a dataframe.

How to Add Multiple Columns to the Dataframe in R

Here’s how you would insert multiple columns, to the dataframe, using the add_column() function:

# Add multiple columns dataf <- %>% add_column(New_Column1 = "1st Column Added", New_Column2 = "2nd Column Added")

In the example code above, we had two vectors (“a” and “b”). Now, we then used the add_column() method to append these two columns to the dataframe. Here’s the first 6 rows of the dataframe with added columns:

multiple columns added to the dataframe
  • Save

Note, if you want to add multiple columns, you just add an argument as we did above for each column you want to insert. It is, again, important that the length of the vector is the same as the number of rows in the dataframe. Or else, we will end up with an error

Add Columns from One Dataframe to Another Dataframe

In this section, you will learn how to add columns from one dataframe to another. Here’s how you append e.g. two columns from one dataframe to another:

# Read data from the .xlsx files: dataf <- read_excel('./SimData/add_column.xlsx') dataf2 <- read_excel('./SimData/add_column2.xlsx') # Add the columns from the second dataframe to the first dataf3 <- cbind(dataf, dataf2[c("Anx1", "Anx2", "Anx3")])
added columns from another dataframe
  • Save

In the example above, we used the cbind() function together with selecting which columns we wanted to add. Note, that dplyr has the bind_cols() function that can be used in a similar fashion. Now that you have put together your data sets you can create dummy variables in R with e.g. the fastDummies package or calculate descriptive statistics.

Conclusion

In this post, you have learned how to add a column to a dataframe in R. Specifically, you have learned how to use the base functions available, as well as the add_column() function from Tibble. Furthermore, you have learned how to use the mutate() function from dplyr to append a column. Finally, you have also learned how to add multiple columns and how to add columns from one dataframe to another. 

I hope you learned something valuable. If you did, please share the tutorial on your social media accounts, add a link to it in your projects, or just leave a comment below! Finally, suggestions and corrections are welcomed, also as comments below.

Other R Tutorials

Here you will find some additiontal resources that you may find useful- The first three, here, is especially interesting if you work with datetime objects (e.g., time series data):

If you are interested in other useful functions and/or operators these two posts might be useful:

add column to dataframe in R
  • Save
Share via
Copy link
Powered by Social Snap