In this guide you will learn how to concatenate two columns in R. In fact, you will learn how to merge multiple columns in R using base R (e.g., using the paste function) and Tidyverse (e.g. using str_c() and unite()). In the final section of this post, you will learn which function is the best to use when combining columns. 

merge two columns
  • Save
Combine columns

If you have some experience using dataframe (or in this case tibble) objects in R and you’re ready to learn how to combine data found in them, then this tutorial will help you do precisely that. 

Knowing how to do this may prove useful when you have a dataframe containing information, in two columns, and you want to combine these two columns into one using R. For example, you might have a column containing first names and last names. In this case, you may want to concatenate these two columns into one e.g. called Names. 

three methods to concatenate two columns in R
  • Save
Three methods to merge columns in R

You can follow along with the examples in this tutorial using the interactive Jupyter Notebook found towards the end of the tutorial. Here’s the example data that we use to learn how to combine two, or more, columns to one variable.

Outline

In this post, you will learn, by example, how to concatenate two columns in R. As you will see, we will use R’s $ operator to select the columns we want to combine. The outline of the post is as follows. First, you will learn what you need to have to follow the tutorial. Second, you will get a quick answer on how to merge two columns. After this, you will learn a couple of examples using 1) paste() and 2) str_c() and, 3) unite(). In the final section, of this concatenating in R tutorial, you will learn which method I prefer and why. That is, you will get my opinion on why I like the unite() function. In the next section, you will learn about the requirements of this post.

Requirements

If you prefer to use base R you don’t need more than a working R installation. However, if you are going to use either str_() or unite() you need to have at least one of the packages stringr or tidyr. It is worth pointing out, here, that both of these packages are part of the Tidyverse package. This package contains multiple useful R packages that can be used for reading data, visualizing data (e.g., scatter plots with ggplot2), extracting year from date in R, adding new columns, among other things. Installing an R package is simple, here’s how you install Tidyverse:

install.packages("tidyverse")
Code language: R (r)

Note, if you want to install stringr or tidyr just exchange “tidyverse” for e.g. “stringr”.  In the next section, you will get a quick answer, without any details, on how to concatenate two columns in R. 

How do I concatenate two columns in R?

To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df[‘AB’] <- paste(df$A, df$B)</code>. Note, however, that using <code>paste</code> will result in whitespace between the values in the new column.

Before we are going to have a more detailed look at how to use paste() to combine two columns, we are going to load an example dataset. 

Reading Example Data from a .xlsx File

Here’s how to read a .xlsx file in R using the readxl package:

# Importing Example Data: library('readxl') dataf <- read_excel("combine_columns_in_R.xlsx")
Code language: R (r)

Now, we can have a look at the structure of the imported data using the str() function:

We will also have a quick look at the first five rows using the head() function:

Now, in the images above we can see that there are 5 variables and 7 observations. That is, there are 5 columns and 7 rows, in the tibble. Moreover, we can see the types of the variables and we can, of course, also use the column names. In the next section, we are going to start by concatenating the month and year columns using the paste() function.

Concatenate Two Columns in R

Here’s one of the simplest way to combine two columns in R using the paste(): function:

dataf$MY <- paste(dataf$Month, dataf$Year)
Code language: R (r)

In the code above, we used $ in R to 1) create a new column but, as well, selecting the two columns we wanted to combine into one. Here’s the tibble with the new column, named MY:

In the next example, we will merge two columns and adding a hyphen (“-”), as well. For more useful operators, and how to use them, see for example the post “How to use %in% in R: 7 Example Uses of the Operator“.

Concatenate Two Columns with – as a separator in R

Now, to add “-” (hyphen) between the values we want to combine we add a third parameter to the paste() function:

dataf$MY <- paste(dataf$Month, "-", dataf$Year)
Code language: R (r)

In the code example above, we used the sep parameter and set it as “-”. As you can see, in the image below, we have whitespaces between the two values (i.e. “Month” and “Year”).

Now, using R’s paste() function we can add another parameter: the sep parameter. Here’s a code example combining the two columns, adding the “-” without the whitespaces:

dataf$MY <- paste(dataf$Month, dataf$Year, sep= "-")
Code language: R (r)

Notice, that instead of pasting the hyphen we used it as a separator. Before moving on to the next example, it is worth pointing out that if we don’t want to add whitespaces we can use the paste0() function instead. This way, we don’t need the sep parameter. In the next example, we are going to have a look at how to combine multiple columns (i.e., three or more) in R. 

Combine Multiple Columns in R

As you may have understood, combining more than 2 columns is as simple as adding a parameter to the paste() function. Here’s how we combine three columns in R:

dataf$DMY <- paste(dataf$Date, dataf$Month, dataf$Year)
Code language: R (r)

That was also pretty simple. It is worth, mentioning, that if you use the sep parameter, in a case as above, you will end up with whatever character you chose between each value from each column. For example, if we were to add the sep argument to the code above and put underscore (“_”) as a separator here’s how the resulting tibble would look like:

Now, you may understand that using the sep parameter enables you to use almost any character to separate your combined values. In the next section, we will have a look at the str_c() function from the stringr package.

Concatenate Two Columns in R with the str_c() Function (stringr)

Combining two columns with the str_c() function is super simple. Here’s how to merge the columns “Snake” and “Size” using the str_c() function:

library(stringr) dataf$SnakeNSize <- str_c(dataf$Snake," ", dataf$Size)
Code language: PHP (php)

Notice that we added something in between the two columns we wanted to concatenate? When working with this function, we need to do this, or else we end up with nothing separating the two values that we are combining. As previously mentioned, the stringr package is part of the Tidyverse packages which also includes packages such as tidyr and the unite() function. In the next section, we are going to merge two columns in R using the unite() function as well. 

Merge Columns in R with the unite() Function (tidyr)

Here’s how we concatenate two, or more, columns using the unite() function:

library(tidyverse) # or library(tidyr) dataf <- dataf %>% unite("DM", Date:Month)
Code language: R (r)

Notice something in the code above. First, we used a new operator (i.e., %>%). Among a lot of things, this enables us to use unite() without the $ operator to select the columns. As you can see, in the code example above, we used two parameters. First, we name the new column we want to add (“DM”), second we select all the columns from “Date” to “Month” and combine them into the new column. Here’s the resulting dataframe/tibble:

Now, as you can see in the image above, both columns that we combined have disappeared. If we want to keep the original columns after we have concatenated them we can set the remove parameter to FALSE. Here’s a code chunk that you can use, instead, to not remove the columns:

dataf <- dataf %>% unite("DM", Date:Month, remove = FALSE)
Code language: R (r)

Finally, did you notice how we have an underscore as a separator? If we want to change to another separator we can use the sep parameter. This is exactly what we will do in the next example:

Concatenate two Columns in R using “-” as a separator

Here’s how we use the unite() function together with the sep parameter to change the separator to “-” (hyphen):

dataf <- dataf %>% unite("DM", Date:Month, sep= "-", remove = FALSE)
Code language: R (r)

That was as simple as the previous example, right? In the next section, you will learn which function I prefer to use and why.

Which Function is Best for Concatenating Columns in R?

Naturally, this section will contain my opinion. I have not done any optimization testing (e.g., I don’t know which function is the fastest when it comes to combining columns in R). That said, although all of the functions used in this post are simple to use I prefer the unite() function. Why? Well, together with the piping operator I think it makes the column very readable. It is, as well, very handy to use unite() if you are going to concatenate multiple columns in R. As you may have noticed, in the examples above, we can use “:” when combining columns. This means that we can merge multiple columns from the first column (i.e., left of the column sign) to the last column (i.e., right of the “:”). This is pretty neat and will definitely save some space in your code and make it easier to read! 

how to concatenate two columns in R
  • Save
concatenate two columns in R

Another neat thing is that we add the new column name as a parameter and we, automatically, get rid of the columns combined (if we don’t need them, later, of course). Finally, we can also set the na.rm parameter to TRUE if we want missing values to be removed before combining values. Here’s a Jupyter Notebook with all the code in this post.

Conclusion

In this post, you have learned how to concatenate two (or more) columns in R using three different functions. First, we used the paste() function from base R. Using this function, we combined two and three columns, changed the separator from whitespaces to hyphen (“-”). Second, we used the str_() function to merge columns. Third, we used the unite() function. Of course, it is possible (we saw some example of that) to change the separator using the two last functions as well. To conclude, the unite() function seems to be the handiest function to use to concatenate columns in R.  

Hope you learned something! If you did, please leave a comment below, share on your social media, include a link to the post on your projects (e.g., blog posts, articles, reports), or become a Patreon: 

Finally, if you have any suggestions, other comments, or there is something you wish me to cover: don’t hesitate to contact me. 

R Resources

  • Save
Share via
Copy link
Powered by Social Snap