# Convert Multiple Columns to Numeric in R with dplyr

2 Shares

In this post, you will learn how to convert multiple columns to numeric in R. We explore the efficiency and readability of using dplyr, a powerful R data manipulation package. The mutate family of functions within dplyr is convenient when converting columns, offering a streamlined approach. Here are some other conversion tutorials:

Real-world data scenarios, e.g., Psychology, may require careful column conversion. Consider instances where survey responses, initially stored as characters or sometimes null values, need transformation into a numeric format for meaningful analysis. Such data manipulation tasks are critical for accurate statistical insights and form the backbone of data preprocessing in psychological research.

Whether dealing with questionnaire data, experimental results, or any diverse datasets common in Psychology, mastering the art of converting multiple columns to numeric in R empowers analysts to derive richer insights from their data.

## Outline

The structure of this post is as follows. Before learning how to convert multiple columns to numeric in R, we will set the stage with a brief look at the prerequisites to follow this post. Following that, we will explore the essential functions in Base R for converting data types, providing a foundational understanding. The post then takes a deep dive into the powerful dplyr package, emphasizing its efficiency and clarity in the column conversion process. So, let us progress step by step, ensuring a comprehensive grasp of the critical data manipulation task of changing multiple columns to numeric in R.

## Prerequisites

Before converting multiple columns to numeric in R, you need a solid understanding of fundamental R concepts and data types. Familiarity with the structure of data frames and basic knowledge of R programming will be beneficial.

Additionally, ensure you have the dplyr (or tidyverse) package installed, as we will leverage its powerful functions for efficient data manipulation. To guarantee a seamless experience, check your R version and update R to the latest release if needed. This ensures compatibility and access to the latest features for effective column conversion.

## Base R: Converting Multiple Columns to Numeric

In Base R, several functions can convert multiple columns to numeric types. The `apply()` function, in combination with `as.numeric()`, allows for a versatile approach, offering flexibility in its application. Additionally, the `lapply()` function is handy when dealing with multiple columns simultaneously. To solidify our understanding, let us get into a few practical examples.

Consider a scenario where a Psychology dataset contains columns with numeric information stored as characters. The objective is to transform these columns into a numeric format for accurate analysis.

### Example 1: Base R to Change Multiple Columns to Numeric

Here is an example using `as.numeric()` and `lapply()` to convert multiple columns to numeric:

```.wp-block-code {
border: 0;
}

.wp-block-code > span {
display: block;
overflow: auto;
}

.shcb-language {
border: 0;
clip: rect(1px, 1px, 1px, 1px);
-webkit-clip-path: inset(50%);
clip-path: inset(50%);
height: 1px;
margin: -1px;
overflow: hidden;
position: absolute;
width: 1px;
word-wrap: normal;
word-break: normal;
}

.hljs {
box-sizing: border-box;
}

.hljs.shcb-code-table {
display: table;
width: 100%;
}

.hljs.shcb-code-table > .shcb-loc {
color: inherit;
display: table-row;
width: 100%;
}

.hljs.shcb-code-table .shcb-loc > span {
display: table-cell;
}

.wp-block-code code.hljs:not(.shcb-wrap-lines) {
white-space: pre;
}

.wp-block-code code.hljs.shcb-wrap-lines {
white-space: pre-wrap;
}

.hljs.shcb-line-numbers {
border-spacing: 0;
counter-reset: line;
}

.hljs.shcb-line-numbers > .shcb-loc {
counter-increment: line;
}

.hljs.shcb-line-numbers .shcb-loc > span {
}

.hljs.shcb-line-numbers .shcb-loc::before {
border-right: 1px solid #ddd;
content: counter(line);
display: table-cell;
text-align: right;
-webkit-user-select: none;
-moz-user-select: none;
-ms-user-select: none;
user-select: none;
white-space: nowrap;
width: 1%;
}
```# Create a sample dataframe
example_data <- data.frame( Col1 = c("1", "2", "3"),
Col2 = c("4", "5", "6"),
Col3 = c("7", "8", "9") )

# Convert multiple columns to numeric using as.numeric()
example_data[] <- lapply(example_data, as.numeric)```Code language: R (r)```

In the code chunk above, we used `lapply()` to convert all columns to numeric in R. We use `[]` to ensure that the result is returned to the original dataframe.
However, note that this method may encounter issues with boolean or factor columns as it attempts to convert all columns to numeric. Look at the image below, for instance.

In the following example, we will learn how to change the data type of specific columns using base R.

### Example 2: Base R to Change Specific Columns to Numeric

Here is another example using `as.numeric()` and `lapply()` to convert multiple, but specific columns to numeric:

``````# Create a sample dataframe
example_data_specific <- data.frame(
ColX = c("1", "2", "3"),
ColY = c("4", "5", "6"),
ColZ = c("7", "8", "9")
)

# Convert specific columns to numeric using as.numeric()
example_data_specific[, c("ColX", "ColY", "ColZ")] <- lapply(example_data_specific[, c("ColX", "ColY", "ColZ")], as.numeric)```Code language: PHP (php)```

In the code chunk above, we used `lapply(`) to convert specific columns to numeric in R. Again, we used of `[]`, but this time to select the specified columns, and `lapply()` applies the conversion function.

However, this approach can be cumbersome with many columns, leading us to explore a more dynamic method in the following example.

### Example 3: Transform all Character Columns to Numeric in R

Example using `lapply() `to convert all character columns to numeric

``````# Create a sample dataframe
example_data_all <- data.frame(
ColA = c("1", "2", "3"),
ColB = c("4", "5", "6"),
ColC = c("7", "8", "9"),
ColD = c(1, 2, 3),
ColE = factor(c("1", "2", "3"))
)

# Identify character columns
char_cols <- sapply(example_data_all, is.character)

# Convert all character columns to numeric using lapply()
example_data_all[char_cols] <- lapply(example_data_all[char_cols], as.numeric)```Code language: PHP (php)```

In the code snippet above, we used `sapply()` to identify character columns in the dataframe. The result is a logical vector (`char_cols`) indicating which columns contain character data. Subsequently, we applied `lapply()` to convert only the identified character columns to numeric, avoiding unnecessary conversion of non-character columns. We can use this method more dynamically than manually selecting columns using `[]`. In the next section, we will quickly look at the dplyr package.

## dplyr Overview

In data manipulation in R, the dplyr package is a powerful tool. With its expressive syntax and efficient functions, dplyr simplifies complex operations. When manipulating data, we can use the `select()` and `mutate()` family of functions to create clean and readable scripts. Importantly, the `%>%` operator (pipe) enhances the flow of operations, allowing for seamless chaining of commands. As we learn to convert multiple columns to numeric, dplyr’s capabilities become evident. Functions like `mutate_if()` offer an elegant solution to the challenges faced in base R (see above), allowing us to efficiently transform only the desired columns, such as character columns, with precision and clarity. The following section will look at examples of using dplyr to transform multiple columns to numeric in R.

## Convert Multiple Columns to Numeric in R with dplyr

In the vast landscape of data manipulation tools in R, dplyr’s arsenal stands out. Key functions like `mutate_all()`, `across()`, `mutate_if()`, and `select()` offer precise control over column conversions. This section explores how these functions streamline converting multiple columns to numeric, enhancing clarity and efficiency. In the first example, we will convert all columns to numeric using `mutate_all`.

### Example 1: Converting All Columns to Numeric with dplyr

Using `mutate_all()`, we effortlessly convert all columns to numeric, ensuring consistency in data types. Here is an example:

``````# Create a sample dataset
cognitive_data <- data.frame(

Score1 = c("5", "4", "3"),

Score2 = c("2", "3", "4"),

Score3 = c("1", "2", "3")

)
# Convert all columns to numeric
cognitive_data <- cognitive_data %>%
mutate_all(as.numeric)```Code language: PHP (php)```

In the code snippet above, we used dplyr’s `mutate_all()`, efficiently converting all columns to numeric, enhancing data consistency. This function applies `as.numeric` to every column, similar to the approach with lapply() but more concise and readable, ensuring all scores are in numeric format. In the following example, we will use another function from the mutate-family: `mutate_if`.

### Example 2: Selectively Conversion From Character to Numeric with mutate_if()

Using `mutate_if()` and `is.character`, we can selectively convert only the character columns, leaving others unchanged:

``````# Create a hearing science dataset
hearing_data <- data.frame(
Freq1 = c("440", "520", "630"),
Freq2 = c("75", "84", "91"),
Type = factor(c("pure_tone", "white_noise", "pure_tone"))
)

# Convert character columns to numeric
hearing_data <- hearing_data %>%
mutate_if(is.character, as.numeric)```Code language: PHP (php)```

In the code snippet above, we used `mutate_if()`, to achieve a similar outcome to the base R example, where `sapply()` was used to identify and convert character columns. Here, we targeted columns identified by `is.character` and applied `as.numeric` to ensure a consistent numeric format for the hearing data. Again, this approach enhances readability and efficiency compared to the base R method.

### Example 3: Using across() for Selective Transformation of Columns to Numeric

The `across()` function allows for more targeted operations. We can specify the columns to be transformed, providing flexibility and precision. Here is an example:

``````# Create Example Data
psych_data_specific <- data.frame(
Score_A = as.character(c(1, 2, 3)),
Score_B = as.character(c(4, 5, 6)),
Score_C = as.character(c(7, 8, 9)),
Numeric_Score = c(1, 2, 3)
)

psych_data_specific <- psych_data_specific %>%
mutate(across(starts_with("Score"), as.numeric))```Code language: PHP (php)```

In the provided code snippet, we used `mutate(across())` to convert specific columns starting with “Score” to numeric format. Similar to our base R example where we selected columns explicitly, with dplyr’s `across()`, we can also achieve this by specifying the range using “Score1:Score3”. This showcases the flexibility and clarity that dplyr brings to column selection and transformation.

### Example 4: Convert Factors to Numeric in R

Again, using `mutate_if`, but this time together with `is.factor` we can transform factors to numeric in R:

``````# Example 3: Converting Factors to Numeric in R with dplyr

psych_data_factors <- data.frame(
Student_ID = 1:3,
Exam_1 = factor(c("A", "B", "C")),
Exam_2 = factor(c("B", "A", "C")),
Exam_3 = factor(c("C", "A", "B"))
)

psych_data_factors <- psych_data_factors %>%
mutate_if(is.factor, as.numeric)

```Code language: PHP (php)```

In the code chunk above, we used the `mutate_if()` function to convert all factor columns to numeric. This approach is similar to our previous demonstration (i.e., Example 2), showcasing the efficiency and consistency of using dplyr functions for data manipulation tasks.

## Convert Multiple Columns with Nulls to Numeric in R

Handling nulls, often represented as NA (Not Available) in R, is a crucial aspect of data preprocessing. Null values can arise due to missing data or undefined observations in a dataset. In this section, we will explore how to convert multiple columns with nulls to numeric using both base R and the dplyr package.

### Example 1: Converting Multiple Columns with Nulls with Base R

Here is how we can convert multiple columns with nulls using Base R:

``````# Create a sample dataframe with nulls
null_data <- data.frame(
Col1 = c(1, 2, NA),
Col2 = c("3", NA, 5),
Col3 = c(6, 7, 8)
)

# Convert columns with nulls to numeric
null_data[] <- lapply(null_data, as.numeric)```Code language: R (r)```

In this code snippet, we used `lapply()` to convert all columns with nulls to numeric in the base R environment. Note that this example is basically the same as in our previous example (i.e., earlier in the post) but with mixed values in the column (including NAs).

### Example 3: Transforming Multiple Columns with NAs with dplyr

Again, we can use `mutate_all` if we want to convert all columns, including the ones with nulls, using dplyr:

``````# Create a sample dataframe with nulls
null_data_dplyr <- data.frame(
Col1 = c(1, 2, NA),
Col2 = c("3", NA, 5),
Col3 = c(6, 7, 8)
)

# Convert columns with nulls to numeric using dplyr
null_data_dplyr <- null_data_dplyr %>%
mutate_all(as.numeric)```Code language: PHP (php)```

In the provided code chunk, we used the `%>%` pipe operator and the `mutate_all()` function to convert all columns, including those with nulls, to numeric data type. Note that this approach is consistent with the principles discussed in the earlier dplyr section, emphasizing the versatility of the methods for handling multiple columns, even when null values are present. Again, this is an example of the concise syntax and flexibility of dplyr operations that efficiently streamline converting diverse columns.

## Comparing Base R and dplyr for Converting Numeric Columns

When it comes to converting numeric columns in R, both base R and dplyr offer distinct advantages. Base R, part of the core R language, provides simplicity and independence from additional packages. This can benefit users seeking a lightweight solution without relying on external dependencies. On the other hand, dplyr excels in versatility and efficiency. It goes beyond column conversion, offering a powerful toolkit for various data manipulation tasks. With dplyr, tasks like renaming column names, renaming factors, and seamlessly adding columns to a dataframe become straightforward. While base R may be preferable for minimalistic tasks, dplyr is a comprehensive and efficient choice for users engaged in broader data preprocessing and manipulation activities.

## Conclusion

In conclusion, this post has equipped you with the knowledge and skills to proficiently convert multiple columns to numeric in R using both base functions and the versatile dplyr package. The simplicity of base R provides a solid foundation for straightforward tasks, while dplyr’s efficiency and extensive functionality make it a powerful tool for broader data manipulation. As you work on your data analysis projects, consider the specific needs of your task to choose the most suitable approach. Remember, whether you opt for the simplicity of base R or the efficiency of dplyr, the goal is to streamline your workflow and enhance your data analysis capabilities.

## Resources

Here are some other data manipulation and dplyr posts:

2 Shares

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top