How to Remove/Delete a Row in R - Rows with NA, Conditions, Duplicated

In this tutorial, you will 1) learn how to delete a row in R and 2) how to remove multiple rows in R. Of course, R being a universal programming language, there are many different options when we need to delete a row. For example, we can use the subset() function to drop a row based on a condition. If we prefer to work with the Tidyverse package, we can use the filter() function to remove (or select) rows based on values in a column (conditionally and the same as using a subset). Furthermore, we can also use the function slice() from dplyr to remove rows based on the index.

In this post, we are going to remove rows in the R programming environment in the following ways:

Delete a row/case based on the index
- With Base R
- With slice() from dplyr

Drop multiple rows
Delete a row based on a condition
- Using the subset() function
- With the filter() function
Delete rows with missing values (an)
Remove duplicate rows

Notice that in most of the examples in the current post, we will work with dataframes (or tibbles). However, the same method will also be used on a matrix when applicable. That is, you will learn how to delete a row from a matrix and a dataframe. The next section will cover what you need to know to follow this tutorial. Here’s the table of content of this R tutorial:

Requirements

There is not that much you need to have to follow this tutorial. Of course, if you want to use the example dataset you need to download it and make sure that your R environment is set to the same environment as the location of the datafile (or you need to know the path to the file). Setting the working directory in R can be done using the setwd() function.

Moreover, if you want to use dplyr or tidyr you either need to install Tidyverse or just the dplyr package (or tidyr). Installing R packages can be done using the install.packages() function. Installing Tidyverse is done using this command:

install.packages("tidyverse")Code language: R (r)

Before installing packages, check your R version and update R if needed. Worth noting here is that Tidyverse comes with a lot of great packages. For example, using dplyr you can select a column in R, tibble (and dplyr) enables you to use R to add a column to the data, among other things. That is, it is well worth installing TIdyverse.

Before we continue to the examples of deleting a row in R, we will quickly answer some frequently asked questions. These may have brought you here and might be all you need:

How do you delete a row in R?

To delete a row in R, you can use the – operator. For example, if you want to remove the first row from a dataframe in R you can use the following code: dataFrame <- dataFrame[-1, ]. This code will remove the first row from the dataframe. Of course, if you need to drop multiple rows, you can add them using the vector function: dataFrame <- [-c(1, 2, 3),]

How do I remove a row from a matrix in R?

Removing a row from a matrix can be done in the same way you delete a row from a dataframe: mtx1 <- mtx[-1,]

How to remove rows with na in r?

To remove rows with an in R, we can use the na.omit() and drop_na() (tidyr) functions. For example, na.omit(YourDataframe) it will drop all rows with an.

In the next section, we will read the data from an Excel file in R using the readxl package.

Example Data to Practice Delete Cases in R

Before we delete a row from a dataframe in R, we need some data to practice on. The data is stored in an Excel file; you must download it to your computer before reading it (see the Requirement section). Here is how you can read a xlsx file in R using the readxl package:

library(readxl)
dataf <- read_excel("example_sheets2.xlsx")
head(dataf)Code language: R (r)

Here are the first six rows of this dataframe:

If we want to know a bit more about the data we can also use the str() function:

As can be seen in the image above, the data has ten rows and five columns (this information can be obtained using the dim() function as well). Moreover, we can see the data type of the different variables (columns). In the next section, we will delete a case (i.e., a single row) in R using its index.

How to Delete a Row in R by its index

Here is how you delete a row in base R using its index:

# Delete case
dataf[-2,]Code language: R (r)

As you can see in the output, we get a nine-by-five matrix and have successfully deleted the second row from the dataframe. However, if we want it to be permanent, we must assign the dataframe to a new one (or overwrite the old one). Here is how we would do this:

dataf <- dataf[-2,]Code language: R (r)

In the next section, we will delete a row from a matrix. For more details on removing specific rows see:

Remove Specific Row in R: How to & Examples with dplyr

R: Remove Rows with Certain Values using dplyr

How to Remove a Row from a Matrix in R

As previously mentioned, we will see how to drop rows from a matrix whenever the same method we used for a dataframe is applicable. First, here is how to create a matrix in R:

mtx <- matrix(seq(1,9), nrow = 3, 
              ncol = 3)Code language: R (r)

In the code chunk above, we created a matrix using the matrix() function. Also, we used seq to create a sequence of numbers in R. Next, we can drop the first row using the same method as in the previous example:

# Drop case from matrix
mtx[-1, ]Code language: R (r)

Again, if we want to work with our data without the deleted row, we must assign this to a new (or overwrite the old) matrix. Note that it is also possible to convert a matrix to a dataframe in R. In the following example, we will use the slice() function from dplyr to delete a row by its index.

How to Drop a Row using dplyr’s slice() function in R in two Steps

In this section, we are ready to use other functions to delete rows in R. Here are the two simple steps using the slice() function:

1. Loading dplyr

The first step is, of course, to load the package. This is done using the library() function:

library(dplyr)

2. Using the slice() function

We are now ready to remove a row using its index. Here is how we can do it using the slice() function:

slice(dataf, 1)

Notice how we used the dataframe as the first parameter, and then we used the “-” sign and the index of the row we wanted to delete. In this example, we deleted the first row. Before continuing to the next example, it is worth pointing out that the slice() function cannot be used on a matrix:

In the following two examples, we will learn how to remove multiple rows using the base R and the slice() function.

How to do a Kruskal-Wallis Test in R

How to Delete Rows using base R

In this example, we will delete multiple rows by their indexes. Of course, this is done similarly to deleting a single row. However, we need to use the c() function. Here is how we delete four rows by their index in R:

dataf[-c(1, 3, 5, 7),]Code language: R (r)

In the code chunk above, we used the brackets (“[]“) again. Within the brackets, we furthermore used the c() function. In this function, we added the indexes of the rows we wanted to delete from the dataframe. Notice that we again used the “-” in front of the c() function. If you are working with a matrix, it is, again, possible to apply the same method to remove rows from the matrix:

mtx[-c(1, 3, 4),]Code language: CSS (css)

Again, remember to assign the dataframe to a new variable, or else you cannot work with your data without the rows you removed.

In the following example, we will use the slice() function (dplyr) to delete rows from a dataframe in R.

How to Delete Rows using the Slice() function in R

In the previous example, we learned two steps to drop a row using the slice() function. Of course, the same two steps are valid when deleting multiple rows with dplyr. We will work a lot like the previous example. That is, we use the c() function inside the slice() function to drop multiple rows from the dataframe/tibble:

slice(dataf, -c(1, 3, 5), ]Code language: R (r)

Of course, we need to assign the new dataframe with the rows deleted. In the next two examples, we will get into more advanced methods of deleting a row in R. First; we will learn how to drop rows based on a condition using the subset() function. Second, we will use the filter() function from the dplyr package. Of course, we will also look at examples of deleting rows using multiple conditions.

How to Delete a Row in R based on a Condition

Sometimes we need to delete a row in our data based on a value in, for example, a column. For instance, we may know that a participant wants to be excluded from the data analysis. Another example could be that we want to do our analysis on a subset of our data. In any case, here is how we conditionally can delete a row using the subset() function:

subset(dataf, Name != "Steve")Code language: R (r)

In the code chunk above, we used the subset function with the dataframe as the first parameter. Next is our condition. In this example, we are deleting the row where the name is “Steve”. Now, this may seem a bit backward, and we are selecting all the rows where the name “Steve” is not present. This will give us this dataframe:

That is it! We deleted a case (i.e., Steve) from our dataframe. Of course, most of the time, we may want to delete rows based on multiple conditions. Do not worry; we will look at this, but first, we will use the filter() function from dplyr to accomplish the same result as above.

Dropping a Row based on a Value in a Column cell using filter() dplyr

Here is how we remove a row based on a condition using the filter() function:

filter(dataf, Name != "Pete")Code language: R (r)

In the above example code, we deleted the ” Name ” row with “Pete” in the “Name” column. Again, we selected all other rows except for this row.

Of course, we most likely want to remove a row (or rows) based on multiple conditions, and we will soon learn how to do this again. First, we learn how to do this with the subset() function. Second, we use the filter() function from the dplyr package.

Deleting Rows based on Conditions in R

Now, as previously mentioned many times, we might want to remove a row (or rows) based on many conditions. Here is an example code on how to use multiple conditions with the subset() function:

subset(dataf, Mean != 99 & Correct != 99)Code language: R (r)

In the example above, we used the and operator within the subset() function. First, however, we used the dataframe (i.e., dataf) as the first parameter. The second is where we add our conditions. As you can see, in the code chunk above, the & is the and-operator. Here we selected all rows not containing the specified values (in our case, the numbers X and Y). In the following example, we are going to see how we can use the filter() function from the package dplyr to carry out the same task.

Delete Rows based on Conditions using the filter() Function

Dropping rows based on multiple conditions can, of course, also be done in a very similar way using the filter() function:

filter(dataf, Mean != 99 & Correct != 99)Code language: R (r)

In the code chunk above, we just changed the subset() function to the filter() function. However, working with dplyr and the Tidyverse packages, we can make use of the %>% operator to pipe the data like this:

dataf <- dataf %>%
         filter(Mean != 99 & Correcft != 99)Code language: R (r)

For information about more operators in R see the following articles:

In the next section, you will learn how to delete rows with missing values. First, however, we need to add some missing values to the current dataframe.

remove rows with nas in r — remove rows with na in r

How to Remove All rows With Missing Values (NA) in R

Sometimes we need to remove the missing values from our data. In R, we can delete rows with missing values using the na.omit() function. First, however, we are going to add some missing values to our practice dataframe:

dataf_na <- dataf
dataf_na$Correct[c(4, 7)] <- NA
dataf_na$Mean[c(2, 3)] <- NACode language: R (r)

In the code chunk above, we added missing values to rows 4 and 7 in the Correct column and rows 2 and 3 in the Mean column. Next, we can use the na.omit() function to delete the missing values. Here is how we remove the NA from our dataframe:

dataf_na <- na.omit(dataf_na)Code language: R (r)

now that you know how to remove rows with an in R, we can go to the next section, where we will quickly look at how to delete duplicated rows.

Remove rows with NA in R using tidyr

In this example, we are going to remove all rows with NA in the R dataframe using the tidyr package:

library(tidyr)
dataf_no_na <- dataf_na %>% drop_na()Code language: R (r)

How to Remove Duplicate Rows in R

In this final example, we are going to remove duplicate rows in R:

dataf2 <- rbind(dataf, dataf[1,])
dataf2[!duplicated(dataf2), ]Code language: R (r)

As you can see in the code chunk above, we first created two duplicate rows, and then used the duplicated() function. Note how we used the ! before the duplicated() function. We used this so that we didn’t get the duplicated rows. Leaving this out would give us the duplicated row. See the post about removing duplicates in R for more information about dropping duplicated rows from your data.

Summary

In this post, you have learned how to delete a row in R using both base functions and functions from the package dplyr. Moreover, you have learned how to carry out this task on both a dataframe and a matrix in R. Specifically, you have learned how to drop a row by the index of the row. Moreover, you have learned how to delete multiple rows using row indexes. In these two examples, deleting multiple rows was done using base R and dplyr (slice()). As we sometimes want to delete a row based on a condition (or two), you have also learned how to drop a row based on a condition using the subset() and the filter() functions. The latter one, again, is a dplyr function. Finally, we went through two more examples: we first deleted rows with missing values and then with duplicated values.

How to Remove/Delete a Row in R – Rows with NA, Conditions, Duplicated