In this tutorial, you will 1) learn how to delete a row in R, and 2) how to remove multiple rows in R. Of course, R being a versatile programming language there are many different options when we need to delete a row. For example, we can use the subset() function if we want to drop a row based on a condition. If we prefer to work with the Tidyverse package, we can use the filter() function to remove (or select) rows based on values in a column (conditionally, that is, and the same as using subset). Furthermore, we can also use the function slice() from dplyr to remove rows based on the index. 

In this post, we are going to remove rows in the R programming environment in the following ways:

  • Delete a row based on the index 
    • With Base R
    • With slice() from dplyr
  • Drop multiple rows 
  • Delete a row based on a condition
    • Using the subset() function
    • With the filter() function
  • Delete rows with missing values 
  • Remove duplicate rows

Notice, in most of the examples in the current post, we will work with dataframes (or tibbles). However, when applicable the same method will be used on a matrix as well. That is, you will learn how to delete a row from both a matrix and a dataframe. In the next section, we will cover what you need to for this tutorial. Here’s the table of content of this R tutorial:

Requirements

There is not that much you need to have to follow this tutorial. Of course, if you want to use the example dataset you need to download it and make sure that your R environment is set to the same environment as the location of the datafile (or you need to know the path to the file). Setting the working directory in R can be done using the setwd() function. 

Moreover, if you want to use dplyr you either need to install Tidyverse or just the dplyr package. Installing R packages can be done using the install.packages() function. Installing Tidyverse is done using this command:

install.packages("tidyverse")
Code language: R (r)

Worth noting, here, is that Tidyverse comes with a lot of great packages. For example, using dplyr you can select a column in R, tibble (and dplyr) enables you to use R to add a column to the data, among other things. That is, it is well worth installing TIdyverse.

Now, before we continue to the examples on how to delete a row in R we will quickly answer some frequently asked questions.  These may have brought you here and might be all you need:

How do you delete a row in R?

To delete a row in R you can use the – operator. For example, if you want to remove the first row from a dataframe in R you can use the following code: dataFrame <- dataFrame[-1, ]. This code will remove the first row from the dataframe. Of course, if you need to drop multiple rows, you can add them using the vector function: dataFrame <- [-c(1, 2, 3),]

How do I remove a row from a matrix in R?

Removing a row from a matrix can be done in the same way you delete a row from a dataframe: mtx1 <- mtx[-1,]

In the next section, we are going to read the data from an Excel file in R using the readxl package. 

Example Data

Before we are going to delete a row from a dataframe in R we need some data to practice on. The data is stored in an Excel file and  you need to download it to your computer before you read it (see the Requirement section). Here is how you can read a xlsx file in R using the readxl package:

library(readxl) dataf <- read_excel("example_sheets2.xlsx") head(dataf)
Code language: R (r)

Here are the first 6 rows of this dataframe:

If we want to know a bit more about the data we can also use the str() function: 

As can be seen in the image above, the data has 10 rows and 5 columns (this information can be obtained using the dim() function as well). Moreover, we can see the data type of the different variables (columns). In the next section, we will delete a single row in R using its index. 

How to Delete a Row in R by its index

Here is how you delete a row in base R using its index:

dataf[-2,]
Code language: R (r)

As you can see in the output we get a 9 by 5 matrix and, therefore, has successfully deleted the second row from the dataframe. However, if we want it to be permanent we need to assign the dataframe to a new (or overwrite the old one). Here is how we would do this:

dataf <- dataf[-2,]
Code language: R (r)

In the next section, we will delete a row from a matrix.

How to Remove a Row from a Matrix in R

As previously mentioned, we are going to see how we can drop rows from a matrix whenever the same method that we used for a dataframe is applicable. First, here is how to create a matrix in R:

mtx <- matrix(seq(1,9), nrow = 3, ncol = 3)
Code language: R (r)

In the code chunk above, we created a matrix using the matrix() function. Also, we used seq to create a sequence of numbers in R. Next, we can drop the first row using the same method as in the previous example:

mtx[-1, ]
Code language: R (r)

Again, if we want to work with our data without the deleted row we need to assign this to a new (or overwrite the old) matrix. Note, it is also possible to convert a matrix to a dataframe in R. In the next example, we are going to use the slice() function from dplyr to delete a row by its index. 

How to Drop a Row using dplyr’s slice() function in R in two Steps

In this section, we are ready to use other functions to delete rows in R. Here are the two simple steps using the slice() function:

1. Loading dplyr

The first step is, of course, to load the package. This is done using the library() function:

library(dplyr)

2. Using the slice() function

We are now ready to remove a row using its index. Here is how we can do it using the slice() function:

slice(dataf, 1)

Notice how we used the dataframe as the first parameter and then we used the “-” sign and the index of the row we wanted to delete. In this example, we deleted the first row. Before continuing to the next example, it is worth pointing out that the slice() function cannot be used on a matrix:

In the next two examples, we are going to learn how to remove multiple rows using the base R and the slice() function. 

How to Delete Rows using base R

In this example, we are going to delete multiple rows by their indexes. Of course, this is done in a similar way as deleting a single row. However, we need to use the c() function. Here is how we delete 4 rows by their index in R:

dataf[-c(1, 3, 5, 7),]
Code language: R (r)

In the code chunk above, we used the brackets (“[]“) again. Within the brackets, we furthermore used the c() function. In this function, we added the indexes of the rows that we wanted to delete from the dataframe. Notice that we, again, used the “-” in front of the c(). If you are working with a matrix it is, again, possible to apply the exact same method to remove rows from the matrix:

mtx[-c(1, 3, 4),]
Code language: CSS (css)

Again, remember to assign the dataframe to a new variable, or else you cannot work with your data without the rows you removed.

In the next example, we are going to use the slice() function (dplyr) to delete rows from a dataframe in R. 

How to Delete Rows using the Slice() function in R

In the previous example, we learned two steps to drop a row using the slice() function. Of course, the same two steps are valid when deleting multiple rows with dplyr. In fact, we will work a lot like the previous example. That is, we use the c() function inside the slice() function to drop multiple rows from the dataframe/tibble:

slice(dataf, -c(1, 3, 5), ]
Code language: R (r)

Of course, we need to assign the new dataframe with the rows deleted. In the next two examples, we are going to get into more advanced methods of deleting a row in R. First, we will learn how to drop rows based on a condition using the subset() function. Second, we will do the same using the filter() function from the dplyr package. Of course, we will also look at examples deleting rows using multiple conditions. 

How to Delete a Row in R based on a Condition

Sometimes we need to delete a row in our data based on a value in for example a column. For instance, we may know that a certain participant wants to be excluded from the data analysis. Another example could be that we want to do our analysis on a subset of our data. In any case, here is how we conditionally can delete a row using the subset() function:

subset(dataf, Name != "Steve")
Code language: R (r)

In the code chunk above, we used the subset function with the the dataframe as the first parameter. Next is our condition. In this example, we are deleting the row where the name is “Steve”. Now, this may seem a bit backward and what we actually are doing is selecting all the rows where the name “Steve” is not present. This will give us this  dataframe:

Of course, most of the time we may want to delete rows based on multiple conditions. Do not worry, we will have a look at this but first, we are going to use the filter() function from dplyr to accomplish the same result as above.

Dropping a Row based on a Value in a Column cell using filter() dplyr

Here is how we remove a row based on a condition using the filter() function:

filter(dataf, Name != "Pete")
Code language: R (r)

In the above example code, we deleted the row with “Pete” in the “Name” column. Again, what we actually did was select all other rows except for this row.

Of course, we are most likely wanting to remove a row (or rows) based on multiple conditions and we will, again, soon learn how to do this. First, we learn how to do this with the subset() function. Second, we use the filter() function from the dplyr package. 

Deleting Rows based on Conditions in R

Now, as previously mentioned many times we might want to remove a row (or rows) based on many conditions. Here is an example code on how to use multiple conditions with the subset() function:

subset(dataf, Mean != 99 & Correct != 99)
Code language: R (r)

In the example above, we used the and operator within the subset() function. First, however, we used the dataframe (i.e., dataf) as the first parameter. The second, is where we add our conditions. As you can see, in the code chunk above, the & is the and-operator. Here we selected all rows not containing the specified values (in our case the numbers X and Y). In the next example, we are going to see how we can use the filter() function from the package dplyr to carry out the same task. 

Delete Rows based on Conditions using the filter() Function

Dropping rows based on  multiple conditions can, of course, also be done in a very similar way using the filter() function:

filter(dataf, Mean != 99 & Correct != 99)
Code language: R (r)

In the code chunk above, we basically just changed the subset() function to the filter() function. However, working with dplyr and the Tidyverse packages we can make use of the %>% operator to pipe the data like this:

dataf <- dataf %>% filter(Mean != 99 & Correcft != 99)
Code language: R (r)

For information about more operators in R see the following articles:

In the next section, you will learn how to delete rows with missing values. First, however, we need to add some missing values to the current dataframe. 

How to Remove All rows With Missing Values (NA) in R

Sometimes we need to remove the missing values from our data. In R, we can delete rows with missing values using the na.omit() function. First, however, we are going to add some missing values to our practice dataframe:

dataf_na <- dataf dataf_na$Correct[c(4, 7)] <- NA dataf_na$Mean[c(2, 3)] <- NA
Code language: R (r)

In the code chunk above, we added missing values to rows 4 and 7 in the Correct column and to rows 2 and 3 in the Mean column. Next, we can use the na.omit() function to delete the missing values. Here is how we remove the NA from our dataframe:

dataf_na <- na.omit(dataf_na)
Code language: R (r)

In the next section, we are going to have a quick look at how we can delete duplicated rows.

How to Remove Duplicate Rows in R

In this final example, we are going to remove duplicate rows in R:

dataf2 <- rbind(dataf, dataf[1,]) dataf2[!duplicated(dataf2), ]
Code language: R (r)

As you can see in the code chunk above, we first created two duplicate rows and, then, we used the duplicated() function. Note how we used the ! prior to the duplicated() function. We used this so that we don’t get the duplicated rows. Leaving this out would give us the duplicated row. See the post about removing duplicates in R for more information about dropping duplicated rows from your data.

Summary

In this post, you have learned how to delete a row in R using both base functions and functions from the package dplyr. Moreover, you have learned how to do carry out this task on both a dataframe and a matrix in R. Specifically, you have learned now to drop a row by the index of the row. Moreover, you have learned how to delete multiple rows using row indexes. In these two examples, deleting multiple rows was done using base R and dplyr (slice()) as well. As we sometimes want to delete a row based on a condition (or two) you have also learned how to drop a row based on a condition using the subset() and the filter() functions. The latter one, again, is a dplyr function. Finally, we went through two more examples in which we first deleted rows with missing values and then rows with duplicated values.

More Blog Posts

Here are more blog posts that you might find useful:

  • Save
Share via
Copy link
Powered by Social Snap