In this short R tutorial, you will learn how to extract days from datetime in R. This tutorial will contain a couple of examples containing R code and comments about the code. Extracting days from datetime is a task that we may need to do whenever we are working with a dataset and you want to carry out a time series analysis but only want to include day, for example. It is, of course, possible to use the similar methods as in this post to extract timestamps from datetime, as well.
As R is a versatile statistical programming language, there are many ways to carry out this extraction task. For instance, we can extract days with the format function. However, first we need to have converted a vector, or a column in a dataframe, to a as.POSIXct class.
Another option, that this post also will cover, is using a package such as lubridate. As previously mentioned, this post will cover a couple of examples involving extracting days from datetime stored in a vector, and a dataframe.
First, however, you will find what you need to have to follow this R tutorial.
In this post, you will, obviously, have a working R environment installed. Furthermore, if you want to work with lubridate, you need to install this package (or the tidyverse packages, as it’s part of this bundle). Installing lubridate is, of course, optional but the tidyverse packages are very handy. For instance, it enables you to easily delete a column in R, calculating descriptive statistics, reading and writing xlsx files, adding an empty column, and creating dummy variables.
As you may already be aware of; the installation of R packages is quite easy. First, open up R (or RStudio) and type
install.packages("lubridate"). As previously mentioned, lubridate is part of the tidyverse packages. This means that typing
install.packages("tidyverse") will install lubridate, among other nice R packages.
Now that you know how to install lubridate, or tidyverse, we are going to continue with the examples.
Example 1: Extract Day from a Vector Containing Dates
In the first example, we are going to get day from a vector (
c()) containing datetime. First, we will convert the vector using the as.POSIXct function together with the function format. Now, this will make it easier to extract time, day, or year.
Here’s the the general syntax for extracting day from a vector containing datetime:
format(YourDates, format = "%d")
Obviously, YourDates should be a vector containing the dates you want to extract the days from. Here’s an example output:
1) How to Convert a Vector to POSIXct Class:
First, if we have data in a vector (c()), we need to convert it so that we can use format. Here’s how to use the
dates <- c("02/03/2014 10:41:00", "01/04/2015 13:40:00", "01/06/2016 09:40:00", "01/06/2017 09:41:00" , "01/03/2018 02:40:00", "02/11/2019 03:40:00") dates <- as.POSIXct(dates, format = "%m/%d/%Y %H:%M:%S")
Briefly explained, first we used the vector as input to the
as.POSIXct()function. Next, we used the format argument. Now, you may have noticed that we have different letters as input to the argument.
In the table below, the letters used are explained. Note, that you can change the order around a bit if you want to change the format.
2) Extracting Day from Datetime
Now, we are ready to use format() to split the days from the datetime. Here’s how to extract days:
format(dates, format = "%d")
As previously mentioned, the format of the date, and time, can be changed around by switching the letters.
Now, we can, of course, use POSIXct() direct and here’s a working code example for extracting day:
dates <- as.POSIXct(c("02/03/2014 10:41:00", "01/04/2015 13:40:00", "01/06/2016 09:40:00", "01/06/2017 09:41:00" , "01/03/2018 02:40:00", "02/11/2019 03:40:00"), format = "%m/%d/%Y %H:%M:%S") format(dates, format = "%d")
That was how to get the time from a vector containing datetimes. Now, you can also use a similar approach if you need to extract year from date in R. In the next example, we are going to work with a dataframe. Mostly, when working with real data we read our data from a file. Therefore, the next section will cover how to read a file (.csv) and how to extract time from the column containing date and timestamps.
Example 2: How to Extract Day from a Column in a Dataframe
Now, most of the time we read our data from a file (e.g., .xlsx). Therefore, this example is concerned with importing data from a CSV file. Here’s how to extract days from a column and add them to a new column:
library(httr) library(readr) GET('https://opendata.umea.se/explore/dataset/luftdata-vastra-esplanaden/download/?format=csv&timezone=Europe/Stockholm&lang=en&use_labels_for_header=true&csv_separator=%2C', write_disk(tf <- tempfile(fileext = ".csv"))) df <- read_csv(tf) df$Day <- format(df$Time, format="%d") head(df[4:length(names(df))])
First, as we are getting the file from online we used GET() to temporary store the file on the harddrive. As you may notice, we store the temporary path to the file in tf. Second, we read the file with read_csv().
That was it, now you have extracted days from the column time and added them to the new column called “day”.
Note, you need to have the readr package installed if you want to use read_delim(). Alternatively, you can use the read.csv() function to import the data from the CSV file.
As it’s possible to split day from datetime using other R packages, the next examples are covering how to use lubridate to get days from datetime.
Example 3: Separating Day from datetime in a Vector
Here’s how to split day from a vector containing datetime using lubridate day() and format():
dates <- c("02/03/2014 10:41:00", "01/04/2015 13:40:00", "01/06/2016 09:40:00", "01/06/2017 09:41:00" , "01/03/2018 02:40:00", "02/11/2019 03:40:00") dates <- as.POSIXct(dates, format = "%m/%d/%Y %H:%M:%S") days <- day(dates) days
Now, here’s how to create a dataframe and separate day from datetime:
library(lubridate) dates <- dmy_hms(c("02/03/2014 10:41:00", "01/04/2015 13:40:00", "01/06/2016 09:40:00", "01/06/2017 09:41:00" , "01/03/2018 02:40:00", "02/11/2019 03:40:00")) df_dates <- data.frame(date = format(dates, format = "%Y-%m:%d") , Day= format(dates, format = "%d"))
In the example above, we created a dataframe and extracted date to a new column, that we also added. Here’s how the first five row of the dataframe looks like:
Note, in the previous example, you can of course work with lubridate to extract time, day, or year from a column in a dataframe (like in example 2). If you need to you can create a sequence in R containing numbers.
In this short R tutorial, you have learned how to extract day from datetime using as.POSIXct(), format, and lubridate (i.e., day()). First, you learned how to convert a vector containing datetime to a POSIXct class. This was done to enable the use format to extract day. Second, you have also learned how to import data, split a column containing datetime and add it to a new column, Finally, you have also learned how to do the same using lubridate and format() as well as creating a new dataframe.
Grolemund, G., & Wickham, H. (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3).
Mailund T. (2019) Working with Dates: lubridate. In: R Data Science Quick Reference. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4894-2_10 (Paywalled).