One of the great things with programming is that you can automate things that is boring. For instance, as a student I often got schedules in the form of Word documents.

I prefer to have all my activities in a digital calendar and used to manually enter every class, seminar, and so on from a course into my calendar.  One day I got tired and thought that  I could probably do automate this task using Python.

After some searching around on the Internet, I found the Python packages python-docx and iCalendar. In this post, I will show you how to use these to packages to create an iCalender file that can be loaded into a lot of available calendars (e.g., Google Calendar, Outlook Calendar).

Installing python-docx & iCalendar

Before learning how to scrape a table from a Word Document we need to install the python packages python-docx and icalendar.  Both of these Python packages can be installed using conda or pip.

How to install python-docx and icalendar using Pip:

Here’s how to install Python packages with Pip (i.e., python-docx, icalendar):

# Installing python-docx and iCalendar pip install python-docx icalendar
Code language: Bash (bash)

Note, if needed pip can be used to install a specific version of a Python package. For example, if you need to install an older version of python-docx you just type: pip install python-docx==Specific.Version (replacing “Specific.Version” with the version you want to install, of course).

How to install python-docx and icalendar using Conda:

Here’s how to install the pacakges with conda:

# Using conda to install the Python packages: conda install -c conda-forge python-docx icalendar
Code language: Bash (bash)

In the example code, I used a table from a Word document containing 4 columns. It is a pretty simple example but in the first column store the date, the second the time, third the room (location), and the last the activity of the event (e.g., lecture).

  • Save
Schedule

Extracting a table from a Word Document

In the first code chunk, below, we start by importing the needed modules. Apart from using Document from python-docx, Calendar, and Event from iCalendar, we are going to use datetime from the library datetime. Datetime is used to store the date in a format that the iCalendar package “likes”.

from datetime import datetime from docx import Document from icalendar import Calendar, Event # Importing the docx document = Document('course_schedule.docx') # Fetching the first table: table = document.tables[0] # Creating a list and a dictionary data = [] keys = {} # Looping through the each line in the Word table: for i, row in enumerate(table.rows): # Getting text from the cells text = (cell.text for cell in row.cells) # Getting they column names: if i == 0: keys = tuple(text) continue # Creating a dictionary row_data = dict(zip(keys, text))
Code language: Python (python)

In the next chunk of code (in the same loop as above) we split the date and time. We do this since due to the formatting of the date and time in the example (“5/4” and “9-12).

As previously mentioned the date need to be formatted in a certain way (e.g., using Pythons datetime). In the table from the Word document, some of the events are deadlines and, thus, have no set time. Therefore, we need to see if we have a time to use. If not, we set the time to 17:00-17:01.

There is probably a better way to do this but this will do for now. The last line adds each event (as a Python dictionary) to our list containing all data.

# Extracting date and time: e_date = row_data['Date'].split('/') e_time = row_data['Time'].split('-') if len(e_time) > 1: row_data[u'dtstart'] = datetime(2017, int(e_date[1]), int(e_date[0]), int(e_time[0]), 0, 0) row_data[u'dtend'] = datetime(2017, int(e_date[1]), int(e_date[0]), int(e_time[1]), 0, 0) else: row_data[u'dtstart'] = datetime(2017, int(e_date[1]), int(e_date[0]), 17,1, 0) row_data[u'dtend'] = datetime(2017, int(e_date[1]), int(e_date[0]), 17 ,0,0) data.append(row_data)
Code language: Python (python)

How to use Python to create a iCalendar File

Now that we have a list of dictionaries containing our lectures/seminars (one for each dictionary) we can use iCalendar to create the calendar file (.ics). This file, in turn, can be used to load into your Outlook calendar.

First, we create the calendar object and the continue with looping through our list of dictionaries. In the loop we create an event and add the information. In the example here we use the activity as both summary and description but we could have had a summary of the activity and a more detailed description if we’d liked.

The crucial parts may be, are the ‘dtstart’ and ‘dtend’. This is the starting time and ending time of the event (e.g., a lecture). We continue to add the location (e.g., the room of the event) and add the event to our calendar. Finally, we create a file (‘schedule.ics’), write the calendar to the file, and close the file.

cal = Calendar() for row in data: event = Event() event.add('summary', row['Activity']) event.add('dtstart', row['dtstart']) event.add('dtend', row['dtend']) event.add('description', row['Activity']) event.add('location', row['Room']) cal.add_component(event) f = open('course_schedule.ics', 'wb') f.write(cal.to_ical()) f.close()
Code language: Python (python)

Now we have our iCalendar file (course_schedule.ics) and can load it into our calendar software. I typically use Lightning (a calendar add-on for Thunderbird).

How to Open an iCalendar (.ics) File in Thunderbird

To open the iCalendar file we created using Python go to File, Open, and Calendar File. Finally select the your iCalendar file:

After you done that your new schedule should be loaded into Lightning. Your schedule will be loaded as a separate calendar. As you can see in the image below your lecture and computer labs will show up.

In this post, we learned how to use Python (python-docx) to extract a schedule from a table in a Word Document (.docx). We used the data to create an iCalendar file that we can load into many Calendar applications (e.g., Google, Lightning).

Copy link
Powered by Social Snap