In this Pandas tutorial we will learn how to work with Pandas dataframes. More specifically, we will learn how to read and write Excel (i.e., xlsx) and CSV files using Pandas.

We will also learn how to add a column to Pandas dataframe object, and how to remove a column. Finally, we will learn how to subset and group our dataframe. If you are not familiar with installing Python packages I have recorded a YouTube video explaining how to install Pandas. There’s also a playlist with videos towards the end of the post with videos of all topics covered in this post.

Inspired by my post for the JEPS Bulletin (Python programming in Psychology), where I try to show how Python can be used from collecting to analyzing and visualizing data, I have started to learn more data exploring techniques for Psychology experiments (e.g., response time and accuracy). Here are some methods, using Python, for visualization of distributed data that I have learned; kernel density estimation, cumulative distribution functions, delta plots, and conditional accuracy functions. These graphing methods let you explore your data in a way just looking at averages will not (e.g., Balota & Yap, 2011).

Required Python packages

I used the following Python packages; Pandas for data storing/manipulation, NumPy for some calculations, Seaborn for most of the plotting, and Matplotlib for some tweaking of the plots. Any script using these functions should import them:

In this post, I will describe the existing free Python applications and libraries for creating experiments. So far, I have only used PsychoPy but I plan to test most of them. At least the ones that seem to still be maintained. All applications and libraries are open-source which makes it possible to download the source code and add your own stuff to it.