Python is gaining popularity in many fields of science. This means that there also are many applications and libraries specifically for use in Psychological research.
For instance, there are Python packages for collecting data, doing basic statistics, and analyzing brain imaging data. In this post, I have collected some useful Python packages for researchers within the field of Psychology and Neuroscience. I have used and tested some of them, but others I have yet to try.
Table of Contents
- Experiment building applications/libraries
- Data analysis in Python
Experiment building applications/libraries
Expyriment is a Python library which makes the programming of Psychology experiments a lot easier than using Python. It contains classes and methods for creating fixation crosses’, visual stimuli, collecting responses, etc. (see my video how-to: Expyriment Tutorial: Creating a Flanker Task using Python on Youtube if you want to learn more).
Modular Psychophysics is a collection of tools that aims to implement a modular approach to Psychophysics. It enables us to write experiments in different languages. As far as I understand, you can use both MATLAB and R to control your experiments. That is, the timeline of the experiment can be carried out in another language (e.g., MATLAB).
However, it seems like the experiments are created using Python. Your experiments can be run both locally and over networks. Have yet to test this out.
OpenSesame is a Python application. Using OpenSesame, one can create Psychology experiments. It has a graphical user interface (GUI) that allows users to drag and drop objects on a timeline. More advanced experimental designs can be implemented using inline Python scripts.
Learn how to use OpenSesame:
PsychoPy is also a Python application for creating Psychology experiments. It comes packed with a GUI, but the API can also be used for writing Python scripts. I have written a bit more thoroughly about PsychoPy: PsychoPy. In the latest version of Psychopy, we can now also build online experiments!
I have written extensively on Expyriment, PsychoPy, Opensesame, and some other libraries for creating experiment in my post Python apps and libraries for creating experiments.
Data analysis in Python
Psychology and Neuroscience
PsyUtils “The psyutils package is a collection of utility functions useful for generating visual stimuli and analysing the results of psychophysical experiments. It is a work in progress, and changes as I work. It includes various helper functions for dealing with images, including creating and applying various filters in the frequency domain.”
Psisignifit is a toolbox that allows you to fit psychometric functions. Further, hypotheses about psychometric data can be tested. Psisignfit allows for full Bayesian analysis of psychometric functions that includes Bayesian model selection and goodness of fit evaluation among other great things.
Pygaze is a Python library for eye-tracking data & experiments. It works as a wrapper around many other Python packages (e.g., PsychoPy, Tobii SDK). Pygaze can also, through plugins, be used from within OpenSesame.
General Recognition Theory
General Recognition Theory (GRT) is a fork of a MATLAB toolbox. GRT is ” a multi-dimensional version of signal detection theory.” (see link for more information).
MNE for EEG in Python
MNE is a library designed for processing electroencephalography (EEG) and magnetoencephalography (MEG) data. Collected data can be preprocessed and denoised. Time-frequency analysis and statistical testing can be carried out. MNE can also be used to apply some machine learning algorithms. Although, mainly focused on EEG and MEG data some of the statistical tests in this library can probably be used to analyze behavioral data (e.g., ANOVA).
Kabuki is a Python library for the effortless creation of hierarchical Bayesian models. It uses library PyMC. Using Kabuki you will get formatted summary statistics, posterior plots, and many more. There is, further, a function to generate data from a formulated model. It seems that there is an intention to add more commonly used statistical tests (i.e., Bayesian ANOVA) in the future.
PyMC3. Make sure you use PyMC3, as it’s the latest version, of PyMC. See Probabilistic Programming in Python (Bayesian Data Analysis) for a great tutorial on how to carry out Bayesian statistics using Python and PyMC3.
NIPY: “Welcome to NIPY. We are a community of practice devoted to the use of the Python programming language in the analysis of neuroimaging data”. Here different packages for brain imaging data can be found.
General Python Packages
Although, many of the above libraries probably can be used within other research fields there are also more libraries for pure statistics & visualization.
Descriptive and parametric statistics in Python
PyMVPA is a Python library for MultiVariate Pattern Analysis. It enables statistical learning analyses of big data. This may be suitable for brain imaging data.
Pandas is a Python library for fast, flexible and expressive data structures. Researchers and analysists with an R background will find Pandas data frame objects very similar to Rs. Data can be manipulated, summarised, and some descriptive analysis can be carried out (e.g., see Descriptive Statistics Using Python for some examples using Pandas).
Recently, I’ve also written a Pandas DataFrame Tutorial. Make sure to check it out.
- Check this YouTube Video to learn how to install Pandas. Installing Python packages is quite easy using pip or conda.
Statsmodels is a Python library for data exploration, estimation of statistical models, and statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.
Scikit-learn is an excellent Python package if you want to learn how to do machine learning. For instance, you can do random forest, and extremely random forest, analysis with Python and sci-kit learn.
Among many methods regression, generalized linear, and non-parametric tests can be carried out using statsmodels.
Here’s Statsmodels tutorial for carrying out analysis of variance using Statsmodels:
Pyvttbl is a library for creating Pivot tables. One can further process data and carry out statistical computations using Pyvttbl. Sadly, it seems like it is not updated anymore and is not compatible with other packages (e.g., Pandas).
If you are interested in how to carry out repeated measures ANOVAs in Python this is a package that enables this kind of analysis (e.g., see Repeated Measures ANOVA using Python and Two-way ANOVA for Repeated Measures using Python).
Data Visualisation in Python
There are many Python libraries for visualization of data. Below are the ones I have worked with. Note, pandas and statsmodels also provides methods for plotting data.
All three libraries are compatible with Pandas which makes data manipulation and visualization very easy.
Matplotlib is a package for creating two-dimensional plots.
Seaborn is a library based on Matplotlib. Using seaborn you can create ready-to-publish graphics (e.g., see the Figure above for a boxplot on some response time data). I have also used Seaborn to visualize response time distributions. See this Python and Seaborn Data Visualization Tutorial. If you need to create a scatter plot with Seaborn in Python, make sure to check that post out.
Make sure to watch Python Data Visualization Tutorial: How to Create a Scatter Plot on YouTube and you will learn how to create scattergraphs using Pandas and Seaborn. If you plan to use your plots (e.g., in scientific publication) you can use matplotlib (savefig) to save Seaborn plots to PNG, PDF, EPS, among other file types.
Ggplot is a visualization library based on the R package Ggplot2. That is, if you are familiar with R and Ggplot2 transitioning to Python and the package Ggplot will be easy.
Many of the libraries for analysis and visualization can be installed separately and, more or less, individually. I do however recommend that you install a scientific Python distribution. This way you will get all you need (and much more) by one click (e.g., Pandas, Matplotlib, NumPy, Statsmodels, Seaborn). I suggest you have a look at the distributions Anaconda or Python(x, y). Note, that installing Python(x, y) will give you the Python IDE Spyder.
The last Python package for Psychology I am going to list is PsychoPy_ext. Although PsychoPy_ext may be considered a library for building experiments it seems like the general aim behind the package is to make reproducible research easier. That is, analysis and plotting of the experiments can be carried out. I also think it is interesting that there seems to be a way to autorun experiments (a neat function I know that e-prime have, for instance).
That was it, if you happen to know any other Python libraries or applications with a focus on Psychology (e.g., psychophysics) or for statistical methods.
Drop a line if you know any useful Python packages that can be useful for psychologists!