Month: February 2016

The current post will focus on how to carry out between-subjects ANOVA using Python. As mentioned in an earlier post (Repeated measures ANOVA with Python) ANOVAs are commonly used in Psychology. We start with some brief introduction on theory of ANOVA. If you are more interested in the four methods to carry out one-way ANOVA with Python click here.

In this post we will learn how to carry out ANOVA using SciPy, calculating it “by hand” in Python, using Statsmodels, and Pyvttbl.

Update: the Python package Pyvttbl is not maintained since a couple of year but there’s a new package called Pingouin. As a bonus, how to use this package is added in the end of the post.

Introduction to ANOVA

ANOVA is a means of comparing the ratio of systematic variance to unsystematic variance in an experimental study. Variance in the ANOVA is partitioned in to total variance, variance due to groups, and variance due to individual differences.

A common method in experimental psychology is within-subjects designs. One way to analysis the data collected using within-subjects designs are using repeated measures ANOVA. I recently wrote a post on how to conduct a repeated measures ANOVA using Python and rpy2. I wrote that post since the great Python package statsmodels do not include repeated measures ANOVA. However, the approach using rpy2 requires R statistical environment installed. Recently, I found a python library called pyvttbl whith which you can do within-subjects ANOVAs. Pyvttbl enables you to create multidimensional pivot tables, process data and carry out statistical tests. Using the method anova on pyvttbl’s DataFrame we can carry out repeated measures ANOVA using only Python.

Note, pyvttbl is no longer maintained and you should see new post using Pingouin for carrying out repeated Measures ANOVA: Repeated Measures ANOVA in R and Python using afex & pingouin