Previously I have shown how to analyze data collected using within-subjects designs using rpy2 (i.e., R from within Python) and Pyvttbl. In this post I will extend it into a factorial ANOVA using Python (i.e., Pyvttbl). In fact, we are going to carry out a Two-way ANOVA but the same method will enable you to analyze any factorial design. I start with importing the Python libraries that  are going to be use.

```import numpy as np
import pyvttbl as pt
from collections import namedtuple
```

Numpy is going to be used to simulate data. I create a data set in which we have one factor of two levels (P) and a second factor of 3 levels (Q). As in many of my examples the dependent variable is going to be response time (rt) and we create a list of lists for the different population means we are going to assume (i.e., the variable ‘values’). I was a bit lazy when coming up with the data so I named the independent variables ‘iv1’ and ‘iv2’. However, you could think of iv1 as two different memory tasks; verbal and spatial memory. Iv2 could be different levels of distractions (no distraction, synthetic sounds, and speech, for instance).

## Simulate data

``````N = 20
P = [1,2]
Q = [1,2,3]

values = [[998,511], [1119,620], [1300,790]]

sub_id = [i+1 for i in xrange(N)]*(len(P)*len(Q))
mus = np.concatenate([np.repeat(value, N) for value in values]).tolist()
rt = np.random.normal(mus, scale=112.0, size=N*len(P)*len(Q)).tolist()
iv1 = np.concatenate([np.array([p]*N) for p in P]*len(Q)).tolist()
iv2 = np.concatenate([np.array([q]*(N*len(P))) for q in Q]).tolist()

Sub = namedtuple('Sub', ['Sub_id', 'rt','iv1', 'iv2'])
df = pt.DataFrame()

for idx in xrange(len(sub_id)):
df.insert(Sub(sub_id[idx],rt[idx], iv1[idx],iv2[idx])._asdict())
``````

I start with a boxplot using the method boxplot from Pyvttbl. As far as I can see there is not much room for changing the plot around. We get this plot and it is really not that beautiful.

```df.box_plot('rt', factors=['iv1', 'iv2'])
```

To run the Two-Way ANOVA is simple; the first argument is the dependent variable, the second the subject identifier, and than the within-subject factors. In two previous posts I showed how to carry out one-way and two-way ANOVA for independent measures. One could, of course combine these techniques, to do a split-plot/mixed ANOVA by adding an argument ‘bfactors’ for the between-subject factor(s).

```aov = df.anova('rt', sub='Sub_id', wfactors=['iv1', 'iv2'])
print(aov)
```

The output one get from this is an ANOVA table. In this table all metrics needed plus some more can be found; F-statistic, p-value, mean square errors, confidence intervals, effect size (i.e., eta-squared) for all factors and the interaction. Also, some corrected degree of freedom and mean square error can be found (e.g., Grenhouse-Geisser corrected). The output is in the end of the post. It is a bit hard to read.  If you know any other way to do a repeated measures ANOVA using Python please let me know. Also, if you happen to know that you can create nicer plots with Pyvttbl I would also like to know how! Please leave a comment.

Update (2017-07-03): If your installed version of Numpy is greater than 1.11.x, you will run into a Float and NoneType error. One quick solution for this is to downgrade Numpy to 1.11.x. I created a post, Step-by-step guide for solving the Pyvttbl Float and NoneType error, in which I show how to install Numpy 1.11.x in an virtual environment. This way, you can run your ANOVAs, without having to uninstall Numpy.

## Output ANOVA table

```rt ~ iv1 * iv2

TESTS OF WITHIN SUBJECTS EFFECTS

Measure: rt
Source                            Type III      eps      df         MS           F        Sig.      et2_G   Obs.     SE     95% CI    lambda    Obs.
SS                                                                                                         Power
=======================================================================================================================================================
iv1           Sphericity Assumed   4419957.211       -        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1
Greenhouse-Geisser   4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1
Huynh-Feldt          4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1
Box                  4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1)    Sphericity Assumed    258996.722       -       19     13631.406
Greenhouse-Geisser    258996.722       1       19     13631.406
Huynh-Feldt           258996.722       1       19     13631.406
Box                   258996.722       1       19     13631.406
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv2           Sphericity Assumed   5257766.564       -        2   2628883.282   206.008   4.023e-21   3.920     40   18.448   36.158    433.701       1
Greenhouse-Geisser   5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1
Huynh-Feldt          5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1
Box                  5257766.564   0.500        1   5257766.564   206.008   1.192e-11   3.920     40   18.448   36.158    433.701       1
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv2)    Sphericity Assumed    484921.251       -       38     12761.086
Greenhouse-Geisser    484921.251   0.550   20.911     23189.668
Huynh-Feldt           484921.251   0.550   20.911     23189.668
Box                   484921.251   0.500       19     25522.171
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv1 *         Sphericity Assumed   1622027.598       -        2    811013.799    83.220   1.304e-14   1.209     20   22.799   44.687     87.600   1.000
iv2           Greenhouse-Geisser   1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000
Huynh-Feldt          1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000
Box                  1622027.598   0.500        1   1622027.598    83.220   2.262e-08   1.209     20   22.799   44.687     87.600   1.000
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1 *   Sphericity Assumed    370327.311       -       38      9745.456
iv2)          Greenhouse-Geisser    370327.311   0.545   20.728     17866.175
Huynh-Feldt           370327.311   0.545   20.728     17866.175
Box                   370327.311   0.500       19     19490.911

TABLES OF ESTIMATED MARGINAL MEANS

Estimated Marginal Means for iv1
iv1    Mean     Std. Error   95% Lower Bound   95% Upper Bound
==============================================================
1     983.755       43.162           899.157          1068.354
2     599.917       21.432           557.909           641.925

Estimated Marginal Means for iv2
iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound
===============================================================
1      525.025       19.324           487.150           562.899
2      814.197       49.416           717.342           911.053
3     1036.286       43.789           950.459          1122.114

Estimated Marginal Means for iv1 * iv2
iv1   iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound
=====================================================================
1     1      553.522       24.212           506.066           600.978
1     2     1103.488       28.411          1047.804          1159.173
1     3     1294.256       19.773          1255.501          1333.011
2     1      496.528       29.346           439.009           554.047
2     2      524.906       20.207           485.301           564.512
2     3      778.317       21.815           735.560           821.073

```

1. Veronica

Hi there. Thanks for your excellent blog. I’m trying to run a two-way repeated mesures ANOVA using pyvttbl as you explain. I use python 2.7 and installed pyvttbl via pip. I was able to import pyvttbl and create the dataframe just fine. However, when I run the test, I get this error: TypeError: unsupported operand type(s) for +: ‘float’ and ‘NoneType’. Can you help? Thanks in advance!

• Hey Veronica,

Have you solved the problem? When I wrote this blog, this did not happen. However, I tried to run the script again and get the same problem. I am not sure what is going on here but I will try to find out given that you did not solve it.

Please let me know if and how you solved the problem.

Erik

• Thanks for leaving your comment here, Damien!

I will update the posts later and reference to your solution.

Erik

2. Shengjie

Hi Erik,
I met the same problem when I ran my analysis. In my study, the design is a 2x3x3 repeated measure ANOVA. And my code is straightforward, import pyvttbl as pt
df = pt.DataFrame()