Previously I have shown how to analyze data collected using within-subjects designs using rpy2 (i.e., R from within Python) and Pyvttbl. In this post I will extend it into a factorial ANOVA using Python (i.e., Pyvttbl). In fact, we are going to carry out a Two-way ANOVA but the same method will enable you to analyze any factorial design. I start with importing the Python libraries that  are going to be use.

import numpy as np
import pyvttbl as pt
from collections import namedtupleCode language: Python (python)

Numpy is going to be used to simulate data. I create a data set in which we have one factor of two levels (P) and a second factor of 3 levels (Q). As in many of my examples the dependent variable is going to be response time (rt) and we create a list of lists for the different population means we are going to assume (i.e., the variable ‘values’). I was a bit lazy when coming up with the data so I named the independent variables ‘iv1’ and ‘iv2’. However, you could think of iv1 as two different memory tasks; verbal and spatial memory. Iv2 could be different levels of distractions (no distraction, synthetic sounds, and speech, for instance).

Simulate data

N = 20
P = [1,2]
Q = [1,2,3]

values = [[998,511], [1119,620], [1300,790]]

sub_id = [i+1 for i in xrange(N)]*(len(P)*len(Q))
mus = np.concatenate([np.repeat(value, N) for value in values]).tolist()
rt = np.random.normal(mus, scale=112.0, size=N*len(P)*len(Q)).tolist()
iv1 = np.concatenate([np.array([p]*N) for p in P]*len(Q)).tolist()
iv2 = np.concatenate([np.array([q]*(N*len(P))) for q in Q]).tolist()


Sub = namedtuple('Sub', ['Sub_id', 'rt','iv1', 'iv2'])               
df = pt.DataFrame()

for idx in xrange(len(sub_id)):
    df.insert(Sub(sub_id[idx],rt[idx], iv1[idx],iv2[idx])._asdict())Code language: Python (python)

I start with a boxplot using the method boxplot from Pyvttbl. As far as I can see there is not much room for changing the plot around. We get this plot and it is really not that beautiful.

df.box_plot('rt', factors=['iv1', 'iv2'])Code language: Python (python)
A Boxplot before we do our Python two-way ANOVA
  • Save
Boxplot Pyvttbl

Two-way ANOVA for within-subjects design in Python

To run the Two-Way ANOVA is simple; the first argument is the dependent variable, the second the subject identifier, and then the within-subject factors. In two previous posts I showed how to carry out one-way and two-way ANOVA for independent measures. One could, of course, combine these techniques, to do a split-plot/mixed ANOVA by adding an argument ‘bfactors’ for the between-subject factor(s).

aov = df.anova('rt', sub='Sub_id', wfactors=['iv1', 'iv2'])
print(aov)Code language: Python (python)

The output one gets from this is an ANOVA table. In this table all metrics needed plus some more can be found; F-statistic, p-value, mean square errors, confidence intervals, effect size (i.e., eta-squared) for all factors and the interaction. Also, some corrected degrees of freedom and mean square error can be found (e.g., Grenhouse-Geisser corrected). The output is in the end of the post. It is a bit hard to read.  If you know any other way to do a repeated-measures ANOVA using Python please let me know. Also, if you happen to know that you can create nicer plots with Pyvttbl I would also like to know how! Please leave a comment.

Update (2017-07-03): If your installed version of Numpy is greater than 1.11.x, you will run into a Float and NoneType error. One quick solution for this is to downgrade Numpy to 1.11.x. I created a post, Step-by-step guide for solving the Pyvttbl Float and NoneType error, in which I show how to install Numpy 1.11.x in an virtual environment. This way, you can run your ANOVAs, without having to uninstall Numpy.

Output ANOVA table

rt ~ iv1 * iv2

TESTS OF WITHIN SUBJECTS EFFECTS

Measure: rt
  Source                            Type III      eps      df         MS           F        Sig.      et2_G   Obs.     SE     95% CI    lambda    Obs.  
                                       SS                                                                                                         Power 
=======================================================================================================================================================
iv1           Sphericity Assumed   4419957.211       -        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Greenhouse-Geisser   4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Huynh-Feldt          4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
              Box                  4419957.211       1        1   4419957.211   324.248   2.128e-13   3.295     60   16.096   31.548   1023.941       1 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1)    Sphericity Assumed    258996.722       -       19     13631.406                                                                           
              Greenhouse-Geisser    258996.722       1       19     13631.406                                                                           
              Huynh-Feldt           258996.722       1       19     13631.406                                                                           
              Box                   258996.722       1       19     13631.406                                                                           
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv2           Sphericity Assumed   5257766.564       -        2   2628883.282   206.008   4.023e-21   3.920     40   18.448   36.158    433.701       1 
              Greenhouse-Geisser   5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1 
              Huynh-Feldt          5257766.564   0.550    1.101   4777252.692   206.008   1.320e-12   3.920     40   18.448   36.158    433.701       1 
              Box                  5257766.564   0.500        1   5257766.564   206.008   1.192e-11   3.920     40   18.448   36.158    433.701       1 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv2)    Sphericity Assumed    484921.251       -       38     12761.086                                                                           
              Greenhouse-Geisser    484921.251   0.550   20.911     23189.668                                                                           
              Huynh-Feldt           484921.251   0.550   20.911     23189.668                                                                           
              Box                   484921.251   0.500       19     25522.171                                                                           
-------------------------------------------------------------------------------------------------------------------------------------------------------
iv1 *         Sphericity Assumed   1622027.598       -        2    811013.799    83.220   1.304e-14   1.209     20   22.799   44.687     87.600   1.000 
iv2           Greenhouse-Geisser   1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000 
              Huynh-Feldt          1622027.598   0.545    1.091   1486817.582    83.220   6.085e-09   1.209     20   22.799   44.687     87.600   1.000 
              Box                  1622027.598   0.500        1   1622027.598    83.220   2.262e-08   1.209     20   22.799   44.687     87.600   1.000 
-------------------------------------------------------------------------------------------------------------------------------------------------------
Error(iv1 *   Sphericity Assumed    370327.311       -       38      9745.456                                                                           
iv2)          Greenhouse-Geisser    370327.311   0.545   20.728     17866.175                                                                           
              Huynh-Feldt           370327.311   0.545   20.728     17866.175                                                                           
              Box                   370327.311   0.500       19     19490.911                                                                           

TABLES OF ESTIMATED MARGINAL MEANS

Estimated Marginal Means for iv1
iv1    Mean     Std. Error   95% Lower Bound   95% Upper Bound 
==============================================================
1     983.755       43.162           899.157          1068.354 
2     599.917       21.432           557.909           641.925 

Estimated Marginal Means for iv2
iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound 
===============================================================
1      525.025       19.324           487.150           562.899 
2      814.197       49.416           717.342           911.053 
3     1036.286       43.789           950.459          1122.114 

Estimated Marginal Means for iv1 * iv2
iv1   iv2     Mean     Std. Error   95% Lower Bound   95% Upper Bound 
=====================================================================
1     1      553.522       24.212           506.066           600.978 
1     2     1103.488       28.411          1047.804          1159.173 
1     3     1294.256       19.773          1255.501          1333.011 
2     1      496.528       29.346           439.009           554.047 
2     2      524.906       20.207           485.301           564.512 
2     3      778.317       21.815           735.560           821.073 

Alternative Data Analysis Techniques

In this section, you will find some blog posts that are covering other data analysis tecniques:

  • Save
Share via
Copy link
Powered by Social Snap