Press "Enter" to skip to content

R from Python – an rpy2 tutorial

rpy2 Tutorial

Recently I found the Python module rpy2. This module offers a Python interface to R. That is, rpy2 enables us to use the power of R in Python!

Obviously; rpy2 requires that you have both R (version 3.2+) and Python (versions 2.7 and 3.3) installed.  There are pre-compiled binaries available for Linux and Windows (unsupported and unofficial, however). In this short tutorial, I will show you how to do carry out a repeated measures ANOVA (rmANOVA) using the r-packages ‘afex‘ and ‘emmeans‘, Python, and rpy2. The post is now updated and you will find a YouTube video going through the rpy2 examples found in this blog post. You will also find another YouTube Video in which you will learn two methods to show R plots inline in Jupyter Notebooks. Make sure you check them out!

How to Install rpy2

First, we start the tutorial with installing rpy2. I installed rpy2 on Ubuntu 14.04 using Pip:

sudo pip install rpy2

If you are a Windows and/or have Anaconda Python distribution installed here’s how you can install rpy2:

How to Install Rpy2

rpy2 Example: Calling R from Python

When we have a working installation of rpy2, we continue the tutorial with importing the methods that we are going to use. In this rpy2 example we are going to use ‘afex’ to do the within-subject ANOVA and ’emmeans’ to do the follow-up analysis.

import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector

We also need to check whether the needed packages are installed.  In the example below we are calling r from Python to use the r-package utils to install the r-packages. The code is now updated thanks to comments on my YouTube Channel (the variable have_packages is removed. Thanks Sergey).

packageNames = ('afex', 'emmeans')
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)

packnames_to_install = [x for x in packageNames if not rpackages.isinstalled(x)]

if len(packnames_to_install) > 0:
    utils.install_packages(StrVector(packnames_to_install))

For this tutorial we borrow a data set from the package Psych. In this case, we use the r-function read.table to get the data.

data = robjects.r('read.table(file = 
       "http://personality-project.org/r/datasets/R.appendix3.data", header = T)')

repeated measures ANOVA using rpy2

In this part of the rpy2 tutorial we will carry out the actual analysis. In the example below we are actually using R in Python! More specifically, we are importing the r-package needed to carry out our ANOVA for within-subjects design. When this is done, we will use the function aov_ez to conduct the analysis.

afex = rpackages.importr('afex') 
model = afex.aov_ez('Subject', 'Recall', data, within='Valence')
print(model)

The last line above prints the results. A main effect of Valence was found.

   Effect         df  MSE          F ges p.value
1 Valence 1.15, 4.60 9.34 189.11 *** .93  < .0001

Follow-up analysis

Typically we are interested in following up the main effect, and with rpy2, we can do that using the r-package ’emmeans’. First, we need to import the package and then we do a pairwise contrast and adjust for familywise error using Holm-Bonferroni correction.

emmeans = rpackages.importr('emmeans', 
               robject_translations = {"recover.data.call": "recover_data_call1"})
pairwise = emmeans.emmeans(model, "Valence", contr="pairwise", adjust="holm")

That was easy, right. Now we have learned how to use R in Python!  Rpy2 is relatively easy to use I don’t think it will replace learning R. That is, you will have to know some R to make use of it to call R from Python. However, if you are a Python programmer and want to use available R-scripts, it might be useful and hopefully this rpy2 tutorial have made it somewhat easier for you! Noteworthy, I am not aware of any Python implementations of rmANOVA (except for the linear-mixed effects approach maybe). In fact, that is why I learned how to use rpy2 first; to use Python, and R, to conduct the analysis.  The above code examples can be found in this Jupyter Notebook.


Rpy2 Video Tutorial: Displaying R plots inline in Jupyter Notebooks

In this video we will learn how to display R plots in Jupyter Notebooks. In these two rpy2 examples we are creating a barplot (using R graphics) and a scatterplot (using ggplot2)!



Update: In this rpy2 tutorial you learned how to do a repeated measures ANOVA with Python and R. I have now found a Python package that allows Python ANOVA for within-subjects design (i.e., Python native); see my tutorial Repeated Measures ANOVA using Python.

4 Comments

  1. Sam Sam

    Hi, thanks for posting this. I think you may need ‘rpackages.importr’ in place of ‘importr’ when importing the afex and lsmeans packages. That’s what worked for me anyway.

    • Hey Sam,

      Thanks for the comment and correction,

      I will have a look and will update the code later,

      Erik

    • als als

      Thanks Sam – yes, good example Erik as I stumble through learning this, but the suggestion above by Sam is what worked.

      • Oh. I’ll update the post. I’ve been busy finishing up my Ph.D. thesis so this blog have not been updated that much lately.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: