Previously I have shown how to analyze data collected using within-subjects designs using rpy2 (i.e., R from within Python) and Pyvttbl. In this post I will extend it into a factorial ANOVA using Python (i.e., Pyvttbl). In fact, we are going to carry out a Two-way ANOVA but the same method will enable you to analyze any factorial design. I start with importing the Python libraries that are going to be use.
In an earlier post I showed four different techniques that enables one-way analysis of variance (ANOVA) using Python. In this post we are going to learn how to do two-way ANOVA for independent measures using Python.
An important advantage of the two-way ANOVA is that it is more efficient compared to the one-way. There are two assignable sources of variation – supp and dose in our example – and this helps to reduce error variation thereby making this design more efficient. Two-way ANOVA (factorial) can be used to, for instance, compare the means of populations that are different in two ways. It can also be used to analyse the mean responses in an experiment with two factors. Unlike One-Way ANOVA, it enables us to test the effect of two factors at the same time. One can also test for independence of the factors provided there are more than one observation in each cell. The only restriction is that the number of observations in each cell has to be equal (there is no such restriction in case of one-way ANOVA).
The current post will focus on how to carry out between-subjects ANOVA using Python. As mentioned in an earlier post (Repeated measures ANOVA with Python) ANOVAs are commonly used in Psychology.
We start with some brief introduction on theory of ANOVA. If you are more interested in the four methods to carry out one-way ANOVA with Python click here. ANOVA is a means of comparing the ratio of systematic variance to unsystematic variance in an experimental study. Variance in the ANOVA is partitioned in to total variance, variance due to groups, and variance due to individual differences.
A common method in experimental psychology is within-subjects designs. One way to analysis the data collected using within-subjects designs are using repeated measures ANOVA. I recently wrote a post on how to conduct a repeated measures ANOVA using Python and rpy2. I wrote that post since the great Python package statsmodels do not include repeated measures ANOVA. However, the approach using rpy2 requires R statistical environment installed. Recently, I found a python library called pyvttbl whith which you can do within-subjects ANOVAs. Pyvttbl enables you to create multidimensional pivot tables, process data and carry out statistical tests. Using the method anova on pyvttbl’s DataFrame we can carry out repeated measures ANOVA using only Python.
Descriptive Statistics After data collection, most Psychology researchers use different ways to summarise the data. In this tutorial we will learn how to do descriptive…
In this post we will learn how to reverse pandas dataframe. We start by changing the first column with the last column and continue with reversing the order completely. After we have learned how to do that we continue by reversing the order of the rows. That is, pandas data frame can be reversed such that the last column becomes the first or such that the last row becomes the first.
Spyder is the best Python IDE that I have tested so far for doing data analysis, but also for plain programming. In this post I will start to briefly describe the IDE. Following the description of this top IDE the text will continue with a discussion of my favorite features. You will also find out how to install Spyder on Ubuntu 14.04 and at the end of the post you will find a comparison of Rodeo (a newer IDE more RStudio like) and Spyder.
When I started programming in Python I used IDLE which is the IDE that you will get with your installation of Python (e.g., on Windows computers). I actually used IDLE IDE for some time. It was not until I started to learn R and found RStudio IDE. I thought that RStudio was great (and it still is!). However, after learning R and RStudio I started to look for a better Python IDE.
The aim of this post is to show you why you, as a psychology student or researcher (or any other kind researcher or student) should learn to program. The post is structured as follows. First I start with discussing why you should learn programming and then give some examples when programming skills are useful. I continue to suggest two programming languages that I think all Psychology students and researchers should learn.
Good resources for learning R as a Psychologist are hard to find. By that I mean that there are so many great sites and blogs around the internet to learn R. Thus, it may be hard to find learning resources that targets Psychology researchers.
Recently I wrote about 4 good R books targeted for Psychology students and researchers (i.e., R books for Psychologists). There are, however, of course other good resources for Psychological researchers to learn R programming.
Therefore, this post will list some of the best blogs and sites to learn R. The post will be divided into two categories; general and Psychology focused R sites and blogs. For those who are not familiar with R I will start with a brief introduction on what R is (if you know R already; click here to skip to the links).
What is R?
R is a free and open source programming language and environment. Data analysis in R is carried out by writing scripts and functions. Finally, R is a complete, interactive, and object-oriented language.
In R statistical environment you are able to carry out a variety of statistical and graphical techniques. For instance, linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, and many more can be carried using both frequentist and Bayesian paradigms.
One of the main things that I like with R is the broad and helpful community. This also means that there are many good resources for learning the language.