In this tutorial, we will learn the basics of installing, working and updating packages in Python. First, we will learn how to install Python packages, then how to use them, and finally, how to update Python packages when needed. More specifically, we are going to learn how to install and upgrade packages using pip, conda, and Anaconda Navigator.
Python programming related stuff
In the posts in this category you will find Python scripts. Python is said to be one of the easiest programming language to learn. Learning one language will also make it easier to learn another, much more advanced, one. As a Bachelor student in the cognitive science programme I got to take Python courses. However, it was not before I started my Ph.D years that I realized how much use I had because I knew some programming.
For a psychology researcher Python might be ideal since it is relatively easy to learn and there is a huge Python community to get help from. How to build experiments using free and open-source tools such as PsychoPy, OpenSesame, Expyriment, and PyEPL is, for instance, something you could find in this category.
In this post, we are going to learn how to read Stata (.dta) files in Python.
As previously described (in the read .sav files in Python post) Python is a general-purpose language that also can be used for doing data analysis and data visualization. One example of data visualization will be found in this post.
One potential downside, however, is that Python is not really user-friendly for data storage. This has, of course, lead to that our data many times are stored using Excel, SPSS, SAS, or similar software. See, for instance, the posts about reading .sav, and sas files in Python:
When a program becomes very long and complex, it is convenient to divide it into subroutines, each of which implements a specific task. However, subroutines cannot be executed independently, but only at the request of the main program, which is responsible for coordinating the use of subroutines.
In this post, we introduce a generalization of the concept of subroutines, known as coroutines: just like subroutines, coroutines compute a single computational step, but unlike subroutines, there is no main program to coordinate the results. The coroutines link themselves together to form a pipeline without any supervising function responsible for calling them in a particular order.
In this tutorial, we will learn how to upgrade pip. This may be handy, if we, for instance, are working with old Python environments, This, in turn, may leed to that we need to upgrade pip to the latest version.
In this post, we will use pip, conda (Anacondas package manager), and Anaconda Navigator to upgrade pip.
In this post, we are going to learn how to read SAS (.sas7bdat) files in Python.
As previously described (in the read .sav files in Python post) Python is a general-purpose language that also can be used for doing data analysis and data visualization.
One potential downside, however, is that Python is not really user-friendly for data storage. This has, of course, lead to that our data many times are stored using Excel, SPSS, SAS, or similar software. See, for instance, the posts about reading .sav, .dta, and .xlxs files in Python:
In this post, we are going to work with Pandas iloc, and loc. More specifically, we are going to learn slicing and indexing by iloc and loc examples.
Once we have a dataset loaded as a Pandas dataframe, we often want to start accessing specific parts of the data based on some criteria. For instance, if our dataset contains the result of an experiment comparing different experimental groups, we may want to calculate descriptive statistics for each experimental group separately.
In this post we are going to learn 1) how to read SPSS (.sav) files in Python, and 2) how to write to SPSS (.sav) files using Python.
Python is a great general-purpose language as well as for carrying out statistical analysis and data visualization. However, Python is not really user-friendly when it comes to data storage. Thus, often our data will be archived using Excel, SPSS or similar software.
For example, learn how to import data from other file types, such as Excel, SPSS, and Stata in the following two posts:
In previous posts, we learned how to use Python to detect group differences on a single dependent variable. However, there may be situations in which we are interested in several dependent variables. In these situations, the simple ANOVA model is inadequate.
One way to examine multiple dependent variables using Python would, of course, be to carry out multiple ANOVA. That is, one ANOVA for each of these dependent variables. However, the more tests we conduct on the same data, the more we inflate the family-wise error rate (the greater chance of making a Type I error).
This is where MANOVA comes in handy. MANOVA, or Multivariate Analysis of Variance, is an extension of Analysis of Variance (ANOVA). However, when using MANOVA we have two, or more, dependent variables.
MANOVA and ANOVA is similar when it comes to some of the assumptions. That is, the data have to be:
- normally distributed dependent variables
- equal covariance matrices)
In this post, we are going to learn how to do simplify our data preprocessing work using the Python package Pyjanitor. More specifically, we are going to learn how to:
- Add a column to a Pandas dataframe
- Remove missing values
- Remove an empty column
- Cleaning up column names
That is, we are going to learn how clean Pandas dataframes using Pyjanitor. In all Python data manipulation examples, here we are also going to see how to carry out them using only Pandas functionality.