In this tutorial we will learn how to use Pandas sample to randomly select rows and columns from a Pandas dataframe. There are some reasons for randomly sample our data; for instance, we may have a very large dataset and want to build our models on a smaller sample of the data. Other examples are when carrying out bootstrapping or cross-validation. Here we will learn how to; select rows at random, set a random seed, sample by group, using weights, and conditions, among other useful things.
In this tutorial we will learn how to work with Excel files and Python. It will provide an overview of how to use Pandas to…
Learn three data manipulation techniques with Pandas in this guest post by Harish Garg, a software developer and data analyst, and the author of Mastering Exploratory Analysis with pandas.
Modifying a Pandas DataFrame Using the inplace Parameter
In this section, you’ll learn how to modify a DataFrame using the inplace parameter. You’ll first read a real dataset into Pandas. You’ll then see how the inplace parameter impacts a method execution’s end result. You’ll also execute methods with and without the inplace parameter to demonstrate the effect of inplace.
In this brief Python data analysis tutorial we will learn how to carry out a repeated measures ANOVA using Statsmodels. More specifically, we will learn how to use the AnovaRM class from statsmodels anova module.
To follow this guide you will need to have Python, Statsmodels, Pandas, and their dependencies installed. One easy way to get these Python packages installed is to install a Python distribution such as Anaconda (see this YouTube Video on how to install Anaconda). However, if you already have Python installed you can of course use Pip.
In this Pandas tutorial we will learn how to work with Pandas dataframes. More specifically, we will learn how to read and write Excel (i.e., xlsx) and CSV files using Pandas.
We will also learn how to add a column to Pandas dataframe object, and how to remove a column. Finally, we will learn how to subset and group our dataframe.
If you are not familiar with installing Python packages I have recorded a YouTube video explaining how to install Pandas. There’s also a playlist with videos towards the end of the post with videos of all topics covered in this post.
In this short post I will show you a quick fix for the error “unsupported operand type(s) for +: ‘float’ and ‘NoneType’” with Pyvttbl. In earlier posts I have showed how to carry out ANOVA using Pyvttbl (among other packages. See posts 1, 2, 3, and 4 for ANOVA using pyvttbl).
However, Pyvttbl is not compatible with Python versions greater 1.11 (e.g., 1.12.0, that I am running). This may, of course, be due to that Pyvttbl have not been updated in quite some time.
My solution to this problem involves setting up a Python virtual environment (the set up of the virtual environment it is based on the Hitchikers Guide to Python). You will learn how to set up the virtual environment in Linux and Windows.
In this post, PyCharm vs Spyder will be compared. If you have followed my blog you may have noticed that a lot of focus have been put on how to learn programming (particularly in Python). I have also written about Integrated Development Environments (IDEs). I think that an IDE may, in fact, be very useful when learning how to code. Of course, when it comes to Python IDEs it may be hard to choose the best one (e.g., PyCharm vs Spyder?).
Spyder is one of my long-time favorite IDEs, and I am mainly using Spyder when I have to write code in Windows environments. However, in one of my blog posts PyCharm was suggested in one comment (see the comments on this post: Why Spyder is the Best Python IDE for Science) that I should test PyCharm.
After testing out PyCharm I started to like this IDE. In this post you will find my views on the two IDEs. E.g., I intend to answer the question; which is the best Python IDE; PyCharm or Spyder?
In this post you are going to learn how to create a simple experiment using the free experiment building software OpenSesame. As I have previously written about, OpenSesame, is an application, based on Python, for creating Psychology, Neuroscience, and Economics experiments. It offers a nice and easy to use interface. In this interface you can drag-and-drop different objects. This means that you don’t have to know any Python programming at all to create an experiment. If you need to know how to use images as stimuli you can see this OpenSesame Tutorial.
In this short post we are going to revisit the topic on how to carry out summary/descriptive statistics in Python. In the previous post, I used Pandas (but also SciPy and Numpy, see Descriptive Statistics Using Python) but now we are only going to use Numpy. The descriptive statistics we are going to calculate are the central tendency (in this case only the mean), standard deviation, percentiles (25 and 75), min, and max.
One of the great things with programming is that you can automate things that is boring. For instance, as a student I often got schedules in the form of Word documents.
I prefer to have all my activities in a digital calendar and used to manually enter every class, seminar, and so on from a course into my calendar. One day I got tired and thought that I could probably do automate this task using Python.
After some searching around on the Internet I found the Python packages python-docx and iCalendar. In this post I will show you how to use these to packages to create an iCalender file that can be loaded in to a lot of available calendars (e.g., Google Calendar, Outlook Calendar).