Press "Enter" to skip to content

Author: Erik Marsja

PhD in Psychology, Linköping University. Main interest is experimental and cognitive psychology. Enjoy programming in Python and R.

How to Get the Column Names from a Pandas Dataframe – Print and List

In this short post, we will learn 6 methods to get the column names from Pandas dataframe. One of the nice things about Pandas dataframes is that each column will have a name (i.e., the variables in the dataset). Now, we can use these names to access specific columns by name without having to know which column number it is.

To access the names of a Pandas dataframe, we can the method columns(). For example, if our dataframe is called df we just type print(df.columns) to get all the columns of the pandas dataframe.

How to Plot a Histogram with Pandas in 3 Simple Steps

In this post, we are going to learn how to plot histograms with Pandas in Python. Specifically, we are going to learn 3 simple steps to make a histogram with Pandas. Now, plotting a histogram is a good way to explore the distribution of our data.

Note, in the end of this post there’s a YouTube tutorial explaining the simple steps to plot a Histogram with Pandas.

Prerequisites

First of all, and quite obvious, we need to have Python 3.x and Pandas installed to be able to create a histogram with Pandas. Now, Python and Pandas will be installed if we have a scientific Python distribution, such as Anaconda or ActivePython, installed. On the other hand, Pandas can be installed, as many Python packages, using Pip: pip install pandas.

How to Read and Write Stata (.dta) Files in R with Haven

In this post, we are going to learn how to read Stata (.dta) files in R statistical environment. Specifically, we will learn 1) who to import .dta files in R using Haven, and 2) how to write dataframes to .dta file.

Data Import in R: Reading Stata Files

Now, R is, as we all know, a superb statistical programming environment. When it comes to importing and storing data, we can store our data in the native .rda format. However, if we have a collaborator that uses other statistical software (e.g., Stata) and/or that are storing their data in different formats (e.g., .dta files).

Now, this is when R shows us its brilliance; as an R user we can load data from a range of file formats; e.g., SAS (.7bdat), Stata (.dta), Excel (e.g., .xlsx), and CSV (.csv). On this site there are other tutorials on how to import data from (some) of these formats:

Before we go on and learn how to read SAS files in R, we will answer the questions:

Rename Files in Python: A Guide with Examples using os.rename()

In this post, we are going to work with Python 3 to rename files. Specifically, we will use the Python module os to rename a file and rename multiple files.

First, we will rename a single file in 4 easy steps. After that, we will learn how to rename multiple files using Python 3. To be able to change the name of multiple files using Python can come in handy. For example, if we have a bunch of data files (e.g., .csv files) with long, or strange names, we may want to rename them to make working with them easier later in our projects (e.g., when loading the CSV files into a Pandas dataframe).

How to Save a Seaborn Plot as a File (e.g., PNG, PDF, EPS, TIFF)

In this short post, we will learn how to save Seaborn plots to a range of different file formats. More specifically, we will learn how to use the plt.savefig method save plots made with Seaborn to:

  1. Portable Network Graphics (PNG)
  2. Portable Document Format (PDF)
  3. Encapsulated Postscript (EPS)
  4. Tagged Image File Format (TIFF)
  5. Scalable Vector Graphics (SVG)

Pandas drop_duplicates(): How to Drop Duplicated Rows

In this post, we will learn how to use Pandas drop_duplicates() to remove duplicate records and combinations of columns from a Pandas dataframe. That is, we will delete duplicate data and only keep the unique values.

This Pandas tutorial will cover the following; what’s needed to follow the tutorial, importing Pandas, and how to create a dataframe fro a dictionary. After this, we will get into how to use Pandas drop_duplicates() to drop duplicate rows and duplicate columns.

How to Remove a Column in R using dplyr (by name and index)

In this remove a column in R tutorial, we are going to work with dplyr to delete a column. Here, we are going to learn how to remove columns in R using the select() function. Specifically, we are going to remove columns by name and by index.

Finally, we will also learn how to remove columns from R dataframes that start with a letter, or a word ends with a letter, or word, or contains a character (like the underscore).

How to Read a File in Python, Write to, and Append, to a File

In this tutorial, we are going to learn how to read a file in Python 3. After we have learned how to open a file in Python, we are going to learn how to write to the file and save it again.  In previous posts, we have learned how to open a range of different files using Python. For instance, we have learned how to open JSON, CSV, Excel, and HTML files using Pandas, and the json library. Here, however, we are going to open plain files (.txt) in Python.