Press "Enter" to skip to content

Category: R

R statistical programming related stuff

How to Read and Write Stata (.dta) Files in R with Haven

In this post, we are going to learn how to read Stata (.dta) files in R statistical environment. Specifically, we will learn 1) who to import .dta files in R using Haven, and 2) how to write dataframes to .dta file.

Data Import in R: Reading Stata Files

Now, R is, as we all know, a superb statistical programming environment. When it comes to importing and storing data, we can store our data in the native .rda format. However, if we have a collaborator that uses other statistical software (e.g., Stata) and/or that are storing their data in different formats (e.g., .dta files).

Now, this is when R shows us its brilliance; as an R user we can load data from a range of file formats; e.g., SAS (.7bdat), Stata (.dta), Excel (e.g., .xlsx), and CSV (.csv). On this site there are other tutorials on how to import data from (some) of these formats:

Before we go on and learn how to read SAS files in R, we will answer the questions:

How to Remove a Column in R using dplyr (by name and index)

In this remove a column in R tutorial, we are going to work with dplyr to delete a column. Here, we are going to learn how to remove columns in R using the select() function. Specifically, we are going to remove columns by name and by index.

Finally, we will also learn how to remove columns from R dataframes that start with a letter, or a word ends with a letter, or word, or contains a character (like the underscore).

how to remove a column in R statistical environment

Learn How to Calculate Descriptive Statistics in R the Easy Way

In this post, we will learn how to carry out descriptive statistics in R. After we have learned how to do this, we will learn how to create a nice latex table and how to save the summary statistics to a .csv file.

Why Descriptive Statistics?

Carrying out descriptive statistics, also known as summary statistics, is a very good starting point for most statistical analyses. It is, furthermore, a very good way to summarize and communicate information about the data we have collected.

Descriptive statistics in R

How to Read & Write SPSS Files in R Statistical Environment

In this post we are going to learn 1) how to read SPSS (.sav) files in R, and 2) how to write to SPSS (.sav) files using R.  More specifically, here we are going to work with the following two R packages haven (from the Tidyverse) and foreign to:

  • Read a .sav file into an R dataframe
  • Writing an R dataframe to a .sav file

How to Import Data: Reading SAS Files in R

In this post, we are going to learn how to read SAS (. sas7bdat) files in R. More specifically, we are going to use the packages haven, and sas7bdat. Furthermore, we will also learn how to load .sas7bdat files into R using RStudio.

If you are interested in other methods on how to import data in R:

How to Make a Scatter Plot in R with Ggplot2

In this post, we will learn how make scatter plots using R and the package ggplot2.

More specifically, we will learn how to make scatter plots, change the size of the dots, change the markers, the colors, and change the number of ticks. 

Furthermore, we will learn how to plot a trend line, add text, plot a distribution on a scatter plot, among other things. In the final section of the scatter plot in R tutorial, we will learn how to save plots in high resolution.

How to Use Binder and R for Reproducible Research

In a previous post, we learned how to use Binder and Python for reproducible research. Now, we are going to learn how to create a Binder for our data analysis in R, so it can be fully reproduced by other researchers. More specifically, in this post, we will learn how to use Binder for reproducible research.

Many researchers upload their code for data analysis and visualization using git (e.g., GitHub, Gitlab).

No doubt, uploading your R scripts is great. However, we also need to make sure that we share the complete computational environment so that our code can be re-run and so that others can reproduce the results. That is, to make sure that other researchers really can reproduce our code, we need a way to capture the versions of the R packages we used when publishing our research.

Repeated Measures ANOVA in R and Python using afex & pingouin

In this post, we will learn how to carry out repeated measures Analysis of Variance (ANOVA) in R and Python. To be specific, we will use the R package afex and the Python package pingouin to carry out one-way and two-way ANOVA for within-subject’s design. The structure of the following data analysis tutorial is as follows; a brief introduction to (repeated measures) ANOVA, carrying out within-subjects ANOVA in R using afex and in Python using pingouin. In the end, there will be a comparison of the results and the pros and cons of using R or Python for data analysis (i.e., ANOVA).