About

Pandas: Drop Columns By Name in DataFrames

Leave a Comment / Programming, Python / Erik Marsja

This post explains how to use pandas to drop columns by name from one or multiple DataFrames. We demonstrate dropping single and multiple columns, as well as how to conditionally remove columns based on name patterns. These techniques are essential for data cleaning and preparation in Python using Pandas.

Pandas: Drop Columns By Name in DataFrames Read More »

How to Rename Columns in data.table in R (With Examples)

Leave a Comment / Programming, R / Erik Marsja

In this post, we will learn how to rename columns in data.table in R. Renaming columns is a common task when cleaning and organizing data. Whether you want to rename a single column or multiple columns, data.table, it provides fast and efficient ways to get it done. We will look at different approaches, including renaming

How to Rename Columns in data.table in R (With Examples) Read More »

Replace NA in data.table: Replacing with 0 and Other Values

Leave a Comment / Programming, R / Erik Marsja

In this post, we explore two methods for replacing NA values in data.table using R. First, we replace NAs with zero, and second, we use the mean of non-missing values. These methods are great for preparing data in large datasets with minimal memory usage.

Replace NA in data.table: Replacing with 0 and Other Values Read More »

How to Use data.table to Fill NA with the Previous Value in R

Leave a Comment / Programming, R / Erik Marsja

In this post, we explore filling missing values with data.table in R and compare its speed to dplyr. We found that data.table outperforms dplyr in terms of efficiency, especially when working with large datasets, making it a valuable tool for data manipulation tasks.

How to Use data.table to Fill NA with the Previous Value in R Read More »

How to Find First Non-NA Value in data.table

Leave a Comment / Programming, R / Erik Marsja

In this post, we explore how to find the first non-NA value in data.table, both for grouped and ungrouped data. We use practical examples, including a psychology research experiment, to demonstrate the process. This technique helps handle missing values in datasets and is useful for filling in missing data based on valid entries.

How to Find First Non-NA Value in data.table Read More »

How to Extract GPS Coordinates from a Photo: The USAID Mystery

Leave a Comment / Programming, Python / Erik Marsja

Where did the USAID nutrition pack go? Using Python, we tracked down its exact location by extracting GPS coordinates from a photo. In this post, you’ll learn how to pull location data from images and plot it on a map—step by step. A small mystery solved with code and a curious eye.

How to Extract GPS Coordinates from a Photo: The USAID Mystery Read More »

How to Make a Heatmap in R

Leave a Comment / Programming, R / Erik Marsja

In this post, we used R and ggplot2 to visualize correlations among BFI personality traits. We cleaned the data, computed the correlation matrix, and created a polished heatmap without grid lines. This approach provides a clear and visually appealing way to interpret relationships between personality dimensions in psychological data.

How to Make a Heatmap in R Read More »

Two-Sample Z Test in R: Short Guide to Proportions and Means

Leave a Comment / Programming, R / Erik Marsja

Learn how to perform a two sample Z test in R to compare proportions and means between two groups. This short guide walks through examples using real numbers and shows both built-in functions and manual calculations. A useful starting point if you’re working with hypothesis testing in R and want clear, quick results.

Two-Sample Z Test in R: Short Guide to Proportions and Means Read More »

data.table Count Rows by Group

Leave a Comment / Programming, R / Erik Marsja

In this post, we explore how to use data.table to count rows by group in R. We cover using the .N operator and demonstrate how to group by one or more columns. This technique is quick and effective, making it a valuable tool for working with large datasets.

data.table Count Rows by Group Read More »

How to Get Number of Rows in R Using data.table

Leave a Comment / Programming, R / Erik Marsja

In this post, we explore how to count rows in R using nrow() for both data.frames and data.tables. We compare the performance of each method using a large dataset and discuss which one is quicker. Discover the speed differences and learn more about counting rows in R!

How to Get Number of Rows in R Using data.table Read More »

Author name: Erik Marsja