Pandas for Time Series. This article explained the pandas’… | by KahEm Chu | Jul, 2023


Data Processing in Python

This article explained the pandas’ methods for time series. Let’s deal with the time series like a pro.

Photo by Aron Visuals on Unsplash

Since I joined the workforce as a data scientist, most of the data I deal with are time series. Well, so there are a lot of definitions for time series, generally it’s defined as a set of data points collected over a period of time. Or speaking in a Pythonic way, it refers to a dataset with a datetime index, and at least one column with numerical values.

It could be the price of a stock over the past few months, the sales of a hypermarket for the past few weeks, or even the blood sugar level records collected throughout the months for a patient.

In this article, I will show how to apply pandas to a time series dataset, with an example of generated blood sugar level records.

With that, this article will be structured as below:

  1. DateTime Format Manipulationchanging the datetime series into the desired format
  2. Converting DateTime to a Particular Periodconvert each data point to the specific time periods
  3. Filtering DateTime Series based on Conditionfiltering data points based on selected time period
  4. Time Shift shifting data points down for a specific number of period
  5. Resampling Time Seriesgrouping data points based on the specified time period
  6. Line Chart

Let’s get started!

As usual, the first step in any analysis with Python is importing the necessary library.

import pandas as pd
import random
import numpy as np
from datetime import datetime

Then, let’s generate a blood sugar level records dataset for this demo.

def create_demo_data():

random.seed(365)
np.random.seed(365)
number_of_data_rows = 2160

# generate list of date
dates = pd.bdate_range(datetime(2020, 7, 1), freq='4H', periods=number_of_data_rows).tolist()

# create a dictionary with the…



Source link

Leave a Comment