Load Forecasting with NumPyro Load (in MW) Time Series

Table of Contents

1. NYISO Data

We can pull real-time load data from the following URL. The example URL pulls data for May 2022.

http://mis.nyiso.com/public/csv/pal/20220501pal_csv.zip

Let's create a function that will pull data for a month and process it forecast. It has to:

  • Download the file
  • unzip it
import urllib.request
import zipfile
import pandas as pd

def get_data(month):    # in yyyymm format
    url = "http://mis.nyiso.com/public/csv/pal/" + str(month) + "01pal_csv.zip"
# download the zip file
    filename, headers = urllib.request.urlretrieve(url)

# extract the csv files from the zip
    with zipfile.ZipFile(filename, "r") as zip_ref:
        csv_filenames = [f.filename for f in zip_ref.filelist if f.filename.endswith(".csv")]
        dfs = []
        for csv_filename in csv_filenames:
            # read csv into pandas dataframe
            with zip_ref.open(csv_filename) as csv_file:
                df = pd.read_csv(csv_file)
                dfs.append(df)

    # concatenate the dataframes into a single dataframe
    df_concatenated = pd.concat(dfs)
    df_concatenated.set_index('Time Stamp', drop=True, inplace=True)

    return(df_concatenated)
mar23 = get_data("202303")
mar23
                    Time Zone    Name   PTID       Load
Time Stamp                                             
03/01/2023 00:00:00       EST  CAPITL  61757  1226.9684
03/01/2023 00:00:00       EST  CENTRL  61754  1722.6571
03/01/2023 00:00:00       EST  DUNWOD  61760   596.3434
03/01/2023 00:00:00       EST  GENESE  61753  1008.5693
03/01/2023 00:00:00       EST  HUD VL  61758  1013.8084
...                       ...     ...    ...        ...
03/31/2023 23:55:00       EDT  MHK VL  61756   713.8065
03/31/2023 23:55:00       EDT  MILLWD  61759   279.8333
03/31/2023 23:55:00       EDT  N.Y.C.  61761  4579.5044
03/31/2023 23:55:00       EDT   NORTH  61755   658.6137
03/31/2023 23:55:00       EDT    WEST  61752  1477.0037

[100881 rows x 4 columns]

Below are functions to filter by Name and visualize:

def one_location_time_series(df, name):
    return(df[df['Name'] == name]['Load'])

one_location_time_series(mar23, 'NORTH')
Time Stamp
03/01/2023 00:00:00    705.3594
03/01/2023 00:05:00    706.4240
03/01/2023 00:10:00    699.5104
03/01/2023 00:15:00    686.3651
03/01/2023 00:20:00    702.8407
                         ...   
03/31/2023 23:42:23    659.7048
03/31/2023 23:43:47    662.0710
03/31/2023 23:45:00    660.0018
03/31/2023 23:50:00    653.6038
03/31/2023 23:55:00    658.6137
Name: Load, Length: 9171, dtype: float64
def elec_plot(df, name):
    one_location_time_series(df, name)["Load"].plot.line()
    plt.savefig('line_plot.png')
    plt.close()
elec_plot(mar23, 'NORTH')

line_plot.png

Author: Matt Brigida, Ph.D.

Created: 2023-05-16 Tue 13:22

Validate