View on GitHub

power-outage-analysis

Portofolio Analysis report for EECS 398

power-outage-analysis

Portofolio Analysis report for EECS 398

Power Outage Exploration Report

Step 1: Introduction

The dataset analyzes major power outages in the U.S. from 2000 to 2015, focusing on the annual frequency of such outages and trends. The central question is:

Can we predict the trend in the frequency of major outages over time?

This question is critical for mitigating the socioeconomic impact of outages caused by natural disasters or operational failures, which disrupt lives and economic activities.

Dataset Details:


Step 2: Data Cleaning and Exploratory Data Analysis

Data Cleaning Steps

Data cleaning was a critical step to ensure the accuracy and reliability of the analysis. Below are the steps taken, explained in reference to the data-generating process, and how they affected the analyses:

Univariate of Outage Cause in Washington State

Bivariate Scatter of Intentional Attack vs Year in Washington State

Aggregated Table of Intentional Attack vs Year in Washington State

| YEAR | Intentional Attack Count |
|——- |————————— |
| 2011 | 29 |
| 2012 | 23 |
| 2013 | 4 |
| 2014 | 2 |
| 2015 | 1 |
| 2016 | 5 |

Step 3: Prediction Problem

The prediction problem is a regression task, aiming to predict the annual frequency of major outages.


Step 4: Model and Features

Model Description:

Feature Types:

Encodings:

The model used numerical data directly (no categorical encodings were necessary).

Performance:


Step 5: Refinements and Improvements

Feature Additions:

Modeling Approach:

Hyperparameters:

Comparison:


Conclusion

The refined models demonstrate the importance of addressing data anomalies and regional characteristics in predictive modeling. These insights emphasize actionable strategies for managing power outages, highlighting the value of customized, data-driven decision-making.