Novel Dataset to Apply Machine Learning to Energy Disaggregation


Researchers from Seoul National University analyzed a newly collected dataset that contains 10 Hz sampling data for 58 houses

Nonintrusive load monitoring (NILM) is disaggregating individual appliance usage from the aggregate electricity data, without extra per-appliance measurements. NILM is a process for analyzing changes in the voltage and current going into a house and deducing what appliances are used in the house as well as their individual energy consumption. It can detect what types of appliances people have and their behavioral patterns. Disaggregated energy consumption can be used to offer feedback to consumers in order to alter their energy consumption behavior. Now, a team of researchers from Seoul National University assessed basic classification and regression algorithms to comprehend the data requirements for energy disaggregation.

The team collected data from 58 households with the help of sensing devices with 10 Hz sampling rate. The team assessed the sampling rate of sensor data and the number of households that require to be included in a dataset to conduct reliable NILM research, using the newly collected dataset. The team used the ENERTALK dataset collected through a commercial energy IoT platform called ‘ENERTALK’ –general IoT platform for collecting, storing, and analyzing data, and NILM is one of the analysis functions in the platform. The actual usage time of an appliance was widespread and depended on the household. The study focuses on three devices TVs, washers, and rice cookers. The most basic modeling frameworks—a binary classification framework and a power usage regression framework—was considered in the study. Vanilla CNN (Conventional Neural Network) and LSTM (Long Short-Term Memory) were used as the basic benchmark.

The team found that at least tens of distinct houses need to be included in the training dataset in the case of the ENERTALK dataset to halt the deteriorating performance of NILM. Machine learning with feature engineering performed poorly in case of TVs. However, Vanilla Deep Neural Network (DNN) and CNN with hyperparameter optimization (HPO) demonstrated significant performance. In the case of washers, Random Forest algorithm with feature engineering performed the best. For the Random Forest algorithm, all TVs, washers, and rice cookers showed a monotonic improvement as the number of houses increased. The research was published in the journal MDPI Energies on May 5, 2019.


About Author

Curt Reaves started working for Plains Gazette in 2016. Curt grew up in a small town in northern Iowa. He studied chemistry in college, graduated, and married his wife one month later. He has been a proud Texan for the past 5 years. Curt covers politics and the economy. Previously he wrote for the Washington City Paper, The Hill newspaper, Slate Magazine, and