WaveNILM: A Causal Neural Network for Power Disaggregation
Causality deserves more attention in Deep Learning
Photo by Alessandro Bianchi on Unsplash
Introduction
Power Disaggregation is one of the important processes involved in building better grid infrastructures for the ever-increasing demand for energy consumption. Disaggregation means “to separate individual appliances’ consumption values” from the aggregate total power consumption. As we all know, Deep Neural Networks are being applied to solve complex problems in almost all industries and verticals, and it has made its mark in the Energy industry too. We have seen the usage of LSTMs [1], 1D-CNNs, Transformers [2], and other Time-series based algorithms applied for popular use-cases like Load Forecasting, Integration Strategies [3], Demand Response [4, 5], NILM (Non-Intrusive Load Monitoring), P2P Trading, etc. We are focusing on NILM in this article, which stands out as one of the important processes in developing Smart Grid infrastructure for the future.
Background
NILM is quite a seasoned research area, being first proposed in 1992. Fast forward 2 decades and we have a lot of exciting research [6, 7, 8] in this field with Deep Learning taking the majority spotlight among the various other ways of approaching and solving this problem. This domain involves time-series data and directly implicates the use of sequence modeling architectures like RNNs, LSTMs, GRUs, and of course the most latest Transformers. However, these are all non-causal and require future values for disaggregation at the current time step t.
TCNs (Temporal Convolutional Networks) came into existence and challenged the LSTMs and GRUs in various sequence modeling tasks with longer effective memory in addition to being causal. The hidden layers in a TCN are of the same size as the input layer with zero padding to ensure the sequence length stays the same throughout. Also, the way it prevents information leakage from the future to the past is by using causal convolution layers that convolve over t-n time steps at current time step t. Dilated convolutions with various dilation factors are employed to look back very far (exponentially large receptive field) in the past for factoring in the predictions.
Standard Convolution vs Dilated Convolution for Temporal Data. Image Credits — WaveNILM paper
Idea & Architecture
WaveNILM [9] builds on top of Causal CNN [10] architecture with a few improvements in the process of building a better and more efficient architecture for real-time disaggregation, much needed in the age of Smart and Dynamic grids.
This is how a block looks like in the WaveNILM architecture. Image Credits — WaveNILM paper
The above illustration depicts how the WaveNILM uses a Gated Dilation method — Output from the Conv-Dense layers is split into two directions, namely the Sigmoid (Gate) path, and the ReLU (Regressor) path. The outputs are then concatenated, which forms the overall output for a block. There are several blocks stacked in a residual manner, but the nature of the residual connection is slightly different — The Skip connection from each block doesn’t skip just 1 block, instead skips several blocks and joins the network at the final few downstream layers.
The input layer is a 512 unit Time-distributed layer followed by a stack of gated blocks. Here, all the skip connections are collected and concatenated to form another new layer that acts as an output mask. The output mask is multiplied with that unit corresponding to the appliance we want to disaggregate due to the multiple input nature of the architecture.
Results & Benchmarks
Single vs Multiple input experiments. Image Credits — WaveNILM paper
Let’s quickly discuss Deferrable load, one of the important load signature types to be disaggregated for key benefits. Deferrable Loads are large loads that can be run at off-peak hours. Examples are HVAC, heat pumps, dryers, ovens, and dishwashers. Successful disaggregation of the above loads can be instrumental in preventing blackouts and load sheddings across the grid.
Data from AMPds2 [11] was first scaled between [0, 1], batched, and trained for around 500 epochs on a 1-day (1440) sample set. Since WaveNILM is a regressor network, we have to focus on the error rate rather than the categorical output. The Estimated Accuracy illustrated below is one of the widely adopted metrics for these scenarios,
Image by Author
where sk^(t) is the predicted power of an appliance at time t, sk(t) the ground truth, T and K are the total time and number of appliances respectively. WaveNILM was studied with various input signals like Current (I), Power (P), Reactive Power (Q), and Apparent Power (S). When choosing multiple input signals, either P or I was selected as output signals.
Coming to comparison, WaveNILM has been benchmarked against SSHMM [12] architecture for noisy and denoised input signals. The below tables show the performance of WaveNILM vs SSHMM for single and multiple inputs.
Denoised results on deferrable loads. Image Credits — WaveNILM paper
Noisy case results on deferrable loads. Image Credits — WaveNILM paper
My Thoughts
In my opinion, WaveNILM is a good approach, in general to move towards causality-based neural networks. Causality is one step ahead of correlation in understanding the model behavior when things change from its basic assumptions on the data. Deep Learning with causation is the need of the hour for better modeling and robustness. Knowing the cause-effect aspect can make AI applications explainable and smarter.
One more aspect is the dataset, only a single dataset was inspected and used. I was expecting the usage of popular datasets like REDD and Pecan Street. Usage of these datasets would have opened up the competition in benchmarks and presented more realistic numbers of performance.
Conclusion
WaveNILM presented a simple and effective causal network for solving the real-time disaggregation problem. Its main advantage is the ability to add/remove the number of inputs for disaggregation for a better model fit. There was also a focus on the types of Power involved, and how going forward infrastructure to safely capture these readings will go a long way in developing AI systems for the consumer and the grid operator.
References
[1] Parallel LSTM Architectures for Non-Intrusive Load Monitoring in Smart Homes: https://ieeexplore.ieee.org/document/9308592
[2] ELECTRIcity: An Efficient Transformer for Non-Intrusive Load Monitoring: https://pubmed.ncbi.nlm.nih.gov/35458907/
[3] Non-intrusive Electric-vehicle Load DisaggregationAlgorithm for a Data-driven EV Integration Strategy: https://ieeexplore.ieee.org/document/9794150
[4] Deep Learning for Intelligent Demand Response and Smart Grids: A Comprehensive Survey: https://arxiv.org/pdf/2101.08013.pdf
[5] Prediction of Electric Energy Consumption for Demand Response using Deep Learning: https://ieeexplore.ieee.org/document/9862353
[6] Analyzing Load Profiles of Energy Consumption toInfer Household Characteristics Using Smart Meters: https://www.mdpi.com/1996-1073/12/5/773
[7] Low-Frequency Non-Intrusive Load Monitoring of Electric Vehicles in Houses with Solar Generation: Generalisability and Transferability: https://www.mdpi.com/1996-1073/15/6/2200
[8] Sequence-to-point learning with neural networks for non-intrusive load monitoring: https://arxiv.org/pdf/1612.09106.pdf
[9] WaveNet: A Generative Model for Raw Audio: https://arxiv.org/pdf/1609.03499v2.pdf
[10] WaveNILM: https://arxiv.org/pdf/1902.08736.pdf
[11] AMPds2 dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FIE0S4
[12] Exploiting HMM Sparsity to Perform Online Real-Time Nonintrusive Load Monitoring: https://ieeexplore.ieee.org/document/7317784
[13] WaveNILM implementation: https://github.com/picagrad/WaveNILM
标签:Load,WaveNILM,time,神经网络,新秀,https,pdf,org From: https://blog.51cto.com/zhuxianzhong/5959817