I defended my dissertation (i.e. PhD. thesis) at the Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, in July 2018. My supervisor was associate professor Mária Lucká. The theme of the thesis was about improving forecasting accuracy of electricity load through the cluster analysis of consumers (or prosumers) and time series representations (see extended abstract of my thesis: Improving Forecasting Accuracy Through the Influence of Time Series Representations and Clustering). I had focused on three interesting areas of data mining:
- Time series analysis
- Forecasting and regression
The area of time series analysis consists of a research in (and also proposals of new) time series representations, specifically efficient dimensionality reduction of time series of electricity consumption that will input to a clustering algorithm. I developed my own R package called TSrepr that involves various representations methods and is available on my GitHub repository: github.com/PetoLau/TSrepr.
The clustering task is about classification (clustering) consumers into more predictable (forecastable) groups of consumers. The challenge is to develop an algorithm that will be adaptable to a behavior of multiple data streams of electricity load. Results of clustering are then used in statistical time series analysis and regression methods to improve forecasting accuracy of aggregate (global) or individual (end-consumer) electricity load. Results of clustering can be also used for smart grid monitoring, anomaly (outlier) detection, and an extraction of typical patterns of electricity consumption.
The research scope of the forecasting and regression part focuses on methods that will benefit the most from clustering of consumers. Forecasting and regression methods have to incorporate to a model a seasonality and a trend, and they have to be adaptable to a concept drift appearance. Here is a promising approach – ensemble learning that combines multiple forecasts from various forecasting and regression methods.
Works (papers) that were presented by me at a conference, a workshop or a meetup are listed below.
Time Series Data Mining - from PhD to Startup
Where: Belgrade, Serbia
Time Series Representations for Better Data Mining
Where: Budapest, Hungary
New Clustering-based Forecasting Method for Disaggregated End-consumer Electricity Load Using Smart Grid Data
Where: Poprad, Slovakia
Is Unsupervised Ensemble Learning Useful for Aggregated or Clustered Load Forecasting?
Workshop: ECML-PKDD NFMCP’2017
Where: Skopje, Macedonia
Using Clustering Of Electricity Consumers To Produce More Accurate Predictions
Where: Bratislava, Slovakia
Adaptive Time Series Forecasting of Energy Consumption using Optimized Cluster Analysis
Workshop: ICDM DaMEMO’2016
Where: Barcelona, Spain
Comparison of Representations of Time Series for Clustering Smart Meter Data
Conference: WCECS ICMLDA’2016
Where: San Francisco, California, U.S.A.