Прогнозирование потребления электроэнергии с применением методов машинного обучения
Аннотация
Прогнозирование в сфере электроэнергетики позволяет генерирующим компаниям и потребителям уменьшить финансовые потери, возникающие в результате несоответствия планируемых и фактических объёмов закупки электроэнергии на оптовом рынке электрической энергии и мощности. Для реализации прогнозирования был произведён обзор систем-аналогов, таких как GMDH Shell, SAS Energy Forecasting, Alyuda Load Forecasting. Также был произведён обзор научных публикаций с целью выбора методов прогнозирования. В результате были выбраны методы Facebook Prophet, SARIMA, LSTM, XGBoost и Random Forest. Для прогнозирования использовались данные о потреблении точек учёта и погодные данные источника OpenWeatherMap. Был реализован метод прогнозирования для одной и группы точек учёта. Для группы точек учёта прогнозирование производится для кластера точек учёта. Методы прогнозирования подверглись оптмизации: изменению объёма тестовой и тренировочной выборки, добавлению признаков на основании даты (день недели, номер месяца и т.п.), подбору оптимальных гиперпараметров путём реализации решётчатого поиска с целью увеличения точности прогноза, понижению размерности набора данных с целью оптимизации времени обучения. Итоговая величина средней абсолютной ошибки в процентах для каждого метода составила 20,72%, 17,54%, 12,40%, 7,18% и 9,71% соответственно. Методы XGBoost и Random Forest показали лучший результат и могут быть включены в состав программного обеспечения для потребителей электроэнергии.
Forecasting in the electric power industry allows generating companies and consumers to reduce financial losses that arise as a result of a mismatch between planned and actual volumes of electricity purchases in the wholesale electricity and capacity market. A review of analog systems such as GMDH Shell, SAS Energy Forecasting, Alyuda Load Forecasting, was performed to implement forecasting. A review of scientific publications was also conducted for choosing forecasting methods. As a result, the methods Facebook Prophet, SARIMA, LSTM, XGBoost, and Random Forest were chosen. To implement forecasting methods data on consumption of metering points and weather data from the OpenWeatherMap source was used. A forecasting method for one and group of metering points was implemented. Forecasting for a group of metering points was performed on clustered metering points. The forecasting methods were optimized: variation the volume of the test and training samples, adding features based on the date (day of the week, month number, etc.), selectind the optimal hyperparameters by implementing grid search was performed to optimize the methods accuracy, lowering the dimension of the dataset was performed to optimize the learning time. The total mean absolute percentage error for each method was 20.72%, 17.54%, 12.40%, 7.18% and 9.71%, respectively. XGBoost and Random Forest methods showed the best result and can be included in the software for electricity consumers.
Forecasting in the electric power industry allows generating companies and consumers to reduce financial losses that arise as a result of a mismatch between planned and actual volumes of electricity purchases in the wholesale electricity and capacity market. A review of analog systems such as GMDH Shell, SAS Energy Forecasting, Alyuda Load Forecasting, was performed to implement forecasting. A review of scientific publications was also conducted for choosing forecasting methods. As a result, the methods Facebook Prophet, SARIMA, LSTM, XGBoost, and Random Forest were chosen. To implement forecasting methods data on consumption of metering points and weather data from the OpenWeatherMap source was used. A forecasting method for one and group of metering points was implemented. Forecasting for a group of metering points was performed on clustered metering points. The forecasting methods were optimized: variation the volume of the test and training samples, adding features based on the date (day of the week, month number, etc.), selectind the optimal hyperparameters by implementing grid search was performed to optimize the methods accuracy, lowering the dimension of the dataset was performed to optimize the learning time. The total mean absolute percentage error for each method was 20.72%, 17.54%, 12.40%, 7.18% and 9.71%, respectively. XGBoost and Random Forest methods showed the best result and can be included in the software for electricity consumers.