Short-term,photovoltaic,power,prediction,using,combined,K-SVD-OMP,and,KELM,method

来源:优秀文章 发布时间:2023-02-02 点击:

LI Jun, ZHENG Danyang

(School of Automation and Electrical Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)

Abstract:
For photovoltaic power prediction, a kind of sparse representation modeling method using feature extraction techniques is proposed. Firstly, all these factors affecting the photovoltaic power output are regarded as the input data of the model. Next, the dictionary learning techniques using the K-mean singular value decomposition (K-SVD) algorithm and the orthogonal matching pursuit (OMP) algorithm are used to obtain the corresponding sparse encoding based on all the input data, i.e. the initial dictionary. Then, to build the global prediction model, the sparse coding vectors are used as the input of the model of the kernel extreme learning machine (KELM). Finally, to verify the effectiveness of the combined K-SVD-OMP and KELM method, the proposed method is applied to a instance of the photovoltaic power prediction. Compared with KELM, SVM and ELM under the same conditions, experimental results show that different combined sparse representation methods achieve better prediction results, among which the combined K-SVD-OMP and KELM method shows better prediction results and modeling accuracy.

Key words:
photovoltaic power prediction;

sparse representation;

K-mean singular value decomposition algorithm (K-SVD);

kernel extreme learning machine (KELM)

Due to the random, fluctuation, and intermittent nature of solar energy, the output power of photovoltaic (PV) power plants is often not easy to control, which harms the stable operation of the power grid. At present, the ultra-short-term and short-term power prediction of PV power generation can ensure the regular operation and reliability of the power system, which has attracted more and more attention from researchers[1-4].

The short-term PV power prediction methods are generally classified into physical methods[5], classical statistical methods[6-8], and artificial intelligence methods[9]. The artificial intelligence methods, such as neural networks[10-14]and support vector machine (SVM), etc[15-17], have the remarkable ability of reasoning and learning in an uncertain and imprecise environment, which have been successfully applied in the field of PV prediction and have achieved good prediction results. For example, Hossain et al.[10]proposed a prediction algorithm for PV power using a long short term memory (LSTM) neural network. Sun et al.[15]established a short-term step-wise temperature prediction model for PV module based on SVM, and the results indicate that the stepwise prediction model has better accuracy than the direct prediction model if other things are equal. Ref.[16] further presents a comparison of extreme learning machine (ELM) and SVM for PV power estimation. Lin et al.[17]introduced an inertia weighting strategy and the Cauchy mutation operator are to improve the moth-flame optimization algorithm for SVM prediction of PV power generation. In Ref.[18], weather types were classified into abnormal day (weather changed suddenly) and normal day, and a combined prediction method based on ensemble empirical mode decomposition (EEMD) and SVM was proposed to tackle the problem of the short-term forecast of hourly output PV system a day ahead. In addition, an EEMD-GRNN(generalized regression neural network) prediction method is further proposed by Zhang et al.[19], and the prediction error of the EEMD-GRNN method can also meet the requirements.

ELM is a fast learning algorithm based on single-hidden layer feedforward neural networks (SLFNs). Since the network structure is determined randomly by the number of hidden layer nodes, the input layer weights and hidden layer node parameters are also randomly given, and the regularized least-squares algorithm is used for training, which only needs to consider adjusting the output layer weights of the network, thus increasing the network training speed extremely fast. ELM was successfully applied to short-term photovoltaic power prediction and achieved high prediction accuracy in Ref.[20]. Since the ELM network is more sensitive to its hidden nodes, to solve this problem, the kernel extreme learning machine (KELM), which does not need consider the number of the hidden layer nodes, only uses the kernel function to represent the nonlinear feature mapping of the hidden layer unknown. Therefore, KELM can achieve good performance.

In addition, principal component analysis (PCA) or kernel principal component analysis (KPCA) is an important means for data preprocessing, which can effectively extract the features hidden in the data. Lv et al.[21]used the PCA algorithm and neural network to preprocess the input data, and the results showed that PCA could effectively improve the accuracy of prediction results. Similarly, sparse representation can also be used in the field of prediction as to the primary means of feature extraction as well. As an advanced machine learning method, sparse representation has been widely used in pattern recognition field. The sparse representation method contains two parts:
dictionary learning and sparse coding. Compared with PCA, a set of super complete basic vectors can be obtained through sparse representation.

In view of the advantages of sparse representation algorithms in feature extraction, combined with the KELM method, we propose a new global prediction model for short-term prediction of PV power based on theK-mean singular value decomposition and orthogonal matching pursuit (K-SVD-OMP) sparse representation algorithm. To verify the effectiveness of the method, it is applied to an instance for PV benchmark power prediction provided by the Global Energy Forecasting Competition 2014 (GEFCOM2014), and compared with the existing prediction methods such as SVM and KELM and other sparse representation algorithms using non-dictionary learning to evaluate the effectiveness of the combined K-SVD-OMP method.

Sparse representation is to represent the original signal by a linear combination of coefficients of a super-complete basis function, where the sets of basis functions are called dictionaries, and the basis functions are called atoms. Sparse coding in sparse representation is basis on the concept of a large number of candidate basis vector dictionaries, which linearly combine the base vectors in a super-complete dictionary to represent the original signal in a compact and efficient way. The weights of most of the dictionary basis vectors tend to zero, and the basis vectors with non-zero weights are selected for the original signal. That is, the sparse representation of the signal vectorx∈Rmis expressed as

x=Φα,

(1)

whereΦ={φ1,φ2, …,φK}, and the basis vectorφis an element in the dictionaryΦ, called an atom. For Eq.(1), it can be obtained by solving thel0-parametric optimization problem for the coefficient vectorα, that is,

(2)

If the number of non-zero coefficients inαis restricted, Eq.(2) can be formulated as anM-sparse optimization problem

(3)

whereMis the maximum number of non-zero coefficients inα.

Applying the Lagrange multiplier, Eq.(2) is converted to

(4)

whereλ>0 and acts as a compromise in terms of reconstruction error and sparsity.

For Eqs.(2)-(4), which are essentially NP-hard problems, solving such issues usually uses the greedy algorithm and the relaxation algorithm to sparse the representation of the signal, for example, the prior distribution ofαis assumed to obey the Laplacian distribution[22]as

(5)

(6)

Letλ1=2σ2β, the above equation can be transformed into

(7)

If an additional constraint with a sum of weights of 1 is given, the above equation can be transformed into

(8)

2.1 Dictionary selection

In sparse representation, dictionary learning is to transform samples into appropriate sparse representation, allowing the learning task to be simplified and the model complexity reduced. Therefore, different dictionaries can be chosen for different types of data, for example, discrete cosine transform (DCT) dictionary, data dictionary, structured dictionary, etc. For a better sparse representation of other signals, we selectk-means singular value decomposition (K-SVD) dictionary with better performance. This method can obtain the most suitable and compact dictionary for the data that are trained without the training process of the model. The dictionary is further applied to the sparse coding algorithm based on OMP to obtain the optimal solution of the sparse coding vector.

2.2 K-SVD-OMP algorithm

As a classical dictionary training algorithm, K-SVD is a generalization ofk-means based on sparse representation, which includes explicitly two phases of dictionary updating and sparse encoding. For K-SVD algorithm, inKiterations, the SVD technique is used, and iterative alternating learning is employed to optimize the representation of the input data in the current dictionary and update the atoms in the dictionary to fit the data by iterations better. Considering a finite data vector as the standard basis vector, the objective function of thek-means algorithm is expressed as

s.t. ∀i,xi=ek,

(9)

The algorithm essentially solves the optimization problem shown as

(10)

In the sparse encoding stage, assuming that the dictionaryΦis fixed, the optimization problem of Eq.(10) is transformed into searching for a sparse representation of the coefficient matrixAcorresponding to the dictionary matrixΦ. Then Eq.(10) can be optimized as

(11)

Eq.(11) is solved in substantially the same way as Eq.(3), and different sparse encoding algorithms can solve both.

In the dictionary update phase, which includes the sequential processing of the atoms (i.e., columns) of the dictionary and keeping all columns unchanged except for thekth columnφk. Columnφkcan be updated by multiplying its coefficients inΦ, isolating the elements associated withφk. Eq.(10) can be rewritten as

(12)

Definingωkis the index set of dataxiusing the dictionary atomφkas

Therefore, the objective function of Eq.(12) is equivalent to

(13)

In summary, for the optimization of Eq.(10), the aim is to obtain the best dictionary setΦrepresenting datasetX.

The K-SVD algorithm is implemented in the following steps:

Step 1) Initialize the dictionary: setJ=1 and the columns ofΦ(0)∈Rm×Kwithl2-parametric normalization.

Step 2) Sparse coding phase uses the sparse coding algorithm to approximate the sparse vectorαiwith the correspondingxi, i.e.

Step 3) Dictionary update phase:
update each column (k=1,2,…,K) in dictionaryΦ(J-1).

Next, compute the error matrixEkby

Step 4) DoJ=J+1 and return to step 2 until the convergence or stopping condition of Eq.(10) is satisfied.

2.3 OMP algorithm

OMP algorithm is an effective solution to the sparse coding problem with sparse constraints for Eq.(3) as a class of greedy algorithmic solution strategies. In statistical modeling, OMP algorithm aims to select the atom with the highest correlation to the current residual at each step in the solving sparse coding vector stage. After selecting the atom, the signal is projected orthogonally onto the space of the selected atoms, the residuals are recalculated, and the process is repeated. As a result, the OMP algorithm converges faster for the same accuracy requirements.

To solve the optimal estimate of the sparse encoding vector, the optimization problem for Eq.(3) can be rewritten as

(14)

To solve Eq.(14), the OMP algorithm is implemented in the following steps.

Step 1) Initialize the residual vectorr0=x, the index setΛ0=Φ, and the number of iterationst=1.

Step 2) Find the indexχt, whereχtis for the residual and the index corresponding to the column vectorφiof the dictionary matrix when it has the maximum inner product, i.e..

Step 3) Augment the index set, select index setΛt=Λt-1∪{χt}, and update dictionaryΦt=[Φt-1,φχt], noting thatΦ0is the empty matrix.

Step 4) Solve the least-squares problem to obtain the new data representation, i.e.

Step 5) Update the residualsrt=x-Φtαt.

Step 6) Dot=t+1, ift≤M, return to Step 2.

This section specifically considers how to combine the sparse representation algorithm with KELM to form a global prediction model. Firstly, the factors affecting the PV power prediction output and the PV power to be predicted are included into an input-output data pair with time delay. Secondly, the training data are sparsely decomposed, transformed, and mapped into the sparse domain, and the sparse coding vector is the sparse representation of the training input data with the help of the basis vector selected by the dictionary used for sparse decomposition, i.e., a hidden mode representation of the input data is obtained. Finally, the corresponding target output is paired and trained using a KELM global regression model.

The model to be predicted is established as

(15)

For a standard single-layer feedforward neural network (SLFN) with a node activation function and several nodes in the hidden layer ofLwith a single output, the outputs of the network nodes are expressed as

(16)

Unlike conventional SLFN, the activation function of ELM is usually fixed during the training period, and the initial parameter settings of the standard activation function can be generated by uniformly distributed random numbers, which converts the training learning of ELM to the estimation problem of finding the optimal weight vectorθ. Theoretically, it can be confirmed that ELM has universal approximation property and is a general function approximator.

The optimal weight problem of ELM is solved by thel2-normalized optimization problem, and we have

(17)

where the matrixH=[h(x1),…,h(xN)] is the regularization parameter.

For the implicit layer feature mappingh(x), we define a kernel matrix as

(18)

(19)

In Eq.(19), the Gaussian kernel function is usually used, i.e.

(20)

whereζis the width parameter of the function, which sets the radial action range of the function. The penalty parametersCandζare selected by the cross-validation method.

The implementation of the global prediction method based on the K-SVD-OMP algorithm is as follows:

Step 3) The obtained dictionaryΦis used as data dictionary or used to learn a new compact dictionary. The sparse coding solution is performed based on the OMP algorithm, and the sparse coding vector corresponding to eachxiis calculated.

Step 4) Usingαias the input and the corresponding target outputyi, a global prediction model is built. The prediction modelf(·) can be constructed by Eq.(18).

4.1 Data processing

In this section, an instance of a PV power prediction experiment is carried out to verify the method’s effectiveness, and the model using the experimental data containing PV power values and other influencing factors is performed by Eq.(15).

Take the PV power datasets[23-25]provided by the organizer of the Global Energy Forecast Competition 2014 (GEFCOM) as an example, which is based on three adjacent solar power stations in an Australian region. The training datasets consist of PV power data from April 2012 to March 2013. The test datasets consider the influence of different dictionaries and different sparse encoding algorithms on the prediction results are consist of data from April 2014 to June 2015.The data interval is sampled by the hour. Besides, the initial dictionary selection in the experiment takes into account the time-delay input data vector, not only the DCT dictionary, but also the sparse coding algorithm of non-dictionary learning and the least absolute shrinkage and selection operator (LASSO) algorithm. Therefore, the sparse representation modeling algorithms such as DCT-LASSO, DCT-OMP, K-SVD-LASSO are used for comparison. The SVM method is accomplished using libsvm software.

Fig.1 gives a schematic diagram of the training and test datasets composed of 15 tasks. Each task consists of training dataset and test dataset, respectively. After the test of each task is completed, the actual PV power data of that test month are merged into the training dataset of the next task, and so on, to obtain 15 tasks.

Fig.1 Training and test datasets of each task

For a particular regionz,z∈{1,2,3}, assuming the current time is hourhof dayd, whereh=0,1,…,23, using data from the European Centre for Medium-Term Weather Forecast (ECMWF) as the data source and 12 factors that measured 24 hours in advance in the Numerical Weather Forecast (NWF) as input variables, the PV power value in hourhof dayd+1 can be predicted. That is, the values measured 24 h in advance by each prediction model are available for PV power prediction. Therefore, 24 prediction models are needed to complete the daily prediction. Considering that the output of PV power at night is zero, only 16 prediction models are needed. There are 12 factors affecting the prediction model input and their physical implications are shown in Table 1.

Table 1 Specific interpretation of the 12 variables

4.2 Experiment results and discussion

In the experiment, to evaluate the performance of the proposed model, root mean square error (RMSE) and Mean Absolute Error (MAE) are used for the experiment. The K-SVD algorithm selects a Gaussian kernel function ofζ=5 and a regularization factor ofη=102. Fig.2 gives a comparison of the prediction accuracy of different sparse representations under different values of the sparsityM. The best prediction accuracy is achieved whenMis equal to 4.

Fig.2 Comparison of MAE of different prediction methods based on sparse representation algorithm

To further measure the prediction effectiveness of the proposed method, the prediction results of the K-SVD-OMP method are compared with the sparse representation modeling method with non-dictionary learning. In the single KELM, SVM, and ELM methods, DCT dictionary is used in the sparse representation method with non-dictionary learning, the same Gaussian kernel function is selected for both the single KELM and SVM; the penalty parameterC=16, insensitive lossε=0.1, the number of hidden layersL=50 for the ELM. The activation function is a sigmoid function.

First, Task1 and Task15 were randomly selected as examples to analyze the proposed method. Tables 2 and 3 show the values of MAE and RMSE of the above tasks. According to Tables 2 and 3, different sparse modeling methods all give better prediction accuracy values. Although the MAE value of K-SVD-OMP combining SVM model is higher than that of DCT-LASSO algorithm, its RMSE value is still better than that of the DCT-LASSO algorithm. Combining K-SVD-OMP algorithm with SVM, ELM, and KELM, respectively, it can be clearly seen that the accuracy of the prediction results is improved more effectively and the accuracy of K-SVD -OMP algorithm with KELM is the highest among them. In addition, compared with the GRNN, EEMD-SVM and EEMD-GRNN methods, the evaluation indicators of K-SVD-OMP-KELM method are better than those of GRNN, EEMD-SVM, EEMD-GRNN methods.

Table 2 Results of different prediction methods (Task 1)

Table 3 Results of different prediction methods (Task 15)

Fig.3 gives the predicted and actual values of Zone 1 in K-SVD-OMP combined with the KELM method and other sparse representation modeling methods for Task 7 and shows the expected results for the first ten days of January 2015. Fig.4 then gives the comparison of corresponding prediction errors using different methods. Figs.3 and 4 further demonstrate that the sparse representation method based on K-SVD-OMP combined with KELM presents better prediction results.

Fig.3 Prediction results using different methods for Zone 1 in Task 7

Fig.4 Prediction errors using different methods for Zone 1 in Task 7

Fig.5 compares the absolute percentage error (APE) box plots of Zone 2 in Task 7 using different sparse representations with the single SVM and KELM methods, and it can be seen that the global modelling approach of K-SVD-OMP combined with KELM achieves better prediction results under the APE metric. Fig.6 further gives the prediction results using K-SVD-OMP-KELM method for the first three days in April 2013. It can be seen that the K-SVD-OMP-KELM model for the first three days fits the actual output value better, at around 24-h time ahead and 48-h time ahead, the prediction result fits the actual value better, which shows a better prediction effect.

Fig.5 Comparison of APE box plots using different methods

Fig.6 Prediction results of Zone 3 in Task 1 using K-SVD-OMP-KELM method

Aiming at the actual PV power prediction problem, a sparse representation modeling method based on K-SVD-OMP algorithm and ELM with kernels is proposed. In the dictionary learning stage, this method uses the K-SVD algorithm to update the initial dictionary column by column. Compared with the non-dictionary learning algorithm that directly uses the DCT dictionary, it can obtain a more compact dictionary set under the condition of meeting the sparsity. It further reduces the reconstruction error and has a certain degree of adaptability, and the OMP algorithm is used in the sparse coding solution stage.

The proposed sparse representation method based on the K-SVD-OMP dictionary learning algorithm and the ELM with kernels is then applied to an instance of PV power prediction, and compared with other methods under the same conditions. The experimental results show that compared with the existing single SVM or ELM prediction method, the proposed method combining sparse representation and KELM is similar to the representation learning process in deep learning networks, therefore, it can be regarded as a feature preprocessing method. Since the prediction accuracy of the model is further improved, it has a better ability in feature representation and prediction.

推荐访问:Power prediction photovoltaic
上一篇:基于认知关联教学,提升数学思维品质——以“角”的教学为例
下一篇:河南警察学院召开党委(扩大)会,学习贯彻党的二十大报告精神

Copyright @ 2013 - 2018 优秀啊教育网 All Rights Reserved

优秀啊教育网 版权所有