Recommended articles:
-
Global Energy Interconnection
Volume 4, Issue 6, Dec 2021, Pages 596-607
Fault diagnosis of electric transformers based on infrared image processing and semi-supervised learning
Keywords
Abstract
It is crucial to maintain the safe and stable operation of distribution transformers, which constitute a key part of power systems.In the event of transformer failure, the fault type must be diagnosed in a timely and accurate manner.To this end, a transformer fault diagnosis method based on infrared image processing and semi-supervised learning is proposed herein.First, we perform feature extraction on the collected infrared-image data to extract temperature, texture, and shape features as the model reference vectors.Then, a generative adversarial network (GAN) is constructed to generate synthetic samples for the minority subset of labelled samples.The proposed method can learn information from unlabeled sample data, unlike conventional supervised learning methods.Subsequently, a semi-supervised graph model is trained on the entire dataset, i.e., both labeled and unlabeled data.Finally, we test the proposed model on an actual dataset collected from a Chinese electricity provider.The experimental results show that the use of feature extraction, sample generation, and semi-supervised learning model can improve the accuracy of transformer fault classification.This verifies the effectiveness of the proposed method.
0 Introduction
The distribution transformer is the key equipment that ensures the stable and reliable operation of a power system.However, factors such as disrepair and the external environmental conditions can cause faults or transformer dysfunction [1-3].The common fault types include equipment discharge faults and overheating faults [4-7].In the event of a fault, it is necessary to analyze the fault type in a timely and accurate manner to ensure that the transformer can be repaired and rapidly restored to normal operation.Therefore, research directed at improving the accuracy of fault type diagnosis of transformers is highly important [8-11].
Existing methods for fault diagnosis and condition assessment of transformers include the application of the IEC three-ratio method [12], Roger ratio method [13], David triangle [6], and uncoded ratio [14] to analyze the chromatographic data of transformer oil.In addition, researchers have used infrared images to perform fault diagnosis and status assessment of equipment.If the equipment remains in a faulty operating state for a long time, the temperature of the faulty area and its surroundings will increase.Infrared images can reflect the contour shape and texture characteristics of the device to a certain extent [16-19].From the perspective of image acquisition, the detection process is relatively straightforward, the detection time is short, and the process is not subjected to electromagnetic interference.Thus, it is unnecessary to shut down the equipment [20].However, once the collected infrared images are fed to a central monitoring system, the system operators require substantial time to handle these data.Moreover, depending entirely on the knowledge and experience of the staff may result in omission and/or misdiagnosis of phenomena [21-22].
At present, with the continuous development of big data and artificial intelligence technology, an increasing number of intelligent analysis systems are being proposed.These can mine inherent information from collected data samples and improve the diagnosis accuracy [23-26].A method to apply the infrared technology in the fault diagnosis of substation equipment was proposed in Reference [27].Reference [28] put forward a method to automatically detect the oil level in electrical power transformers in substations, based on infrared images.The edge points of the transformer oil conservator are identified by applying the edge detection process on an infrared image of the conservator.Then, its location and shape can be obtained by an iterative ellipse fitting approach.Reference [29] used infrared thermography data to diagnose the fault type of electrical equipment.In this study, K-means algorithm was used to cluster the infrared images, and a support vector machine was used as a classifier to estimate the fault type.Reference [30] used the infrared thermal imaging technology for the fault diagnosis of power equipment.Reference [31] used a video surveillance system to monitor the operating condition of electric equipment, based on the infrared theory, a temperature measurement model, and a temperature modification model.Reference [32] proposed a real-time and off-line method to monitor temperature variations and analyze the fault region of electrical equipment using infrared thermograms.Reference [33] located heating faults using the infrared heating thermal imaging technology.Reference [34] proposed an autodiagnosis system for electrical distribution panes using a matter-element model.
Labeled sample datasets involve sample imbalance issues.The number of samples of a type of label accounts for a large majority of the entire data.More attention would be paid to these samples during model training, whereas the information on the remaining samples would be omitted.This would reduce the generalization capability of the model and thereby, cause overfitting problems.To address this, Reference [35] proposed an adaptive over-sampling method for imbalanced datasets to improve the diagnostic performance for power transformers.In this study, an adaptive synthetic minority over-sampling technique was used in the data pre-processing stage to generate new data.Based on this enriched dataset, certain classification methods were used to validate the effectiveness of this oversampling method.Reference [36] used data preprocessing and gradient boosting methods for fault diagnosis of an oil-immersed transformer.The method is used to identify and replace outliers to obtain denoising samples.The high dimensionality of infrared image data makes this data expansion work more challenging.
Considering this, a transformer fault diagnosis model based on image processing and semi-supervised learning is proposed in this paper.We first extract the temperature feature, texture feature, and shape feature from the collected infrared image data as the feature parameters of the model.Second, the GAN algorithm is used to generate samples for the labeled sample dataset.Then, labeled and unlabeled data are used to construct a graph-based semi-supervised learning network.Finally, it is tested on actual data.The experimental results show that the method proposed in this paper has high accuracy.The contributions of this paper compared with conventional methods are as follows: 1) extracts key information parameters from the original infrared image; 2) reduces the imbalance between classes of labeled samples; and 3) constructs a semi-supervised learning model, which can fully use the unlabeled data in the database to further improve the accuracy of transformer fault diagnosis.
The remainder of this paper is organized as follows: Section II introduces the feature extraction of infrared images.Section III describes the semi-supervised graph model.Section IV presents the GAN for sample synthesis.Case studies are shown in Section V, and the conclusions are presented in Section VI.
1 Feature Extraction of Infrared Images
Infrared images can accurately and effectively reflect the thermal characteristics of transformers.Infrared images are presently being used widely to evaluate the operating status of power equipment.However, effective information cannot be obtained by relying only on the originally acquired infrared image.It is necessary to extract key information from the original image.Based on the extracted feature information, the diagnosis model can be designed and analyzed more effectively, and the accuracy of model results can be improved further.This study focuses on the extraction of the following features from an infrared image of a transformer: temperature features, texture features, and shape features.
1.1 Temperature Features of Infrared Images
Substation equipment failure is generally accompanied by drastic variations in the equipment temperature.For example, when the equipment undergoes an overheating fault, the temperature of the fault location and surrounding area increases significantly.Therefore, we selected three fundamental temperature features as the most intuitive feature information: the maximum regional temperature, average regional temperature, and variance of the regional temperature distribution.Although the temperature information can directly reflect the temperature variation characteristics of the device, it can be affected by environmental conditions (e.g., the temperature of the external environment, light intensity, and wind speed) during the acquisition of the infrared image.The recorded temperature information exhibits deviations; hence, it cannot accurately reflect the actual variations in the temperature distribution of the detected equipment.Therefore, to reduce the interference caused by variations in the external environment and limit the measurement error of the infrared thermal imager of the detection device, additional texture and shape features were incorporated in this study as characteristic parameters for the fault type analysis of transformers.
1.2 Texture Features of Infrared Images
After the integrated information on the gray value and gradient of the infrared image is extracted, we calculate the co-occurrence matrix of the two values.Then, we process it to obtain the final extracted texture information.This texture feature can be more sensitive to the boundary information of the image and reflect the roughness and uniformity of the image.This is conducive to the subsequent decision analysis.The specific process of texture feature extraction is as follows:
Let g(i,j) and f(i,j) denote the gray value and gradient, respectively, of (i,j) in the transformer infrared image.The size of the image is M×M .First, these need to be normalized and enlarged to the ranges of [0,1]Ng -and [0,1]Nf - , respectively:
where gmax, gmax, fmax, and fmin represent the maximum gray value, minimum gray value, maximum gradient value, and minimum gradient value, respectively, of the image.The number of pixels for which G ( m, n)=i and F ( m, n) = j, is the value at (i, j) of the gray-gradient co-occurrence matrix H.The normalization yields the following:
On this basis, we can calculate the non-uniformity of gray distribution U1 and that of gradient distribution U2 as the temperature features of infrared images:
1.3 Shape Features of Infrared Images
This study calculates the Hu moment features of the infrared images and linearly combines these to extract the moment features that are invariant to translation and rotation.The specific process is as follows:
The (p+ q)-th-order moment of a two-dimensional discrete image can be expressed as
where s ( x,y ) is the pixel value at (i, j) .The zero-order moment of the image m00 is the sum of all pixels.The centroid ( x , y) can be determined using the first moment of the image:
The second-order moment of the image contains the direction and size information.It is also called the moment of inertia.The third-order moment reflects the degree of distortion of the image projection, and the fourth-order moment reflects the projection kurtosis.The central moment of the image is defined as
After normalization, we obtain
We can calculate the Hu moments on the basis of.These are expressed as follows:
2 Graph-based Semi-supervised Learning Method
A collected sample set mainly includes two categories: labeled samples and unlabeled samples.The proportion of unlabeled sample data is generally higher owing to factors such as the cost of data labeling.Only labeled sample data for network training are used in the conventional supervised learning model, and the information contained in a large amount of unlabeled data is omitted.This results in a low generalization capability of the trained model.This study addresses this problem using the graph learning model for semi-supervised learning.It enables the full use of these unlabeled sample data for model training [20].
Graph learning is the mapping of a sample set to a corresponding structure graph.Each sample element in the set can correspond to a node in the constructed graph.Edges would connect nodes corresponding to the two samples.The weight of an edge is proportional to the similarity of the original sample.If the point corresponding to the marked sample has a color and the color type corresponds to the category of the label, the point corresponding to the unlabeled sample has no color.Then, semi-supervised learning can be considered as the spreading of the colors in the graph along the path of the edges.Because the points and edges in the graph can be expressed in the form of a matrix, matrix operations can be used to derive the graph network [37-39].
Let denote the labeled dataset and unlabeled dataset, respectively.Here, the lengths of the datasets are l and u, respectively.Moreover, l <u ,l + u =m.Here, m is the length of the total dataset.A graph G=(V ,E) can be constructed based on the dataset D = Dl ∪Du.Here, V={x 1, …xl , xl+1,… xl + u ,} is the node set, and E is the edge set.The affinity matrix can be defined as
where σ is the bandwidth of the Gaussian function.
Suppose F is the mapping function learned from the graph G=( V, E).Then, F can be used for classification.Furthermore, yi =sign(F( xi )), and yi ∈{- 1,1}.The semisupervised learning model needs to be constructed based on a fundamental assumption: similar sample inputs would have similar corresponding output values.An energy function can be defined based on this assumption:
where The classification results of labeled and unlabeled data are Fl=( f ( x1 ); f ( x2); …;f ( xl)) and Fu=( f ( xl+1); f (xl+2 );…;f (xl +u)), respectively.The diagonal matrix is D=diag ( d1 , d2 ,… dm ).Here, each diagonal element di is the sum of the elements of the i-th row of the matrix W.
When the energy function F attains the minimum value, the classification result function for labeled samples is F( xi)= yi, and the classification function for unlabeled samples is ΔFu =0.Here, Δ is the Laplacian matrix, and Δ= -D W.D and W are divided into blocks with the l-th row and l-th column as the dividing lines.This can be expressed as
Then, Eq.(18) can be rewritten as
Let P D W=-1 .We can obtain
Now, Then, Eq(21) can be rewritten as
We can use data of Dl to calculate Fl=( y1 ;y2; …;yl ).Then, we can obtain Fu to classify the unlabeled data.
For the multi-label classification problem, let Y denote the label set.Accordingly, a graph G=(V ,E ) can be constructed.In addition, a new non-negative label matrix must be constructed.Here, The label of sample xi corresponds to the i-th row element of the matrix J, i.e., The classification rule is
For i∈[1,m] and j∈[1,|Y|], the matrix J needs to be normalized.This is given as
The matrix W is used to construct the transfer matrix of the label S:
where Then, the iteration formula can be expressed as
where β ∈(0,1) represents the model hyper-parameters.After Eq(26)converges, we can obtain
The above iterative process is essentially equivalent to the solving of an optimization problem:
where μ >0 is the regularization parameter of the function.Whenthe optimal solution of Eq(28) is the convergent solution of Eq(27).
3 Sample Synthesis Based on Generative Adversarial Network
3.1 Fundamental principles
The generative adversarial network (GAN) was inspired by the binary zero-sum game theory.It is divided into two confrontational parts: the generator and the discriminator.The function of the generator is to learn the characteristics of the actual data and generate synthetic data based on it.The function of the discriminator is to distinguish the source of the input sample, i.e., to correctly assess whether the input sample is from the actual sample set or synthetic sample dataset.As the training of the GAN progresses, the capabilities of the generator and discriminator improves gradually [40].
A structural diagram of the GAN is shown in Fig.1.The input of the generator G is a random noise vector z.The noise is generally Gaussian or uniformly distributed, and the output is synthesized sample data G(z).The input of the discriminator D is real data.Its objective is to perform binary classification on the input data x and G(x).If it is determined that the input source is a real sample, it outputs one.Otherwise, it outputs zero.Then, the output result of the discriminator would be guided by backpropagation to optimize the parameters and thereby, improve the capability of G (so that the distribution of G(z) is as close as feasible to the distribution of the actual data pdata).Simultaneously, the discriminator improves its classification performance.The two continue to optimize in this adversarial training.When the discriminator cannot determine which dataset the input sample originated from, it can be considered that the generator has learnt the distribution characteristics of the actual data [41].
Fig.1 Flowchart of GAN
3.2 Generative Adversarial Network
The GAN is a generative model.According to probability and statistics theory, generative models usually include a set of probability distribution functions p x( x,θ )with a parameter θ and can be trained to approximate the probability distribution of real data Pd ata( x) to the extent feasible.Then, new synthetic data can be generated via extraction according to the aforementioned probability distribution [42-44].
If the probability distribution function PG (x ,θ) is restricted further (e.g., it is assumed that the probability distribution function conforms to the Gaussian distribution), the maximum likelihood method is generally used to optimize θ.Assuming that a set of real samples exists, these samples conform to the probability distribution Pd ata( x) and are mutually independent.Let pdata( x)and p x( x,θ ) be the probability density functions of the sum Pd ata( x) and Px ( x,θ),respectively.Then, the likelihood function of this sample set is
Take the logarithm of both sides of the likelihood function:
The objective of the generative model is to determine a maximum likelihood estimate that maximizes the value of the likelihood function:
Because the sample is drawn by the probability distribution function Px ( x,θ), Eq (31) can be rewritten as
By transforming the above formula into integral operation, we obtain
The concept of KL divergence is introduced below.It is an index used to measure the degree of difference between two probability distributions.The KL divergence is zero when the two probability distributions are identical.Furthermore, a higher KL divergence implies a larger difference between the two probability distribution functions.The following is the expression for KL divergence:
The objectives of the discriminator are to maximize the output value D ( x) of the sample x from the true probability distribution Pd ata( x), and minimize the output value of the pseudo data D (G ( z)) generated by the generator G ( z). To conveniently express and calculate the objective function of the GAN model, imitating the operation of taking the pairs of the likelihood functions mentioned above, taking the logarithms of D ( x) and D (G ( z)) respectively. In addition, the maximum value of log( D ( x )) and minimum value of log( D ( G ( z ))) are calculated using the same formula, and the values of both are obtained in the range from zero to one. Then, it is altered from log( D ( G ( z ))) to log(1-D ( G ( z ))). Finally, the expectations for log( D ( G ( z)))and log(1-D ( G ( z ))) are taken to obtain the discriminator’s objective function.
3.3 Training steps
The optimization of GAN is a minimization problem.In the actual training process, the generator and discriminator models have their respective loss functions.The two are trained alternately.The detailed steps for training a GAN are given below:
(1) Given the probability distribution Pd ata( x) and prior distribution Pz ( z) of the actual data, set the hyperparameters n and ηsimultaneously.In addition, initialize the parameters φ and θ of the discriminator and generator.
(2) Extract real samples { x1 , x2, …,xn} from Pd ata( x), and collect random noise samples { z1 , z 2, … , zn} from the prior distribution.Then, input the noise samples into the generator to obtain the generated data {G ( z1 ), G ( z 2), …,G ( zn )}.Use the gradient ascent method to update the parameter φ of the discriminator.That is,
(3) Repeat Step 2 k times.
(4) Extract random noise samples { z1 , z 2, … , zn} from the prior distribution, and update the parameter θ of the generator using the gradient descent method:
(5) Repeat Steps 2, 3, and 4 until the network converges or attains the set maximum number of iterations.
4 Transformer Fault Diagnosis Process
4.1 Fault Diagnosis Process
Image processing and graph-based semi-supervised learning methods are used to realize fault diagnosis of transformers.The specific process is shown in Fig.2.First, the infrared image data of the transformers are obtained as a sample case library.In addition, the temperature feature, texture feature, and shape feature of the infrared image are extracted as the characteristic features.The sample case library is classified into two categories: the labeled dataset and unlabeled datasets.For the labeled dataset, the imbalance ratio of the sample is first calculated.The imbalance ratio is equal to the number of samples in the majority subset divided by the number of samples in the minority subset.If the imbalance ratio is higher than the set threshold, it can be considered that the labeled dataset has sample imbalance issue.It is necessary to use GAN to generate samples for the minority subset.The imbalance threshold in this study is set to three.Then, the labeled dataset and unlabeled dataset are used to construct a graph network, train it, and finally output the fault type of the device.This study mainly considers two types of substation equipment failures: equipment defect failures and overheating failures.
Fig.2 Fault diagnosis process
4.2 GAN Structure
The structure of GAN is shown in Fig.3.The generator model is based mainly on LSTM.The GAN structure includes an embedding layer, an LSTM network layer, a fully connected layer, and a softmax layer.The discriminator model is based mainly on CNN.It includes the embedding layer, two-dimensional convolutional layer, onedimensional maximum pooling layer, fully connected layer, and softmax layer.The input features of the GAN model are the features extracted from the infrared images, i.e., temperature features, texture features, and shape features.
5 Case Studies
5.1 Evaluation Metrics
The performance of the model in terms of fault classification is evaluated by constructing a confusion matrix, as shown in Table 1.For classification problems with imbalanced data, the classification performance of the model cannot be evaluated completely based only on the classification accuracy (ACC).Therefore, based on the confusion matrix, we use two additional evaluation indicators: recall (REC) and precision (PRE).
Fig.3 Structure of GAN
Table 1 Confusion matrix
Estimation label 1 0 Actual label 1 True Positive False Negative 0 False Positive True Negative
5.2 Field test results
The infrared images of transformers were collected from a Chinese electric company.The numbers of unlabeled and labeled samples are 359 and 182, respectively.Of the latter number, 41 are defective faults and 141 are overheated faults.The non-uniformity of the labeled samples is 3.43, which is higher than the set imbalance threshold.Therefore, the minority labeled samples need to be generated using the proposed GAN.First, the temperature features, texture features, and shape features are extracted from the infrared image as the characteristic features.Then, the GAN model is used to generate data for the labeled samples of the defective faults.Subsequently, these characteristic features of both labeled and unlabeled samples are fed as model inputs to train the semi-graph classification model.Finally, the proposed algorithm is applied to the field test with the well-trained classification model.Fig.4 illustrates the tested infrared image of the transformer.Based on the results from the algorithm presented in this paper, it is concluded that the transformer has an overheating fault.Furthermore, the information is sent to the field staff in a timely manner using the algorithm.An inspection by the staff revealed that the transformer was damaged at the point marked by a black box in the figure.The damage caused the transformer to overheat.This observation also verifies the effectiveness of the method proposed in this paper.
Fig.4 Field test results
5.3 hyper-parameter Selection
We first analyze the effect of the selection of hyperparameter.Let β ∈{0.1,0.3,0.5,0.7,0.9}.The results of classification of ACC, REC, and PRE for the two types of faults are shown in Fig.5.It is evident that the classification performance of the proposed model is the highest when β =0.7.This determines the setting of the hyper parameter.
Fig.5 Effects of different hyper-parameter values
5.4 Effects of Feature Extraction
In this subsection, we test the effectiveness of feature extraction on the original data.The following method is designed for comparison.Method 1 uses the original image information as the input of the model.Note that because the dimensionality of the vector is high at this time, no sample expansion method is used.Methods 2-4 extract only one feature of the infrared image from among the temperature feature, texture feature, and shape feature.The classification results of the two types of faults are shown in Table 2.
Table 2 Effects of different feature extraction methods
Fault type Method ACC (%) REC (%) PRE (%)Equipment defect Method 1 52.2 53.1 54.7 Method 2 62.4 64.8 63.6 Method 3 68.5 70.2 71.1 Method 4 74.5 72.6 76.8 Proposed 82.2 84.7 83.1 Overheating Method 1 57.5 60.1 58.8 Method 2 65.5 68.2 67.7 Method 3 72.3 77.2 74.1 Method 4 78.8 75.6 80.1 Proposed 86.2 84.8 83.5
The results in Table 2 reveal that effective feature extraction can significantly improve the model’s capability to classify transformer faults.This is because although the input vector has a high dimensionality, the model cannot learn effectively from it owing to the limited number of samples.This results in a decrease in classification accuracy.The effect of the model can be improved moderately by using simple feature extraction.This is because the use of only one type of feature quantity is biased and cannot fully reflect the input information features.It should be noted that among the three feature quantities, the improvement corresponding to the temperature feature is the lowest and that corresponding to the shape feature is the highest.This also indicates that the temperature feature would be subjected to interference by the external measurement environment and would also be affected by the accuracy of the measurement tool.These would ultimately affect the accuracy of the temperature feature.In contrast, Hu moment is used as the shape feature, with its higher robustness.Combining the advantages of the above-mentioned features, the method in this study uses three types of information simultaneously.Thereby, infrared image features can be extracted fully and accurately.This, in turn, can significantly improve the classification accuracy of the model.
5.5 Effects of Semi-supervised Graph Model
We test the effect of using the graph network for semisupervised learning to classify the infrared images of substation equipment.The following two methods are designed for comparison.A support vector machine (SVM) is used in Method 1 for fault classification.The SVM model uses the radial basis function as the kernel function.The penalty factor is set to 0.2, and the parameter of the kernel function is set to 104.The second method involves using a multi-layer neural network.The number of network layers of the DNN is set to five, the learning rate is set to 0.001, and the learning period is set to 3000.Because both Methods 1 and 2 involve supervised learning, they use only labeled sample datasets for model training.The classification results for the two faults are shown in Table 3.
Table 3 Effects of different classification models
Fault type Method ACC (%) REC (%) PRE (%)Equipment defect Method 1 65.1 67.5 64.5 Method 2 71.5 73.2 72.1 Proposed 82.2 84.7 83.1 Overheating Method 1 70.2 72.2 70.9 Method 2 74.4 73.3 71.0 Proposed 86.2 84.8 83.5
Table 3 shows that the classification accuracy of the method proposed in this paper is significantly higher than that of the other two methods.This is because Methods 1 and 2 are supervised learning.Therefore, effective classification requires a large number of labeled samples for model training.However, in the actual process, it is highly difficult to obtain such labeled samples.Training on a limited set of labeled samples results in a low classification effect of the model and causes over-fitting problems (which results in low generalization capability).In contrast, the graph learning method proposed in this paper is a semisupervised learning framework.It can learn the mapping relationship from labeled samples and effectively mine information from a large number of unlabeled samples.This significantly improves the classification capability of the model and ensures that it has a good generalization capability.Therefore, it is more suitable for practical applications.It is particularly suitable for fault classification of transformers because samples of such faults are difficult to obtain.
5.6 Effects of GAN
In this section, we verify the effectiveness of generating minority samples through GAN.Two comparison methods are designed to better compare the model classification effect.The first method is to use only the original collected database for analysis, i.e., sample generation method is not used.The second method is to apply the model with oversampling.The transformer fault classification results for the different methods are shown in Table 4.
Table 4 Effects of different classification models
Fault type Method ACC (%) REC (%) PRE (%)Equipment defect Method 1 61.1 64.2 60.5 Method 2 71.5 72.0 74.1 Proposed 82.2 84.7 83.1 Overheating Method 1 64.4 65.3 61.1 Method 2 68.5 71.1 67.7 Proposed 86.2 84.8 83.5
Table 4 shows that the strategy of sample synthesis can improve the accuracy of model classification.Although the classification effect can be improved by applying the oversampling strategy, the range of improvement is significantly small.This is because the oversampling method does not add new data samples and only repeats the original samples.This strategy would result in overfitting of the trained model and a reduction in its generalization capability.In contrast, this study uses GAN to generate new samples.The sample generation process considers the spatial distribution of the minority sample data.This ensures the randomness and effectiveness of the generated samples.This, in turn, improves the generalization capability of the model and enhances its capability to classify infrared images of transformers.
5.7 Effects of number of samples
Because both labeled and unlabeled data are applied to train the model, it is worthwhile to investigate the effect of the proportion of these numbers of samples on the model performance.Herein, we set the proportion of the labeled data and unlabeled data as a number from The transformer fault classification results of different ratios are shown in Table 5.
Table 5 ACC of model with different proportions of samples
?Data proportion 1 / 3 1 / 2 1 2 3 Equipment defect 78.3 80.5 82.2 83.5 84.1 Overheating 79.2 81.5 82.9 85.2 86.0
Table 5 shows that the ACC of the model increases (which indicates an improvement in the model performance) with the increase in the proportion of labeled samples.This is because the labeled samples contain more information for model training.Note that the proposed model performs better than the compared methods for these five cases.
6 Conclusions
A fault diagnosis model for transformers is proposed based on image processing, graph-based semi-supervised learning, and GAN.First, the key feature parameters are extracted from the infrared images, including temperature features, texture features, and shape features.Then, the GAN algorithm is used with the labeled sample data to generate samples for the minority samples.Finally, a graphbased semi-supervised network is constructed and trained for all the data including the labeled and unlabeled samples.The experimental results show that the method presented in this paper has a high accuracy of fault classification of distribution transformers.The following are the advantages of the proposed method: (1) The key feature parameters are extracted from the original infrared images to ensure the validity and universality of the input information; (2) The problem of imbalance between classes in labeled samples is reduced, and the model’s performance is improved; (3) The information of unlabeled samples is used fully.This reduces the model’s dependence on labeled sample data and makes it more suitable for actual fault diagnosis of transformers.
7 Acknowledgements
This work was supported by China Southern Power Grid Co.Ltd.science and technology project (Research on the theory, technology and application of stereoscopic disaster defense for power distribution network in large city, GZHKJXM20180060) and National Natural Science Foundation of China (No.51477100).
Declaration of Competing Interest
We declare that we have no conflict of interest.
Fund Information