Recommended articles:
-
Global Energy Interconnection
Volume 1, Issue 5, Dec 2018, Pages 618-626
A quantitative analysis model of grid cyber physical systems
Abstract
Conventional power systems are being developed into grid cyber physical systems (GCPS)with widespread application of communication,computer,and control technologies.In this article,we propose a quantitative analysis method for a GCPS.Based on this,we discuss the relationship between cyberspace and physical space,especially the computational similarity within the GCPS both in undirected and directed bipartite networks.We then propose a model for evaluating the fusion of the three most important factors:information,communication,and security.We then present the concept of the fusion evaluation cubic for the GCPS quantitative analysis model.Through these models,we can determine whether a more realistic state of the GCPS can be found by enhancing the fusion between cyberspace and physical space.Finally,we conclude that the degree of fusion between the two spaces is very important,not only considering the performance of the whole business process,but also considering security.
1 Introduction
The Global Energy Internet provides a basis for open and interconnected interactions between businesses,as well as key technical requirements for security and stability of each business system [1].The power system,the physical equipment,and its monitoring system — which is connected with the relevant information processing and control communication systems — constitute the power cyber physical system of each business’s link.Conventional power systems are developing into grid cyber physical systems(GCPS)with widespread application of communication,computer,and control technologies [2].
GCPS involves three primary key technologies:information,communication,and security [3,4].These three key technologies jointly build the basic platform for interaction between cyberspace and physical space.In the global energy interconnection environment,GCPS will be will encounter complications in dealing with complex information processing,communication security,and information security problems [4,5].Data networks in GCPS not only store monitoring data,detection data,control data,and so on,but also keep the measuring instruments connected across the whole network.To address these needs,we can construct a data network infrastructure not only to exchange valuable data,but also to mine both the data itself and the global data in order to inform business decisions.The GCPS communication network connects physical devices and data resources,enabling communication among devices,between devices and data resources,and among data resources.Current infrastructures between physical devices are not compatible,so reliability of communication among them varies.Diversity in the physical infrastructure introduces numerous risks;more complicated security measures are required to ensure the safety of the whole system [6],as opposed to today’s local defense measures.Security will be a systematic project,both in physical space and cyberspace.
In the current implementation of intelligent substations,we can see that data accuracy,data availability,and the effectiveness of state determination will be dependent upon the effectiveness of terminal-aware equipment.In evaluating the state of the sensors,errors in data and disturbance of the equipment state are not distinguishable,and the errors in analysis are enlarged once the data is uploaded to the processing node [7].At this time,optimization evaluation and decision algorithms are insufficient to solve the problem.In the connected global energy environment,a small error in the perception of equipment due to data iteration and transmission will be amplified,compromising its availability and the credibility of analysis based on it.The number and type of connected terminals will be greater than the present situation,leading to a significant reduction in communication efficiency and affecting the operational efficiency of low latency requirements.As the scale of the network increases exponentially,its reliability and accuracy are difficult to guarantee.At the same time,sensors’ erroneous and abnormal data are included in data analysis,resulting in unreliability analysis results.How to intelligently identify data across the widely interconnected terminals to ensure efficient filtering of weak data,lightweight data fusion analysis,confidential transmission of high quality data,and quality improvement and fusion of data on the terminal side is a major challenge GCPS is facing today [8].
Fusion consists of three factors:information,communication,and security.In fusing physical space and cyberspace,these three areas interact with each other,each playing a specific role in the GCPS.If we only consider overall GCPS performance,we should seek to enhance the fusion of all three factors,especially the information and communication technologies.However,taking security into consideration,information risks will spread to the physical space and can cause industrial disasters through the tightly coupled fusion of the three factors.Thus,in this article,we mainly discuss the degree of fusion within the GCPS,and propose some practical models for quantitative analysis of the fusion process.
2 Background
2.1 Density-based clustering (DENCLUE algorithm)
Density-based clustering is the traditional statistics algorithm used to solve the grouping problem among mass data.As a major technology of data mining,clustering analysis is the most common and basic method used to study classification [9,10].The basic idea of clustering is to assume that there are different degrees of similarity between the set of physical or abstract object samples.According to the similarity between them,the sample objects are divided into groups or clusters.The objects with high similarity are clustered together,and the objects in different groups have high degree of dissimilarity.The greater the similarity(homogeneity)within the group,the greater the difference between groups,and the better the clustering is.
Clustering analysis provides a valuable tool for searching for data objects from a given set and dividing the objects into meaningful or useful groups (clusters).If the goal is to divide into meaningful groups,the clusters should capture the natural structure of the data.In many applications,all objects in a cluster are often treated or analyzed as a single object.In a sense,clustering analysis is only the starting point for other purposes (such as data aggregation).Assume x1,…,xn are variables of the f function:
where K()is the kernel and h is the bandwidth [11].K must have two basic principles:
Usually,we set the part of the equation of the
2.2 Similarity computation algorithms in recommender systems
Among the cyberspace’s points (we take each device as a point),each point has some relationship with some other point (or points);a change in one point will affect the related points.The relationship among the points may be a linear or nonlinear correlation.As this relationship is very important,we can analyze it amongst the points using orthogonal matching pursuit (OMP)algorithms for linear correlation analysis [12,13],and empirical mode decomposition (EMD)algorithms for nonlinear problems [15,16].
2.2.1 Orthogonal matching pursuit (OMP)[14]
Considering one point s,and its neighbors X,find a subset of k neighbor points such that|s−f( x1,…,xk )|is minimized.When f is a linear function,f( x1,…,xk)is a linear regression.
The aim is to compute a linear combination among the k points to make the result most similar to the statistical readings of point s.We compute by first finding xi with the maximum correlation r with s:
Then we compute
Setting res to s and repeating the procedure,we get:
2.2.2 Empirical Mode Decomposition (EMD)[16]
The main principle of nonlinear regression is to compute the nonlinear basis based on linear regression algorithms.In other words,in the equationwe should construct the function for each coefficient of xi.We denote this as each ai here is a function,and resk represents the residual of the signal s.
In the EMD algorithm,we employ the basic OMP linear regression algorithm to compute the nonlinear basis.First,we can get the nonlinear basis for each point by EMD.Then,we can compute the respective coefficient ai and get the nonlinear equation.Taking a1 for example,we get the first basis a1 via EMD,denoting the basis asThen,we compute the minimum value:
We find that the computation procedure is the same as OMP,so we employ the OMP algorithm to get the value,andThe other bases,gi,can be computed in the same way.In the end,we can get the nonlinear regression as follows:
where gi indicates the ith basis and resk represents the residual of the signal s.
2.3 User–item based recommender algorithms
In a bipartite network,the relationship between two parties is very interesting.Usually,we can use the links between the two parties and the weight of each link to represent the quantitative value of the relationship between all points of each side.There are two typical algorithms to solve this problem:heat conduction and mass diffusion.
2.3.1 Heat conduction method [17,18]
To make the situation simple,we take the user–item bipartite network as an example.Users choose the items,and the items reflect the recommendations for the users’side.The heat conduction method considers the value of each point delivered by the connection.After summing all the values,the point’s net value is found by accounting for the point’s degree.We can denote the procedure as follows:
Where valuej represent the value of point j delivered to point i.
2.3.2 Mass diffusion method [18,19]
Again taking the user–item network as an example,the mass diffusion method differs from heat conduction in that it only takes the net value from the other side into account.We depict the method as follows:
3 Analysis model
3.1 The overall structure of the quantitative analysis model for GCPS
In a GCPS,the relationship between cyberspace and physical space is very important.Taking security for example,a minor mistake or an attack in the information space may cause a chain reaction in the physical space,leading to disasters.In 2015,hackers attacked the Ukrainian industrial control information systems and caused a major power failure.The physical space reflects the information space just as minor problems with a physical device will cause monitor data fluctuation.If the fluctuation is out of control,the data can cause inaccurate statistical results,leading to poor decisions.Thus,the relationship between the two sides is very important;they do not operate independently,but also are not fully coupled.Given these concerns,we can depict the structure of the GCPS as follows:
Fig.1 GCPS qualitative analysis model
This structure can be divided into three parts:cyberspace,denoted by C( c1,…,cn );physical space,denoted by P( p1,…,pn);and the cyber–physical relationship layer,denoted by RCP.In cyberspace,the devices are represented as points,and they may have relationships between them.If ci and cj are related,we create an edge lcijbetween them.We can describe the physical space similarly.
However,the relationship layer RCP reflects the connections between points belonging different sides.We denote these relationships by Lci pj .The figure above shows the bipartite network of the GCPS model which is used for the quantitative analysis below.
3.2 Clustering algorithm for the GCPS structure
For each side,we employ the DENCLUE algorithm to classify the points.Since each point is not absolutely independent,some devices will have relationships with other points.In their original state,the point–point links can be depicted as cause-and-effect relationships,time series relationships,adjacent relationships,and so on.Then,we divide the points into four types based on the transmission of information:non-communicating,transmitting-only,receiving-only,and both transmitting and receiving.Then,we can classify them according to the points which information is sent and received from.
Assuming that the information is data (non-data information can be used with the network connections to classify the related points),we can employ the DENCLUE algorithm as follows:
stopping when k>0.Through this procedure,we can decide whether the point should be classified into the group by the results of
Using the steps above,we can get the preliminary groups of points for each side.However,the data sent between devices is not efficiently used;to make a better use of this data,we firstly normalize it.The most common method is to map the data using the following linear normalization equation:
However,linear normalization cannot deal with a situation where the device has a disturbance in data.Thus,we use another normalization method — sigmoid normalization — as follows:
This method amplifies the middle of the data range and thus can accommodate abnormal devices.The normalized data must is then processed using OMP and EMD from section 2.First,the OMP algorithm is used to find the most related k points of each point,and then analyze the resk.If resk>threshold,we can refine the regression using EMD.Initially,we must set a threshold to decide whether to employ the EMD nonlinear regression algorithm.By thus applying the clustering procedure,we can construct the GCPS model.
Fig.2 Clustered points of the cyber–physical relationship
As shown in Fig.2,cyberspace points are denoted by C(C1,…CN)and physical space points are denoted by P(P1,…PN);the relationships between the clusters are shown by lines.To simplify,we map the relationships as:
Taking the physical point Pj for example,we represent the normalized value as Ipj,and the degree as Dpj.The same can be done for the cyberspace point;the normalized value Ci is Ici and the degree of Ci is denoted as DCi.Thus,we can define the quantitative analysis model of the undirected bipartite network.
Fig.3 Simplified bipartite GCPS network
4 A quantitative analysis model for GCPS
4.1 Quantitative analysis model for the undirected bipartite network
As above,we constructed an undirected bipartite network based on the GCPS structure.In the bipartite network,the links between points on each side can deliver the value of each point.Taking point Pj for example:the value Ipj is divided into several parts depending on the degree Dpj and each part of the value is delivered through the links with Pj.Thus,we can see the tendency of the value to influence the connected neighbors,and that the degree of the point’s influence is the quantitative indicator of how strong the two points influence each other in the GCPS model.That means we can use the influence of each pair of points as the quantitative indicator of the GCPS model.
The problem becomes simple,since the real difficulties are the quantitative value of each pair.In the simplified model from Fig.3,we can use the heat conduction and mass diffusion methods (from section 2)to compute the values as follows:
Heat conduction method:
The pair of points (Ci, Pj)e.g.,ValueCican be calculated as follows:
We also can get the value of Pj:
Mass diffusion method:
Also,taking the pair of points (Ci, Pj),each value of the point can be calculated as follows:
Through the value computation procedure,we can get the values of each side:
Each value of the point represents the trend of influence the point has upon the other side.
From the analysis above,we can see that the undirected bipartite network can be quantitatively calculated for both sides.So,the quantitative procedure is somewhat like the directed network.However,we want to take the real directed links in the GCPS model into consideration in order to represent the important evaluation factors for the quantitative model,resulting in the following refined quantitative model.
4.2 Quantitative analysis model for GCPS considering the directed bipartite network
In the previous section,we presented an undirected bipartite network quantitative model for GCPS.However,the real situation is more complicated [20].Usually,the sensors in a GCPS collect information and send it to the host(or servers)and actuators in the system are given control instructions to execute the command.We can group this process into two parts:information gathering (sensing)and command executing (controlling).
To reflect the procedure from Fig.3,sensing and controlling represent {Pj→Ci }|(1≤i≤N,1≤j≤M)and{Ci→Pj }|(1≤i≤N,1≤j≤M),respectively.Based on this theory,we can construct the directed bipartite network from Fig.3.Assuming that every directed link between the points of each side should have some difference when delivering the values,and taking (Ci,Pj)forexample,thelinksbetween the two points can begiven as:andThesetwo links differ from each other in their ability to deliver values,thus we can assign a function to represent the ability of each link as:F(Ci,Pj)and F(Pi,Cj).The computation procedure for values is also changed as follows:
Heat conduction method:
Mass diffusion method:
The function of the link between each pair of points can be determined by the specific GCPS model.However,in the directed network,we can add three factors to evaluate the GCPS fusion degrees:information,communication,and security.All three factors are directed and thus can form the following evaluation model.
5 Discussion
In the preceding two sections,we discussed the value computation procedures in which we find the impact of each point on the other side.If we get the weight of the point on each side,we can easily get the overall impact of one side on the other side.The weight here depends on the real situation of the GCPS.The overall impact can be calculated as follows:
For cyberspace,suppose the weight is:1≤i≤N;the overall impact quantitative value can be calculated by:
The overall impact of the physical space can be also calculated by:
Having the values for the two sides is not enough;the three factors for evaluating fusion degree cannot be reflected by only the impact because,in reality,the three factors do not always have the same importance.In our evaluation model,we not only consider the fusion of cyberspace and physical space,but also consider the fusion of the information,communication,and security factors.
For the information evaluation factor,we should consider the point’s degree and information entropy;for the communication factor,we should take the communication latency into consideration;and for the security factor,we should consider the risk-spreading mechanisms in the whole system.The model is complicated because we must consider every factor at the same time,and each factor by itself is complex as well.To simplify the problem,we assume that in reality,the importance of each factor has already been modelled and has a weight indicating its respective importance.The weight of the three factors as follows:
whereαI,βC,γS indicate the importance (weight)of the information,communication,and security factors,respectively.For each point of the two sides,the value delivered to the adjacent point should consider these three factors.In other words,the real value in the directed network should consider the impact of the factors.Taking the (Ci,Pj)from the heat conduction method for example,the value computation procedure would be:
These three valuescan be represented as a point in a three-dimensional coordinate system.Therefore,we can depict the point as Thus,we construct the main evaluation model for the fusion of GCPS as follows:
The three factors (information,communication,and security)can be the three dimensions of the coordinate system,and based on this,we can construct the quantitative fusion analysis unit cubic of the GCPS models (QFAUCGCPS).As shown in Fig.4,the point VCi is the impact of Ci on its adjacent neighbors.
Based on the above model,we can evaluate the importance of the three factors (information,communication,and security)for the grid cyber-physical system.To verify the effectiveness of the proposed method,we take the monitoring and control of voltage stability in an IEEE 14-node system as an example of a cyber–physical process,as follows:
Fig.4 QFAUC-GCPS unit cubic
As shown in Fig.5,the physical side is an IEEE 14-node system,where the advanced application function is RTVSMAC with a control cycle of predetermined intervals.Each substation in the power grid is connected in parallel with two capacitors and reactors.During system operation,based on the state information,the voltage stability and reliability index of each node is determined.When the flow approaches or exceeds the threshold,the dispatch and control center takes corresponding steps (e.g.,adjusting the switching state of load,capacitors or reactors)according to the severity until the current and voltage value of the node is qualified.
In this case,the cyberspace impact area is more important than the physical space impact area.So,we set the unit cubic analysis model as follows:
Fig.5 Cyber–physical process of an IEEE 14-node system
Fig.6 Quantitative areas of the cyber and physical spaces
In Fig.6,the values of the points create two regions:the cyberspace impact region and the physical space impact region.The cyberspace impact region is more important than the physical space impact region;therefore,in our cube,the distribution of information nodes is much larger than the distribution of physical nodes.But in some other scenarios,the physical space impact area is more important than the cyberspace impact area,so the distribution of nodes in the cube reflects that.In the unit cubic figure,we can easily judge which one is more merged with the other and can propose a fusion of the GCPS indicators to evaluate the performance or security of the model.
6 Conclusion
In this paper,we discussed the problem of quantitatively analyzing a GCPS.We proposed that the fusion of cyberspace and physical space should be balanced to some degree,neither a full fusion nor isolation of the two spaces.We proposed a quantitative model based on bipartite network theories.Based on that primitive model,we further developed a directed-network evaluation model.After getting the values of each point in the network,we integrated the weighting of the points and calculated the overall value to evaluate the fusion of the GCPS model.However,that work did not include the three most important factors:information,communication,and security.So,we continued developing the model by proposing a quantitative fusion analysis unit cubic for the mass diffusion GCPS model.Through this cubic,we can judge which space is more likely to dominate the model.Also,we can find which point will most strongly influence adjacent points.
Acknowledgements
This work is supported by The National Key Research and Development Program of China (Title:Basic Theories and Methods of Analysis and Control of the Cyber Physical Systems for Power Grid (Basic Research Class 2017YFB0903000))and the State Grid Science and Technology Project (Title:Research on Architecture and Several Key Technologies for Grid Cyber Physical System,No.SGRIXTKJ[2016]454).
References
-
[1]
Voropai N,Podkovalnikov S,Osintsev K (2018)From interconnections of local electric power systems to Global Energy Interconnection.Global Energy Interconnection,1(1):4-10 [百度学术]
-
[2]
Kumar N,Singh M,Zeadally S et al (2017)Cloud-Assisted Context-Aware Vehicular Cyber-Physical System for PHEVs in Smart Grid.IEEE Systems Journal,11(1):140-151 [百度学术]
-
[3]
Zhang Y,Zhang J,Zhang S et al (2018)Architecture and roadmap of standard system for Global Energy Interconnection.Global Energy Interconnection,1(3):225-235 [百度学术]
-
[4]
Li B,Lu R,Wang W et al (2017)Distributed host-based collaborative detection for data injection attacks in smart grid cyber-physical system.Journal of Parallel &Distributed Computing,pp:32-41 [百度学术]
-
[5]
Maasoumy M (2013)Controlling Energy-Efficient Buildings in the Context of Smart Grid:A Cyber Physical System Approach.Health Policy,96(3):239-244 [百度学术]
-
[6]
Choudhari A,Ramaprasad H,Paul T et al (2013)Stability of a Cyber-physical Smart Grid System Using Cooperating Invariants.In:Cyber-Physical Systems (ICCPS),2013 ACM/IEEE International Conference on Cyber-Physical Systems,pp:240-240 [百度学术]
-
[7]
Sun Y,Guan X,Liu T,et al (2013)A cyber-physical monitoring system for attack detection in smart grid.In:IEEE Conference on Computer Communications Workshops.IEEE,2013,pp:33-34 [百度学术]
-
[8]
Wang P,Ashok A,Govindarasu M (2015)Cyber-physical risk assessment for smart grid System Protection Scheme.In:Power&Energy Society General Meeting.IEEE,2015,pp:1-5 [百度学术]
-
[9]
He J,Pan W (2010)A DENCLUE based approach to neuro-fuzzy system modeling.In:International Conference on Advanced Computer Control.IEEE,2010,pp:42-46 [百度学术]
-
[10]
Idrissi A,Rehioui H,Laghrissi A et al (2015)An improved DENCLUE algorithm for data clustering.In:IEEE 2015 International Conference on Information and Communication Technology and Accessibility,Icta.IEEE [百度学术]
-
[11]
Hinneburg A,Gabriel HH (2007)DENCLUE 2.0:fast clustering based on kernel density estimation.In:Advances in Intelligent Data Analysis Vii,International Symposium on Intelligent Data Analysis,Ida 2007,Ljubljana,Slovenia,September 6-8,2007,Proceedings.DBLP,2007,pp:70-80 [百度学术]
-
[12]
Liu E (2010)Thresholding Orthogonal Multi Matching Pursuit.Measurements,2010 [百度学术]
-
[13]
Xie Z (2009)Iterative Orthogonal Matching Pursuit and Sparse Solution.Microelectronics &Computer,26(10):53-56 [百度学术]
-
[14]
Huang F,Zhu Y (2013)University H.Improving Orthogonal Matching Pursuit Algorithm Based on Local Property.Journal of Qingdao University of Science &Technology,2013 [百度学术]
-
[15]
Huang N,Shen Z,Long S et al (1998)The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Nonstationary Time Series Analysis.Proceedings Mathematical Physical and Engineering Sciences,454(1971):903-995 [百度学术]
-
[16]
Zhan L,Li C (2016)A Comparative Study of Empirical Mode Decomposition-Based Filtering for Impact Signal.Entropy,19(1):13 [百度学术]
-
[17]
Zhou T,Ren J,Medo M et al (2007)Bipartite network projection and personal recommendation.Phys Rev E,76:046115 [百度学术]
-
[18]
Zhang Y,Blattner,Yu YK.2007a.Heat conduction process on community networks as a recommendation model.Phys Rev Lett,99:154301 [百度学术]
-
[19]
Wu T,Liu S,Ni M (2018)Model design and structure research for integration system of energy,information and transportation networks based on ANP-fuzzy conprehensive evaluation.Global Energy Interconnection,1(2):137-144 [百度学术]
Fund Information