the sequential algorithm. In Figure 6(b), the absolute error between the rotor angles computed from the sequential and the PRA are shown. It can be seen that the maximum absolute rotor angle error is only 0.2˚ and the error has the same contour of the rotor angle variation. The small maximum error and the same contour demonstrate the accuracy of the PRA.
The second set of simulations was performed with a fault clearing time of 1.0 secs which is larger than the critical clearing time of 0.42 secs. Critical clearing time is the time before which a fault has to be cleared for the power system to transit to a stable steady state. Also, the critical clearing time is fault and system steady state dependent. In Figure 7(a), the time domain solution of the rotor angle with sequential and PRA are shown. As expected, the rotor angle keeps increasing due to the system being unstable. The rotor angle computed using the PRA follows the angle from the sequential algorithm demonstrating the suitability of PRA even during the unstable system state. In Figure 7(b), the absolute error between the sequential and PRA solution is presented. In Figure 7(b), the magnitude of the error between the sequential and PRA is increasing while following the contour of the rotor angle variation. This is due to the numerical solutions computed by the coarse propagator are numerically instable (increasing
Figure 6. Rotor Angle Variation using Sequential and PRA of Classical Generator Model. (a) Rotor Angle variation with Sequential and PRA Simulations; (b) Rotor Angle Variation Error with PRA Simulations.
Figure 7. Time-domain simulation comparison of Rotor angle for unstable system. (a) Time-domain solution: Sequential v/s PRA for unstable system; (b) Absolute error between sequential and PRA for unstable system.
numerically). Furthermore, these numerically instable values are used as the initial conditions for fine propagators resulting in an amplification of the numerical instability. The numerical instability affects negatively the performance of the predictor-corrector stage of the PRA resulting in an increasing error. This behavior is expected as the system is unstable.
5.2. Simulations Using the Detailed Generator Model
The ODEs of the detailed model of a generator incorporating the saliency and transient reactances are typically stiff compared to the ODEs of the classical model. Due to the stiffness, the maximum time step that could be used without the solution diverging for both the sequential simulation and the coarse-propagator was found to be 70 msecs compared to 98 msecs for the classical model. Therefore, the simulation parameters i.e., the coarse and fine propagators time steps, the fault location and type, and the fault duration are identical to the simulations using the classical model. The simulations with the detailed model are carried out for a long period of time to study the effect of saliency and the damping.
In Figure 8(a), the rotor angle variations with sequential and PRA simulations are presented. The rotor angle solutions computed using the PRA is similar to the traditional sequential method. Also, it can be noticed that the rotor angle swing is damped and settles to a new system steady-state. In Figure 8(b), the rotor angle absolute error between the sequential and the PRA simulations is presented. The rotor angle computation with the detailed model involves the sequential solution of four ODEs at each time step and application of the predictor-corrector on all four ODEs at the end of each fine propagator iteration. The numerical values of the solutions of the four ODEs during the initial phase of fault are large. These numerical large values cascade through the four ODEs within a fine propagator iteration and due to the cascading effect, the numerical error is large initially as shown in Figure 8(b). After the clearing of the fault, and due to damping the rotor angle swing reduces and correspondingly the numerical values resulting in smaller numerical error between the sequential and PRA simulations.
In Figure 9(a), the rotor angle variation with time when the fault clearing time is 1.3 secs is shown. The 1.3 secs are larger than the 0.77 secs critical clearing time and therefore the system is unstable. The rotor angle variations computed using the PRA follows that computed using the sequential algorithm. In Figure 9(b), the absolute error between PRA solution and sequential solution is shown. The absolute error has larger value due to the cascading of the error through the four ODEs.
5.3. Performance Analysis
The performance of PRA is analyzed using the execution time speedup achieved with respect to the traditional sequential algorithm. The speedup is given by Equation (21).
Figure 8. Time-domain simulation comparison of rotor angle for detailed model. (a) Time-domain Solution: Sequential v/s PRA; (b) Absolute error between sequential and PRA.
Figure 9. Time-domain simulation comparison of Rotor angle for unstable system. (a) Time-domain solution: Sequential v/s PRA for unstable system; (b) Absolute error between sequential and PRA for unstable system.
is the computation time of the sequential algorithm.
is the execution time of the PRA.
The is defined as the execution time since it is the sum of four-time components as shown in Equation (22)
is the computation time of the coarse propagator on the host.
is the memory transfer latency between the host and the GPU.
is the computation time of the fine propagators on the GPU.
is the memory transfer latency between the GPU and the host.
is the computation time of the predictor-corrector on the host.
N is the number of iterations.
The coarse propagator computation time is dependent on the coarse propagator time step and the fixed interval of time T for which the ODEs are solved. For a fixed T, the coarse propagator computational time will increase with smaller. The memory transfer latencies and both are also dependent on the coarse propagator time step. The number of fine propagators corresponding to a coarse propagator time step and for a given T is
By varying, the number of threads executing in parallel on the GPU cores is varied and varying the fine propagator time step the computation load of each thread is varied.
The speedup achieved using the PRA is demonstrated through a number of simulations with varying or, and. In Table 3, the execution times of both sequential algorithm and PRA computing the solutions of the ODEs of the classical model along with the speedups are presented. The simulation time T was set to 3.06 secs to account for the first swing stability with classical model of the generator. The execution time of the PRA executing on the GPU is significantly less compared to sequential algorithm computation time
Table 3. PRA execution time and speedup with classical model.
resulting in a speedup of 25×. The PRA on GPU provides better performance when the fine propagator computation load is large, i.e. smaller.
In Figure 10, the variation of speedup with is shown. Since the speedup is increasing linearly, the parallel scalability of the PRA has strong scaling efficiency. The strong scaling efficiency is due to the being significantly large compared to the sum of remaining four time components in Equation (22).
In Table 4, the execution times of both sequential algorithm and PRA computing the solutions of the ODEs of the detailed model along with the speedups are presented. The simulation time T was set to 26.1 secs to account for the long term dynamic stability with the detailed model of a generator. In Table 4, it can be seen that the PRA execution time is significantly small compared to the sequential computational time resulting in a speedup of 31×. It is important to emphasis that with the detailed model, the number of ODEs solved sequentially
Figure 10. Variation of Speedup with for Classical model.
Table 4. Execution time and speedup for detailed model.
at each time step is twice that with the classical model.
In Figure 11, the variation of speedup with is shown. In Figure 11, it can be seen that the speedup does not increase linearly and flattens with increasing indicating the parallel scalability has a weak scaling efficiency. The weak scaling is due to the sum of the coarse propagator computation time and the memory transfer latencies (,) being larger compared to the fine propagators computation time. The is large due to four ODEs solved at each time step and larger memory transfer latencies to transfer the larger coarse propagator solutions from host to GPU and vice versa. The performance of the PRA with the detailed model is memory bound.
Therefore, from Figure 11, it is evident that the performance of the PRA algorithm decreases as higher level models with larger number of differential equations are implemented to study the dynamic stability. However, the execution time of the PRA is still significantly small compared to the computational time of the sequential algorithm demonstrating the suitability of PRA for near real-time transient stability analysis.
TSA performed using the time-domain solution approach is a compute-intensive problem and is typically conducted offline by the utilities. In this paper, the use of PRA to solve the ODEs for two synchronous generators models of a SMIB test system to perform TSA using GPUs has been demonstrated successfully. The
Figure 11. Variation of Speedup with for Detailed model.
PRA was evaluated for accuracy with both stable and unstable cases of the test system. The absolute error between the ODE solutions by PRA and the sequential algorithm is very small demonstrating the accuracy of the PRA. The PRA speedup achieved using GPUs demonstrated that the numerical integration computational time can be significantly reduced in comparison to traditional sequential numerical integration. However, PRA is an iterative algorithm that can impact the performance due to significant amount of memory transfers between the host and device for systems with higher-order generator models. In future work, various methods will be explored to mitigate the memory transfers between the host and device, and the PRA algorithm will be tested for higher-order generator models for large power systems.
This work was supported in part by the Department of Energy under grant DESC0012671.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.
Tylavsky, D., Bose, A., Alvarado, F., Betancourt, R., Clements, K., Heydt, G.T., Huang, G., Ilic, C., La Scala, M., Pai, M.A. and Pottle, C. (1992) Parallel Processing in Power Systems Computation. IEEE Transactions on Power Systems, 7, 629-638.
La Scala, M., Bose, A., Tylavsky, D.J. and Chai, J.S. (1990) A Highly Parallel Method for Transient Stability Analysis. IEEE Transactions on Power Systems, 5, 1439-1446.
La Scala, M., Sblendorio, G., Bose, A. and Wu, J.Q. (1996) Comparison of Algorithms for Transient Stability Simulations on Shared and Distributed Memory Multiprocessors. IEEE Transactions on Power Systems, 11, 2045-2050.
Granelli, G.P., Montagna, M., La Scala, M. and Torelli, F. (1993) Relaxation-Newton methods for Transient Stability Analysis on a Vector/Parallel Computer. Conference Proceedings Power Industry Computer Application Conference, Scottsdale, AZ, 4-7 May 1993, 387-393.
Shu, J., Xue, W. and Zheng, W. (2005) A Parallel Transient Stability Simulation for Power Systems. IEEE Transactions on Power Systems, 20, 1709-1717.
Esmaeili, S. and Kouhsari, S.M. (2007) A Distributed Simulation Based Approach for Detailed and Decentralized Power System Transient Stability Analysis. Electric Power Systems Research, 77, 673-684.
Crow, M.L. and Ilic, M. (1990) The Parallel Implementation of the Waveform Relaxation Method for Transient Stability Simulations. IEEE Transactions on Power Systems, 5, 922-932.
Morales, F., Rudnick, H. and Cipriano, A. (2001) Electromechanical Transients Simulation on a Multicomputer via the VDHN-Maclaurin Method. IEEE Transactions on Power Systems, 16, 418-426.
Dufour, C., Jalili-Marandi, V., Bélanger, J. and Snider, L. (2012) Power System Simulation Algorithms for Parallel Computer Architectures. 2012 IEEE Power and Energy Society General Meeting, San Diego, CA, 22-26 July 2012, 1-6.
Nievergelt, J. (1964) Parallel Methods for Integrating Ordinary Differential Equations. Communications of the ACM, 7, 731-733.
Alvarado, F.L. (1979) Parallel Solution of Transient Problems by Trapezoidal Integration. IEEE Transactions on Power Apparatus and Systems, PAS-98, 1080-1090.
Chai, J.S. and Bose, A. (1993) Bottlenecks in Parallel Algorithms for Power System Stability Analysis. IEEE Transactions on Power Systems, 8, 9-15.
Wang, F.Z. (1998) Parallel-in-Time Relaxed Newton Method for Transient Stability Analysis. IEE Proceedings-Generation, Transmission and Distribution, 145, 155-159.
|||Nielsen, A.S. (2012) Feasibility Study of the Parareal Algorithm. Doctoral Dissertation, Technical University of Denmark, Denmark.|
Maday, Y. (2008) The Parareal in Time Algorithm.
Baffico, L., Bernard, S., Maday, Y., Turinici, G. and Zérah, G. (2002) Parallel-in-Time Molecular-Dynamics Simulations. Physical Review E, 66, Article ID: 057701.
Staff, G.A. and Rønquist, E.M. (2005) Stability of the Parareal Algorithm. In: Domain Decomposition Methods in Science and Engineering, Springer, Berlin, Heidelberg, 449-456.
Gander, M.J. and Hairer, E. (2008) Nonlinear Convergence Analysis for the Parareal Algorithm. In: Domain Decomposition Methods in Science and Engineering XVII, Springer, Berlin, Heidelberg, 45-56.
|||Harden, C.R. (2008) Real Time Computing with the Parareal Algorithm. Doctoral Dissertation, Florida State University, Tallahassee, FL.|
Ruprecht, D. and Krause, R. (2012) Explicit Parallel-in-Time Integration of a Linear Acoustic-Advection System. Computers & Fluids, 59, 72-83.
Minion, M. (2011) A Hybrid Parareal Spectral Deferred Corrections Method. Communications in Applied Mathematics and Computational Science, 5, 265-301.
Berry, L.A., Elwasif, W., Reynolds-Barredo, J.M., Samaddar, D., Sanchez, R. and Newman, D.E. (2012) Event-Based Parareal: A Data-Flow Based Implementation of Parareal. Journal of Computational Physics, 231, 5945-5954.
|||Staff, G. (2003) Convergence and Stability of the Parareal Algorithm: A Numerical and Theoretical Investigation.|
Bal, G. and Maday, Y. (2002) A “Parareal” Time Discretization for Non-Linear PDE’s with Application to the Pricing of an American Put. In: Recent Developments in Domain Decomposition Methods, Springer, Berlin, Heidelberg, 189-202.
Maday, Y. and Turinici, G. (2003) Parallel in Time Algorithms for Quantum Control: Parareal Time Discretization Scheme. International Journal of Quantum Chemistry, 93, 223-228.
Gurrala, G., Dimitrovski, A., Pannala, S., Simunovic, S. and Starke, M. (2015) Parareal in Time for Fast Power System Dynamic Simulations. IEEE Transactions on Power Systems, 31, 1820-1830.
Duan, N., Dimitrovski, A., Simunovic, S. and Sun, K. (2016) Applying Reduced Generator Models in the Coarse Solver of Parareal in Time Parallel Power System Simulation. 2016 IEEE PES Innovative Smart Grid Technologies Conference Europe, Ljubljana, Slovenia, 9-12 October 2016, 1-5.
Duan, N., Dimitrovski, A., Simunovic, S., Sun, K., Qi, J. and Wang, J. (2018) February. Embedding Spatial Decomposition in Parareal in Time Power System Simulation. 2018 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, Washington DC, 19-22 February 2018, 1-6.
Xia, S., Bu, S., Hu, J., Hong, B., Guo, Z. and Zhang, D. (2018) Efficient Transient Stability Analysis of Electrical Power System Based on a Spatially Paralleled Hybrid Approach. IEEE Transactions on Industrial Informatics, 15, 1460-1473.
|||Cheng, J., Grossman, M. and McKercher, T. (2014) Professional Cuda C Programming. John Wiley & Sons, New York.|
|||Wang, B. and Sun, K. (2015) Power System Differential-Algebraic Equations. arXiv Preprint arXiv:1512.05185.|
|||Kundur, P., Balu, N.J. and Lauby, M.G. (1994) Power System Stability and Control. Volume 7, McGraw-Hill, New York.|
|||Padiyar, K.R. (1996) Power System Dynamics: Stability and Control. John Wiley, New York.|
Kumar, R. Muknahallipatna, S. and McInroy, J. (2016) An Approach to Parallelization of SIFT Algorithm on GPUs for Real-Time Applications. Journal of Computer and Communications, 4, 18-50.
Ramakrishnaiah, V.B., Muknahallipatna, S. and Kubichek, R.F. (2017) Adaptive Region Construction for Efficient Use of Radio Propagation Maps. Journal of Computer and Communications, 5, 21-51.
|||Tanwani, N.K., Memon, A.P., Adil, W.A. and Ansari, J.A. (2014) Simulation Techniques of Electrical Power System Stability Studies Utilizing Matlab/Simulink. Engineer, 9, 18.|
Quadro RTX 6000 GPU.
Copyright © 2020 by authors and Scientific Research Publishing Inc.
This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.