Implementation and optimization of partial element equivalent circuit-based solver

Sammanfattning: The Partial Element Equivalent Circuit (PEEC) method is an integral equation based full-wave approach to solve combined circuit and electromagnetic problems in the time and frequency domain. Using PEEC, an electromagnetic problem is transferred to the circuit domain and then solved using circuit theory which gives PEEC a high flexibility to be used in combined electromagnetic and circuit modeling problems. Thus, the method can be applied to different classes of problems, for example power electronics systems and antenna simulation to ensure the functionality of the system and also comply with electromagnetic compatibility (EMC) regulations. Other methods, like Finite Element Method or Finite Difference Time Domain Methods, are also used for electromagnetic analysis and an optimal computer implementation is needed to be able to handle such problems within a reasonable time and at certain accuracy.This work presents the development and optimization of a PEEC-based software on different computing platforms. The aim of the acceleration is to be able to solve problems using the PEEC method as fast as possible with optimum memory usage on a regular computer system. The PEEC-based solver has been developed for desktop machines using the GMM++ linear algebra package. This implementation was optimized by improving the code, to use more efficient libraries and adapt the program to run on powerful machines. Another part of this optimization process was to implement smart algorithms like non-uniform meshing and on-the-fly calculations for the partial elements. Though, the code has been recently enhanced to take advantage of the multicore hardware by replacing old library with Intel Math Kernel Library (MKL) in order to take advantage of several processors which exist on typical computer system.To be able to solve very large problems which for example needs several hundred gigabytes of memory, the code was also ported into parallel computer systems. The parallel PEEC solver is compatible with the distributed memory architecture and thus, is scalable by using a set of computing units which collaborate to solve a problem. Hence, by allocating enough number of processors and amount of memory, the load of the solution will be distributed over different elements. Consequently, one of the challenging parts of these kinds of distributed computations is to distribute data, using a balanced manner to minimize data movement between nodes which will slow down the running process. The balanced distribution of data was ensured, by using Basic Linear Algebra Subprograms (BLAS) and Scalable Linear Algebra Package (ScaLAPACK) libraries to handle linear algebraic calculations in the parallel solver. Using these tools, the matrix elements are dispensed according to the block-cyclic decomposition scheme which guarantees that data is uniformly assigned to each computing node. Several test cases have been run, in order to benchmark the computer implementation. On account of the applied optimization techniques, the sequential solver needs less memory and performs the solution remarkably faster than before. As an example, high frequency simulations can now be run now with the optimized code, within shorter time and with less memory usage, by having very light mesh, using non-uniform meshing. According to the benchmarking results of the parallel solver, the results of these tests did agree very well with the physical measurements and also showed an acceptable speed-up factor as number of processors as well as size of the problems grew in the parallel version of the solver. The robustness of the parallel solver was verified by stressing the code with the largest test case which was a problem with more than 250 000 unknowns. Further steps of the acceleration would be focusing on the smarter algorithms as well as numerical methods like Fast Multipole Method (FMM), using iterative solvers instead of direct solvers and QR Decomposition. An important issue which needs to be considered is the approximations which will appear in the results as a consequence of usage of such techniques like numerical instabilities and loss of accuracy.