An application of the NSGA-II algorithm in Pareto joint inversion of 2D magnetic and gravity data

Joint inversion is a widely used geophysical method that allows model parameters to be obtained from the observed data. Pareto inversion results are a set of solutions that include the Pareto front, which consists of non-dominated solutions. All solutions from the Pareto front are considered the most feasible models from which a particular one can be chosen as the final solution. In this paper, it is shown that models represented by points on the Pareto front do not reflect the shape of the real model. In this contribution, a collective approach is proposed to interpret the geometry of models retrieved in inversion. Instead of choosing single solutions from the Pareto front, all obtained solutions were combined in one “heat map”, which is a plot representing the frequency of points belonging to all returned objects from the solution set. The conducted experiment showed that this approach limits the problem of equivalence and is a promising way of representing the geometry of the model that was retrieved in the inversion process.


INTRODUCTION
The inversion problem is generally understood to recreate the parameters of a physical system or the causes of a phenomenon based on indirect measurements and observations. Inversion is used in many different fields and is an indispensable tool in situations where it is impossible or unfavourable to obtain data directly about the system under study.
In geophysical terms, inversion is the reconstruction of the parameters of the M model on the basis of measurement data d. The relationship between the observations (measurement data) and the recovered parameters of the model is described by the formula d = G(M) T , where: d -vector of measurement data with the dimension l, d = (d 1 , ..., d l ) T , M -vector of model parameters with dimension n, M = (m 1 , ..., m n ) T , G -operator of the relationship between model parameters and measurement data (Tarantola 1987). In order to find the extreme of the objective function, various types of optimization algorithms can be used: local, global, heuristic or stochastic. In this article, the optimization approach was used to solve the inverse problem.
In the case of a multiobjective function, where more than one component is present, two key problems arise. Firstly, the returned summed results of different methods are very difficult to compare and require appropriate, usually arbitrary weighting. This operation can be very laborious and significantly affects the results obtained for different data sets. The second problem is the correct uncertainty analysis of the solution for two or more combined objective functions. The elimination of the weighting of functions seems to be the key issue in the preparation of a fully functional and efficient method of solving the problem of joint inversion.
Data from only one geophysical method used in inversion is often insufficient to retrieve information about the geological formation, therefore combining several methods increases the quality and usefulness of the models obtained (Vozoff & Jupp 1975). In order to ensure the independent operation of each of the objective functions and to avoid the troublesome modification of them (transfer to the same domain in the case of information loss), the Pareto approach turns out to be helpful (Miernik et al. 2016). The Pareto methodology enables the simultaneous optimization of many different objective functions within one domain without the need for scaling or weighting (Kung et al. 1975). The newly found solution is only accepted if at least one component of the solution vector is better than in the previous step and no other component has deteriorated (Zaefferer et al. 2013).

THEORETICAL BACKGROUND Joint inversion
Joint inversion is used to interpret two or more geophysical data sets. The idea of joint inversion in geophysics was introduced as late as 1975 (Vozoff & Jupp 1975), about a hundred years after its initial conception. Initially, attempts were focused on similar methods based on the same physical field (Amatyakul et al. 2017), from which one target function was combined; however, information from observations based on different fields is more desirable. Unfortunately, joining misfits coming from different methods has significant drawbacks. Firstly, in regular optimization operations only one target function is required, which implies that misfits from particular methods have to be scaled. The second problem concerns the correct analysis of uncertainty, something which can be challenging for joint objectives with assigned weights. A potential strategy for solving such issues is the Pareto approach (Kozlovskaya et al. 2007), which allows the synchronous optimization of many miscellaneous functions within one domain omitting a priori assumptions, e.g. weighting (Kung et al. 1975). Any new solution can only be approved if it is not worse than the previous one. The final result of one inversion process is one point represented in the 2D coordinate system, where each axis represents the value of the misfit of one target function. The process of inversion conducted in this manner is repeatedly executed and the results are collected into a set in which the best non-dominated solutions are composed into the Pareto front. The definition of dominance in an N-dimensional solution space (here in the context of minimization) states that solution A(a 1 , a 2 , …, a n ) dominates solution B(b 1 , b 2 , ..., b n ), which is expressed as: A  B, when: ∀i : a i ≤ b i , i ∈ 1, …, N and: ∃j: a j < b j , j ∈ 1, …, N.
Any solution that is accepted as better is better in at least one dimension and not worse in all others (Zaefferer et al. 2013).
The Pareto approach was successfully applied in solving multi-optimization problems (MOP) using both local (Jaszkiewicz 2018) and global optimization algorithms (Bai et al. 2019, Mustaffa et al. 2019 in recent research.

Gravimetry and magnetometry forward solvers
Gravimetry and magnetometric methods are commonly used for near-surface investigations. Both methods suffer from the equivalence problem and are characterized by low resolution. The solution of the inversion problem is generally not unequivocal as many different models can correspond to the same observations, which is known as the equivalence or ambiguity problem (Vozoff & Jupp 1975). The model obtained in the inversion process is not necessarily the same as the real model which is searched for. Many different variants of the model geometry can fit the same set of recovered parameters, which unfortunately reduces the credibility and uniqueness of the solutions. It is necessary to provide a priori information, e.g. on the geological structure of the studied formation or borehole data, supporting the decision-making process of selecting the final solution (Cervelli et al. 2001, Socco 2008. Gravimetry and magnetometric methods are applicable if the physical properties (density and magnetic susceptibility) of the anomalous body and the background differ enough (Hinze et al. 2013). The forward solvers for gravity (Miernik et al. 2016) and magnetic methods  are based on the superposition rule and calculate the sum of gravity and magnetic effects in every node in a computation grid.
Each cell is considered a rectangle, to which some parameters of magnetic susceptibility and density are assigned (Danek et al. 2019).

Non-dominated sorting genetic algorithm II (NSGA-II)
The NSGA-II algorithm, introduced in 2002 ( Deb et al. 2002) as a multicriteria genetic algorithm, was an improvement on NSGA, the original version that was proposed in 1994 (Deb & Snirvas 1994). NSGA was criticized for its computational complexity, which was due to iterative sorting, elitism absence and also the necessity of determining a sharing parameter (Deb et al. 2002). Elitism in this context is understood as the distinction of individuals of population in terms of quality. Individuals which are found as the best in the process of sorting are marked and remembered, creating the so-called elite. NSGA-II was supposed to eliminate the problems of NSGA by the introduction of quick sorting of non-dominated results, estimation of density, and a mechanism comparing crowding distance.
The scheme of the NSGA-II is presented in Figure 1 and is as follows (Deb et al. 2002): -Step1 Random initialization of population P t in solution space. - Step 2 Creation of offspring Q t of a number the same as the number of population P t in the crossing and mutation process. -Step 3 Sorting and ranking the members of population R t = P t ∪ Q t creates the Pareto sets F 1 to F n . Set F 1 contains the best population's members, F 2 to F n (from best to worst). In the scope of one set, all solutions are of the same quality. -Step 4 The creation of a new population P t+1 of a number equal to P t . To the population P t+1 , solutions from set F 1 first are added, next from F 2 , and so on. If any whole set that is added to the population makes the size of population P t+1 > P t , like F 3 in Figure 1, then some of the solutions from this set are rejected after the application of crowding distance sorting, in which solutions that are far from the others are preferred in the process of selection in order to create the differential solution set and avoid cumulation.
Return to step 2 until the maximum number of generations is reached.
The final solution set is the highest-ranked Pareto set F i from the last population.  (Deb et al. 2002), where: P t -initial population; Q t -the offspring made in the crossing and mutation process done on P t ; R t -a population consists of P t and Q t ; F 1 -F n -Pareto sets created by sorting and ranking the members of population R 1 , where F 1 contains the best population's members, F 2 to F n (from best to worst). In the scope of one set, all solutions are of the same quality The example of genetic algorithm operationrandom initialization, crossing over and mutation is shown in Figure 2.
After the initialization of population P t and the creation of its offspring Q t , NSGA-II orders the resultant population R t into non-dominated Pareto fronts F 1 to F n using the process of non-dominated sorting, which covers the principle of elitism; this assumes that when an individual finds the best solution, the algorithm marks and remembers it (the elite). Unlike NSGA, NSGA-II includes the distinction of individuals, assigning them to fronts F 1 to F n according to their quality. Subsequently, the crowding distance sorting is applied in order to maintain diversity and avoid cumulations of final solutions. The main reason for choosing NSGA-II was the fact that it is more efficient at producing Pareto fronts. NSGA-II found its application both in geophysical investigations (Schwarzbach et al. 2005, Currenti et al. 2014, Ayani et al. 2020) and in other fields (Jeyadevi et al. 2011, Wang et al. 2017, Martínez et al. 2020. It is still developing (Fang et al. 2018) and has even been used as the basis to create some new algorithms (Liu et al. 2020).

IMPLEMENTATION
MARIA was originally written for inversion in gravimetry (GV) and magnetotellurics (MT) in C language (Miernik et al. 2016). For parametrization, the Sharp Boundary Interfaces approach was used (Smith et al. 1999). Later, a magnetometric (MG) forward solver was implemented that replaced the MT module . A case study on synthetic data for MARIA 2.0 was performed (Danek et al. 2019). The next step was the integration of the GV and MG forward solvers with the R environment (R is a free software environment as well as a programming language which can be used for statistical computing and graphics). The functions that solve forward problems were separated from MARIA and transformed into stand-alone programs that could read data from a file and have a vector of 16 elements and misfits as their parameters as a starting model.
Subsequently, the main() functions (normally main() is the function which is called at the start of the program, being its entry point) of these programs were renamed and the thus prepared functions were compiled into dynamic libraries.
Next, a wrappers.R file was created. This is a file with code written in R and its role was to load the dynamic libraries and to create R functions based on functions written in C code. It allowed these functions to be loaded into the R environment and the GV and MG forward solvers to be used in combination with packages that implement optimization algorithms as well as many other tools.
The next step was to use the nsga2 function from the mco package for multicriteria optimization. nsga2 is a function built into the R environment, which solves the genetic algorithm NSGA-II. Mco is the package which is built in the R environment. It contains a set of functions which can be used for multicriteria optimization. An input model with geometry, density and susceptibility parameters and vectors with upper and lower boundaries of variables' variability during the inversion process were introduced into the nsga2 function.
Computations were performed for 100 generations and parallelized with the parSapply function from the parallel package for 63 cores, launching 16 populations for each core, summing up the results into one solution set with 1008 points.

EXPERIMENTS
The application of NSGA-II was tested on a synthetic magmatic intrusion model. This model illustrates a mini laccolite which is built of igneous rocks and background as sedimentary rocks. In Figure 3 the model of the disturbing body from which synthetic data were generated is shown. The blue area shows the disturbing body composed of igneous rocks, while the red area shows the background composed by sedimentary rocks. The geophysical parameters for the synthetic model are presented in Table 1.
The properties of the magnetic field assumed in examined model are shown in Table 2. These values are typical for northern Europe. In order to calculate synthetic gravimetric and magnetic data, 200 evenly distributed points were placed along the profile. To simulate real experiment data, white Gaussian noise was added with a different signal-to-noise ratio for each method (10 dB for magnetometry and 50 dB for gravimetry). The data contaminated by the noise is shown in Figures 4 and 5 as a blue dashed line. The solid red line shows the magnetic and gravity effect of the initial model. In the input model for joint inversion (see Fig. 6), geometry and geophysical parameters were perturbed. This simple initial model was used to check if the geometry of the "real" model can be reconstructed well from a geometry which is very different from the real one.
During inversion, all vertices of the model could change their coordinates, but the number of vertices were unchanged. The parameters sought, i.e. density and magnetic susceptibility for disturbing body and surrounding rocks could change their value in the range typical for the occurrence of rocks in nature (Hinze et al. 2013).  Table 2 Properties of magnetic field

RESULTS AND DISCUSSION
In this example, the NSGA-II engine was applied in parallel using 63 populations (one per core) of 16 individuals per population, with a limit of 1,000 generations. Figure 7 shows the magnified set of all accepted solutions obtained in the inversion process (63 populations × 16 individuals = 1,008 points). All solutions belonging to the Pareto front (F 1 ) are marked with red dots; the extended front is marked with green dots, and the other ones are marked with blue dots.
To obtain a more trustworthy comparison between the best solutions and the whole set of points, an extended front was taken into consideration (not only the Pareto front). To obtain the thickened front, the "onion-peeling" method was used: first, a Pareto front was defined from the whole set of solutions; next, all points belonging to the Pareto front that was found were eliminated. The procedure was repeated until the desired number of points on the extended front was reached.
In Figure 7 x-axis represents the fitting error for magnetic method and y-axis for gravity. These blue dot solutions represent Pareto sets F 2 , ..., F k , where k is the highest index of the accepted Pareto set from the final population obtained by NSGA-II. In order to extend the solution space dimensionality and enlarge its interpretational potential, we decided to take into account all of the obtained solutions.

Fig. 7. A set of solutions found under magnification with Pareto front (red) and thickened front obtained with an "onion-peeling" method (red and green)
As expected, the best solution from the Pareto front suffers from problems typical of the GV and MG methods, such as equivalence and resolution decrease with depth. Even the best of the obtained Fig. 6. Geometry of the starting model models of the body's geometry (see Fig. 8) represented by the optimal solution from the Pareto front (a dot that is closest to the origin of coordinates) with an almost perfect curve fit (see Figs. 9 and 10) is significantly different from the real one (marked with black contour). The geophysical properties for this model are presented in Table 3. All solution models from the Pareto front are presented in Figure 11 in black. The model on the left shown in red represents the real model.
The presented problems are why we decided to use a single solution map that combines all models obtained during inversion, instead of choosing single models from the set of obtained solutions. This map is a 2D histogram (presented in the form of a heat map) in which every bin represents how many times certain grid points in the raster belong to the disturbing body for all accepted models. The color variation is caused by the different intensity of a given cell's membership in each of the models.
In this approach, we used all of the obtained Pareto sets F 1 , ..., F k (see Fig. 12). The occurrence of points in the raster was calculated for 1,008 models in total.
As expected, in the case of potential fields, single solutions with an almost perfect curve fitness do not fully depict the geometry as many so-called equivalent models produce the same response. The proposed method, which is based on the statistical averaging of all models, makes it possible to highlight the dominating pattern of the obtained solutions. Obviously, the upper edge of the disturbing body is recovered with better precision than the bottom edge (see Fig. 13).
To determine whether limiting the amount of data taken into consideration can sharpen the obtained result, only the points close to the original Pareto front were used and aggregated into the heatmap (see Fig. 14). It is clearly visible that these results are quite similar to the heatmap for the whole set of solutions, and yet they do not depict the geometry of the starting model any better.

CONCLUSIONS
The role of the potential method inversion process is not to provide the exact geometry of the expected model, but rather information about its parameters and general shape. However, its geometry can still play a role in assessing the correctness of the retrieving model and in choosing between several feasible models (Menke 1989). Generally, defining the final model is not a trivial issue as many different models can fit the same set of data, and this is the problem of equivalency.
In this contribution, a simple synthetic model was used as a proof of concept. The geometry of the models of the Pareto front was verified with the synthetic model, thus proving that models represented by points on the Pareto front do not depict the expected geometry. Therefore, a new collective approach that takes into account all solutions from the solution set was proposed. A 2D histogram representing the number of the occurrences of the model in the raster for all obtained models was introduced. The experiment showed that this method could be a promising way to portray the models' geometry, which is robust and independent of additional information about the expected model. It was shown that even the best single model obtained with use of the presented algorithm does not provide sufficient information on the model geometry in question. Through the aggregation of the information held by the set of solutions, the proposed collective approach allows the shape of the body to be reconstructed with a better resolution.