# «OPTIMIZATION AND MONITORING OF GEOLOGICAL CARBON STORAGE OPERATIONS A DISSERTATION SUBMITTED TO THE PROGRAM IN EARTH, ENERGY AND ENVIRONMENTAL ...»

5.1. LEAKY AQUIFER MODEL sites in multi-leak cases have a 25% chance of being a leak and a 75% chance of being a non-leak. An example realization of a cap rock with multiple leaks generated as described here is shown in Figure 5.3. The example contains three leaks (shown as ﬁlled blocks) and ﬁve non-leaks (shown as empty dashed blocks).

** Figure 5.3: Eight randomly selected leakage sites.**

Filled blocks indicate leaks, while outlined blocks are non-leaks As in previous examples, we simulate 30 years of CO2 injection into the storage formation, followed by 470 years of equilibration. For history matching purposes, we need only simulate the ﬁrst 10 years of injection, because our goal is to provide early detection of leaks from pressure data. A typical run time for these simulations is one minute. When evaluating the eﬃcacy of our history matching procedure, we simulate over 500 years in order to compare the location and saturation of leaked CO2 between the true and predicted models. The run time for these simulations is about ﬁve minutes. The initial pressure at the top of the storage formation is the same as noted in Chapter 2 (17 MPa), and the hydrostatic gradient continues into

## 98 CHAPTER 5. LEAKAGE DETECTION FROM HISTORY MATCHING

5.2 Leakage analysis In this section we will analyze the eﬀect of varying the leakage location, leakage permeability, total number of leaks, and aquifer geology, on the proportion of injected CO2 that escapes the storage formation. To accomplish this, we generate large numbers of leakage scenarios for the three-region model speciﬁed in Section 5.1. We ﬁrst consider cases with a single leak, and then expand this to more general cases with up to eight leaks. No history matching is performed at this stage.

5.2.1 Single-leak cases We generate 2000 realizations of single-leak cases with diﬀerent aquifer geologies, leakage locations (il, jl ) and leakage permeabilities kz,l (we reiterate that kz,l refers to the grid block permeability). The geological realizations for the storage formation and overlying aquifer are generated using SGeMS (Remy et al., 2009). The position and permeability of each leak is determined randomly, by sampling the leakage variables η from a uniform distribution, with upper and lower bounds for kz,l as deﬁned in min Equation 5.1 with η3 = ln(0.005). A ﬂow simulation is performed for each realization and we record the fraction of injected CO2 that escapes the storage formation as a function of time. Leaked CO2 can be in any form (e.g., free-gas phase or dissolved in brine).

** Figure 5.4 shows the fraction of leaked CO2 after 30 years and 500 years for each realization as a function of the permeability of the leak kz,l.**

The lines indicate median, P10 and P90 results. In both plots an increase in the leakage fraction is observed in median and P90 results as kz,l is increased to around 2 md. After this point, the median and P90 curves plateau at about 1% and 8% leakage after 30 years, and 2.5% and 15% leakage after 500 years (the slight decrease in the median after kz,l = 2 md is presumably a statistical eﬀect). This indicates that for leakage block permeabilities above 2 md, the leakage volume will depend mostly on other factors, such as the speciﬁc geological heterogeneity or the proximity of the leak to accumulated CO2,

5.2. LEAKAGE ANALYSIS rather than on the actual permeability of the grid block representing the leak. Of particular interest in both plots is the signiﬁcant increase in leaked CO2 fraction from the median to the P90 curves. For example, the median leaked CO2 fraction after 500 years plateaus at around 2.5%, whereas the P90 curve plateaus at around 15% leakage after 500 years. The apparent potential for substantial leakage motivates the need to locate and characterize large leaks that may require remediation eﬀorts.

** Figure 5.5 shows the fraction of leaked CO2 after 30 years and 500 years for each realization.**

The location of the leak corresponds with the location of the point. As expected, the most signiﬁcant leaks are in the central region, where CO2 is likely to accumulate (above the wells in our model). The diﬀerences in leakage fraction for leaks located in the same area are due to diﬀerences in aquifer geology and kz,l for the leak.

5.2.2 Multi-leak cases In our analysis of multi-leak cases, we again generate 2000 realizations with geologies in the storage formation and overlying aquifer constructed using SGeMS. We now generate up to eight leaks, as described in Section 5.1, with η sampled from a uniform distribution. An example of one such realization is shown in Figure 5.3. Figure 5.6 shows the leakage fraction of CO2 after 30 years and 500 years, versus the number of leaks, for the 2000 realizations. The key statistics of P90, median and P10 are also plotted (when meaningful) to show how the results change with the number of leaks.

In Figure 5.6, the results for cases with a single leak correspond to the results shown in Figure 5.4. When additional leaks are added, it can be seen that all three statistics (median, P10 and P90) generally increase, as expected (the slight decrease in median leakage for the ﬁve-leak case is presumably a statistical eﬀect). It is interesting, however that the P90 leakage after 500 years for the ﬁve-leak case is only 55% higher than that for the single-leak case (0.17 compared with 0.11). These leakage fractions are nonetheless quite high given that the average mobile CO2 fraction after 500 years is around 0.3, for cases with no leakage.

## 100 CHAPTER 5. LEAKAGE DETECTION FROM HISTORY MATCHING

0.1 0.05 0.1 0.05 Figure 5.5: Location of leaks with color indicating the fraction of leaked CO2 (singleleak cases)

## 102 CHAPTER 5. LEAKAGE DETECTION FROM HISTORY MATCHING

Figure 5.6: Fraction of leaked CO2 for diﬀerent numbers of leaks

5.3. HISTORY MATCHING LEAKAGE

5.3 History matching leakage 5.3.1 Problem setup The history matching problem to be solved for leakage detection can be expressed in the form

where ξ 1 ∈ Rℓ1 and ξ 2 ∈ Rℓ2 are the K-L parameters characterizing the porosity of the storage formation and overlying aquifer respectively, η, deﬁned earlier, describes the location and grid block permeability of L leaks in the cap rock, m denotes the porosity and permeability ﬁeld for the entire model, f and dobs are the modeled and observed pressure data at monitoring and CO2 injection wells, and W denotes the weighting matrix. We will determine optimal data weight parameters in W, with the goal of improving the eﬃcacy of the history matching algorithm for leakage detection, using a method similar to that described in Chapter 4.

Separate K-L representations are used for the storage formation and overlying ¯ ¯ aquifer. Speciﬁcally, we write m1 = Φ1 ξ 1 + m1 and m2 = Φ2 ξ 2 + m2 for the two regions (we will continue to use m to denote the geology of the entire three-region model). The matrices Φ1 and Φ2 are constructed by performing SVDs of the data matrices (sets of realizations) for the two regions, as described in Section 4.3.2. We use the Ns = 2000 realizations that were generated in the previous section to construct the data matrices. Because these realizations honor hard data in grid blocks containing sensor wells or CO2 injection wells, the K-L models also honor these hard data.

The model permeability in the storage formation and overlying aquifer is calculated from porosity (which is determined from the K-L parameters) using Equation 2.12 with a = 4.5 and b = 2. The vertical permeability is set to kz = 0.1kx, and we take ky = kx. For the history matching examples considered in this chapter, we will use ℓ1 = 50 to characterize the porosities of the 5000 blocks in the storage formation and ℓ2 = 10 to characterize the porosities of the 2500 blocks in the overlying aquifer (recall that ℓ designates the number of K-L variables). This number of variables was found to be eﬀective for our history matching applications. We assign

## 104 CHAPTER 5. LEAKAGE DETECTION FROM HISTORY MATCHING

fewer variables to characterize the overlying aquifer because it contains fewer grid blocks and because we are more interested in ﬁnding accurate leak locations than in determining the precise ﬂow path of leaked CO2. Extensive testing indicated that the misﬁt function in Equation 5.2 is most sensitive to the position of the leak rather than the individual K-L variables describing the geology. In some cases, reasonable history matches were obtained using no K-L parameters (i.e., by ﬁxing the porosity ﬁeld to the prior mean and varying only the position and permeability of the leakage blocks), though we found that including the K-L representation generally improved history matching performance.The observed and modeled data dobs and f (m(ξ 1, ξ 2, η)) consist of sensor pressure and BHP data sampled every 50 days for the ﬁrst 10 years of CO2 injection. The observed data in this work are determined by running a ﬂow simulation for a particular synthetic ‘true’ model. Various ‘true’ models with diﬀerent levels of leakage will be considered. These true models were selected from the realizations simulated in the previous section. We will assess the performance of history matching for these true models for two data collection scenarios: a ‘data-rich’ scenario (nine monitoring wells completed in every layer of both the storage formation and overlying aquifer) and a ‘data-scarce’ scenario (four monitoring wells, only completed in the overlying aquifer).

The four monitoring wells used in the data-scarce scenario are the corner wells shown in red in Figure 5.1. We will analyze both scenarios, since in practice it may not be feasible to drill nine monitoring wells into the storage formation. In an example presented by Sun & Nicot (2012), they were able to determine the position of a leaky wellbore, based on pressure data at only two observation wells. Thus it seems reasonable to expect that we will be able to locate leaks in single-leak cases even for the data-scarce scenario. The situation is less clear for multi-leak examples, however.

In Chapter 4, the misﬁt function depended only on the continuous K-L variables ξ. The misﬁt function used in this chapter, however, depends on the leakage characterization variables η, as well as on two sets of K-L variables (ξ 1 and ξ 2 ). This minimization problem is therefore more complicated, and presumably less smooth than that in Chapter 4, since η contains grid-block location variables. We found the PSO algorithm to outperform GPS for this case, so PSO will be used to minimize the ˆ misﬁt S in the history matching examples in this chapter.

5.3. HISTORY MATCHING LEAKAGE

**5.3.2 Measures of history matching eﬃcacy**

The primary goal of history matching, for the examples in this chapter, is to accurately determine the location and severity (i.e., leakage fraction) of leaks in the cap rock.

Since we are using synthetic ‘true’ models, we can compare our history matched results to results from the true model, which enables us to assess the eﬃcacy of the method. In order to accomplish this, we must introduce a quantitative measure of the ‘closeness’ of one geologic model mα to another mβ, in terms of the location of a leak and how much CO2 ﬂows through it over time.

where Λ is the closeness measure, γ1 and γ2 are weighting factors, (il, jl ) refers to the grid block indices of the leakage site, FCO2 (m, t) is the fraction of injected CO2 that has escaped through the leak in model m at time t, and T is the simulation time frame. In order to assess the history matching procedure, T must be much larger than the 10-year data collection time frame (e.g., we use T = 500 years). This means that in order to compare the closeness of two models, our simulations must run for a much longer period of time than the simulations used in the history matching process.

We reiterate that the run time for the 10-year models is typically one minute, while 500-year simulations run in around ﬁve minutes.

The closeness measure Λ, deﬁned in Equation 5.3, is useful for comparing cases with single leaks, though it is diﬃcult to generalize this measure to compare cases with multiple leaks. For this reason, we propose a diﬀerent measure of model closeness, which is based on the saturation distribution of the CO2 that has leaked into the overlying aquifer. This quantity can be viewed as a proxy for a measure of the similarity in the location and severity of multiple leaks, because the amount and distribution of CO2 in the overlying aquifer clearly depends on these parameters. We

## 106 CHAPTER 5. LEAKAGE DETECTION FROM HISTORY MATCHING

where wSF, wOA and wBHP are tunable data weight parameters for sensor pressures in the storage formation, sensor pressures in the overlying aquifer, and bottomhole pressures at the injection wells. Here σSF, σOA and σBHP are anticipated posterior standard deviations for the three data types, NSF, NOA and NBHP are the number of observations for each of the three data types (we include these variables so the misﬁt

5.3. HISTORY MATCHING LEAKAGE function is not biased by any particular data type – in Chapter 4 these were absorbed into the θ estimates), and INSF, INOA and INBHP are identity matrices of appropriate dimension. In the data-rich scenario, we use sensor data from nine monitoring wells that contain sensors in the eight vertical grid blocks in the storage formation and four vertical grid blocks in the overlying aquifer, along with BHP data from the four CO2 injection wells. Consequently, we have NSF /NBHP = 18 and NOA /NBHP = 9. In the data-scarce scenario, we use sensor data from four monitoring wells with sensors in the overlying aquifer only, so NOA /NBHP = 4 and NSF is not relevant (i.e., we set wSF = 0 in Equation 5.5). Note that we can multiply Equation 5.2 by NBHP without aﬀecting the optimal solution, so it is these ratios that are important.