Deep learning for predicting porosity in ultra-deep fractured vuggy reservoirs from the Shunbei oilfield in Tarim Basin, China

Deng, Ziyan; Zhou, Dongsheng; Dong, Hezheng; Huang, Xiaowei; Wei, Shiping; Kang, Zhijiang

doi:10.1038/s41598-024-81051-4

Download PDF

Article
Open access
Published: 28 November 2024

Deep learning for predicting porosity in ultra-deep fractured vuggy reservoirs from the Shunbei oilfield in Tarim Basin, China

Ziyan Deng^1,2,3,
Dongsheng Zhou^1,2,
Hezheng Dong^1,2,
Xiaowei Huang^1,2,
Shiping Wei^1,2 &
…
Zhijiang Kang³

Scientific Reports volume 14, Article number: 29605 (2024) Cite this article

1626 Accesses
Metrics details

Subjects

Abstract

Deep and ultra-deep carbonate reservoirs in China, which account for 34% of the country’s oil and gas reserves, pose significant challenges for porosity prediction due to their complex geological features, including extensive burial depth, weak seismic signals, and high heterogeneity. To address these challenges, this study develops an advanced deep learning approach specifically designed for ultra-deep, fault-controlled, fractured-vuggy reservoirs in the Tarim Basin. The study utilizes a three-dimensional seismic dataset and applies Principal Component Analysis (PCA) to select five key features from eight seismic attributes. Additionally, seismic phase-controlled constraints are incorporated into the model. Using deep learning technology, a porosity prediction model for ultra-deep carbonate reservoirs has been constructed. Validation using blind wells from the Shunbei oilfield shows that this approach achieves a 76% reduction in Mean Square Error (MSE) compared to traditional impedance inversion techniques, highlighting its high predictive accuracy. Through SHapley Additive exPlanations (SHAP) analysis, the attributes LAMBDA_AAGFIL and PHASE_ANT are identified as the most influential, highlighting their importance in representing karst cave and fracture structures within the reservoir. These findings underscore the innovation and substantial improvement of the proposed method over conventional techniques, offering a robust and high-precision approach for porosity prediction in ultra-deep carbonate reservoirs.

Permeability modelling in a highly heterogeneous tight carbonate reservoir using comparative evaluating learning-based and fitting-based approaches

Article Open access 03 May 2024

An improved method for predicting the permeability of Chang 7 shale oil reservoirs based on lithology and hydraulic flow units

Article Open access 17 March 2025

Seismic characterization of carbonate stringers using machine learning techniques: an example from the western flank of South Oman Salt Basin

Article Open access 28 December 2024

Introduction

China holds abundant supplies of deep and ultra-deep carbonate oil and gas resources, with total reserves amounting to 67.1 billion tons of oil equivalent, representing 34% of the nation’s total oil and gas resources. These resources have become a key focus for expanding reserves and boosting production, marking a critical growth area for the country’s oil and gas industry. Ultra-deep fault-controlled fractured-vuggy reservoirs are a newly discovered type of ultra-deep reservoir, buried at depths exceeding 7000 m^1,2,3. Porosity is a critical parameter for evaluating reservoir reserves and optimizing development strategies. Predicting porosity in complex reservoirs remains challenging due to factors such as deep burial, weak seismic signals, and significant heterogeneity. Fractured-vuggy reservoirs, such as those in the Shunbei oilfield, exhibit complexities similar to those observed in the Asmari reservoir, where fracture density plays a crucial role in oil production and reservoir management⁴. Current prediction methods often fall short of capturing the intricate characteristics of these reservoirs.

Porosity prediction methods can be classified into three main types based on the technical approach: physical model-based methods, seismic inversion, and data-driven approaches⁵. Physical model-based methods are the most traditional, grounded in established physical principles. These methods rely on well log data and rock physics models to estimate porosity based on rock and fluid properties. Early methods, such as the Archie formula, used well log parameters like resistivity and acoustic velocity to calculate porosity⁶, while the Kozeny-Carman formula linked permeability to porosity, and the Wyllie-Rose formula determined porosity using acoustic velocity⁷. With advances in experimental techniques, laboratory-based rock physics analysis has become an essential tool, allowing more accurate characterization of porosity in complex reservoirs by measuring core samples and applying rock physics theories like the Gassmann equation. To address geological complexity, statistical rock physics analysis has been introduced, analyzing statistical relationships in well log and core data to build porosity models, making it particularly suitable for heterogeneous reservoirs. While physical model-based methods are highly reliable, their ability to characterize reservoir complexity is somewhat limited, as they heavily depend on laboratory data and prior knowledge^8,9,10.

Seismic inversion is a core technology in porosity prediction, allowing for the estimation of reservoir properties by converting seismic-derived elastic attributes (such as impedance) into porosity using rock physics models or statistical relationships derived from laboratory or well log data^11,12,13. In a study of the Asmari formation in Iran’s Hendijan oil field, Abdolahi et al. demonstrated the effectiveness of combining seismic inversion with well log data to improve porosity prediction accuracy and reservoir quality delineation¹⁴. However, in complex reservoirs, especially deep carbonate reservoirs, it remains challenging to establish a reliable relationship between inverted seismic attributes and porosity, as the geological complexity is often too great for traditional methods to handle effectively.

Compared to physical model-based and seismic inversion methods, data-driven approaches do not rely on explicit physical laws or geological assumptions. These methods include traditional geostatistics and modern artificial intelligence (AI) techniques. Geostatistical methods rely on statistical properties and spatial correlations in data to build models, often integrating both hard and soft data, such as seismic and well log data. Kriging, an early interpolation technique, uses well log data to perform spatial prediction between wells by establishing a variogram for precise interpolation¹⁵. Cokriging goes a step further by jointly interpolating seismic and well log data, enhancing prediction accuracy¹⁶. These methods are largely deterministic, while stochastic inversion methods, like Sequential Gaussian Simulation (SGS) and Multiple Gaussian Simulation (PGS)^17,18,19,provide multiple realizations, making them particularly adaptable for modeling heterogeneous reservoirs. In the case of the Savak formation, SGS was used to generate various possible reservoir attribute distributions, assisting in predicting hydrocarbon distribution and quality¹⁸.

In addition to geostatistical methods, AI techniques have seen extensive application in reservoir characterization, particularly in integrating seismic data (pre-stack, post-stack, and inverted attributes) with well log data, significantly enhancing reservoir property description^20,21,22. For instance, Support Vector Regression (SVR) combined with seismic attributes can effectively estimate porosity and water saturation²³, while Artificial Neural Networks (ANN) capture complex, nonlinear relationships between seismic attributes and porosity or permeability. Studies indicate that combining ANN with the Gamma test to select optimal seismic and well log attributes significantly enhances the prediction accuracy of permeability, porosity, and intrinsic attenuation in carbonate aquifers and sandstone-shale reservoirs^24,25. A study in the Upper Assam Basin of India exemplified this approach, successfully applying absolute and relative acoustic impedance (AAI and RAI) to stratigraphically delineate reservoirs and estimate porosity, shale volume, and water saturation, improving overall reservoir property estimation²⁶. Likewise, XGBoost, integrated with PySpark, has achieved favorable results in porosity prediction for complex carbonate reservoirs, leveraging large datasets to capture nonlinear relationships and offering new solutions for complex reservoir characterization and porosity prediction²⁷.

However, existing deep learning methods face significant challenges when applied to ultra-deep carbonate reservoirs. Due to the complex pore-fracture structure and high heterogeneity of these reservoirs, seismic signals are often weak, with high noise levels, making it difficult for traditional methods to extract clear reservoir property features. This study addresses these challenges by proposing a porosity prediction model based on seismic inversion and deep learning. By combining high-resolution well log data with 3D seismic data, we selected critical seismic attributes (such as wave impedance, coherence, and phase tracking) and constructed a training dataset. The model leverages deep learning to extract multi-scale subsurface information and incorporates phase-controlled constraints to capture the nonlinear features of porosity. Bayesian optimization further enhances the model’s performance, achieving high-precision porosity predictions in complex reservoirs.

This paper is structured as follows: “Geological characteristics of ultra-deep fault-controlled fractured-vuggy reservoirs” details the geological characteristics of the study area, “Methods and data set” describes the methods and dataset, “Results and discussions” presents the results and discusses prediction accuracy, and “Conclusions” offers the conclusions and implications for future research.

Geological characteristics of ultra-deep fault-controlled fractured-vuggy reservoirs

The Shunbei No.4 strike-slip fault zone is a typical ultra-deep, fault-controlled, fracture- vuggy oil and gas reservoir located in the Shunbei oil field of the Tarim Basin, China. Structurally, it lies in the Shuntuoguole low uplift, a saddle-shaped belt between the Tabei and Tazhong uplifts and the Awati and Manjiaer depressions (Fig. 1). The reservoir is controlled by large strike-slip faults and exhibits a tabular structure with a thickness reaching up to 1041 m. It is primarily developed in the middle and lower Ordovician Yijianfang and Yingshan Formations, with burial depths exceeding 7200 m. The main rock types are sparry and micrite limestone, with an average porosity of approximately 3% and permeability ranging from 0.706 to 4760 × 10^− 3 millidarcies²⁸. The matrix generally has negligible storage and permeability.

The internal geological structure of the fault zone, as shown in Fig. 1, is complex and influenced by differential activity within the large strike-slip fault zone. It features segmented reservoir formation, vertical penetration, and intermittent distributions of tensional-shear and compressional-shear zones. The fractured-vuggy reservoir mainly resulted from multiple episodes of strike-slip fault activity, with minor fluid reworking in later stages. The upper part of the reservoir is sealed by dense carbonate rocks on the sides and a thick mudstone caprock on top, forming a fault-controlled fracture-cave trap. The reservoir space mainly consists of cavities formed by fractures, pores between breccias, and structural fractures. The irregular distribution of these cavities, holes, and fractures leads to significant local variations in seismic wave impedance, leading to scattering and interference of seismic waves during propagation. This results in complex and ambiguous signals in the seismic data, a low signal-to-noise ratio, and significant challenges in porosity prediction.

Methods and data set

Data collection and preparation

In this study, we collected logging porosity data from six wells in the northern part of the study area (see Fig. 1), and combined these with 3D seismic data from geostatistical inversion to create a comprehensive dataset.

Logging data and seismic attribute selection

Logging data: Logging porosity data were collected from six wells with a measurement accuracy of 0.125 m. This high accuracy was crucial given the relatively lower accuracy (25 m) of seismic sampling in the ultra-deep (> 7000 m) target area. Logging interpretation porosity was used as the target value for seismic porosity prediction due to its superior precision.

Seismic attributes: we selected seismic attributes based on the specific geological characteristics of the reservoir, prioritizing those that comprehensively capture its features. The attribute selection process begins with geological feature analysis, identifying key structures like faults, cracks, and cavities. Initial filtering of seismic attributes was informed by previous studies and seismic interpretation experience, resulting in the selection of four categories of attributes that effectively represent these geological features (Fig. 2). Subsequently, PCA and correlation assessments were conducted for further refinement, with the overall process illustrated in Fig. 3.

Impedance: Reflects changes in porosity within the formation as the product of medium density and wave velocity. It exhibits a clear mirror-symmetry relationship with the porosity curve, due to variations primarily caused by pores or fractures.

Ant tracking: Provides geological information about small and medium-sized fractures, indicating cracks or continuous holes in matrix rocks. It captures minor changes in subsurface rock structures and helps identify fractures of different scales that are related to porosity.

Coherence: Highlights geological features such as faults and fault-controlled fractured-vuggy bodies by analyzing the similarity of adjacent seismic trace signals. This attribute helps identify lateral heterogeneity within strata.

Tensor: Reflect chaotic reflection structures in seismic data, converting them into two- or three-dimensional image attributes. These attributes describe the discontinuous reflection characteristics caused by faults and small fractured-vuggy bodies.

Data processing and integration

To align seismic attribute data with logging porosity data:

Time-depth conversion: Seismic attribute volumes were converted from time to depth.

Depth matching and correction: Depth matching and correction of logging porosity data were performed, using logging interpretation porosity as the target value for seismic predictions. Seismic attribute data were densified and resampled to ensure alignment with logging data at the same depths, generating a training dataset that included both seismic attributes and porosity values.

Data cleaning

We processed the data for missing values and outliers. If few values were missing or had little impact on the analysis, rows or columns containing them were deleted directly; if many values were missing, interpolation methods (such as the mean, median, or mode) were used to fill them in. The outliers were either deleted or replaced as appropriate.

Standardization

The dataset was standardized to ensure consistent statistical properties and eliminate dimensional differences between features, enhancing the model’s convergence rate.

Model construction and application

A phase-constrained deep learning model was developed. Parameters such as the number of neurons and network layers were optimized using a Bayesian optimization algorithm. The optimal model was then applied to the seismic volume of the entire region to calculate the porosity distribution.

Deep learning model establishment and optimization

Model structure

A deep learning model is built with two components: feature extraction and regression. The feature extraction component identifies characteristics related to porosity from multi-dimensional seismic attributes, while the regression component maps these extracted features to porosity values. With multi-dimensional seismic data as input, the model processes the data through these two stages to predict the porosity, denoted as $\hat{y}$.

$$\hat y = {\text{fc}}2({\text{fc}}1({\text{flatten}}({\text{pool}}1({\text{relu1}}({\text{conv}}1({\text{seismic\_data}}))))))$$

(1)

The convolution layer, conv1, slides the convolution kernel over the input seismic data to extract local features.

$${\text{conv}}1[i] = {\text{ReLu}}\left( {\sum\limits_{k = 0}^{K - 1} {{\text{seismic\_data}}[i + k]} \cdot {\text{weight}}[k] + {\text{bias}}} \right)$$

(2)

$$\:ReLU\left(x\right)={\text{max}}(0,x)$$

(3)

where $\:seismic\_data\left[i\right]$ is the i-th element of the input data, k denotes the sliding position of the convolution kernel, $\:weight\left[k\right]$ denotes the weight of the convolution kernel, and $\:bias$ denotes a bias term.

Pooling operation, pool1, takes the maximum or average over each sliding window. The mathematical expression for the pooling operation is as follows:

$${\text{pool}}1[i] = \max ({\text{conv}}1[i],{\text{conv}}1[i + 1])$$

(4)

The fully connected stratigraphy is expressed as follows:

$${\text{fc}}1[i] = \sigma \left( {\sum\limits_{j = 0}^{M - 1} {{\text{flattened}}[j] \cdot {\text{weight}}[j,i] + {\text{bias}}[i]} } \right)$$

(5)

$${\text{fc}}2[i] = \sum\limits_{j = 0}^{N - 1} {{\text{fc}}1[j] \cdot {\text{weight}}[j,i] + {\text{bias}}[i]}$$

(6)

where fc1[i] represents the output of the ith neuron of the fully connected layer, $\:\sigma\:$ denotes the activation function, $\:flattened\left[j\right]$ denotes the output of the input layer or the upper layer of neurons, the value of the jth element after flattening processing, $\:weight[j,i]$ denotes the weight connecting the jth input to the ith neuron, and $\:bias\left[i\right]$ denotes the bias term representing the ith neuron.)

Model training

The deep learning model is trained by minimizing the loss function, which is set to $L_{\text{total}}$. The root mean square error $L_{\text{regression}}$ between the predicted porosity value $\hat{y}$and the actual porosity value y is calculated to determine the model’s prediction error. Additionally, a seismic facies loss term ${L_{{{nearest\_neighbor}}}}$ is introduced to constrain the predicted porosity values, ensuring they remain consistent with the seismic facies data. A parameter $\lambda$ is introduced to adjust the relative weight between the two loss terms. The model parameters are updated via backpropagation using the gradient descent algorithm to minimize the total loss value ${L_{{\text{total}}}}$. This approach improves the model’s performance, particularly in predicting high-porosity values, by maintaining geological consistency during the training process.

$${L_{{\text{total}}}} = {L_{{\text{regression}}}} + \lambda {L_{{{nearest\_neighbor}}}}$$

(7)

$$L_{{\text{regression}}} = \frac{1}{N}\sum\limits_{i = 1}^N {{{({{\hat y}_i} - {y_i})}^2}}$$

(8)

$${L_{nearest\_neighbor}} = \frac{1}{N}\sum\limits_{i = 1}^N {d({{\hat y}_i},{\text{center}}({\Theta _i}))}$$

(9)

where $L_{regression}$ denotes the loss function for the regression task, $L_{nearest neighbor}$denotes the seismic phase loss term, ${\hat{Y}}_{i}$ refers to the predicted value of porosity for the ith sample, ${\Theta _i}$ refers to the corresponding true seismic facies label, center (${\Theta _i}$) denotes the central value of the porosity corresponding to seismic phase ${\Theta _i}$, and $d({\hat Y_i},{\text{center}}({\Theta _i}))$denotes the distance measure between the predicted value ${\hat{Y}}_{i}$ and the porosity center value ${\Theta _i}$ of the true seismic facies.

Model optimization

To optimize the deep learning model, we use a Bayesian optimization algorithm to automatically search for the optimal hyperparameter combination. This process adjusts key elements such as the network structure (number of layers, neurons), training iterations, and loss function parameters. The objective is to minimize prediction error by dynamically selecting the best parameters based on the model’s performance in previous iterations.

Bayesian optimization explores the hyperparameter space and refines the model by updating the parameters after each iteration²⁹. The final set of parameters is the one that achieves the lowest prediction error, ensuring optimal model performance.

$${x^*} = \arg \min f(x),\,\,x \in X$$

(10)

where x represents a set of hyperparameter combinations, X represents the parameter combination space, and $x^{\ast}$ denotes a set of x in X, such that $f(x)$ achieves the optimal solution.

SHAP value analysis

The SHAP method, based on Shapley values from game theory, explains model predictions by quantifying the impact of each feature³⁰. SHAP analysis assesses how each feature influences the prediction while accounting for uncertainty in feature values. It calculates the SHAP value for each feature by iteratively adding and removing features, thereby providing a clear understanding of how each input affects the model’s output.

Validation method

The validation method in this study involved using several blind wells that were excluded from the model training process. The deep learning model, developed using seismic attributes and logging data from other wells, was applied to predict porosity in these blind wells. The predicted porosity values were then compared to the actual logging porosity. The prediction accuracy was assessed using MSE, Coefficient of Determination (R²), and Mean Absolute Error (MAE), calculated as follows:

$$MSE = \frac{1}{n}\sum\limits_{i = 1}^n {{{({{\hat y}_i} - {y_i})}^2}}$$

(11)

$${R^2} = 1 - \frac{{{{\sum\nolimits_{i = 1}^n {({y_i} - {{\hat y}_i})} }^2}}}{{{{\sum\nolimits_{i = 1}^n {({y_i} - \bar y)} }^2}}}$$

(12)

$$MAE = \frac{1}{n}\sum\limits_{i = 1}^n {\left| {{{\hat y}_i} - {y_i}} \right|}$$

(13)

where ${\hat{y}}_{i}$represents the predicted porosity, ${{y}}_{i}$represents the actual porosity, and n represents the number of samples. A lower MSE, MAE and a higher R² indicates higher prediction accuracy. This method ensures the model’s reliability in predicting porosity under different geological conditions, especially in ultra-deep fault-controlled fractured-vuggy reservoirs.

Results and discussions

Results of data analysis

Based on previous studies and seismic interpretation experience, we identified four categories of seismic attributes that effectively characterize the reservoir body of this oilfield. From these categories, we selected eight key seismic attributes: TENSOR_ANT, XGNLTD, PHASE_ANT, EIGENVALUE_2ND, EIGENVALUE_3RD, LAMBDA_ANT, LAMBDA_AAGFIL, and IMP. Specifically, IMP falls under the Impedance category; XGNLTD and LAMBDA_AAGFIL are categorized as Coherence attributes; TENSOR_ANT, PHASE_ANT, and LAMBDA_ANT are part of the Ant Tracking category; and EIGENVALUE_2ND and EIGENVALUE_3RD belong to the Tensor category. By collecting seismic and logging porosity data and performing well-to-seismic matching, we obtained a dataset comprising 10,261 data points. After preprocessing steps such as standardization and imputing missing values, the final training dataset was prepared. The statistical properties of the dataset is summarized in Table 1. Porosity values range from 0.003 to 30.594%, with an average of 0.854%, a median of 0.752%, and a standard deviation of 0.912%.

Table 1 Basic statistics used for model training.

Full size table

Certain porosity values are identified as outliers, which can be attributed to the geological features of the reservoir. Ultra-deep fractured-vuggy reservoirs often feature complex fracture networks and karst structures, including caves and vugs, leading to localized areas with very high porosity. In contrast, regions with denser rock or fewer fractures may show unusually low porosity. These natural variations in rock formations are typical of such reservoirs and account for the extreme porosity values observed.

To further explore the relationship between seismic attributes and logging porosity, we calculated the Pearson correlation coefficient, which measures the linear relationship between two variables. The Pearson correlation coefficient ranges from − 1 to 1, where 1 indicates a perfect positive linear relationship, − 1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship. A value below 0.2 suggests a weak correlation³¹. As illustrated in Fig. 4, the correlation between individual seismic attributes and porosity ranges from 0.0049 to 0.26, indicating weak linear relationships. These results underscore the need for applying deep learning techniques to model the nonlinear relationships between multiple seismic attributes and porosity.

Next, we utilized PCA to filter the seismic attributes. This method calculates the loadings for each feature, which indicate how much each feature contributes to the principal components. By examining the absolute values of these loadings, we identified the features with the largest values as important attributes, ensuring that they effectively capture the variation in the data.

The loading values for the various seismic attributes are displayed in Fig. 5. In our selection process, we considered both the completeness of the seismic attribute categories and the contribution of features highlighted by PCA. Ultimately, we chose LAMBDA_ANT, PHASE_ANT, IMP, EIGENVALUE_3RD and LAMBDA_AAGFIL as the predictive feature attributes.

Results of evaluation and validation

Model performance

In this study, 80% of the dataset was used to train the deep learning model, while the remaining 20% was reserved for testing its predictive performance. The R² and MSE for the training set were 0.973 and 0.08, respectively, while for the test set, these values were 0.899 and 0.16. These results suggest that the model fits the training data well and also shows good generalization capability on unseen data.

Figure 6 compares the porosity prediction results from both the unconstrained and facies-constrained deep learning models. The unconstrained model performs well for low-porosity values (< 5%), but its accuracy diminishes for higher porosities due to an imbalance in the training data, with a predominance of low-porosity samples.

Traditional deep learning models often face this issue, as they tend to focus on the more abundant low-porosity samples, resulting in poor performance for the less common high-porosity values. This limitation arises as the model primarily learns from the abundant low-porosity data, leading to underfitting in the high-porosity regions.

To address this issue, the facies-constrained deep learning model was developed. By incorporating seismic facies as constraints during training, the model is guided to better capture the underlying geological features, resulting in more accurate predictions across the full porosity range, especially in high-porosity regions. As Fig. 6 shows, the facies-constrained model improves accuracy by balancing predictions for both low- and high-porosity samples, ensuring more robust performance.

Verification and comparison of predictive accuracy

To validate the model’s performance, we selected two blind logging wells that were excluded from the model training process. The predicted porosity values from the deep learning model were compared to the actual logging porosity values from these blind wells. To quantify the model’s accuracy, we calculated the MSE between predicted and true porosity values, as shown in Table 2.

Table 2 Comparison of the MSE, MAE, and R² for porosity prediction between the two methods across the two wells.

Full size table

Table 2 shows that the MSE of the facies-constrained deep learning model is 76% lower than that of the traditional wave impedance inversion method. This significant improvement underscores the effectiveness of deep learning for predicting porosity, particularly in highly heterogeneous, ultra-deep reservoirs. Additionally, R² values indicate that the deep learning model explains a greater proportion of variance in the actual porosity values compared to the traditional method. Furthermore, the MAE demonstrates that predictions from the deep learning model are closer to the actual values, confirming its superior predictive performance across all wells assessed.

Comparing prediction results from the two blind wells clearly reveals that the deep learning model outperforms the traditional method in both low-porosity and high-porosity regions (Fig. 7). Notably, in high-porosity regions, the deep learning model provides significantly more accurate predictions, demonstrating its superior ability to handle limited high-porosity samples.

SHAP analysis and feature importance

By analyzing the SHAP values, we can quantify the percentage contribution of each seismic attribute to the model’s predictions. As illustrated in Fig. 8, LAMBDA_AAGFIL and PHASE_ANT are the most influential attributes in porosity prediction. LAMBDA_AAGFIL reflects the reservoir body structure, while PHASE_ANT captures the fracture structure. Both are critical in ultra-deep fractured-vuggy reservoirs. These findings validate the model’s ability to capture key geological features and provide insights into the relative importance of seismic attributes.

In terms of practical applications, understanding the influence of specific seismic attributes allows for more targeted data collection and model refinement in future reservoir prediction efforts. Future applications could leverage these key attributes to refine porosity prediction models, particularly in complex reservoirs with significant structural heterogeneity. For instance, integrating additional geological data, such as lithofacies or fault intensity, could enhance model accuracy.

Porosity distribution

The final porosity distribution map of the study area highlights high-porosity regions, represented by red and orange areas predominantly concentrated in the middle and upper sections (Fig. 9). These regions correspond to areas around Wells 1 and 2, the most productive wells in the study area. The high-porosity areas align with key geological features, such as bedding structures and fracture zones, validating the model’s ability to capture geological influences on porosity distribution.

Notably, ultra-deep fault-controlled fractured-vuggy reservoirs exhibit a porosity distribution distinct from that of stratified reservoirs. In these reservoirs, porosity is often governed by the distribution of faults and fractures rather than by layered geological structures. This unique, highly heterogeneous, and non-layered distribution pattern is well-represented in the model, with high-porosity regions concentrated along fault and fracture zones, while areas with minimal fault influence show lower porosity. This distribution aligns with the known geological characteristics of ultra-deep fractured-vuggy reservoirs, validating the model’s ability to reflect the complex geological structures present in the study area.

The smooth color gradient and continuous porosity distribution suggest that the deep learning model effectively captures trends and variations in formation characteristics and porosity, providing highly reliable prediction results. The identification of high-porosity areas, often potential reservoir sites, underscores the model’s practical utility for exploration and development planning.

Limitations and future research

Although the deep learning model shows significant improvements over traditional methods, it still has some limitations. First, the model relies solely on seismic attributes, excluding other geological factors such as fluid saturation, mineral composition, and others. These factors, which can influence reservoir properties, are not directly accounted for in the model, potentially limiting its ability to capture small-scale variations.

Additionally, while the facies-constrained model improves predictions in high-porosity regions, imbalances in the training data may still affect accuracy. High-porosity samples are often underrepresented, which might lead to less accurate predictions in these critical areas.

Future work could focus on incorporating additional data, such as lithology and fluid distribution, to improve the model’s predictive capabilities. Hybrid models that combine deep learning with traditional geophysical methods could also enhance performance across different reservoir types. Additionally, developing better techniques for balancing training data could further improve the model’s accuracy in rare but important high-porosity zones.

Conclusions

In this study, we developed a deep learning-based porosity prediction method that integrates multiple key seismic attributes and incorporates seismic facies constraints to address the challenges of predicting porosity in ultra-deep, fault-controlled fractured-vuggy reservoirs. The method achieved a 76% reduction in MSE during blind well predictions, significantly enhancing predictive accuracy. The key findings are as follows:

1.
We established a nonlinear relationship model between seismic attributes and porosity using deep learning, effectively leveraging detailed subsurface structural information within the seismic data. This capability allows for more accurate capture of porosity distribution under complex geological conditions.
2.
The introduction of seismic facies constraints significantly improved the model’s performance in predicting high-porosity areas, addressing the shortcomings of traditional methods in these scenarios.
3.
SHAP analysis revealed the critical role of seismic attributes related to fracture-cave and fault structures, particularly identifying LAMBDA_AAGFIL and PHASE_ANT as the most influential attributes for porosity prediction. This underscores the importance of these geological features in reservoir characterization.

This approach holds significant potential for application in other complex reservoir environments, especially those with high heterogeneity and intricate geological structures. Future research could explore the integration of additional geological constraints and the development of hybrid models that combine deep learning with traditional geophysical methods, further enhancing prediction accuracy and adaptability.

Data availability

The datasets presented in this article are not readily available because Restricted to specific individuals, researchers, or organizations only. Requests to access the datasets should be directed to 2543778006@qq.com.

References

Kang, Z. et al. Intelligent prediction technology for inter-well connectivity paths in deep fractured-vuggy reservoirs. Pet. Nat. Gas Geol. 44, 1290–1299 (2023).
Google Scholar
Li, Y., Kang, Z., Xue, Z. & Zheng, S. Theory and practice of carbonate reservoir development in China. Petroleum Explor. Dev. 45, 669–678 (2018).
Article CAS Google Scholar
Ma, Y. et al. Exploration and development practice and theoretical and technological progress of ultra-deep carbonate oil and gas fields in the Shunbei area of the Tarim Basin. Petroleum Explor. Dev. 49, 1–17 (2022).
Article Google Scholar
Safari, M. R., Taheri, K., Hashemi, H. & Hadadi, A. Structural smoothing on mixed instantaneous phase energy for automatic fault and horizon picking: case study on F3 North Sea. J. Petroleum Explor. Prod. Technol. 13, 775–785 (2023).
Article CAS Google Scholar
Han, H. et al. Overview of seismic prediction technology for reservoir properties. Geophys. Progress. 36, 595–610 (2021).
Google Scholar
Archie, G. E. The electrical resistivity log as an aid in determining some reservoir characteristics. Trans. AIME. 146, 54–62 (1942).
Article Google Scholar
Wyllie, M. R. J., Gregory, A. R. & Gardner, L. W. Elastic wave velocities in heterogeneous and porous media. Geophysics 21, 41–70 (1956).
Article ADS Google Scholar
Carman, P. C. Fluid flow through granular beds. Trans. Inst. Chem. Eng. Lond. 15, 150–156 (1937).
CAS Google Scholar
Miah, M. I. Porosity assessment of gas reservoir using wireline log data: a case study of bokabil formation. Bangladesh Procedia Eng. 90, 663–668 (2014).
Article Google Scholar
Taheri, K., Alizadeh, H., Askari, R., Kadkhodaie, A. & Hosseini, S. Quantifying fracture density within the Asmari reservoir: an integrated analysis of borehole images, cores, and mud loss data to assess fracture-induced effects on oil production in the Southwestern Iranian Region. Carbonates Evaporites. 39, 8 (2024).
Article Google Scholar
Wu, D., Wang, Z., Wang, F. & Lin, X. Predicting sandstone porosity using seismic wave velocity. Pet. Nat. Gas Geol. 6, 290–293 (1995).
Google Scholar
Zou, G., Peng, S., Zhang, H., Zhang, K. & Shi, S. Prediction of porosity in mining areas using seismic wave impedance inversion method. J. China Coal Soc. 34, 1507–1511 (2009).
Google Scholar
Leisi, A. & Shad Manaman, N. Three-dimensional shear wave velocity prediction by integrating post-stack seismic attributes and well logs: application on asmari formation in Iran. J. Petrol. Explor. Prod. Technol. 14, 2399–2411 (2024).
Article Google Scholar
Abdolahi, A., Chehrazi, A., Kadkhodaie, A. & Babasafari, A. A. Seismic inversion as a reliable technique to anticipating of porosity and facies delineation, a case study on Asmari formation in Hendijan field, southwest part of Iran. J. Petrol. Explor. Prod. Technol. 12, 3091–3104 (2022).
Article Google Scholar
Journel, A. G. & Journel, A. G. Fundamentals of Geostatistics in Five Lessonsvol. 8 (American Geophysical Union, 1989).
Wackernagel, H. Multivariate Geostatistics: An Introduction with Applications (Springer Science & Business Media, 2003).
Rahimi, M. & Riahi, M. A. Static reservoir modeling using geostatistics method: a case study of the Sarvak formation in an offshore oilfield. Carbonates Evaporites. 35, 62 (2020).
Article CAS Google Scholar
Rezaei, M., Emami Niri, M., Asghari, O., Talesh Hosseini, S. & Emery, X. Seismic Data Integration Workflow in Pluri-Gaussian Simulation: application to a heterogeneous Carbonate Reservoir in Southwestern Iran. Nat. Resour. Res. 32, 1147–1175 (2023).
Article Google Scholar
GhojehBeyglou, M. Geostatistical modeling of porosity and evaluating the local and global distribution. J. Petroleum Explor. Prod. Technol. 11, 4227–4241 (2021).
Article Google Scholar
Goralski, M. A. & Tan, T. K. Artificial intelligence and sustainable development. Int. J. Manage. Educ. 18, 100330 (2020).
Google Scholar
Li, Y., Lian, P., Xue, Z. & Dai, C. Application status and prospects of big data and artificial intelligence in oilfield development. J. China Univ. Petrol. 44, (2020).
Estimating total organic carbon of potential source rocks in the Espírito Santo Basin, SE Brazil, using XGBoost. Mar. Pet. Geol. 162, 106765 (2024).
Estimation of Reservoir Porosity and Water Saturation Based on Seismic Attributes Using Support Vector Regression Approach. J. Appl. Geophys. 107, 93–101 (2014).
Article Google Scholar
Saljooghi, B. S. & Hezarkhani, A. Comparison of WAVENET and ANN for predicting the porosity obtained from well log data. J. Petrol. Sci. Eng. 123, 172–182 (2014).
Article CAS Google Scholar
Wei, G., Han, H., Liu, H., Li, M. & Yuan, S. Facies-controlled porosity prediction of sandstone reservoirs based on semi-supervised gaussian mixture model and gradient boosting tree. Oil Geophys. Explor. 58, 46–55 (2023).
Google Scholar
Leisi, A., Ahsan, M. R. & Saberi, M. R. Petrophysical parameters estimation of a reservoir using integration of wells and seismic data: a sandstone case study. Earth Sci. Inf. 16, 637–652 (2023).
Article ADS Google Scholar
Bione, F. R. et al. Estimating total organic carbon of potential source rocks in the Espírito Santo Basin, SE Brazil, using XGBoost. Mar. Pet. Geol. 162, 106765 (2024).
Article CAS Google Scholar
Zhang, Y., Li, H., Chen, X., Bu, X. & Han, J. Integrated geological and engineering practices and achievements of ultra-deep fault-controlled fractured-vuggy reservoirs in the Shunbei area of the Tarim Basin. Oil Gas Geol. 43, 1466–1480 (2022).
Google Scholar
Pelikan, M. & Pelikan, M. Hierarchical Bayesian Optimization Algorithm (Springer, 2005).
Parsa, A. B., Movahedi, A., Taghipour, H., Derrible, S. & Mohammadian, A. K. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 136, 105405 (2020).
Article PubMed Google Scholar
Cohen, I. et al. Pearson correlation coefficient. Noise Reduct. Speech Process. 1–4 (2009).

Download references

Acknowledgements

This research was financially supported by the National Natural Science Foundation of China under Grant Nos. 42272359 and 42303021. We are extremely grateful to the editor and the four anonymous reviewers for their valuable and constructive suggestions.

Author information

Authors and Affiliations

Key Laboratory of Polar Geology and Marine Mineral Resources (China University of Geosciences, Beijing), Ministry of Education, Beijing, 100083, China
Ziyan Deng, Dongsheng Zhou, Hezheng Dong, Xiaowei Huang & Shiping Wei
School of Ocean Sciences, China University of Geosciences (Beijing), Beijing, 100083, China
Ziyan Deng, Dongsheng Zhou, Hezheng Dong, Xiaowei Huang & Shiping Wei
Petroleum Exploration and Development Research Institute, SINOPEC, Beijing, 100083, China
Ziyan Deng & Zhijiang Kang

Authors

Ziyan Deng
View author publications
You can also search for this author inPubMed Google Scholar
Dongsheng Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Hezheng Dong
View author publications
You can also search for this author inPubMed Google Scholar
Xiaowei Huang
View author publications
You can also search for this author inPubMed Google Scholar
Shiping Wei
View author publications
You can also search for this author inPubMed Google Scholar
Zhijiang Kang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

D.Z.Y.: Software, Investigation, prepared Figs. 1, 2, 3, 4, 5, 6, 7, 8and 9, and Writing–original draft; Z.D.S.: Conceptualization, Methodology, prepared Figs. 1, and 3, Writing–original draft, Writing–review and Supervision; D.H.Z.: Investigation, Data curation, and Editing; H.X.W: Investigation; W.S.P.: Investigation, Data and Curation; K.Z.J.: Resources, Conceptualization, Methodology, and Supervision.

Corresponding authors

Correspondence to Dongsheng Zhou or Zhijiang Kang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Deng, Z., Zhou, D., Dong, H. et al. Deep learning for predicting porosity in ultra-deep fractured vuggy reservoirs from the Shunbei oilfield in Tarim Basin, China. Sci Rep 14, 29605 (2024). https://doi.org/10.1038/s41598-024-81051-4

Download citation

Received: 08 August 2024
Accepted: 25 November 2024
Published: 28 November 2024
DOI: https://doi.org/10.1038/s41598-024-81051-4