Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Causality research based on phase space reconstruction

  • Lei Hu,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Resources, Software, Visualization, Writing – original draft

    Affiliation School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China

  • Zhuoma Sunu,

    Roles Writing – review & editing

    Affiliation School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China

  • Hongke She,

    Roles Methodology

    Affiliation School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China

  • Binghuai Fan,

    Roles Visualization

    Affiliation School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China

  • Jingru Ma,

    Roles Data curation

    Affiliation School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China

  • Chaojiu Da

    Roles Software, Writing – review & editing

    jtdcj@163.com

    Affiliations School of Mathematics and Computer Science Institute, Northwest Minzu University, Lanzhou, China, Institute of Applied Mathematics and Astronomical Calendar, Northwest Minzu University, Lanzhou, China

Abstract

Based on phase space reconstruction theory, the root mean square error is used as a quantitative criterion for identifying the appropriate embedding dimension and time step and selecting the optimal configuration for these factors. The phase space is then reconstructed, and the convergent cross-mapping algorithm is applied to analyse the causality between time series. The causality among the variables in the Lorenz equation is first discussed, and the response of this causality to the integration step of numerical solutions to the Lorenz equation is analyzed. We conclude that changes in the integration step do not alter the causality but will affect its strength. Variables X and Y drive each other, whereas variable Z drives variables X and Y in a unidirectional manner. Second, meteorological data from 1948–2022 are used to analyse the effect of the Southern Hemisphere annular mode on the East Asian summer monsoon index and surface air temperature driving capacity. From a dynamic perspective, it is concluded that the Southern Hemisphere annular mode is the driving factor affecting the East Asian summer monsoon index and surface air temperature. Based on ideal test results and the observation data, the collaborative selection of the embedding dimension and time step is more reliable in terms of determining causality. This provides the ability to determine causality between climate indices and theoretically guarantees the selection of climate predictors.

1. Introduction

Atmospheric motion is an extremely challenging topic because of its multiscale nature, complex dynamics, and thermodynamic effects. In the end of the 19th century and beginning of the 20th century, Bjerknes established a system of atmospheric dynamics equations, providing a milestone in human understanding of atmospheric motion [1]. The highly nonlinear nature and complexity of atmospheric dynamics means that analytical solution to these equations could not be obtained, leaving only numerical solutions. There are several models for predicting atmospheric motion, including dynamic analysis models [2,3], statistical models [4], and dynamic-statistical models [57]. The key to the establishment of statistical models is the selection of predictors, which essentially describe the driving causality between various elements of atmospheric motion [8]. Therefore, the study of causality is an important issue in atmospheric science. Causality is applicable in various disciplines [911], reflecting the relationships that stimulate and are stimulated between things. These relationships can reveal the inherent connections among quantities in the process of development and change. Against the backdrop of global warming, the inference of causality in the climate system can be used to project the risks associated with climate change, particularly in attributing the cause of extreme weather events.

Since the 1970s, attribution research has made significant progress in both theory and application based on a considerable amount of observational data [1215]. In atmospheric science, meteorological research institutions in various countries now have access to large amounts of observation and model output data following innovations in observation technology, improved storage conditions, faster calculation methods, and easier means of communication. The mining of information from these data and the extraction of reliable predictors are urgent scientific problems. Granger first proposed a causality test method for strongly coupled systems in 1969 [16]. The premise of this causal test is that the time series must be stationary; otherwise, false attribution conclusions may appear. Since then, many scholars have proposed various attribution theories and methods [17,18], such as causal network learning [19,20]. In 1981, Takens proposed the theory of phase space reconstruction [12], which aims to reconstruct a phase space that is diffeomorphic to the solution space of the original differential equations. This method employs the time series of a solution component from a system of differential equations, selects the appropriate embedding dimension and time step, and uses this time series as a predictor for other solution components of the differential equation group. Based on Takens’ theory, a lot of attribution research has been done. Some scholars have focused on parameter determination; for example, Fraser and Swinney proposed the mutual information method [21], and Li et al. discussed the theory of determining the attractor dimension by reconstructing the phase space with a one-dimensional time series [22]. In terms of the application of attribution theory, Feng et al. developed the dynamical correlation factor exponent, which enables us to study the dynamic structural features of the climate system [23]. Takens’ theory is also widely used in biological mathematics, mainly in the construction of biological models. Sugihara et al. introduced the technique of convergent cross-mapping (CCM), which complements Granger’s causality [24]. This causality in weakly coupled nonlinear systems can be tested, allowing reliable predictors to be obtained and producing more accurate prediction conclusions [8,2527].

Over the past few years, significant progress has been made in using dynamic causality to analyze drivers of the climate system. Wang et al. used the CCM algorithm to investigate the causal relationship between Northern Hemisphere annular modes and winter surface air temperature (SAT) over Northeast Asian [28,29]. Tsonis et al. using CCM to explore the dynamic relationship between cosmic rays and global temperature [30], while Chen et al. used the CCM algorithm to analyze meteorological drivers affecting PM2.5 concentrations in China [31]. These studies have contributed to our understanding of the complex dynamics of climate systems.

In the previous CCM causal detection process, the embedded dimension is often determined separately, and the time delay is usually obtained by experience, without considering the common influence of parameters in detail [2831]. In this paper, a collaborative parameter determination method is proposed to improve the testing ability of causality strength. The remainder of the article is structured in the following manner. Section 2 contains Basic theory and CCM algorithm. Section 3 contains Parameter selection method in ideal type test. Section 4 contains the driving capability of the Southern Hemisphere annular mode (SAM) index on East Asian summer SAT and the East Asian summer monsoon index (EASMI) will be explored. Section 5 contains conclusion and prospect.

2. Methods and data

2.1. Takens’ theorem [12]

Let M be a compact d-dimensional manifold. Given a pair of transformations (φ,y), where φ: MM is a smooth diffeomorphism and y: MR is a smooth function, we define a mapping as follows: (1)

This is an embedding map from M to the space R2d+1. Moreover, the smoothness of the mapping is guaranteed by the fact that it has at least a second-order continuous derivative, as stated in the theorem.

Takens’ theorem establishes the minimum dimension required to guarantee the existence of a diffeomorphism. In many cases, it is unnecessary to extend the dynamics to such high dimensions. However, accurately determining the minimum dimension in actual data experiments can be challenging, highlighting the importance of identifying the optimal embedding dimension for the success of the experiment.

2.2. Convergent cross-mapping algorithm [24]

Consider two time series X and Y of length L from the same system. We obtain (2) (3) where , and similarly for yi, represent the values of the time series X, Y at time ti. Suppose that the embedding dimension is E (according to Takens’ theory, E must be greater than 2d+1) and the time step is τ, where both E and τ are natural numbers. To discuss the causality between X and Y, we define the embedding vectors of variables X and Y at time tj as MX(tj) and MY(tj), respectively. The sets of embedding vectors of MX(tj) and MY(tj) constitute the phase spaces MX and MY, respectively. For example, consider X: (4) (5) Where {j}⊂{i}. We define as the prediction of the time series X(t) by applying the cross-mapping through the phase space MY, and similarly for .

At time tj, we search for the E+1 nearest neighbours of MY(tj) in the embedded vector set MY, sorted in ascending order of distance as MY(tj1), MY(tj2), …, MY(tj(E+1)), where . Assuming that the mapping of these E+1 nearest neighbours on MX isMX(tj1), MX(tj2), …, MX(tj(E+1), we can obtain the predicted value of at time tj by the cross-mapping, which is given by: (6)

Where wjk is the distance weight of vector MY in the set MY(tj) to the kth nearest neighbour MY(tjk), i.e., , , and (.,.) is the Euclidean distance. Obviously, (7)

2.3. Data

Data for monthly SAM index and EASMI data provided by Professor Li (http://lijianping.cn/dct/page/65577). The monthly SAT data at 1000 hPa are provided by the National Centers for Environmental Prediction, with a horizontal resolution of 2.5°×2.5° (https://www.psl. noaa.gov/data/gridded/data.ncep.reanalysis.html). The research area covers the East Asia region (4°N–53°N, 73°E–150°E), and the time span of our study ranges from January 1948 to December 2022. We consider the summer season to be the period from June to August.

3. Numerical results for Lorenz system

3.1. Data Solution for embedding dimension and time step

The Lorenz equation can be written as (8)

When σ = 10, and β = 28, chaos will occur and a chaotic attractor will be generated [32]. Taking the initial values as x(0) = 3, y(0) = 5 and z(0) = 9 and setting the integration interval and integration step to be [0,500] and Δt = 0.1, respectively, we select the numerical solution of [350,500] as the test sequence, where X is the numerical solution of x, and Y and Z are obtained in a similar manner. The root mean square error (RMSE) is constructed as (9) where A and B are any two of the time series X, Y, and Z. A(tj)is the numerical solution of variable A at time tj. When E and τ are given, is the predicted value of variable A at time tj. The time step τ represents τ times the integration step, i.e., the sampling interval of τ Δt time units. Eq (9) gives the RMSE as a function of E and τ.

Fig 1 shows the RMSE in the interval (E,τ)∈[1,12]×[0,12] by taking A as the time series Y and B as the time series X. To obtain a smooth image, cubic convolution interpolation has been performed based on lattice data, where Fig 1A is R(E,τ) and 1b and 1c show R(τ) (with E = 3) and R(E) (with τ = 1), respectively. When the embedding dimension E is greater than two, the correlation coefficients between the predicted values and the numerical solutions for each parameter configuration pass the 95% significance test. Fig 1A indicates that the smallest prediction error occurs when E = 3 and τ = 1, i.e., the red point A (because E and τ can only be positive integers). Moreover, when the embedding dimension E is given, the prediction error first decreases and then increases with increasing time step τ. When τ is fixed and the embedding dimension E is increased, the prediction error suddenly decreases and then slowly increases. That is, we again observe an initial decrease followed by an increase; Fig 1B and 1C exhibit similar trends. Based on these results, The following conclusions can be derived: (1) The reconstruction of the phase space of variable X provides geometrical information about variable Y; (2) When the time step exceeds a certain threshold, the prediction error tends to increase as the time step becomes larger, indicating that the reconstructed phase space cannot fully recover historical information. This suggests that more historical data may not necessarily provide more effective information, and could even introduce invalid information. Therefore, the requirement for collaborative selection of the best parameters is that the parameters are positive integers; the prediction error is small and passes the 95% significance test. Thus, for the following Lorenz equation test, we set E = 3 and τ = 1. This is consistent with the selection of parameters obtained by previous research methods [29].

thumbnail
Fig 1. RMSE changes with the number of embedding dimensions and time steps.

(a) R(E,τ), (E,τ)∈[1,12]×[0,12]; (b) R(τ), E = 3, τ∈[0,12]; (c) R(E), τ = 1, E∈[1,12].

https://doi.org/10.1371/journal.pone.0313990.g001

In this paper, we consider collaborative parameter determination to improve the ability of causal detection. The simple algorithm is as follows:

3.2. Detection of causality

The correlation coefficient between the time series A(tj) A(tj) and is given by (10) where represents the mean value of A(tj) and represents the mean value of . If the limit converges, i.e., (11) then variable A is considered the driving factor of B, denoted by AB, showing that there is unidirectional causality between the two. The strength of causality is represented by the absolute value of the limit u, with stronger causality indicated by values closer to ±1. A lack of causality or weak causality is indicated by values further away from ±1. Convergence in this context means that ρ(L) approaches a certain value as L increases.

There are two methods for extending time series in numerical simulations of differential equations. One is to adjust the size of the integration step while maintaining the constant integration interval to increase the value of L. The other is to fix the integration step and increase the integration interval to make L larger. In observation data, the time series length can be extended by decreasing the observation interval or by collecting additional data.

Fig 2 displays the correlation coefficient curve given by Eq (10) in numerical simulations of the Lorenz Eq (8), where the abscissa represents the time series length, the ordinate ρ represents the correlation coefficient between the predicted value and the numerical solution. The predictions of variables Y and Z by the reconstructed phase space MX are depicted by red solid lines and dashed lines, respectively. The predictions of variables X and Z by phase space MY are shown by solid and dashed blue lines, respectively. According to Takens’ theorem, the reconstructed phase space of variables X and Y is equivalent to the original phase space. However, for variable Z, the detection process is successful when Z is used as the predicted value, but fails when Z is used as the predicted field due to the rotational symmetry of the phase space reconstructed by Z. E. Deyle presented some forms of Takens’ theorem extension, but further research is still needed for univariate causal detection [33]. Thus, Fig 2 demonstrates the driving capability for detection success. These four correlation coefficient curves converge as the length L of the time series increases. The convergence values are all close to or above 0.99, indicating that Y and Z are the driving factors of X, and that X and Z are the driving factors of Y, with extremely strong causality, namely YX, ZX and XY, ZY. However, the detection of XZ, YZ fails due to the property of rotational symmetry in the reconstructed phase space of variable Z [33].

To assess the effect of the time step on causality, we set the embedding dimension to E = 3 and analyze causality in different time steps. Fig 3A shows the curve of the correlation coefficient between Y and with different time steps. The results show that causality still exists as the time step increases, but the intensity of causality decreases. Fig 3B shows the causality between variables X and with different time steps, which is similar to Fig 3A. Combined with Fig 1, it can be concluded that if the time step is too large, the prediction error may increase, the correlation will decrease, and the reconstructed phase space will not fully recover the system information. This will result in the loss of valid information and the inclusion of invalid information. However, based on Takens’ theorem, without considering the impact of computational cost, causality can still be detected through CCM by appropriately increasing the embedding dimension and time step. Thus, we have the following: (1) The CCM detection process is not random, i.e., appropriately increasing the embedding dimension and time step will not destroy causality, as shown by Fig 3; (2) To determine the strength of the causality, it is necessary to select the best embedding dimension and time step; (3) The study found that while the delay must fall within the system’s pseudo-period range of 0 to 1/4, increasing the delay appropriately does not alter the driving force, but rather its magnitude. Table 1 reveals a significant correlation between delay and data density. Importantly, increasing the delay appropriately when data density is higher does not undermine the test. In these cases, the reconstructed phase space remains diffeomorphic to the original system, with no loss of information due to excessive delay. This highlights the importance of considering the delay corresponding to different data densities when detecting causality in complex systems, and emphasizes the need to further explore the relationship between delay and data density.

thumbnail
Fig 3. Intensity of causality.

(a) Strength of the causal relationship between Y and X; (b) strength of the causal relationship between X and Y.

https://doi.org/10.1371/journal.pone.0313990.g003

thumbnail
Table 1. Forecast error with different integration steps.

https://doi.org/10.1371/journal.pone.0313990.t002

Large amounts of data can be obtained from each national climate observation station across different observation periods, with the data density representing the different observation periods. Now, we theoretically discuss the impact of data density on the causality relationships in atmospheric observational data. The Lorenz equation is taken as the study object, with the initial conditions being x(0) = 3, y(0) = 5, and z(0) = 9, the integration interval being [0,500], and the integration step sizes successively set to 0.01, 0.02,…, 0.1. The numerical solution over the integration interval [350,500] is chosen as the test sequence. For example, when Δt = 0.01, there are 15,000 numerical solutions for the test data. A smaller integration step gives a smoother time series and a greater data density. The optimal embedding dimension and time step are calculated from Eq (9). Fig 4 shows the prediction error of variable X on . As the integration step increases (i.e., the data density decreases), the prediction error increases, but the causality remains unchanged. This shows that a smaller integration step size and smoother data result in a smaller prediction error. Therefore, in detecting the causality of observation data, it is essential to select data that are as dense as possible.

To compare the reliability and advantages of collaborative parameter determination, the initial field of the Lorenz Eq (8) remains unchanged while a numerical solution with an integration step of 0.01 and integration interval [350,500] is selected as the experimental sequence. Take the variables X, Y as an example. There are three schemes for parameter determination. One is E = 3 obtained according to Simplex Projection algorithm and τ = 1 obtained empirically [29]; the other is E = 6 and τ = 6 determined collaboratively in this paper; the third is to construct a control test and choose E = 6 and τ = 1 according to experience. As shown in Fig 5, when E = 6 and τ = 6, the convergence effect is the best with the largest convergence value, followed by the control test. Conversely, E = 3 and τ = 1 result in the worst convergence with the smallest convergence value. This is because when discrete data approaches continuous infinity, the selection of tau will affect the ability of the shadow manifold to recover information. If tau is too small and the dimension is too small, information overlap occurs, hence E and τ needs to be determined collaboratively under different density data. This also verifies the advantages of collaborative selection [12].

thumbnail
Fig 5. Convergence for different parameter configurations.

https://doi.org/10.1371/journal.pone.0313990.g005

4. Causality detection of SAM and some summer climate indices in East Asia

4.1. Data analysis

The Southern Hemisphere annular mode is the dominant mode of variability in atmospheric circulation in the mid-to-high latitudes of the Southern Hemisphere [34]. However, the effects of SAM on climate are not limited to the Southern Hemisphere, and studies of the global climate often incorporate SAM [35,36]. Li et al. investigated the Northern Hemisphere climate by studying the SAM phase and climate index [37].

Although it is widely known that observed data inherently contain noise, this paper does not specifically focus on this aspect. However, in this study, the second-best parameter configuration was selected to minimize the impact of noise on the test results. In this section, we preliminarily analyze the correlation between summer SAM and EASMI and between summer SAM and the East Asian summer 1000hPa SAT from 1948–2022. Fig 6 shows the time series and the linear trend of each climate index from 1948–2022. The seasonal time series curves of SAM (solid line marked in blue) and EASM (solid line marked in orange) in summer from 1948–2022 are shown in Fig 6A. The correlation coefficient between variables is -0.36, and the correlation coefficient passes 95% significance test, indicating that the two have a good correlation. The regression analysis of SAM and EASM is carried out respectively, and it is found that SAM and EASM have obvious linear trend, and their linear trend coefficients are 0.67 and -0.29. Fig 6A The blue and orange bands show the confidence intervals with 95% significance for SAM and EASM regression analysis. Fig 6B shows the seasonal time series curves of summer SAM and East Asian summer 1000hPa SAT (solid line marked with green). There is also a significant correlation between SAM and SAT, with a coefficient value of 0.66. In addition, the linear trend coefficient of SAT is 0.80, and the green band is the regression analysis of SAT, passing the confidence interval at 95% significance level. The correlation coefficient between SAM and SAT was 0.66 from 1948–2022, and the correlation between SAM and SAT was negative from 1953–1963. In the mid-to-late 1970s, the correlation was positive again. There are obvious oscillations in local time periods. Traditional statistical analysis may yield different causality results in different time periods. In addition, correlation cannot establish causality in the physical sense. This section explores the causality between SAM and the East Asian summer climate based on Takens’ theory and the CCM algorithm. We also discuss the influence of different data densities on causality. Provide theoretical support for literature [37] as much as possible.

thumbnail
Fig 6. Season by season time series of SAM, EASM and 1000hPa SAT from 1948–2022 (blue: SAM; Orange: EASM; Green: SAT).

(a)Time series and linear trend of SAM and EASM in summer; (b)Time series and linear trend of SAM and 1000hPa SAT in East Asian summer.

https://doi.org/10.1371/journal.pone.0313990.g006

4.2. Causality between summer SAT and SAM in East Asia

To investigate the causality between summer SAM and SAT in East Asia, the optimal embedding dimension and time step for reconstructing the phase space are first determined, and the optimal parameter configuration is jointly selected. Fig 7 shows a contour map of RMSE given by Eq (9). For phase space reconstruction of the summer SAM time series, the optimal parameter configuration for minimizing the prediction error and passing the 95% significance test is E = 6 and τ = 1, as demonstrated by the red point A in Fig 7A. However, for phase space reconstruction of the SAT season-by-season time series, the parameter configuration that produces the smallest prediction error while still passing the 95% significance test is E = 7 and τ = 1, as shown by the red point A in Fig 7B. As control experiments, the parameter configurations with the next-smallest prediction errors are indicated by the blue points B in Fig 7A (E = 7, τ = 1) and 7b (E = 8, τ = 1).

thumbnail
Fig 7. Determination of embedding dimension and time step.

(a) RMSE of variable SAT under different parameters; (b) RMSE of variable SAM under different parameters.

https://doi.org/10.1371/journal.pone.0313990.g007

The parameter configurations represented by points A and B in Fig 7A were used to investigate the unidirectional driving force from the East Asian summer 1000hPa SAT to summer SAM. The results are shown in Fig 8A. As the length of the time series increases, the correlation coefficient does not converge to a stable value when E = 6 and τ = 1 (red solid line). In order to reduce the influence of randomness in the test results, the embedding dimension was increased to E = 7 and the time step was held at τ = 1, as shown by the blue solid line. As the length of the time series increases, the correlation coefficient curve still does not converge. It can be inferred that, under the available data, there is no unidirectional causality between East Asian summer 1000hPa SAT and summer SAM, but the predictive power cannot be dismissed. The prediction errors for both cases are small, only around 0.2, as shown in Fig 7A. Using the parameter configurations represented by points A and B in Fig 7B, the causality between summer SAM and East Asian summer 1000hPa SAT was investigated. Fig 8B shows that both causality curves exhibit a significant upward trend in the correlation coefficient as the length of the time series increases. With E = 7 andτ = 1, the prediction error is 1.31, as shown by the red solid line in Fig 8B. As the length of the time series increases, the correlation coefficient gradually converges to ρ = 0.65, suggesting that the strength of causality is u = 0.65. It can be inferred that summer SAM is the driving factor of 1000hPa SAT in the East Asian summer, with strong causality. With E = 8 and τ = 1, the summer SAM prediction error is 1.32, and the strength of causality is ρ = 0.63. The blue solid line in Fig 8B is maintained, which is completely consistent with Takens’ theory that a one-dimensional time series can always find an E-dimensional embedding phase space with unchanged topological meaning, satisfying E≥2d+1 (where d is the singular attractor dimension). When the embedding dimensions changes, the causality is not disrupted, but its strength changes.

thumbnail
Fig 8. Causality curves.

(a) Unidirectional causality of summer 1000hPa SAT in East Asia on summer SAM; (b) unidirectional causality of summer SAM on summer 1000hPa SAT in East Asia.

https://doi.org/10.1371/journal.pone.0313990.g008

We now examine the changes in causality as the density of observations increases. Monthly data were used to investigate the unidirectional causality between summer monthly SAM and summer 1000hPa SAT in East Asia. The optimal embedding dimension and time step parameter configurations and suboptimal parameter configurations are E = 14, τ = 1 and E = 13, τ = 2, respectively, as shown in Fig 9. Causality still exists, and the causal strength is about 0.47, which is slightly lower than the causal strength of 0.64 in Fig 8B.

In conclusion, our study confirms the existence of unidirectional causality between summer SAM and 1000hPa SAT in the East Asian summer. However, based on the available data, this causality is not strongly associated with data density, which contradicts the ideal test research findings. This may be the result of systematic errors (noise) in the observed data [38]. Furthermore, our results show that there is no increase in causal strength after encryption of the observed data.

4.3. Causality between summer SAM and EAMSI

A similar discussion of causality between summer SAM and EASMI reveals that selecting the optimal parameter configuration as E = 13 and τ = 2 results in the failure of the unidirectional causality test of EASMI on summer SAM (indicated by the red solid line in Fig 10A). Even when selecting the second-best parameter configuration, E = 12 and τ = 1, the test still fails (indicated by the blue solid line in Fig 10A), which suggests the absence of unidirectional causality between EASMI and summer SAM. However, when testing the unidirectional causality between summer SAM and EASMI, the optimal parameter configuration of E = 11, τ = 1 and the second-best configuration of E = 13, τ = 1 both pass the significance test. Fig 9B shows that there is stable causality between SAM and EASMI, and the causal strength does not change significantly with changes in the embedding dimension. These results suggest the existence of unidirectional causality, where the summer SAM influences the motion trajectory of the EASMI dynamic system and serves as a reliable predictor. In summary, our research indicates that there is only one unidirectional causality between summer SAM and EASMI, which aligns with the physical mechanism of SAM propagating towards the northern hemisphere [37,3942].

thumbnail
Fig 10. Causality curves.

(a) Unidirectional causality of EAMSI on summer SAM; (b) unidirectional causality of summer SAM on EAMSI.

https://doi.org/10.1371/journal.pone.0313990.g010

5. Conclusions and discussion

This paper has described the use of Takens’ theory and the CCM algorithm to detect causality in nonlinear climate systems. This analysis explains the intrinsic connections between elements of the climate system and explores the driving mechanisms of the atmospheric system. Through causality tests on both ideal models and atmospheric observation data, this study has demonstrated the feasibility of this detection method. The specific conclusions are as follows:

  1. A method has been proposed for determining the optimal parameter configuration by selecting the optimal embedding dimension and time step in phase space reconstruction.
  2. Causality between variables X, Y, and Z in the Lorenz equation was inferred, where Y and Z are the driving factors of X, X and Z are the driving factors of Y, with extremely strong causality. There also has a causality from X and Y to Z, although its strength is lower.
  3. Using atmospheric observation data, the driving capability of SAM on EASMI and SAT was discussed. The results indicate that, during the summer season, SAM may affect EASMI and SAT through propagation across the equator. Specifically, SAM has a driving effect on both EASMI and SAT in East Asia, and this driving effect is unidirectional.

Causality detection indicates the existence of a causal driver, whereas failure to detect causality may be the result of limitations in existing data and methods, rather than the absence of causality. This is evident from the ideal experiment, where XZ,YZ causality was found to exist but was not detected. Further research is needed to identify sufficient conditions for the absence of causal drivers, to evaluate the impact of time series observation errors on causality detection, and to detect causality in highly nonlinear systems.

Supporting information

S1 File. For observational data in this paper, please refer to the supplementary material.

https://doi.org/10.1371/journal.pone.0313990.s001

(XLSX)

Acknowledgments

We thank Stuart Jenkinson, PhD, from Liwen Bianji (Edanz) (www.liwenbianji.cn/) for editing the English text of a draft of this manuscript.

References

  1. 1. Bjerknes V. Das Problem der Wettervorhersage, betrachtet vom Standpunkte der Mechanik und der Physik. Meteorologische Zeitschrift. 1904;21:1–7.
  2. 2. Zheng ZH, Ren HL, Huang JP. Analogue correction of errors based on seasonal climatic predictable components and numerical experiments. Acta Physica Sinica. 2009;58(10):7359–7367.
  3. 3. Wang H, Zheng ZH, Yu HP, Huang JP, Ji MX. Characteristics of forecast errors in the National Climate Center atmospheric general circulation model in winter. Acta Physica Sinica. 2014;63(09):467–477.
  4. 4. Gong ZQ, Feng GL. The research of durative characteristics of dry/wet series of China during the past 1000 years. Acta Physica Sinica. 2008;57(6):3920–3931.
  5. 5. Feng GL, Zhao JH, Zhi R, Gong ZQ, Zheng ZH, Yang J, et al. Recent progress on the objective and quantifiable forecast of summer precipitation based on dynamical statistical method. Journal of Applied Meteorological Science. 2013;24(06):656–665.
  6. 6. Zuo JQ, Li WJ, Sun CH, Xu L, Ren HL. Impact of the North Atlantic sea surface temperature tripole on the East Asian summer monsoon. Advances in Atmospheric Sciences. 2013;30(04):1173–1186.
  7. 7. Liu Y, Ren HL, Zhang PQ, Jia XL, Liu XW, Sun LH. Improve the prediction of summer precipitation in South China by a new approach with the Tibetan Plateau snow and the applicable experiment in 2014. Chinese Journal of Atmospheric Sciences. 2017;41(2):313–320.
  8. 8. Runge J, Bathiany S, Bollt E, Camps-Valls G, Coumou D, Deyle ER, et al. Inferring causation from time series in Earth system sciences. Nature communications. 2019;10(1):2553. pmid:31201306
  9. 9. Kamiński M, Ding MZ, Truccolo WA, Bressler SL. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics. 2001;85(2):145–157. pmid:11508777
  10. 10. Friston KJ, Harrison L, Penny W. Dynamic causal modelling. NeuroImage. 2003;19(4):1273–1302. pmid:12948688
  11. 11. Meinshausen N, Hauser A, Mooij JM, Peters J, Versteeg P, Bühlmann P. Methods for causal inference from gene perturbation experiments and validation. Proceedings of the National Academy of Sciences. 2016;113(27):7361–7368. pmid:27382150
  12. 12. Takens F. Dynamical Systems and Turbulence. Berlin, Heidelberg: Springer Berlin Heidelberg; 1981.pp. 366–381.
  13. 13. Liang XS. Information flow and causality as rigorous notions ab initio. Physical Review E. 2016;94(5–1):052201. pmid:27967120
  14. 14. Zhao YB, Liang XS. Charney’s Model—the Renowned Prototype of Baroclinic Instability—Is Barotropically Unstable As Well. Advances in Atmospheric Sciences. 2019;36:733–752.
  15. 15. Liang XS. Normalized multivariate time series causality analysis and causal graph reconstruction. Entropy. 2021;23(6):679. pmid:34071323
  16. 16. Granger C. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society. 1969;37(3):424–438.
  17. 17. Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. 1974;66(5):688–701.
  18. 18. Runge J. Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos: An Interdisciplinary Journal of Nonlinear Science. 2018;28(7):075310.
  19. 19. Runge J, Petoukhov V, Kurths J. Quantifying the Strength and Delay of Climatic Interactions: The Ambiguities of Cross Correlation and a Novel Measure Based on Graphical Models. Journal of Climate. 2014;27(2):720–739.
  20. 20. Kretschmer M, Coumou D, Donges JF, Runge J. Using causal effect networks to analyze different Arctic drivers of midlatitude winter circulation. Journal of climate. 2016;29(11):4069–4081.
  21. 21. Fraser AM, Swinney HL. Independent coordinates for strange attractors from mutual information. Physical review A. 1986;33(2):1134–1140. pmid:9896728
  22. 22. Li JP, Chou JF. Some problems existed in estimating fractal dimension of attractor with one-dimensional time series. Acta Meteorologica Sinica. 1996;54(3):312–323.
  23. 23. Gong ZQ, Feng GL. Analysis of similarity of several proxy series based on nonlinear analysis method. Acta Physica Sinica. 2007;56(06):3619–3629.
  24. 24. Sugihara G, May R, Ye H, Hsieh CH, Deyle ER, Fogarty M, et al. Detecting Causality in Complex Ecosystems. Science. 2012;338(6106):496–500. pmid:22997134
  25. 25. Deyle ER, Fogarty M, Hsieh C-h, Kaufman L, MacCall AD, Munch SB, et al. Predicting climate effects on Pacific sardine. Proceedings of the National Academy of Sciences. 2013;110(16):6430–6435. pmid:23536299
  26. 26. McGowan JA, Deyle ER, Ye H, Carter ML, Perretti CT, Seger KD, et al. Predicting coastal algal blooms in southern California. Ecology. 2017;98(5):1419–1433. pmid:28295286
  27. 27. Mønster D, Fusaroli R, Tylén K, Roepstorff A, Sherson JF. Causal inference from noisy time-series data—Testing the Convergent Cross-Mapping algorithm in the presence of noise and external influence. Future Generation Computer Systems. 2017;73:52–62.
  28. 28. Zhang NN, Wang GL, Tsonis AA. Dynamical evidence for causality between Northern Hemisphere annular mode and winter surface air temperature over Northeast Asia. Climate Dynamics. 2019;52(5):3175–3182.
  29. 29. Zhang LY, Wang GL, Tan GR, Wu Y. Experimental study on prediction of nonlinear system based on causality test. Acta Physica Sinica. 2022;71(8):080502.
  30. 30. Tsonis AA, Deyle ER, May RM, Sugihara G, Swanson K, Verbeten JD, et al. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proceedings of the National Academy of Sciences. 2015;112(11):3253–3256. pmid:25733877
  31. 31. Chen ZY, Xie XM, Cai J, Chen DL, Gao BB, He B, et al. Understanding meteorological influences on PM 2.5 concentrations across China: a temporal and spatial perspective. Atmospheric Chemistry and Physics. 2018;18(8):5343–5358.
  32. 32. Lorenz EN. Deterministic nonperiodic flow. Journal of atmospheric sciences. 1963;20(2):130–141.
  33. 33. Deyle ER, Sugihara G. Generalized Theorems for Nonlinear State Space Reconstruction. PLOS ONE. 2011;6(3):e18295. pmid:21483839
  34. 34. Thompson D, Wallace JM. Annular Modes in the Extratropical Circulation. Part I: Month-to-Month Variability*. Journal of Climate. 2000;13(5):1000–1016.
  35. 35. Zheng F, Li JP, Clark RT, Nnamchi HC. Simulation and Projection of the Southern Hemisphere Annular Mode in CMIP5 Models. Journal of Climate. 2013;26(24):9860–9879.
  36. 36. Zheng F, Li JP, Ding RQ. Influence of the preceding austral summer Southern Hemisphere annular mode on the amplitude of ENSO decay. Advances in Atmospheric Sciences. 2017;34(11):1358–1379.
  37. 37. Zheng F, Li JP, Liu T. Some advances in studies of the climatic impacts of the Southern Hemisphere annular mode. Journal of Meteorological Research. 2014;28(5):820–835.
  38. 38. Liu YJ, Ma KY, Lin ZS. The estimation on climate noise of monthly precipitation in china. Journal of Applied Meteorological Science 2000;11(2):165–172.
  39. 39. Xue F, Wang HJ, He JH. Interannual Variability of Mascarene High and Australian High and Their Influences on East Asian Summer Monsoon. Journal of the Meteorological Society of Japanserii. 2004;82(4):1173–1186.
  40. 40. Xue F. Influence of the southern circulation on East Asian summer monsoon. Climate and Environmental Research. 2005;10(3):123–130.
  41. 41. Liu T, Li JP, Li YJ, Zhao S, Zheng F, Zheng JY, et al. Influence of the May Southern annular mode on the South China Sea summer monsoon. Climate Dynamics. 2018;51(11):4095–4107.
  42. 42. Dou J, Wu ZW, Li JP. The strengthened relationship between the Yangtze River Valley summer rainfall and the Southern Hemisphere annular mode in recent decades. Climate Dynamics. 2020;54(3):1607–1624.