Abstract
Studying the propagation prediction of low-frequency (LF) radio waves is very significant for supporting applications in fixed and mobile long-distance communication, remote navigation, and timing service. Therefore, to enhance the predicting accuracy of LF sky wave propagation, we proposed an improved method based on the machine learning method. Firstly, we employed a machine learning method to create a prediction model for the critical frequency of the low ionospheric E layer (foE), which significantly affects LF sky wave propagation. Secondly, we enhanced the method for predicting LF sky wave propagation based on the model of low ionospheric parameters. By comparing the measured data from East Asia and predicted data based on the wave-hop theory, the proposed method achieved a 6.16% improvement in LF sky wave field strength.
Similar content being viewed by others
Introduction
The Low-frequency (LF) is defined as the frequency range between 30 kHz and 300 kHz. Thanks to their low signal attenuation, strong penetration ability, and resilience against both human-made and nuclear interference, LF radio waves are extensively utilized in navigation, control communications, and emergency command1. LF radio waves propagate in two ways: ground waves and sky waves, and the latter can be reflected by the ionosphere, enabling long-distance propagation. In order to achieve an effective application and high-quality communication, LF radio wave propagation prediction has been a research hotspot in the past, remains so in the present, and will continue to be in the future.
In 1938, Booker extended the magneto-ionic theory2 to address a series of issues related to the oblique incidence of LF sky waves on the ionosphere3. In 1962, Budden published Radio Waves in the Ionosphere, systematically providing the mathematical basis for the propagation of LF sky waves in the ionosphere4. Based on previous theories, Ferguson in 1985 proposed a model for the statistical treatment of the daytime ionosphere, which could be used for predicting LF sky wave propagation5. In 2003, Hayakawa’s research on LF propagation signals in the sub-ionosphere revealed earthquakes-related ionospheric disturbances6. Berenger in 2006 summarized the application of the Finite Difference Time Domain (FDTD) algorithm in LF propagation7. In recent years, Antunes de Sá used machine learning models to estimate lightning distance based on LF lightning signal data, which promoted the development of remote sensing of the ionosphere8. Gao et al. predicted the ionospheric reflection coefficient using ionosphere models, comparing and validating the impact of different lower ionospheric models on LF sky wave propagation9. In 2023, Hu et al. investigated the prediction of LF sky wave propagation under complex random ionospheric conditions, comparing the advantages of different FDTD algorithms10. In addition, as an international standard, Recommendation ITU-R P.684 is a widely accepted method for predicting LF sky wave propagation11. It is not difficult to see that the ionosphere, as the main part of LF sky wave propagation channel, has great research potential.
To further enhance the prediction accuracy of LF sky wave propagation, we first propose a model for predicting the critical frequency of the low ionospheric E layer (foE) in East Asia based on statistical machine learning (SML) and spatial cubic spline interpolation (SCSI) methods. Based on this model, we improved the LF sky wave propagation prediction method based on wave-hop theory which can be used to calculate the field strength at frequencies below about 150 kHz. The structure of this paper is arranged as follows: Section “Method description” presents the current status of predictions for low ionospheric parameters, explains the overall method to constructing the model in this paper, and provides a flowchart for the entire modeling process. Section “Modeling of ionospheric parameters” details the entire process of establishing the foE prediction model based on SML and SCSI, including the principles, data, modeling process, and prediction results, demonstrating the model’s advantages in foE prediction and laying the foundation for field strength prediction. In Section “Improved prediction of sky wave propagation”, we enhance the LF sky wave propagation prediction method, providing the predicted results of relevant parameters and field strength, and proving the model’s advantages in field strength prediction. Finally, we summarize the whole work and analyze the future development direction.
Method description
For LF propagation, the low ionosphere plays a crucial role in the propagation of LF sky waves12,13. In particular, the ionospheric parameters of the E layer are essential for accurate prediction of LF sky waves. Through the foE at the reflection position, we can determine the ionospheric reflection height, and then deduce the reflection coefficient, focusing factor, and other parameters. Therefore, accurately predicting foE has become very important in supporting the LF sky wave propagation prediction. The most commonly used model for foE prediction is the International Reference Ionosphere (IRI) model, which is the empirical standard model for predicting many ionospheric parameters. Currently, there are several versions such as IRI-2020, IRI-2016, and IRI-201214. In 2007, Yue et al. developed an empirical model based on the data given by the ionospheric observation station in Wuhan and compared it with the IRI model, which is of great significance for the prediction of foE in East Asia15. In 2011, Chen et al. summarized some variation patterns of foE in the past 20 years based on the measured data of stations in different locations, which helped develop regional empirical models16. In 2013, based on the historical measured data of foE, Somoye et al. predicted the variation pattern of foE in Singapore, Slough, and other places by using standard deviation quotient and average values17. In the same year, Wongcharoen et al. studied the influence of solar activity index and ionospheric index on foE based on the measured data of foE over Chumphon, Thailand, which has guiding significance for the prediction of foE in this region18. Also in 2013, Abe et al. studied the correlation between foE changes and solar activities and obtained the pattern that foE increases with the increase of solar intensity, etc., which is conducive to modeling the foE19. As an international standard, ITU-R P.1239 provides the calculation method of monthly median values of foE20.
However, at present, many scholars are focusing on the study of ionospheric parameter such as total electron content (TEC)21,22,23. Due to the importance of foE in LF propagation, accurate prediction of foE urgently needs to keep up with the development of the times. Therefore, we first established a higher accuracy model of foE, and then improved the propagation prediction method of field strength and relative parameters in this paper. As shown in Fig. 1, the total research process includes.
-
(1)
The first part focuses on the prediction of foE. This is accomplished through the use of SML and SCSI. The SML method involves training models on extensive datasets, which include various influencing factors such as solar activity, and historical ionospheric data. By analyzing these factors, the models based on SML can learn patterns and relationships that enable them to predict future foE with high accuracy. This method is advantageous because it allows for continuous improvement as more data becomes available, making the predictions increasingly reliable over time. Building upon the time prediction using SML, we further employ the SCSI for spatial reconstruction. The SCSI is a mathematical method used to create smooth curves through a series of data points, refining and smoothing input data to achieve higher accuracy.
-
(2)
The second part is dedicated to predicting the LF sky wave field strength. Building upon the foE prediction earlier, we can improve the ionospheric reflection height based on the wave-hop theory, thereby enhancing the prediction accuracy of LF sky wave field strength. Wave-hop theory can provide a framework for understanding the propagation of LF sky wave in the ionosphere.
By fusing these two parts, we can more accurately predict LF sky wave propagation. Subsequent sections will elaborate on the detailed process of model establishment as delineated in this paper.
Modeling of ionospheric parameters
Modeling idea
The SML uses data and statistical techniques to enhance system performance. One of its key advantages lies in its ease of understanding and explanation. Applied SML is a process of extracting data features, summarizing data models, discovering internal knowledge, and finally applying these to data24,25. Like other processes that employ machine learning methods to solve problems, it’s crucial to pay attention to the four key aspects: data, model, algorithm, and strategy. Figure 2 shows the four aspects of modeling in this paper.
To be specific, the application of SML methods to build prediction models should be expanded from the following four directions:
-
(1)
Modeling data: The application of machine learning to problem-solving involves the identification of data features, which need to be consistent and clear across the dataset. Typically, a portion of the acquired dataset is used for model training, while the remainder is allocated for model evaluation. The quality of the data directly influences the quality of the model.
-
(2)
Model selection: This is the key to using machine learning to solve problems. We need to fully understand the purpose of modeling, the principle and the characteristics of the data, and other factors. Only in this way can we choose the appropriate modelset. If the selected model is not suitable for the problem, it will cause poor modeling accuracy and high modeling costs. Principal Component Analysis (PCA) is a widely used machine learning modeling method. It constructs new features on the basis of the original data features to achieve data dimensionality reduction. The PCA method transforms the original features into a combination of linearly independent basis functions. In the early twentieth century, Person proposed the application of PCA method to establish empirical models26. Over half a century later, Dvinskikh introduced the PCA method into the establishment of empirical models of ionospheric parameters27, creating a precedent in this field. Since then, the PCA method has increasingly been utilized in the establishment of various ionospheric models.
-
(3)
Model training: Based on the selected model, the appropriate algorithm is used to analyze the training dataset. This involves identifying statistical patterns between the input and output variables, solving for the correlation coefficient, and determining the specific form of the model expression. The Least Squares (LS) algorithm is a most commonly used algorithm in the field of function fitting28, which is closest to the real case, and its purpose is to minimize the sum of squares of errors.
-
(4)
Model evaluation: Select appropriate statistics as criteria for evaluating the quality of the model. The smaller the difference between the predicted values based on the deterministic model and the data used for model evaluation, the better the model performance.
Taking into account the preceding descriptions, in order to obtain the predicted values of the monthly median foE of each station in November 2013 (comprising the monthly median hourly data for each month), the following details the process of establishing the foE prediction model based on the SML:
Step 1 Obtain foE measured data of four ionospheric observation stations in East Asia (Fig. 3 shows the location information of the four stations and the transmitter station) from 1998 to 2012 and the solar activity index data of the corresponding time.
Step 2 Based on the PCA29, choose the empirical orthogonal function set as the model set to analyze the time variation characteristics of a specific station, and establish the mapping relationship between foE (MHz) and the solar activity index. It is defined as follows:
where m is the month from 1 to 12, X is the solar activity index chosen later, k from 1 to 4 can represent the time variation characteristics of 12, 6, 3, and 1 month30, respectively. j is used to represent the variation characteristics of the solar activity index, which can be represented by functions of order 1, 2, 3 or higher. The optimal number is denoted as K and J. μ and η represent the correlation coefficients
Step 3 Determine the solar activity index X, select a set of solar activity index variation characteristics j and time variation characteristics k.
Step 4 Use the data from 1998 to 2009 as the training dataset for model training. Perform regression analysis using the LS algorithm based on the provided values of j and k. Calculate the correlation coefficients and establish the specific mapping relationship according to the following procedure:
where foE′ is the observation statistic value of foE, Z is the maximum observation count, T is transpose of the matrix, and B is defined as follows:
Step 5 Evaluate the model using the data from 2010 to 2012 by employing the Relative-Root-Mean-Square-Error (RRMSE) as the criterion for assessment. Suppose n1 is the total number of foE from 2010 to 2012, foEOBS and foEPRE represent the measured foE data and predicted values respectively, it is defined as follows:
Step 6 Traverse k and j until the RRMSE between the model predicted values and the measured data is minimum. Identify the optimal values for k and j, denoted as K and J, respectively. Subsequently, establish the specific mapping relationship for the station based on these determined values.
Step 7 Repeat steps 2–6, analyze other stations and determine the corresponding mapping relationships.
Step 8 Predict the monthly median foE in November 2013. Then calculate the Root-Mean-Square Error (RMSE) between the predictions and the measured monthly median values of foE in November 2013. Suppose n2 is the total number of measured monthly median values of foE in November 2013, it is defined as follows:
Modeling data
Following the modeling steps outlined in the above section, this section employs the Kokubunji station as an example (with other stations are similar) to illustrate the data required for modeling. All data come from official institutions and have been manually scaled by experienced specialists, ensuring their quality and reliability.
Firstly, we collect foE measured data31 from the Kokubunji station spanning from 1998 to 2012 with a sampling interval of 1 h. Then, the data are preprocessed to obtain the corresponding monthly median values of foE. However, as shown in Fig. 4a, there are instances of missing data. It is worth noting that during nighttime, foE values are relatively small, and the vertical incidence sounders used by the station have a measurement lower limit close to 1 MHz, making stable measurements challenging during this period. To ensure data reliability, we have excluded foE measured data below 1 MHz.
Secondly, considering the significant impact of solar activity on the ionosphere, wherein the SSN commonly denotes solar activity intensity, particularly within the framework of the IRI model for predicting foE, this paper incorporates SSN as a solar activity index influencing foE variation32,33. As shown in Fig. 4b, we collect SSN data34 from 1998 to 2012, and obtain the monthly median SSN data after preprocessing. Similar to the measured data of foE, there are also instances of missing SSN data.
Due to the small proportion of missing data in the total, it will not affect training based on SML. Meanwhile, the missing data mainly exists after 2010 and is not used for training.
Modeling process and results
Based on the Eq. (1), taking the Kokubunji station as an example, the model is iterated through different j and k to train the model and evaluate the possible model sets respectively. Figure 5 shows the RRMSE obtained from the evaluation of models under various combinations of j and k.
As shown in Fig. 5, it’s evident that the increase of j and k not only increases the computational complexity but RRMSE may not decrease. Although the computational complexity of j = 1 and k = 1 is the lowest, the corresponding RRMSE is obviously greater than that of other combinations of j and k. For practical engineering applications, although the RRMSE reaches its minimum when j = 2 and k = 2, the increase in computational complexity doesn’t significantly enhance accuracy compared to when j = 1 and k = 2. Consequently, J = 1 and K = 2 is chosen. Therefore, the mapping relationship between foE and SSN at the Kokubunji station can be expressed as follows:
Once the model is trained, the coefficients μk,j and ηk,j can be obtained through regression analysis utilizing the LS method. By applying absolute values and logarithmic transformations to the coefficients, Fig. 6 illustrates the distribution of the correlation coefficients of the foE prediction model for the Kokubunji station with Japan Standard Time (JST). From Fig. 6, it is evident that most coefficients are relatively small but not zero. Notably, μ0,0 has significant values and, like all coefficients except μ0,1, exhibits a trend of being higher during the day and lower at night. Substituting k = 0 and j = 0 into Eq. (6), μ0,0 serves as a constant term representing a baseline for foE prediction. Its prominent values indicate that this baseline plays a substantial role. Additionally, μ2,0 at JST = 0 differs significantly from other times. Larger coefficients for higher-order terms indicate the model’s ability to effectively capture local features.
Based on the above steps, the specific form of the foE prediction model for the Kokubunji station can be obtained and the corresponding prediction model for each station can be obtained by further analysis. Figure 7 illustrates the variations of predicted monthly median foE values with JST at Kokubunji, Okinawa, Wakkanai, and Yamagawa stations predicted by the ITU model and the PRO model (the model proposed in this paper) in November 2013. Nighttime foE values are typically small, making them difficult to observe at ionospheric observation stations, thus impacting the accuracy of models relying on measured data. On this basis, referring to the ITU model and the actual situation at the four stations, a foE nighttime steady-state value close to the ITU model is selected in the PRO model. It can be seen that the variation pattern of foE predicted by both models is largely consistent within a 24-h period, with foE remaining relatively stable at night, increasing post-sunrise, peaking at noon, and decreasing into the night again. The ITU model tends to predict slightly smaller foE values at night compared to this paper’s model, while daytime predictions align more closely between the two models. As previously mentioned, the measurement lower limit of vertical incidence sounders is 1 MHz. It can be observed that around JST = 17–18, the measured values and the predictions from the PRO model exceed 1 MHz, while the ITU model’s predictions fall below this threshold, showing significant discrepancies. This difference can be attributed to two factors: first, the ITU model is derived from global ionospheric observation data, giving it a global and steady-state characteristic; second, during sunset, the ionosphere tends to experience fluctuations, resulting in variations in foE. The PRO model, based on measured data, effectively captures these fluctuations (such as in Fig. 7c when JST = 17), providing more accurate predictions compared to the ITU model. Based on the existing measured data, the RMSE of the two models at the four stations can be calculated. Using Eq. (7), the proportion of RMSE improvement of the PRO model relative to the ITU model can be determined. Statistics analysis shows that the PRO model achieves approximately 66.16% (Kokubunji), 66.38% (Okinawa), 73.21% (Wakkanai), and 3.97% (Yamagawa) higher accuracy in foE prediction compared to the ITU model at the four stations, respectively. Compared to the other three stations, the improvement at the Yamagawa station is less significant. This is because the foE values themselves are quite low, and the limited measured data make the overall improvement more sensitive to minor prediction errors. Specifically, at this station, the prediction errors of the PRO model are slightly larger than those of the ITU model at JST = 11 and 12, leading to a smaller improvement ratio. However, overall, the PRO model demonstrates highly accurate predictions. Moreover, the yellow bars, representing errors between predicted values and measured data based on the PRO model, are generally lower than the green bars, indicating smaller errors compared to the ITU model. Overall, the PRO model significantly improves foE prediction accuracy compared to the ITU model.
Comparison of foE predictions of the two models at four stations. Black square: measurements (identified as Mea); Red dot: Predictions of the ITU model (identified as ITU); Blue triangle: Predictions of the PRO model (identified as PRO); Yellow bar: Error between measurements and ITU predictions (identified as |Mea-ITU|). Green bar: Error between measurements and PRO predictions (identified as |Mea-PRO|).
The SML method can obtain the predicted values of foE at specific observed stations. To obtain foE at any location, spatial reconstruction can be performed using interpolation methods. The commonly used interpolation methods include distance-weighted interpolation, SCSI, spherical harmonic functions, Kriging, etc.35 Among these, the SCSI method is particularly noteworthy. It approximates sample points by defining cubic polynomials in a piecewise manner, making it widely applicable in mathematical modeling. This method excels in constructing smooth curves, boasting high accuracy and stability36, hence its selection in this paper.
Here, we take the predicted foE values at four stations in November 2013 as sample values, and the foE of the target location can be reconstructed by the SCSI method. The SCSI is usually used for the analysis of one-dimensional data. However, in this paper, foE is predicted based on two-dimensional data based on longitude and latitude, so it is necessary to reduce the dimension of the data. Specifically, the Kriging method is based on Euclidean distance37, whereas this paper uses Manhattan distance38 for data dimensionality reduction39 to complement the cubic spline interpolation. Using the intersection of the 0° magnetic longitude line and the 0° magnetic latitude line as the origin, eastward and northward are considered positive directions. the specific definition is as follows:
where Mi is the Manhattan distance which can represent the location information of the i-th station, n + 1 is the number of ionospheric observation stations, and μi and ηi represent the magnetic latitude and magnetic longitude of the i-th station, respectively. Hence, the data point (μi, ηi, foEi) can be denoted as (Mi, foEi). Ionospheric observation stations at different locations can be regarded as different nodes on a curve, which can be divided into different intervals. Accordingly, in the interval [Mi, Mi+1], a cubic polynomial can be constructed and defined as follows:
where M represents the location information of the target location, and ai, bi, ci, di are four correlation coefficients40. The first and second derivatives of Eq. (9) are defined as follows:
Based on the interpolation condition, all known sample points must pass through the curve, so there is:
and then according to the continuity condition41, there are:
set \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{f}_{o} E_{i}^{\prime \prime } (M_{i} )\) as Ui, there are:
By substituting Eqs. (12), (15), (16), and (17) into Eq. (13), a system of equations with a single unknown, U, is obtained, which can be handled and written as follows:
where K is the coefficient matrix of the equation and T stands for transpose of the matrix. Then we can solve for U based on the boundary conditions41. Based on the Not-A-Knot end conditions, K can be written as:
Based on the above steps, Ui can be solved and then we can obtain the coefficients of the interpolation function. According to the interval, we can select the corresponding SCSI function and get the corresponding foE.
To verify the effectiveness of the SCSI method, Table 1 compares the RRMSE between the foE predicted values and measured data over 24 h in November 2013 at four stations, based on reconstructions using the SCSI and Kriging methods, which is the traditional interpolation method of ionospheric parameters42. It can be seen that the RRMSE calculated using the SCSI interpolation method is consistently lower than those obtained using the Kriging interpolation method. Therefore, it is proved that the SCSI method can achieve more accurate spatial predictions in small areas, such as East Asia.
Figure 8a,b show the variation patterns of foE at JST = 12 in November 2013 in East Asia based on the ITU model and the PRO model, respectively. Darker colors denote smaller foE values. While the ITU model shows a decreasing trend in foE from south to north, the PRO model exhibits a decreasing trend from southwest to northeast. This difference mainly stems from the data sources used: the ITU model is built on data from 55 globally distributed ionospheric observation stations, whereas the PRO model relies on data from four ionospheric observation stations in East Asia, leading to location-related discrepancies. Figure 8c shows the error variation patterns of the two models. In the eastern part of the region, the predicted values of foE based on the PRO model are smaller than that of foE based on the ITU model, while the reverse is true in the western part of the region.
Improved prediction of sky wave propagation
Improved reflection height
With foE of the reflection points and wave-hop theory, we can calculate its corresponding ionospheric reflection height, which is defined as follows:
where G is a function related to foE11, f is the frequency.
Figure 9a,b show the variation patterns of ionospheric reflection height Rh at JST=12 in November 2013 in East Asia based on the ITU and PRO models. As shown in Fig. 9a,b, Rh based on the ITU model exhibits an increasing trend from south to north, whereas that based on the PRO model demonstrates an increasing trend from southwest to northeast. Consequently, Fig. 9c shows the error distribution between the two models. In the western part of the region, the Rh predicted values based on the PRO model is smaller than those based on the ITU model, whereas the opposite is true in the eastern part.
Enhanced wave-hop method
Having the ionospheric reflection height Rh, various parameters necessary for calculating the LF sky wave field strength can be calculated. Here, the calculation method of LF sky wave field strength is first given as11:
where s is the sky wave hop number, and pt is the transmitting power (kW). The other parameters are directly or indirectly affected by the ionospheric reflection height (Rh), which is described as follows:
-
(1)
φ is the elevation angle, which is defined as
$$\phi = \arctan \left\{ {\cot \left( {\frac{D/s}{{2r}}} \right) - \frac{{r\csc \left( {\frac{D/s}{{2r}}} \right)}}{{(r + R_{h} )}}} \right\}\quad {\text{rad}}$$(24)where D is the great circle distance (km) between the transmitter and receiver, and r is the effective Earth radius (km).
-
(2)
P is the sky wave propagation path length (km), which is defined as:
$$P = 2sr\sin \left( {\frac{D/s}{{2r}}} \right)\sec \left( {\varphi + \frac{D/s}{{2r}}} \right)\quad {\text{km}}$$(25) -
(3)
F is the ionospheric focusing factor43, which can be defined in different areas as follows:
$$F = \left\{ \begin{gathered} {\text{Lighting}}\;{\text{area}} \hfill \\ \left( {\frac{{r + R_{h} }}{r}} \right)\left[ {\frac{{2\sin \left( {\frac{\beta }{2s}} \right)}}{\sin \theta }} \right]^{\frac{1}{2}} \left[ {\frac{{r + R_{h} - r\cos \left( {\frac{\beta }{2s}} \right)}}{{\left( {r + R_{h} } \right)\cos \left( {\frac{\beta }{2s}} \right) - r}}} \right]^{\frac{1}{2}} \times \rho , \hfill \\ {\text{Caustic}}\;{\text{point}} \hfill \\ \left( {\frac{2s}{{\sin \beta_{t} }}} \right)^{\frac{1}{2}} \left[ {\left( {\frac{{r + R_{h} }}{r}} \right)^{2} - 1} \right]^{\frac{1}{2}} \times 1.3466\left( \frac{wr}{3} \right)^{\frac{1}{6}} e^{{\frac{i\pi }{{12}}}} , \hfill \\ Shadow\;area \hfill \\ \left( {\frac{2s}{{\sin \beta }}} \right)^{1/2} \left\{ {\left[ {\left( {\frac{{r + R_{h} }}{r}} \right)^{2} - 1} \right]^{\frac{1}{2}} + \frac{{\beta - \beta_{t} }}{2s}} \right\} \times 1.3466\left( \frac{wr}{3} \right)^{\frac{1}{6}} e^{{\frac{i\pi }{{12}}}} \hfill \\ \end{gathered} \right.$$(26)where β is the angular distance between the transmitter and the receiver. When the reflection wave is tangent to the ground, β is βt. w is the wavelength number (= 2π/λ), λ is the wavelength, ρ is the correction factor.
-
(4)
Ci is the ionospheric reflection coefficient, which is a function of the ionospheric incidence angle q, calculated by interpolation. q is also affected by the ionospheric reflection height and is defined as follows:
$$q = \arcsin \left( {r\cos (\varphi )/\left( {r + R_{h} } \right)} \right)\quad {\text{rad}}$$(27) -
(5)
At and Ar are antenna factors of the transmitting and receiving antenna, which are determined by the geographical environment and elevation angle φ. They are computed through interpolation.
-
(6)
Cg is the ground reflection coefficient, which is defined as follows:
$$C_{g} = \left\{ {\begin{array}{*{20}l} {\frac{{v^{2} \sin (\varphi ) - \sqrt {v^{2} - \cos^{2} (\varphi )} }}{{v^{2} \sin (\varphi ) + \sqrt {v^{2} - \cos^{2} (\varphi )} }}} \hfill & {\varphi > 0} \hfill \\ {E_{D} /E_{F} } \hfill & {\varphi \le 0} \hfill \\ \end{array} } \right.$$(28)where v is related to the ground reflection point environment, ED is the diffraction field strength, and EF is the free space field strength44.
With these parameters, we can predict the field strength using Eq. (23) and then calculate the RMSE between the predicted values and the measured field strength. Suppose n3 is the total number of field strength data, EOBS and EPRE represent the measured field strength and predicted field strength respectively, it is defined as follows:
Based on the comparison of ionospheric reflection height, we can further analyze the variation of relevant parameters based on the ITU and the PRO model in East Asia. Figure 10 shows the differences in relevant parameters over East Asia based on the two models (ITU and PRO). As can be seen from Fig. 10, the change in color represents a difference in the parameters based on the two models
-
(1)
As shown in Fig. 10a, the elevation angles calculated by the PRO model increase in the eastern part, whereas those decrease in the western part. The difference is greatest around the transmitter and in the northeast corner of this region.
-
(2)
As shown in Fig. 10b, the variation patterns of the sky wave propagation path length are similar to those of the elevation angle. The difference is also greatest around the transmitter and in the northeast corner of this region.
-
(3)
As shown in Fig. 10c, the ionospheric focusing factor calculated by the PRO model increases for most links in this region. The difference is greatest in the northeast corner of this region.
-
(4)
As shown in Fig. 10d, the ionospheric reflection coefficient decreases around the transmitter according to the PRO model. The difference is greatest in the western part of the transmitter.
-
(5)
As shown in Fig. 10e,f, the antenna factors of both the transmitter and receiver have a significant reduction in predicted results based on the PRO model, particularly in the western part of the region. The difference is greatest in the western part and northeast corner of this region.
Differences of relevant parameters in East Asia based on the two models. (a) the difference in φ between the ITU model and the PRO model. (b) the difference in P between the ITU model and the PRO model. (c) the difference in F between the ITU model and the PRO model. (d) the difference in Ci between the ITU model and the PRO model. (e) the difference in At between the ITU model and the PRO model. (f) the difference in Ar between the ITU model and the PRO model.
Improved prediction of field strength
Figure 11 shows the map of the transmitter and receivers (Mapping software: The MathWorks Inc. (2024). MATLAB Online. Available: https://matlab.mathworks.com). We select this region due to the regional nature of the ionospheric data and the limited data available. The transmitter is at (37.4°N, 140.8°E). The receivers are arranged in a northeast-to-southwest direction over the Pacific Ocean, with the nearest receiver approximately 220 km from the transmitter and the farthest receiver about 2000 km away. We have a total of 55 different receiver stations. One of the receivers, located at (35.6°N, 139.5°E), is a fixed receiver and it measured field strength variations over 24 h. Figure 12 shows the spatiotemporal distribution of the measured and predicted field strength of 40 kHz LF sky waves within a range of 2000 km based on the ITU model and the PRO model, where the x-axis represents the propagation distance and the y-axis represents the JST. The larger the bubble, the lighter the color, and the greater the field strength. Upon analysis, the RMSE between the predicted field strength and the measured field strength of 78 point-to-point links using the ITU model is about 6.17 dB, while the RMSE between the predicted field strength and the measured field strength based on the model in this paper is approximately 5.79 dB. This represents a notable enhancement in accuracy of about 6.16 % compared to the ITU model, signifying significant improvement. It can be seen from the above section that the change in ionospheric reflection height leads to the change in relevant parameters. Combined with the wave-hop theory, it becomes apparent that alterations in one parameter influence others, rendering it difficult to isolate the impact of a single parameter change on field strength. Notably, all relevant parameters are either directly or indirectly linked to the elevation angle ψ, and even minor fluctuations in ψ induce changes in other parameters, consequently affecting field strength. Hence, the primary reason behind the enhanced accuracy in field strength prediction lies in the fluctuations of ψ triggered by variations in foE.
In Fig. 13, the overall error statistics are presented. The error distribution based on the two models conforms to a normal distribution. In Fig. 13a, there are 21 data points within the error range of − 0.5 to 2 dB, accounting for the largest proportion. Similarly, Fig. 13b depicts 20 data points within the error range of − 2.6 to − 0.4 dB, accounting for the largest proportion.
Conclusion
This paper proposes a prediction model for foE in East Asia utilizing the SML and SCSI methods. Comparing the ITU model and measured data, the proposed model can obtain reasonable variation patterns of foE and significantly improve the prediction accuracy of foE. Therefore, we believe that the SCSI method combined with the Manhattan distance has better prediction accuracy in a small range. Based on foE, the relatively improved parameters such as ionospheric reflection height, reflection coefficient, focusing factor, etc., are employed to improve the prediction of LF sky wave propagation, resulting in approximately 6.16 % improvement in accuracy for field strength prediction of LF sky wave propagation within 2000 km at 40 kHz compared to the ITU model.
Although existing established foE prediction methods have achieved a certain level of accuracy, validation reveals that prediction errors persist in some regions. To address this, the proposed model effectively captures the variation characteristics of foE in the East Asia region, thereby improving prediction accuracy and providing a novel approach for localized predictions.
In the future, a better mapping relationship between foE and solar activity index along with other parameters can be established by augmenting the sample size and utilizing prediction methods for other ionospheric parameters.
Data availability
The data supporting the main findings of this study are available in Supplementary Information. Source data are provided as a Source Data file. Source data are provided with this paper.
References
Nichols, S. & Melia, A. Understanding LF and HF Propagation (2010).
Appleton, E. V. The propagation of radio waves over the earth. Nature 115(2889), 382–382. https://doi.org/10.1038/115382d0 (1925).
Booker, H. G. Propagation of wave-packets incident obliquely upon a stratified doubly refracting ionosphere. Philos. Trans. R. Soc. A 237, 411–451 (1938).
Budden, K. G. & Chang, H. Radio waves in the ionosphere. Phys. Today 15(2), 44–46. https://doi.org/10.1063/1.3058017 (1962).
Ferguson, J. A., Morfitt, D. G. & Hansen, P. M. Statistical model for low-frequency propagation. Radio Sci. 20, 528–534 (1985).
Hayakawa, M. & Molchanov, O. A. Ionospheric perturbation associated with earthquakes, as revealed from sub-ionospheric VLF/LF propagation. In Asia-Pacific Conference on Environmental Electromagnetics 1–4 (2003).
Berenger, J.-P. Use of the FDTD method for computing the VLF-LF propagation in the Earth-Ionosphere waveguide. In 2006 IEEE Antennas and Propagation Society International Symposium 3785–3788 (2006).
Andre, L., de Sa, A. & Marshall, R. A. Lightning distance estimation using LF lightning radio signals via analytical and machine-learned models. IEEE Trans. Geosci. Remote Sens. 58(8), 5892–5907. https://doi.org/10.1109/TGRS.2020.2972153 (2020).
Gao, J., Xi, X. & Zhou, L. Effects of different lower ionospheric models on the calculation of low-frequency sky waves. In 2023 International Applied Computational Electromagnetics Society Symposium (ACES-China) 1–2 (2023).
Hu, X., Zhou, L., Mu, Z. & Guo, W. Prediction of stochastic properties of low-frequency sky wave in the earth-ionosphere waveguide based on the S-FDTD method. In 2023 International Applied Computational Electromagnetics Society Symposium (ACES-China) 1–3 (2023).
Recommendation ITU-R P. 684–6: Prediction of field strength at frequencies below about 150 kHz. (International Telecommunication Union, 2012).
Gao, J., Yuan, Y., Zhao, Z. & Xi, X. Lower ionospheric remote sensing using low-frequency sky waves. IEEE Geosci. Remote Sens. Lett 21, 1–5 (2024).
Zhen, W., Ou, M., Zhu, Q., Dong, X. & Liu, D. Review on ionospheric sounding and modeling. Chin. J. Radio Sci. 38, 625–645 (2023).
Committee on Space Research (COSPAR) and the International Union of Radio Science (URSI). Availability. Accessed on: 15 April 2024. [Online]. Available: https://irimodel.org/
Yue, X., Wan, W., Liu, L. & Ning, B. An empirical model of ionospheric foE over Wuhan. Earth, Planets Space 58(3), 323–330. https://doi.org/10.1186/BF03351928 (2006).
Chen, M., Li, C., Ding, L. & Gu, B. Two decades variation about ionospheric parameters foF2, foE and M3000F2. In 2011 International Conference on Multimedia Technology 2769–2772 (2011).
Somoye, E. O., Onori, E. O. & Akala, A. O. Comparison of foE and M(3000)F2 variability at Ibadan, Singapore and Slough. J. Atmos. Sol.-Terr Phys. 92, 18–22 (2013).
Wongcharoen, P., Kenpankho, P., Sepsirisuk, K., Supnithi, P., Noppanakeepong, S., Lerkvaranyu, S., Tsugawa, T. & Nagatsuma, T. The variation of critical frequency of E layer over Chumphon, Thailand. In 2013 IEEE International Conference on Space Science and Communication (IconSpace) 198–201 (2013).
Abe, A., Rabiu, A. & Adeniyi, J. O. Variability of foE in the equatorial ionosphere with solar activity. Adv. Space Res. 51, 69–75 (2013).
Recommendation ITU-R P.1239–3: ITU-R Reference Ionospheric Characteristics, (International Telecommunication Union, 2012).
Sivakrishna, K., Venkata Ratnam, D. & Sivavaraprasad, G. A bidirectional deep-learning algorithm to forecast regional ionospheric TEC maps. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 15, 4531–4543 (2022).
Nigusie, A., Tebabal, A. & Feyissa, F. Machine learning based storm time modeling of ionospheric vertical total electron content over Ethiopia. Sci. Rep. 14, 19293 (2024).
Sulungu, E. D. Performance of IRI 2016 model in predicting total electron content (TEC) compared with GPS-TEC over East Africa during 2019–2021. Sci. Rep. 14, 10010 (2024).
Masashi, S. Introduction to Statistical Machine Learning (China Machine Press, 2016).
Cuevas, E., Galvez, J., Avalos, O. & Fernando, W. Fundamental machine learning methods. In Machine Learning and Metaheuristic Computation (eds Cuevas, E. et al.) 59–143 (Wiley, 2024). https://doi.org/10.1002/9781394229680.ch3.
Person, K. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dubl. Phil. Mag. 2, 559–572 (1901).
Dvinskikh, N. I. Expansion of ionospheric characteristics fields in empirical orthogonal functions. Adv. Space Res. 8, 179–187 (1988).
Zhao, L. & Ding, L. Least squares approximations to lognormal sum distributions. IEEE Trans. Veh. Technol. 56, 991–997 (2007).
Bao, D., Wang, Y. & Li, S. Dynamic graph embedding PCA to extract spatio–temporal information for fault detection. IEEE Trans. Ind. Inform. https://doi.org/10.1109/TII.2024.3485805 (2024).
Qian, L., Burns, A. G., Solomon, S. C. & Wang, W. Annual/semiannual variation of the ionosphere. Geophys. Res. Lett. 40, 1928–1933 (2013).
World Data Center for Ionosphere and Space Weather. Ionospheric Data in Japan Dataset: 1998–2013. National Institute of Information and Communications Technology. https://wdc.nict.go.jp/IONO/wdc/index.html.
Moon, S., Kim, Y. H., Kim, J. H., Kwak, Y. S. & Yoon, J. Y. Forecasting the ionospheric F2 parameters over Jeju station (33.43°N, 126.30°E) by using long short-term memory. J. Korean Phys. Soc. 77, 1265–1273 (2020).
Dai, L., Ding, Z., Yang, S. & Tang, Z. Preliminary study on ionospheric foE empirical model in middle and low latitudes of China. Chin. J. Radio Sci 36, 74–80 (2021).
World Data Center for Solar Influences Data Analysis Center. Sunspot Number data Dataset. (Royal Observatory of Belgium, Brussels). https://www.sidc.be/SILSO/datafiles.
Wang, J. et al. A regional model for the prediction of M(3000)F2 over East Asia. Adv. Space Res. 8, 2036–2051 (2020).
Tao, T. Stability Analysis and Application of Cubic Spline Interpolation Function. M.S. thesis (Chengdu University of Technology, 2021).
Mukesh, R. et al. Prediction of ionospheric vertical total electron content from GPS data using ordinary kriging-based surrogate model. Astrophys. Space Sci. 364, 15 (2019).
Haviluddin, Iqbal, M., Putra, G. M., Puspitasari, N., Setyadi, H. J., Dwiyanto, F. A., Wibawa, A. P. & Alfred, R. A Performance Comparison of Euclidean, Manhattan and Minkowski Distances in K-Means Clustering. In 2020 6th International Conference on Science in Information Technology (ICSITech), 184–188 (2020).
Wang, J., Li, X., Dong, X., Li, J. & Li, Y. A data dimensionality reduction algorithm for aerospace telemetry data mining. In 2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP) 35–39 (2024).
Matiu-Iovan, L. A cubic spline interpolation algorithm implemented on a system with digital signal processor. In 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) 343–348 (2013).
Xu, X. & Zhong, T. Construction and realization of cubic spline interpolation function. Ord. Industry Auto (in Chinese), 76–78 (2006).
Stanisκawska, I., Juchnikowski, G. & Cander, L. R. Kriging method for instantaneous mapping at low and equatorial latitudes. Adv. Space Res. 18(6), 217–220. https://doi.org/10.1016/0273-1177(95)00927-2 (1996).
Pan, W. LF VLF ELF Wave Propagation (2004).
Recommendation ITU-R P.368–10: Ground-wave propagation prediction method for frequencies between 10 kHz and 30 MHz. (International Telecommunication Union, 2022)
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 62031008 and by the State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information Systems under Grant CEMEE2022G0201.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.W. and C.Y.; methodology, J.W., C.D. and Q.Y.; software, J.W., C.D. and C.Y.; validation, C.D., Q.Y. and C.Y.; formal analysis, J.W., C.D. and Q.Y.; investigation, J.W. and C.D.; resources, J.W., C.D. and Q.Y.; data curation, J.W., C.D. and Q.Y.; writing—original draft preparation, J.W. and C.D.; writing—review and editing, J.W., C.D., Q.Y. and C.Y.; visualization, J.W. and C.D.; supervision, J.W. and C.D.; project administration, J.W.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, J., Duan, C., Yu, Q. et al. Predicting low ionospheric parameters and low frequency sky wave propagation strength using machine learning. Sci Rep 15, 4079 (2025). https://doi.org/10.1038/s41598-025-87930-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-87930-8