Introduction

Exploring the structure, function, and interactions of proteins is crucial for understanding the molecular basis of diseases, thereby facilitating the development of precise and effective therapeutic interventions. NMR, X-ray crystallography, and cryo-electron microscopy (cryo-EM), as powerful techniques, are commonly applied in protein research. Compared to X-ray crystallography and cryo-EM techniques, the multidimensional (nD) NMR technique offers the unique advantage of providing structure, dynamic, and interaction information of proteins in solution at physiological conditions, without disrupting or modifying endogenous conformations1,2. Unfortunately, obtaining nD NMR spectra is usually a time-consuming process. The sampling time of nD NMR spectra grows exponentially with the increase of the number of indirect dimensions. Non-uniform sampling (NUS) technology greatly shortens the sampling time by undersampling the indirect dimensions of the NMR spectrum and is widely applied in the rapid acquisition of NMR spectra. The acquired NUS data requires reconstruction algorithms to obtain high-quality NMR spectra. Therefore, developing robust reconstruction algorithms becomes imperative. For example, Orekhov et al.3,4,5 demonstrated multi-dimensional decomposition (MDD) and compressed sensing (CS) as effective tools for reconstructing the NUS data. Other traditional reconstruction algorithms, such as SMILE6, and hmsIST7 also be proposed. However, these traditional algorithms require manual adjustments of key parameters, resulting in suboptimal performance. In addition, some traditional algorithms (like MDD) have long computation times. Fortunately, deep learning (DL) has witnessed successful applications in various fields like image processing8,9 and speech recognition10,11. Similarly, DL has also achieved great success in the field of NMR12. For instance, Li et al.13 introduced DEEP Picker, which accomplishes peak picking in complicated NMR spectra. Schmid et al.14 also introduced a DL-based approach to accomplish peak picking in 1D NMR spectra. Klukowski et al.15 proposed ARTINA which enables the completely automated analysis of protein NMR data within hours after completing the measurements. Wu et al.16 proposed DeNoising Unet (DN-Unet) to suppress noise in NMR spectra. Karunanithy et al.17 demonstrated that by processing NMR spectra of protonated, uniformly 13C-labelled samples with FID-Net, the resulting spectra are comparable to typical deuterated methyl-TROSY spectra. In addition, DL has also made progress in the NUS reconstruction of NMR spectra. For example, frequency domain reconstruction networks based on DenseNet18 and HRN19 have been proposed by Qu et al.20 and Luo et al.21, respectively. In addition, Hansen et al.22,23 introduced an LSTM-based reconstruction network24 and WaveNet-based FID-Net25 for the time domain reconstruction of NMR spectra. However, these DL-based reconstruction methods focus only on single domains (either time domain or frequency domain) with each having its advantages and challenges. Specifically, the advantage of the frequency-domain reconstruction method is that it can use the sparsity of the NMR spectra to achieve high-quality reconstruction even with few sampling points. The shortcoming is that the artifacts generated from high-amplitude peaks may even appear stronger than the weak peaks, which leads to the occurrence of peak missing and artifacts in the reconstructed spectra. Time-domain reconstruction directly processes sampled time-domain data, avoiding the interference of under-sampled artifacts caused by strong peaks on the reconstruction of weak peaks, and thus benefiting effective reconstruction. However, time domain reconstruction predicts the unsampled data based on the sampled points, requiring a sufficient number of sampled points to ensure reconstruction accuracy. When there are few sampled points, the effectiveness of time domain reconstruction is seldom as high as frequency domain reconstruction. Therefore, the combination of information from both domains has the potential to produce better results.

In addition, a long-standing and important issue has remained unresolved. This issue is that users are unable to assess the quality of the reconstructed spectra without full-sampled spectra, which are not acquired in practical applications. Commonly employed indicators for quality assessment in the studies of algorithm development include the pearson correlation coefficient (PCC), root mean square deviation (RMSD), relative L2 norm error (RLNE), and so on. These indicators all require the fully sampled spectrum as a reference. However, when the NUS technique is needed to shorten the experimental time, it always means that fully sampled spectra are difficult and often impossible to obtain. Therefore, full-reference assessment is impractical in real-world applications. Without quality metrics, the algorithm user worries: Can I trust this reconstructed result?

In this paper, a joint time-frequency domain deep learning network called JTF-Net is proposed. Overcoming the limitations of single-domain DL reconstruction, JTF-Net shows excellent performance in the reconstruction of nD protein NMR spectra. More importantly, the referenced-free quality assessment metric for NMR reconstructed spectra is proposed in this paper, denoted as the reconstruction quality assurance ratio (REQUIRER). Using simulated spectra, a quality space that reflects the correlation between uncertainties and the RLNE of the reconstructed spectra is constructed. And, the quality threshold, defined by a specific RLNE value to assess potential peak loss or artifacts in the reconstructed spectra, was determined through testing on simulated spectra. The quality thresholds were set as 0.35 for 2D spectra and 0.55 for 3D spectra. Based on this quality space and quality thresholds, the REQUIRER metric can be obtained after completing the reconstruction of real experimental spectra, providing users of the JTF-Net with the quality assessment result of the reconstructed spectra without the fully sampled spectra.

Results and discussion

Comparison with current reconstruction algorithms

In this section, JTF-Net performance is demonstrated by reconstructing the 2D 15N-1H HSQC (Heteronuclear Single Quantum Coherence) spectra of GB1 (601 × 170) and T4L L99A protein22 (335 × 256), with system sizes are 56 and 172, respectively. The reconstructed results were compared with traditional algorithms SMILE6 and hmsIST7, and DL reconstruction algorithms FID-Net23 (time domain) and EDHRN21 (frequency domain). All spectra were undersampled with the 12.5% Poisson-Gap sampling schemes. Figure 1 shows the results of the 2D 15N-1H HSQC spectrum of protein GB1. The reconstructed spectra from all methods show no peak loss and artifact peaks, and the reconstruction spectrum by JTF-Net has the lowest RLNE, which means that the reconstructed spectrum is closest to the real fully sampled spectra. The reconstruction of protein T4L L99A is challenging because of the large number of spectral peaks and the presence of some peaks with low intensity. Figure 2 shows the results of the 2D 15N-1H HSQC spectra of protein T4L L99A, and the red and black boxes represent artifact peaks and peak loss respectively. There are numerous peak losses when employing single-domain DL algorithms and SMILE, while hmsIST does better, with only a few peak losses. In terms of artifact peaks, both the traditional and single-domain DL algorithms show a few. Notably, JTF-Net achieves the lowest RLNE value without any peak loss and artifact peaks. It is noted that, although spectra of the two samples have different sizes and sampling schemes, JTF-Net does not require re-training and could still achieves high-quality reconstruction with one trained model, reflecting its universality.

Fig. 1: 2D 15N-1H HSQC (Heteronuclear Single Quantum Coherence) spectra of protein GB1.
figure 1

a Is the fully sampled spectrum. bf Are the reconstruction results of hmsIST7, SMILE6, FID-Net23, EDHRN21, and JTF-Net (Joint time-frequency domain deep learning network), respectively. All reconstruction results are displayed using contour plots at the same contour level (contour level = 6). The sampling schemes are the same 12.5% Poisson-Gap sampling for all methods. RLNE: Relative L2 norm error.

Fig. 2: 2D 15N-1H HSQC (Heteronuclear Single Quantum Coherence) spectra of protein T4L L99A.
figure 2

a Is the fully sampled spectrum. bf are the reconstruction results of hmsIST7, SMILE6, FID-Net23, EDHRN21, and JTF-Net, respectively. The black and red boxes indicate peak loss and artifact respectively. All reconstruction results are displayed using contour plots at the same contour level (contour level =  21). The sampling schemes are the same 12.5% Poisson-Gap sampling for all methods. RLNE: relative L2 norm error.

To demonstrate the robustness of JTF-Net on different sampling rates, JTF-Net was used to reconstruct the HSQC spectra of T4L L99A protein, which were undersampled by Poisson-Gap sampling schemes with sampling rates of 10%, 12.5%, 15%, 17.5%, and 20%. Each sampling rate includes 10 different sampling schemes. Figure 3 shows that JTF-Net can consistently achieve the lowest RLNEs across different sampling rates. The hmsIST method also performs well but there is still a significant gap between hmsIST and JTF-Net at sampling rates of 10% and 12.5%. However, this gap decreases as the sampling rates increase. In addition, compared with EDHRN, SMILE, and FID-Net, JTF-Net always has a greater advantage in the RLNE indicator. Thus, JTF-Net consistently provides stable and superior performance across different sampling rates, showcasing further its universality.

Fig. 3: Comparison of RLNE (relative L2 norm error) among different methods at different sampling rates.
figure 3

The compared methods are JTF-Net, hmsIST7, SMILE6, FID-Net23, and EDHRN21. Sampling rates are 10%, 12.5%, 15%, 17.5%, and 20%. Each method was tested with n = 10 different Poisson-Gap sampling schemes at each sampling rate. Data are presented as mean ± standard deviation (SD) with overlaid individual data points. Source data are provided as a Source Data file.

In addition, JTF-Net can also be applied to reconstructing 3D spectra. The HNCO spectrum of protein Azurin with the system size of 128, which was undersampled using an 8% Poisson-Gap sampling scheme, was used to demonstrate the performance of JTF-Net. As shown in Fig. 4, the reconstructed spectra from all methods (FID-Net cannot reconstruct 3D spectra) show no peak loss and artifact peaks, but JTF-Net achieves the best RLNE indicator.

Fig. 4: 3D HNCO spectra of protein Azurin and the CN plane projections of HNCO.
figure 4

The HNCO experiment stands for the transfer of magnetisation from 1H to 15N and then selectively to the carbonyl 13C via the 15NH–13CO J-coupling. a Is the fully sampled spectrum. be are the reconstruction results of hmsIST7, SMILE6, EDHRN21, and JTF-Net, respectively. All reconstruction results are displayed using contour plots at the same contour level (contour level = 12). The sampling scheme is the same 8% Poisson-Gap sampling for all methods. RLNE: Relative L2 norm error.

In the above 2D and 3D spectral reconstruction, JTF-Net achieves the optimal RLNE indicator. Especially in comparison to single-domain reconstruction algorithms like FID-Net and EDHRN, JTF-Net demonstrates a clear advantage, which arises from its combination of time and frequency domain information during the reconstruction process. Data consistency ensures the efficient utilization of the sampled information, preventing excessive alterations to the sampled data during the reconstruction process. In addition, compared with traditional methods SMILE and hmsIST, JTF-Net is fully automatic and does not need to manually set key parameters. In terms of reconstruction time, JTF-Net achieves the shortest single reconstruction times for both the 2D 15N-1H HSQC spectra of the GB1 and T4L L99A proteins and the 3D HNCO spectra of the Azurin protein among these methods, with details available in Supplementary Part 9. Data preprocessing, such as apodization functions, baseline correction, zero-filling, and phase correction, can affect the reconstruction results, as provided in Supplementary Part 10. Based on the testing results in SI, it is recommended to perform phase correction, apodization function, and baseline correction on the data before reconstruction. For the apodization functions, the EM window provided the best results on the HSQC spectrum of the GB1 protein. However, this does not mean that the EM window is the best for all data. It is advised to first try the EM window for reconstruction. Additionally, if the number of points in the indirect dimension is not divisible by 8, zero-filling to the close integer that is a multiple of 8 is recommended. The number of residues also affects reconstruction quality. Test results show that as the number of residues increases, the reconstruction quality of JTF-Net decreases, but REQUIRER can still provide good assessments of the quality of the reconstructed spectra. In addition, the reconstruction of spectra of intrinsically disordered proteins and spectra with strong peak overlap are relatively difficult for JTF-Net. However, the test results show that JTF-Net remains effective in these two cases. Further details are available in Supplementary Part 11.

Reference-free quality assessment of reconstruction spectra by REQUIRER

To validate the feasibility of the REQUIRER metric, it is tested on many 2D and 3D protein spectra. It is expected that the spectrum with high SNR can be reconstructed with high quality and low RLNE, and low SNR with low quality and high RLNE. Here, Gaussian noises with different levels were artificially added to the fully sampled spectra, producing a series of spectra with different SNRs. For 2D data, the 15N-1H HSQC spectra of the GB1 and T4L L99A proteins were used for testing. The evaluation result using the REQUIRER metric for the GB1 protein at different SNR levels is shown in Fig. 5. The REQUIRER values of (a–d) are consistently high (above 79%), indicating a high likelihood that the RLNE indicator will be below 0.35 (2D quality threshold). Indeed, the actual RLNE indicators for these spectra are all below 0.35. There is one artifact peak in Fig. 5e, the REQUIRER value is lower than (a–d), and its actual RLNE is larger than (a–d). The reconstruction result (f) shows several artifact peaks, and the REQUIRER of (f) is lower than 50% with the actual RLNE indicator of (f) above 0.35. The T4L L99A protein was also used for validation of REQUIRER under different SNR levels, the results are shown in Figure S10 in Supplementary Part 12, and REQUIRER consistently provided good assessments of the quality of the reconstructed spectra.

Fig. 5: The reconstructed 2D 15N-1H HSQC (Heteronuclear Single Quantum Coherence) spectra of protein GB1 with different signal-to-noise ratios (SNR) by JTF-Net.
figure 5

Their REQUIRER (reconstruction quality assurance ratio) metrics are presented. The real RLNE (relative L2 norm error) for the reconstructed spectra of (af) are 0.0803, 0.1199, 0.1406, 0.2022, 0.3670, and 0.5693, respectively. The red box indicates artifact peaks.

In addition, it is expected that a higher sampling rate generally leads to higher reconstruction quality and lower RLNEs, and a lower sampling rate generally leads to lower reconstruction quality and higher RLNEs. The fully sampled 15N-1H HSQC spectra of the GB1 and T4L L99A proteins were undersampled using different sampling schemes with different sampling rates (3%–20%). As shown in Fig. 6, the reconstructed spectra at sampling rates of 20%, 15%, and 12.5% exhibit no peak loss and artifact peaks. The REQUIRER values for these reconstructed spectra of protein T4L L99A are remarkably high (REQUIRER > 91%), and the actual RLNE values are significantly below the quality threshold. When the sampling rate is reduced to 8%, the reconstructed spectrum exhibited few peak loss and artifact peaks, and the REQUIRER metric is lower than (a–c). The actual RLNE value is larger than the quality threshold. As the sampling rate further decreases, the reconstruction results in Fig. 6e–f exhibit noticeable peak loss and artifact peaks compared to the fully sampled spectrum. The REQUIRER of (e–f) also significantly decreases compared to (d). Indeed, the actual RLNE indicators of (e–f) are all above 0.35 (2D quality threshold). In Supplementary Part 13, REQUIRER also provides a good quality assessment of the reconstruction results for the GB1 protein at different sampling rates.

Fig. 6: The reconstructed 2D 15N-1H HSQC (Heteronuclear Single Quantum Coherence) spectra of protein T4L L99A with different sampling rates (SR) by JTF-Net.
figure 6

Their REQUIRER (reconstruction quality assurance ratio) metrics are presented. The real RLNE (relative L2 norm error) for the reconstructed spectra of (af) are 0.1323, 0.1574, 0.1694, 0.3701, 0.4624, and 0.8507, respectively. The black boxes and red boxes indicate peak loss and artifact peaks, respectively.

Additionally, the 3D HNCO spectrum (732 × 60 × 60) of the Azurin protein was also used for testing. Artificial noise was added to the spectra, gradually reducing the SNR of fully sampled spectra, and the sampling scheme used an 8% Poisson-Gap. The evaluation results are shown in Fig. 7. The REQUIRER values of (a–c) are consistently high (above 82%), indicating a high likelihood that the RLNE indicator will be below 0.55 (3D quality threshold), and the actual RLNE values are lower than the 3D quality threshold. The REQUIRER metrics of (d,e) are lower than (a–c), and the actual RLNEs of (d,e) are larger than (a–c). When the reconstruction quality is poor (Fig. 7f), REQUIRER is a small value.

Fig. 7: The projections on 13C-15N planes of the 3D HNCO spectra of protein Azurin reconstructed by JTF-Net with different signal-to-noise ratios (SNR).
figure 7

Their REQUIRER (reconstruction quality assurance ratio) metrics are indicated. The real RLNE (relative L2 norm error) for the reconstructed spectra of (af) are 0.3908, 0.4168, 0.5085, 0.5681, 0.6257, and 0.6781, respectively. The black and red box indicates peak loss and artifact peaks, respectively.

To further validate the universality of the REQUIRER, four 2D spectra and nine 3D spectra from the Biological Magnetic Resonance Bank (BMRB) were also tested. Different reconstruction methods were applied to these different types of spectra, and the results are displayed in Table 1. The test results show that JTF-Net achieves the optimal RLNE indicator for all spectra, showing a clear advantage over other methods. In addition, SNR has a great effect on the reconstruction results. For the HNCO spectra of At1g24000.1, A3DK08, and O64736 with high SNR, high reconstruction qualities were achieved. In contrast, the HNCO spectrum of Probable 30S Ribosomal protein showed the poor reconstruction result due to its low SNR. Similarly, the HNCACB spectrum of thiamine triphosphatase showed better reconstruction quality than BT_p548217 due to its higher SNR.

Table 1 Comparison of reconstruction indicators of different reconstruction methods

In terms of REQUIRER, the reconstruction result of the HMQC spectrum of protein Yqal has a value of REQUIRER that is not very high, with its actual RLNE indicator slightly higher than the quality threshold. The reconstructed HSQC spectrum of BH09830 has a low REQUIRER value, with the actual RLNE indicator significantly exceeding the quality threshold. For 3D spectra, the reconstruction results of the HNCO spectra of proteins At1g24000.1, A3DK08, and O64736 have high REQUIRER values, and their actual RLNE indicators are all smaller than the quality threshold, while the reconstruction result of the HNCO spectrum of the probable 30S ribosomal protein has a low REQUIRER value, with its actual RLNE value higher than the quality threshold. The reconstruction results of the CBCA(CO)NH spectrum of protein YorP, the HNCACB spectrum of thiamine triphosphatase, and the HBHA(CO)NH spectrum of protein Ykvr have high REQUIRER values, and their actual RLNE indicators are smaller than the quality threshold, while the reconstruction result of the HNCACB spectrum of the BT_p548217 protein has a low REQUIRER value, with its actual RLNE value higher than the quality threshold. The reconstructed spectra are shown in Supplementary Part 14. Although JTF-Net achieved the best performance in the data presented here, it can’t test all NMR data. Therefore, we cannot guarantee that it will achieve the best results on every dataset. In addition, based on testing results (Supplementary Part 15), it is suggested to use JTF-Net on 2D data with a sampling rate of 10%–50% and 48–2048 points in the indirect dimension, and on 3D data with a sampling rate of 5%–70% and 24 × 24 − 256 × 256 points in the indirect dimensions, with Poisson-Gap sampling schemes for both 2D and 3D spectra.

A higher REQUIRER value indicates a greater likelihood that the reconstructed spectra will have an actual RLNE lower than the quality threshold, suggesting good quality of reconstruction. Conversely, a lower REQUIRER value indicates a higher probability that the actual RLNE will exceed the quality threshold, suggesting poor quality in the reconstructed spectra. Although REQUIRER has proven to be feasible, it is not without its flaws. For example, a 99% REQUIRER still leaves a 1% chance of poor quality, and an 18% REQUIRER means that while 82% might exceed the quality threshold, 18% could still be good. For example, although the actual RLNE for the HNCOCA spectrum of the E. coli YfgJ protein in Table 1 is below the quality threshold, its REQUIRER is not very high. Additionally, for NOESY-related spectra, the loss of their small peaks does not lead to significant changes in the RLNE metric, and thus REQUIRER is unsuitable in this case. Thus, REQUIRER must be used judiciously. Even with the flaws, the overall effectiveness and reliability of REQUIRER are evident. When full-sample spectra are not available, REQUIRER serves as the only metric capable of evaluating the quality of reconstructed spectra.

Methods

Uncertainty

Compared to non-Bayesian convolutional neural networks (BNNs) that can only perform regression and classification tasks, BNNs also provide uncertainty that includes epistemic uncertainty and aleatoric uncertainty, aiding in better evaluating the reliability of predictive outcomes. Epistemic uncertainty mainly arises from a lack of knowledge or incomplete information. For the NUS reconstruction of NMR spectra, if the NUS data deviates from the training set, the model may not reconstruct the NUS spectra accurately, leading to higher epistemic uncertainty. Meanwhile, aleatoric uncertainty refers to uncertainty that arises from the data itself. In the case of NUS NMR spectra, aleatoric uncertainty primarily arises from unsampled points. One common method to implement BNNs is using dropout which is considered as a Bayesian approximation26,27. Therefore, dropout is incorporated into JTF-Net to obtain uncertainty. During the training process, a certain proportion of neurons are randomly dropped out at each layer. Dropout is kept active during multiple test processes for the same input data. Each test process randomly drops different neurons, generating a series of reconstructed results. The variance of these reconstructed results is epistemic uncertainty, which measures the uncertainty of the trained model respective to the input data. And aleatoric uncertainty is caused by factors related to the input data. To accurately predict the aleatoric uncertainty of input data, a branch is added to JTF-Net, as shown in Fig. 8(a), and trained with a loss function including aleatoric uncertainty σ expressed as26:

$${L}_{{{{\rm{ale}}}}}=\frac{1}{2\sigma {(x)}^{2}}{\Vert y-{f}^{{{{\rm{W}}}}}(x)\Vert }^{2}+\frac{1}{2}\,\log \sigma {(x)}^{2},$$
(1)

where σ(x) denotes the aleatoric uncertainty for input x, fW denotes the trained model with the parameters W, and y represents the groundtruth of the reconstructed spectrum.

Fig. 8: JTF-Net and REQUIRER workflow.
figure 8

a Framework of JTF-Net (Joint time-frequency domain deep learning network), which consists of the t-modules and f-modules. TFF/FFF is performed before passing the data to the next t-module/f-module. The features from the last t and f-module in JTF-Net were fused by FFF and input to an encoder-decoder module to get the outputs. FT/IFT: Fourier transform /Inverse Fourier transform. b Reconstruction process: Each spectrum undergoes multiple reconstructions. The final reconstructed spectrum is the mean of these reconstructions; Epistemic uncertainty (Epi) is their variance, and aleatoric uncertainty (Ale) is the mean of aleatoric uncertainties. c JTF-Net performs reconstruction on numerous simulated spectra to build the quality space. REQUIRER (Reconstruction quality assurance ratio) is the percentage of points with RLNE (relative L2 norm error) below the quality threshold within the quality cuboid. NUS Non-uniform sampling. NMR Nuclear magnetic resonance.

Network Structure

JTF-Net (Fig. 8a) consists of the time domain reconstruction modules (t-modules) and the frequency domain reconstruction modules (f-modules). Both the t-modules and f-modules in JTF-Net employ the Encoding and Decoding (ED) structure. The key difference is that the t-modules use dilated convolutions to capture long-term dependencies in time domain reconstruction, and the f-modules use standard convolutions. The details of the structure of ED and convolution kernel parameters are provided in Supplementary Part 1. The optimal numbers of t-modules and f-modules are 8 for the 1D JTF-Net model (used for reconstructed 2D spectra) and 14 for the 2D JTF-Net model (used for reconstructed 3D spectra), with the results shown in Supplementary Part 5. The undersampled FID and its frequency-domain spectrum (after Fourier transformation) are input into the t-modules and f-modules, respectively. After completing the reconstruction in a t-module/f-module, feature fusion is performed before passing the data to the next t-module/f-module. Time-domain and frequency-domain feature fusion (TFF/FFF) can be represented as:

$${{{{\rm{FFF}}}}}_{{{{\rm{n}}}}}= \frac{1}{2}({f}_{{{{\rm{n}}}}}({x}_{{{{\rm{n}}}}}^{{{{\rm{f}}}}})+F({t}_{{{{\rm{n}}}}}({x}_{{{{\rm{n}}}}}^{{{{\rm{t}}}}})))\\ {{{{\rm{TFF}}}}}_{{{{\rm{n}}}}}= \frac{1}{2}({t}_{{{{\rm{n}}}}}({x}_{{{{\rm{n}}}}}^{{{{\rm{t}}}}})+{F}^{-1}({f}_{{{{\rm{n}}}}}({x}_{{{{\rm{n}}}}}^{{{{\rm{f}}}}}))),$$
(2)

where fn and tn denote the n-th f-module/t-module, xnf and xnt refer to the frequency-domain and time-domain data fed into the n-th f-module/t-module. F and F−1 denote the Fourier and inverse Fourier transforms, respectively. The t-modules predict unsampled points based on the sampled points. At this stage, time-domain reconstruction is free from the interference of undersampling artifacts caused by strong peaks on the reconstruction of weak peaks, preserving weak peaks better. Thus, after FFF, these peaks that were lost during the f-module may be recovered. In addition, after TFF, the time-domain reconstruction module can access more information, enabling it to achieve better reconstruction results at low sampling rates. Additionally, to fully utilize the available sampled data information, a data consistency (DC) module is employed to constrain the reconstruction results after the last transposed layer (see Supplementary Part 2). After multiple rounds of reconstruction by the t-modules and f-modules, the features from the last t-module and f-module in JTF-Net were fused by FFF. These fused features are then processed by an ED module and multiple convolutional layers for final adjustments to output the reconstructed spectrum. Moreover, the aleatoric uncertainty can be obtained from the aleatoric uncertainty branch (green rectangle in Fig. 8a), which consists of three transposed convolution layers. The loss function of JTF-Net consists of three parts: time-domain reconstruction module loss Lt, frequency-domain reconstruction module loss Lf (see Supplementary Part 3 for the specific formula of Lt and Lf), and aleatoric uncertainty loss Lale, expressed as

$$L={L}_{{{{\rm{t}}}}}+{L}_{{{{\rm{f}}}}}+{L}_{{{{\rm{ale}}}}}.$$
(3)

Dataset and training

DL requires a large amount of training data to train models. However, the quantity of data acquired through NMR experiments is usually limited. Starting from NMR physics, the theoretical expressions provided in Supplementary Part 4 are used to generate a substantial amount of simulated NMR data that closely approximates real experimental data. Additionally, JTF-Net employs a dimensionality reduction approach for reconstructing NMR spectra. Specifically, it utilizes a 1D model for reconstructing 2D spectra and a 2D model for reconstructing 3D spectra. The numbers of simulated spectra in the training set and validation set are 40,000 and 4000 respectively. The JTF-Net model was trained on a server with 8 NVIDIA Tesla GPUs. The following fixed mutual hyperparameters were used: initial learning rate was 0.001 and batch size was 4. Adam optimization with the default settings was adopted. The model was trained by using early stopping and the learning rate was halved when the loss of validation set did not decrease.

Reconstruction Quality Assurance Ratio (REQUIRER)

When applying JTF-Net to a real NUS spectrum, multiple reconstructions are performed with different outputs in the presence of dropout. As shown in Fig. 8b, the final reconstructed spectrum is the mean of these reconstructed spectra, while the epistemic uncertainty is their variance. The final aleatoric uncertainty is the mean of aleatoric uncertainties. Although full-reference quality assessment indicators like RLNE cannot be obtained due to the lack of fully sampled spectra, aleatoric and epistemic uncertainties can be obtained when the reconstruction process is complete. If the relationship between these uncertainties and the RLNE indicator can be explored, it would be possible to infer the RLNE indicator of the reconstructed spectra based on the uncertainties. Here, the trained JTF-Net was used to reconstruct a large number of simulated spectra and obtained the reconstructed spectra, aleatoric uncertainties, and epistemic uncertainties. Since the spectra are simulated, the fully sampled spectra can be obtained, and the RLNE indicators for all reconstructed spectra can be calculated. To correlate RLNE with aleatoric uncertainties and epistemic uncertainties, the quality space was constructed, as shown in Fig. 8(c) and Supplementary part 6. In this space, every point, denoted as the quality point, is formed by the three indicators: epistemic uncertainty, aleatoric uncertainty, and RLNE, which are derived from the reconstruction of the individual simulated spectrum. It is seen that as the uncertainty increases, RLNE exhibits a growing trend. This serves as the foundation for our reference-free quality assessment method.

If a reconstructed spectrum has peak loss or artifacts, it will not be considered as a high-quality spectrum in our evaluation. However, determining whether a reconstructed experimental spectrum has peak loss or artifacts through RLNE is a challenging issue. To address this issue, 2D and 3D NUS reconstructions were conducted on a large of simulated spectra with varying sampling rates, peak numbers, and signal-to-noise ratios (SNR) to determine the threshold of RLNE. The results show that when the artifact peaks or peak loss (A/L) is observed, the minimum RLNEs are 0.3535 for the 2D spectra and 0.5562 for the 3D spectra. To be on the safe side, the quality thresholds in 2D and 3D reconstructed spectra are set to 0.3500 and 0.5500, respectively. The details of determining quality thresholds can be found in Supplementary part 7. In the real-world application, JTF-Net reconstructs a practical experimental NUS spectrum, offering its aleatoric uncertainty (A0) and epistemic uncertainty (E0). The point (E0, A0) can be extended into a line along the RLNE axis in the quality space, as shown in Fig. 8c. The real RLNE value of the reconstructed spectrum maybe one of the values in this line. Within an narrow cuboid centered on this line, there are quality points whose uncertainties are close to that of this reconstructed spectrum. This cuboid is called the quality cuboid with its establishment in Supplementary part 8. Then the REQUIRER is calculated with

$${{{\rm{REQUIRER}}}} = \frac{M}{N} \times 100 \%, $$
(4)

where N is the total number of quality points in the quality cuboid (N is about 100 in practical use), and M represents the number of quality points whose RLNE falls below the quality threshold in the quality cuboid. As M is not larger than N, the REQUIRER ranges from 0% to 100%.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.