Deciphering the history of ERK activity from fixed-cell immunofluorescence measurements

Ram, Abhineet; Pargett, Michael; Choi, Yongin; Murphy, Devan; Teragawa, Carolyn; Cabel, Markhus; Kosaisawe, Nont; Quon, Gerald; Albeck, John G.

doi:10.1038/s41467-025-58348-7

Download PDF

Article
Open access
Published: 21 May 2025

Deciphering the history of ERK activity from fixed-cell immunofluorescence measurements

Nature Communications volume 16, Article number: 4721 (2025) Cite this article

1742 Accesses
3 Altmetric
Metrics details

Subjects

Abstract

The RAS/ERK pathway plays a central role in diagnosis and therapy for many cancers. ERK activity is highly dynamic within individual cells and drives cell proliferation, metabolism, and other processes through effector proteins including c-Myc, c-Fos, Fra-1, and Egr-1. These proteins are sensitive to the dynamics of ERK activity, but it is not clear to what extent the pattern of ERK activity in an individual cell determines effector protein expression, or how much information about ERK dynamics is embedded in the pattern of effector expression. Here, we evaluate these relationships using live-cell biosensor measurements of ERK activity, multiplexed with immunofluorescence staining for downstream target proteins of the pathway. Combining these datasets with linear regression, machine learning, and differential equation models, we develop an interpretive framework for immunofluorescence data, wherein Fra-1 and pRb levels imply long-term activation of ERK signaling, while Egr-1 and c-Myc indicate more recent activation. Analysis of multiple cancer cell lines reveals a distorted relationship between ERK activity and cell state in malignant cells. We show that this framework can infer various classes of ERK dynamics from effector protein stains within a heterogeneous population, providing a basis for annotating ERK dynamics within fixed cells.

Generation of pulsatile ERK activity in mouse embryonic stem cells is regulated by Raf activity

Article Open access 10 June 2023

Low-frequency ERK and Akt activity dynamics are predictive of stochastic cell division events

Article Open access 04 June 2024

ERK signalling: a master regulator of cell behaviour, life and fate

Article 23 June 2020

Introduction

The RAS signaling pathway directs multiple cellular behaviors and regulates tissue homeostasis¹. Its terminal kinase, Extracellular Signal-Regulated Kinase (ERK), is essential for cellular decisions to enter the cell cycle, migrate, or differentiate. In cancer and other diseases, changes in the quantitative strength and timing of ERK signaling play a critical role in disease progression and treatment. For example, individual cell fates can be altered by short interruptions in ERK activity^2,3,4, while residual ERK activity following targeted kinase inhibitor treatment enables re-entry to the cell cycle^5,6 and determines therapeutic efficacy⁷. Therefore, accurate measurement of ERK pathway activity has substantial clinical applicability. However, current methods to assay ERK activity in situ are limited and have not been highly successful as diagnostic markers⁸.

Measuring cellular ERK activity is complex because the timing, duration, localization, and amplitude of ERK activation vary widely between cells^9,10,11,12. These differences have functional importance because they influence the expression of numerous effector proteins encoded by ERK target genes (ETGs), including the Immediate Early Genes (IEGs) FOS (encoding c-Fos), FOSL1 (Fra-1), MYC (c-Myc), and EGR1 (Egr-1)^{13,14,15,16,17}. The amplitude, duration, and frequency of ERK activity interact with ETG expression by increasing transcription rate, by stabilizing protein products, and by activating negative feedbacks on expression¹⁸. These effects vary from gene to gene, and consequently some gene products (e.g., Fra-1) integrate ERK activity over time, whereas others (e.g., c-Fos, Egr-1) respond maximally to pulsed ERK activity at intermediate frequencies^19,20,21. Correspondingly, cellular phenotypes are responsive to ERK dynamics; pulsatile activation correlates with proliferation and protection from apoptosis in some systems, while sustained activity correlates with cell cycle arrest^11,22. Advances in live-cell imaging and CRISPR-based tagging have allowed a higher-resolution view of how patterns of activation and deactivation (ERK dynamics) correlate with ETG expression^19,20. However, at the single-cell level, the correlations in these studies appear modest, and the ETG responses are not reliably predictable. This uncertainty is not necessarily random and could arise from various features, including systemic non-linearities and unmeasured factors. Therefore, it remains important to ask whether the dynamic pattern of ERK activity in an individual cell reliably determines the expression of multiple ERK targets. Conversely, can the expression state of ERK target genes reveal information about the prior history of ERK activity dynamics in the cell? Such questions motivate the need for further quantitative analyses and modeling.

A quantitative framework connecting ERK dynamics to ETG expression states would have several notable uses. First, it would help elucidate the cell state landscape that is accessible through growth factor pathway modulation. Transcriptome-based single-cell state measurements have become widely available, but how non-transcriptional events in upstream signaling pathways drive cell state decisions and transitions remains a pressing question. Second, it would provide a comparison of how different fixed markers reflect actual ERK activity. Currently, RAS/ERK activity is assayed using antibodies for phosphorylated ERK (pERK)^7,23,24 or for proteins that are induced by ERK activity, such as c-Fos or DUSP6, which persist for longer than ERK phosphorylation^25,26. Knowing how different markers respond to ERK activity could help in choosing the best activity marker under different circumstances and would be of particular interest in assessing the efficacy of cancer treatments targeted to the pathway. Finally, oncogene-induced ERK activation is distinct from normal physiological patterns generated by endogenous growth factors and is often sustained or stochastically fluctuating^11,27. Knowledge of the dynamic patterns of ERK signaling found in a tissue can therefore be informative about the source of ERK hyperactivation.

In addition to ERK activity biosensors, previous work has demonstrated that synthetic ETGs can detect ERK dynamics²⁸. While such genetically encoded tools are impractical clinically, it is possible that similar information could be derived from endogenous ETGs, which range in their sensitivity and timing in response to dynamic ERK activity^20,29. In principle, the history of cellular ERK activity could be inferred using fixed-cell measurements alone and used to evaluate the dynamic nature of ERK activity within biopsies of tumor tissue.

Here, we develop such an interpretive framework, using a live-cell biosensor of ERK activity in combination with cyclic immunofluorescence for ETG-encoded proteins and other proteins regulated by ERK, including Egr-1, Fra-1, c-Jun, c-Myc, and c-Fos, and phosphorylated proteins such as pERK, pc-Fos, and pRb (a downstream marker of ERK-dependent cell cycle entry; for convenience we collectively refer to all of these measurements as ETGs). Because of the well-established interdependence of protein stability and ERK-mediated phosphorylation that modulate ETG expression¹⁵, we hypothesized that this panel would provide the best assessment of ERK activity available through immunofluorescence. Using statistical models and machine learning to predict ERK activity features based on the expression of each marker, we characterized how various ETGs report ERK history with varying memory span. The characteristics of Fra-1 and pRb as long-term integrators and Egr-1, c-Fos, or c-Myc as short-term responders provided significant predictive capacity in our models. We further expanded on these models by experimentally examining multiple cancer cell lines and by computationally simulating ERK-driven gene expression, demonstrating that fixed-cell analysis of ETGs can be broadly useful in revealing the dynamic history of ERK activity.

Results

A dataset linking live-cell ERK activity to ERK target immunofluorescence

We generated a dataset that enables correlation of ERK activation to downstream target expression and modification by first collecting live ERK activity measurements under differential activation states of the RAS/ERK pathway. In MCF10A mammary epithelial cells, we used EKAR3.1, a calibrated FRET-based biosensor that measures the balance between phosphorylation by ERK activity and dephosphorylation by competing phosphatases (Fig. 1a, Supplementary Fig. S1a–d). ERK activity was stimulated in a dose-dependent manner with varying Epidermal Growth Factor (EGF) concentrations (Fig. 1b, S1e). To increase the temporal diversity of activity patterns, we added MEK inhibitor (MEKi) at varying times after EGF stimulation and included treatments where EGF was added at different timepoints of the experiment (Fig. 1c, Supplementary Fig. S1e, Supplementary Table 1). These treatments generated a wide range of ERK signaling patterns, including sustained and pulsatile activity with varying duration and magnitude (Fig. 1d). Consistent with previous studies^19,30, we found that ERK activation is heterogenous from cell to cell within each stimulation condition.

**Fig. 1: ERK activity and target genes are dose-responsive to Epidermal Growth Factors.**

Immediately following live-cell data collection, we fixed the cells and conducted cyclic immunofluorescence staining to measure levels of eight targets downstream of ERK (Fig. 1a, e, Supplementary Movie 1). This protocol (4i) was adapted from Gut et al.³¹ and validated for our 96-well plate experiments (Supplementary Fig. S2a–e). After quantifying nuclear antibody staining intensities, we found that most targets were dose-responsive to EGF and suppressed by MEKi treatment, except c-Jun (Fig. 1e, f, Supplementary Fig. S1e). c-Jun increased moderately with both MEK inhibition and EGF concentration, suggesting that its expression is not directly regulated by ERK activity in MCF10A.

We then analyzed the correlation between ERK activity and the expression of each target. To link live-cell ERK activity measurements with the respective 4i data for each cell, we aligned the corresponding image datasets and generated a heatmap ordered by the mean ERK activity measurement in each cell (Fig. 2a). While both ERK activity and 4i targets were variable across the data set, most of the 4i targets exhibited some correlation with mean ERK activity. We calculated the Pearson correlation between ERK pulse features, such as Frequency, for each cell and each ETG measurement (Fig. 2b, c). The strongest correlations were between the Sum of duration and Fra-1 or pRb. Interestingly, Egr-1 was uniquely correlated with Average derivative of ERK activity, supporting the previous notion that Egr-1 selectively decodes pulsatile ERK activation²¹. Notably, c-Jun had a weak correlation with any ERK feature and therefore serves as a useful negative control in our analyses, providing baseline correlation values for proteins unregulated by ERK.

**Fig. 2: ERK target gene expression moderately correlates with features of ERK dynamics.**

We performed a more granular time-sensitive analysis by calculating the Pearson correlation between each ETG and the EKAR3.1 FRET measurement at each live-cell timepoint (Fig. 2d). Correlations to Fra-1 and pRb were distributed across the time series (r = ~0.5, 0.4, respectively) after the initial stimulus, apart from a period where ERK activity is weakest, about 2 to 4 hours after EGF addition. In contrast, c-Myc, c-Fos, and pc-Fos mildly correlated to ERK activity from 2–5 hours prior to fixation (r = ~0.3) and at their highest levels (r = ~0.55) during the last hour before fixation. pERK staining was most correlated to ERK activity immediately prior to fixation (r = ~0.6), consistent with its rapid time-varying nature. Finally, Egr-1 correlated only to ERK activity 30 minutes to 1 hour prior to fixation (r: ~0.5).

To visualize spatial correlations of ERK-ETG signaling within the dataset, we plotted a spatial heatmap of signaling and gene expression, where cells within a single image are clustered in a heatmap visualization by proximity to one another (Fig. 2e). This analysis shows ERK activation within clusters of neighboring cells (5–30 cells) throughout the experiment. Consistent with its correlation to recent ERK activity (Fig. 2d), Egr-1 expression is higher in these clusters of cells subject to locally synchronized ERK activity bursts near the end of the time-lapse imaging phase (Fig. 2f).

Neural network-based models reveal non-linear time dependence of ETGs on ERK dynamics

To look beyond linear correlations and identify potential non-linear or complex relationships between live ERK activity and ETGs, we trained a convolutional neural network (CNN) to use the ERK activity time series to predict expression levels of each ERK target in individual cells (Fig. 3a, top). As a comparison to the CNN, we fitted linear regression models using the values at each timepoint of the ERK time series as individual variables to predict final ETG levels (denoted TS linear). We also compared the performance of these time series-based models with that of feature-based models, using nine ERK activity features (as shown in Fig. 2b; denoted Featurized linear). We found that the CNN achieved the highest performance in predicting all ERK targets, except for pERK (Fig. 3b). To account for overfitting, we calculated the mean squared error (MSE) on unseen data (test set); the CNN exhibited the least error for all targets, again except for pERK (Fig. 3b bottom). Though the CNN performed best for most targets, significant variance remained uncaptured (Fig. 3c). The improvements in variance explained were significant only for Egr-1 and pRb, suggesting non-linear responses to ERK activation. For many 4i targets, the featurized linear models underperformed both methods in R² and test set error, showing that a priori featurization can miss key signal aspects.

**Fig. 3: Convolutional neural network identifies non-linear signal transmission.**

We then used CNN model parameter weights (feature importance) to identify specific timepoints influencing target expression. Initially, we found that feature importance was overshadowed by the initial stimulus response, likely reflecting cellular biases rather than direct biochemical regulation of ETGs (Supplementary Fig. S3). Therefore, we limited the model to timepoints greater than 5 hours after treatment, with minimal decrease in CNN performance. A CNN trained on these time points was broadly consistent with the linear Pearson correlation analysis (Fig. 2d), but with less disperse feature importance spread over time, especially in the case of pRb (Fig. 3d). The CNN for Fra-1 expression is influenced by a broad time span, peaking 12 hours prior, with minimal effect from the last two hours. c-Fos and pc-Fos are influenced over six hours but focus on the last two to four hours. The Egr-1 model is strongly affected by ERK activation in the last two hours, while pERK, c-Myc, and pRb are influenced mainly in the final hour. This approach confirms that each ETG has a distinct sensitivity to ERK timing, with a non-linear relationship in some cases.

Regression modeling provides backward inference of ERK dynamics

While the preceding models consider whether ERK dynamics can predict the expression strength of ETGs, a model of the reverse relationship – i.e., inference of ERK dynamics based on the pattern of ETG expression – would be of significant value because live-cell measurements are often unavailable. To explore such models, we performed cross-validated linear regression using the 4i ETG measurements as predictors and FRET-measured dynamic ERK features as output variables. We restricted our analysis to our existing feature set (Fig. 2b), which provided a convenient and intuitive means to develop such models, but our approach would remain applicable to other featurizations of ERK activity dynamics. Our analysis considered both single ETGs as predictors (Fig. 4a, b) and multiple linear regression (MLR) models using all 4i measurements from each cell (Fig. 4a, c). The top single ETG predictor based on variance explained (R²) was Fra-1 for Sum of ERK duration (R² = 0.42). MLR models improved the predictions for all features, with the Sum of duration still the best-predicted feature (R² = 0.53). In comparison, amplitude characteristics of ERK activity, like Max or Average peak height, and features related to pulsatile ERK behavior, such as Average duration and Frequency, were poorly predicted by individual parameters (R² < 0.15) or by MLR (R² < 0.2).

**Fig. 4: ERK target gene expression predicts history of ERK activation.**

For all ERK features, we found that near-maximal R² values could be achieved with 2 to 3 predictor ETGs (Fig. 4d, Supplementary Fig. S4). Fra-1 and pRb were the primary contributors to aggregate-based features (Mean, Sum of duration, Sum of peak height), and Egr-1 and pRb the main contributors to dynamics-related features (Average derivative, Average inter-pulse interval, and Frequency; Fig. 4d). Effectively, Fra-1, Egr-1, and pRb alone capture >90% of the predictivity achieved by the full ETG panel in this dataset. Pairwise correlations between ETGs indicated shared mutual information (Supplementary Fig. S4c), explaining why relatively few are needed for optimal models.

While R² values for single-cell models were modest, strong predictions of all features were achieved when using the average ETG values for each condition to predict average features (Fig. 4e, f); R² values ranged from 0.72 to 0.98 across MLR models of all features. Fra-1 and pRb were consistently strong individual predictors for all features (R² > 0.7) except Frequency, where Egr-1 was the top predictor. While Frequency remained the most difficult feature to predict, its MLR R² rose from 0.08 at the single-cell level to 0.39 in the average model. Overall, the population average models reconcile the modest predictive power of the single-cell models and confirm the classical view that, on average, ERK strongly determines ETG expression levels.

Notably, pERK performed poorly as a predictor of ERK activity, in both forward (Fig. 2) and reverse (Fig. 4) modeling approaches. This weak relationship arises partly from conditions where MEK inhibition following EGF quickly diminishes the pERK signal (see Fig. 1). Removing these treatments improved regression models for pERK and slightly for other 4i measurements. (Supplementary Fig. S4d). These results indicate that pharmacological inhibition renders pERK an unreliable predictor of ERK activity histories and that relying solely on pERK staining can lead to misinterpretations of pathway activation. In contrast, Fra-1 and pRb measurements remain robust to MEKi treatments and better predict long-term ERK activation.

Cancer cell lines differ in the relationship between ERK dynamics and ETG state

To establish the generality of the modeled relationships, we next expanded our investigation to additional cell types, multiple mitogenic stimuli (EGF or bovine serum), and additional candidate ETG stains. We generated stable MCF7 (breast adenocarcinoma), HCC827, and A549 (both lung adenocarcinoma) cell lines carrying EKAR3.1. These lines were subjected to a similar live- and fixed-cell analysis. Additional immunofluorescence stains, including phospho-EGFR, phosho-4EBP1, phospho-p70S6K, NF-1, DUSP1/4, EZH2, FoxO1, RSK1, GSK3β, and E-Cadherin were added, to expand the panel to 19 signaling-related proteins (Supplementary Figs. S5–S7, Supplementary movies 2–4). We first noted distinct overall ERK activity patterns for each of the cell lines. HCC827 cells featured high average ERK activity regardless of EGF stimulation conditions, consistent with their EGFR mutant (E746-A750 deletion) status (Fig. 5a). This activity was highly variable and disorganized at the single-cell level (Fig. 5b). In KRAS-mutant (G12S) A549 cells, ERK activity had a higher baseline relative to full inhibition and was more responsive to EGF, but individual cells lacked pulsatile ERK activity, unlike the other cell lines (Fig. 5a, b). MCF7 cells (estrogen receptor positive, PIK3CA mutant) showed a strong but brief response to a high concentration of EGF and a more sustained response to lower EGF in the population mean (Fig. 5a). In individual MCF7 cells, ERK activity occurred in discrete pulses, similar to MCF10A cells. Overall, this panel of cells provides a diverse set of ERK activity profiles that widens the scope of gene regulatory behaviors to be observed.

**Fig. 5: Cancer cell types display deficiencies in processing ERK dynamics.**

In general, ETG correlations with ERK activity time points (r) were lower in cancer cells than in MCF10A cells (Fig. 2d), with MCF7, in particular, showing no correlations above 0.15 (Fig. 5c). However, a number of features remained similar to the MCF10A dataset; Fra-1 was among the ETGs most highly correlated with ERK activity over a wide range of time points for HCC827 and A549 cells (Fig. 5c). Also as in MCF10A, for all 3 cell lines, Egr-1, and pERK were most correlated with ERK activity in the 1.5 hours prior to fixation. Some cell line-specific differences were noted; pc-Fos showed elevated correlations across the time points only in HCC827, while pp-70S6k showed a strong late time point correlation in A549. Unlike in MCF10A, some ETGs such as c-Myc and c-Fos showed very weak time series correlations with ERK activity in all three cancer cell lines, while c-Jun showed a higher degree of correlation only in A549. Additional features emerged as moderate correlates of long-term ERK activation when per-condition averages were considered, such as Fra-1, RSK1, and DUSP6 in MCF7 cells (Supplementary Fig. S8). Interestingly, pRb showed a low correlation with ERK activity across all conditions for all three cancer cell lines, despite being one of the strongest correlates with ERK activity in MCF10A cells.

Consistent with the lower correlation values observed for the cancer cell lines, regression models of ERK activity features based on ETG measurements were somewhat less predictive of cell-to-cell variance than the corresponding models for MCF10A (Fig. 5d, Supplementary Fig. S8a). While averaging values for each condition improved predictions (Supplementary Fig. S8c), correlations remained below those observed for MCF10A cells. Altogether, while some dynamic features of ETG induction by ERK are conserved across the cell lines, there is a strong general trend toward lower correlations and significant variation among cell lines as to which features are most highly correlated. We interpret these differences as an indication that in the tumor cell lines examined, ERK is active but its linkage to downstream regulation of gene expression and other processes is dysregulated. Interestingly, we also noted that while the cell cycle marker pRb is well correlated with the ERK-related markers c-Myc and Fra-1 in MCF10A (Supplementary Fig. S4c), pRb is correlated strongly with c-Myc but poorly with Fra-1 across all three cancer cell lines (Supplementary Fig. S8d). Given that Fra-1 remains the best-correlated marker of ERK activity in these lines, this result suggests that cell cycle regulation has become unlinked from ERK activity in these cells. In accord with this the percentage of pRb-positive cells, a reliable indicator of cell cycle progression, was insensitive to MEK inhibition in all three cancer cell lines (Fig. 5e). Further, using UMAPs to visualize ETG expression across cell lines, we observed that cycling cells (pRb-positive) consistently clustered together (Supplementary Figs. S2f, S5e, S6e, S7e). Among the ETGs, c-Myc was uniquely associated with cycling cells, while other ETGs were variably expressed in both cycling and non-cycling populations. The retained correlation of pRb with c-Myc may reflect the role of c-Myc in translation regulation, as protein synthesis rate is a key factor regulating cell cycle entry independently of ERK activity².

Classification models uncover prototypical patterns of ERK signaling with distinct gene expression profiles

We next explored how the combined live-cell/immunofluorescence datasets could be used to augment immunofluorescence-only datasets, by developing a method to provide concise visual annotations of inferred ERK activity history for immunofluorescence data. We first used k-means clustering to group cellular ERK activity time series into similar response classes, or prototypes (Fig. 6a). To facilitate intuitive annotations, we chose five clusters, which represented moderate activity with recent inactivation (cluster 1), consistently low activity (cluster 2), recent activation (cluster 3), mid-term activation (cluster 4), and high long-term activation (cluster 5). Analysis of the ETG expression levels in each cluster was consistent with our previous statistical models (Fig. 6b). Most ETG stains were highly expressed in clusters 3 and 5, and to a lesser extent in cluster 1, with varying enrichment across stains, offering potential for class distinction. Clustering in cancer cell lines revealed differential expression in pERK, pp-70SK6, and other ETGs, which could provide a basis for discriminating similar clusters in these lines (Supplementary Fig. S9).

**Fig. 6: Inferring spatiotemporal ERK patterns using classification models.**

We next trained ensemble-based classifiers to predict these prototypes of ERK signaling history using ETG staining levels (Fig. 6c, d). The overall prediction accuracy of our model was ~60% (compared to 20% for random selection), while individual class predictions varied in accuracy. Long-term high activity class predictions were the most accurate (83%), and moderate activation classifications were the least accurate (33%). The classifier’s residual confusion reflects poor separation of some classes due to wide intra-class variation in ERK activity among individual cells (Fig. 6c). In this classification, pRb and pc-Fos were found to be the most important predictors, followed by Fra-1 and cMyc; Egr-1 was of relatively low importance (Fig. 6d). This result indicates that while pc-Fos may not explain a high amount of variance in the ERK history, it carries particularly useful information for distinguishing among the five classes identified here.

Because our other models performed more effectively on averages of multiple cells (Fig. 4e), we asked whether using ETG measurements from groups of adjacent cells could improve the prediction of ERK histories. We grouped cells within each image into hexagonal regions (radius = 50 μm; each containing from ~5 to ~30 cells) and calculated the average ERK activity time course and ETG expression values for each region. We then repeated the generation of clusters and training of classification models. To carry this approach to its practical limit, we also performed the same analysis on whole images (500-1000 cells). We then compared models generated with each approach, across all four cell lines (Fig. 6e). For all cell lines, the regional model improved model accuracy, in some cases dramatically (e.g. from 59% to 79% for MCF10A or 31% to 56% for MCF7 cells). Whole-image models, however, did not consistently outperform regional models, likely due to training on only a small number of samples, when each image only provides one data point. Finally, we overlaid markers of the spatiotemporal histories inferred by the single-cell or regional models on images from the immunofluorescence dataset (Fig. 6f). These images demonstrate both variegated spatial distribution of ERK of history and inhibitor-driven shifts in predominant ERK profiles.

Dynamical systems modeling of ERK-driven gene expression

To investigate the theoretical limits of predicting ERK dynamics from ETG levels, we extended an ordinary differential equation (ODE) model representing the regulation of ETGs^19,29 (Fig. 7a). For a given ERK activity time series, the model simulates the mRNA and protein levels of a hypothetical ERK-responsive gene (sim-ETG). We constructed 1000 hypothetical sim-ETGs by randomly assigning each one with different values for 6 critical parameters: mRNA degradation rate, protein degradation rate, phosphorylated protein degradation rate, protein dephosphorylation rate, negative feedback half-max concentration, and fractional expression at baseline (Fig. 7c, Supplementary Table 2). These 1000 configurations survey the parameter space, allowing us to identify sim-ETGs that capture different aspects of ERK signaling. Using 10,000 randomly selected live-cell ERK activity measurements from our experimental data, we simulated responses of all 1000 sim-ETGs for each cell (Fig. 7b, Supplementary Fig. S10a). The end point sim-ETG protein values (representing a fixed-cell 4i measurement of the hypothetical protein) were examined with single variable regression modeling to characterize each sim-ETG’s capacity to predict ERK dynamics features. We found that 49% of sim-ETGs could predict average ERK activity with an R² above 0.5, and more than 1% were excellent predictors (R² > 0.8) (Fig. 7d, Supplementary Fig. S10b). However, only 12% of sim-ETGs could predict maximum activation or average pulse height with an R² above 0.5, with a maximum R² around 0.6 (Supplementary Fig. S10c). Models for predicting dynamic ERK features like Frequency or Average derivative were overall worse than integrative features like the mean or sum of duration, reflecting that sim-ETGs under this model are variations on an integrator of ERK activity (Supplementary Fig. S10c).

**Fig. 7: Mathematical model identifies limits of ERK activity prediction method.**

To visualize simulated gene expression responses, we plotted a single cell’s ERK signal along with the response of the top five predictors of the mean (Fig. 6f), which included both genes activated by and inhibited by ERK. While our experimental ETG measurements were selected based on known positive responders to ERK, 20% of sim-ETGs were negatively regulated by ERK (Supplementary Fig. S10b); therefore, experimental prediction of ERK activity would likely be improved by including genes that are inhibited by ERK³². We then analyzed which parameters most influence how well an individual sim-ETG predicts mean ERK activity by examining the weights from an MLR model of sim-ETGs (Supplementary Fig. S10d,e). We found that low mRNA and phosphorylated protein degradation rates were generally associated with accurate recording of average ERK activity, which is consistent with the measured parameter values and behavior of Fra-1¹⁹.

We next used sim-ETGs to examine the distinction between genes that reflect long-term history (e.g. Fra-1) and those responsive to recent ERK activity (e.g. Egr-1 and c-Myc). We calculated the correlation between the ERK activity at each timepoint and end-point protein expression (analogous to the experimental data in Fig. 2e). As expected, genes predicting mean ERK activity are correlated with ERK activity over a broad time span, behaving like Fra-1. Genes are less effective at predicting mean correlate with recent activation, similar to Egr-1 or c-Myc (Fig. 7e). Notably, No sim-ETG in this model specifically predicted intermediate activation timescales (5-10 hours before fixation), indicating that such behaviors are rare in the system topology we studied.

Finally, we investigated how increasing the number of ETG measurements affects the ability of MLR models to predict ERK activity features (Fig. 7g). Overall, the predictability of ERK activity features showed similarities to our experimental findings. Some features (Mean, Sum of duration) were well predicted with only a few sim-ETGs, while others (Average inter-peak interval, Average duration) were poorly predicted regardless of the number of sim-ETGs. However, in some cases (Average derivative, Frequency), high predictability was feasible, but only with a larger set of 5 – 20 sim-ETGs (Fig. 7g inset). These results were not obtained through overfitting, as the test set error decreased with more sim-ETGs (Fig. S10f). Altogether, these simulations suggest that predictions of some of the features in our experimental models (e.g. Mean, Max, Average peak height; see Fig. 4a), underperform their theoretical counterparts and could be substantially improved with just one or two additional ETG antibodies well suited for those parameters. Other features such as Average peak duration will likely not improve greatly even with additional stains. Average derivative and Frequency predictions could potentially be improved, albeit with the considerable investment of identifying at least several additional antibodies suitable for those features.

Discussion

Here, we report unique high-content datasets that make it possible to assess how ERK activity relates to its downstream effectors at the single-cell level. Our approach, combining live imaging followed by multiplexed immunofluorescence and quantitative models, establishes proof of principle that single-cell ERK activity measurements and endpoint ETG staining contain mutual information and can be modeled bidirectionally, allowing the inference of key parameters in each domain from observations of the other. Our data show that the relationship between activity dynamics and individual cell expression states is significant, despite the inherent noisiness of signal transduction pathways at the single-cell level³³.

Previous modeling efforts in this area have focused on single ETGs^17,19, on multiple genes at the mRNA level¹³, or on the population average of ETG proteins²¹. Our analysis offers the advantage of examining multiple ETGs while retaining single-cell relationships, enabling models that account for both cell-to-cell variation and differences between ETGs. In this type of dataset, cell-to-cell variation becomes an asset rather than a limitation, where the thousands of individual cells represent semi-independent trials that can be leveraged to establish complex relationships, even when a relatively small number (~20) of external conditions are provided. Our results quantitatively validate the existing but imprecise concept that certain genes such as c-Fos or Egr-1 represent markers of transient ERK activity, whereas others such as Fra-1 are markers of sustained activity. Our data also place these markers within the context of additional genes such as c-Myc, enabling a more comprehensive model of how ETGs collectively compute responses to dynamic ERK activity.

The approach of combining multiplexed live- and fixed-cell analysis on the same cells is finding increasing utility and has been used to interrogate details of CDK-mediated cell cycle regulation³⁴, SMAD-regulated stem cell differentiation³⁵, and the dependence of calcium signaling on gene expression state³⁶. The modeling approaches we develop here provide new capabilities for such datasets. We show that combining fixed-cell measurements of these outputs can reliably identify ERK signaling activity history in different cell lines, despite the complication that these relationships vary somewhat between cell lines.

Our models indicate that single-cell ETG measurements provide reliable information about two main types of ERK behavior: long-term history and short-term activation. However, the unexplained variance in ERK activity in our single-cell models raises the question of which other cellular parameters could improve ERK history predictions. Our analysis of simulated ETGs demonstrates that additional ETG measurements could more finely resolve signaling behaviors, especially for features such as Frequency. These targets could be identified by screening transcriptomic or proteomic datasets for their expression responses to ERK¹³. Additional information could also be obtained by combining antibody measurements with multiplexed single-cell transcript measurements, such as MERFISH³⁷. Our work also shows that using expression data from groups of cells, rather than individual cells, provides improved prediction of the average ERK activity for the group. Grouping cells in this way makes sense because ERK activity is often important for collective cell behaviors, such as generating patterns of collective migration³⁸ or regulating cell death and survival in proportion to cell density³⁹. Finally, orthogonal markers of internal cell state provide contextual information that improves signaling prediction models⁴⁰. It is feasible that such contextual markers, including protein translation rates, organelle morphology, chromatin state, or cell type identity, could improve predictions.

Several aspects of our analysis suggest that the quantitative relationship between pathway activity and gene expression evolves and may be under selection for certain characteristics. The ETGs that we measured in this study appear to be more biased toward control by the duration of ERK activity (as opposed to amplitude, Fig. 4), than would be expected per the ODE model and simulations (Fig. 7). Such an effect could arise from several factors, including saturation of a particular gene’s response to ERK. A key concern is that our ERK biosensor may itself become saturated and fail to capture high ERK activation levels, but we have directly accounted for this measurement issue by calibrating the reporter to provide a linear readout of ERK substrate phosphorylation⁴¹. A remaining possibility is that the parameters of the ETG response have been selected for sensitivity to duration rather than to amplitude during evolution, which would be consistent with the finding that the relative timing of biochemical events is an important mode in gene regulation⁴². Our data from cancer cells, in which ERK-ETG response functions appear significantly weaker than in non-tumor MCF10A cells, further imply the malleability of these quantitative parameters. It is unclear what drives this change within the micro-evolutionary setting of a tumor, as it appears to occur similarly across the tumor cells examined. The loss of signaling regulation appears to be independent of the driver mutation and may be a late development in tumorigenesis, consistent with models that show initial dependence on driver mutations, followed by dysregulated signaling after various checkpoints are lost⁴³. Notably, however, MCF7 cells, which contain a PIK3CA mutation, show the loss of correlation more strongly than lines driven by ERK pathway mutations (A549, HCC827). More work is needed to understand these differences and assess how they are influenced by selective pressures.

In principle, models such as the ones generated here (Fig. 6) could be used to determine important details of ERK signaling dynamics within fixed tissue samples in a clinical setting. For example, the ability to infer the long-term patterns of ERK activity in samples from patients treated with RAS, MEK, EGFR, or other targeted pathway inhibitors could provide a more reliable indication of the effectiveness of long-term ERK activity suppression than current markers, helping to reveal areas of drug resistance. It is also possible that such models could assist in diagnosing the dominant mechanisms of cell signaling within a tumor (or within sub-regions of a tumor), especially in cases where analysis of genetic alterations is ambiguous (e.g., when variants of unknown significance are present). By analogy, single-point measurements of hemoglobin A1C provide a broadly reliable indication of a patient’s time-averaged blood sugar that is useful in the clinical management of diabetes.

Our analysis of cancer cell lines adds needed depth to the question of how ERK signaling differs between normal and tumor cells. We find several key parallels between the cell lines examined here and previous biosensor data. The sustained ERK activity found in RAS mutant A549 cells is similar to the signaling phenotypes observed when expressing oncogenic RAS or RAF in non-tumor epithelial cell lines¹¹ or engineered mouse embryonic fibroblasts⁴¹. Similarly, the highly sporadic pattern of ERK activity in EGFR mutant HCC827 cells is similar to kinetics found in models of other EGFR- or receptor-driven model systems^11,29. These parallels strengthen the case that ERK kinetics cluster into similar forms according to the activating mutation in the pathway. Furthermore, we found weaker ERK-ETG relationships, consistent with the degradation of information transmission observed in tumor cells²⁷. While this loss of correlation poses some challenges for using the same models across different cell types, our data nonetheless support the utility of multiplexed immunofluorescence datasets for capturing cell signaling states via histological methods. For example, Fra-1 is consistently correlated with long-term ERK activity, while Egr-1 and pERK correlate on the short term, across all models tested. Combining these stains could potentially distinguish cells with sustained ERK activity (e.g., RAS mutants⁴¹) or cells with highly stochastic ERK activity (e.g., EGFR-driven cancer cells^11,29) from cells in which ERK is activated in an organized pattern (e.g., non-tumor tissue^44,45,46).

Notably, while all three cancer cell lines examined lack the pRb-Fra-1 correlation seen in non-tumor cells, they retain the pRb-c-Myc correlation. (Fig. S8d). Such changes suggest that c-Myc remains more tightly coupled to proliferation than other ETGs and imply that targetable tumor-specific pathway dependencies can be detected in fixed histological samples. The disruption of c-Jun, Egr-1, and Cyclin-D1 regulation in cells with B-Raf mutations suggests additional such signatures that could be detected^27,47. Furthermore, the coherence between ETGs within the same cell, available in our datasets but not explicitly explored here, could provide an indicator of transformed signaling activity. Thus, our study supports the general feasibility of inferring functionally significant features of activity dynamics from fixed cell immunofluorescence. A significantly broader training set across tumor samples and primary cells will be needed but would be conceptually straightforward given the analysis framework developed here. Further incorporation of multiplexed signatures of other dynamically regulated pathways such as the cell cycle⁴⁸ or metabolic and stress response signaling could also support both generalized and patient-specific models.

Signaling by receptor tyrosine kinase pathways is inherently a multi-input, multi-output system, and here we explore only a limited number of ligand inputs and a restricted set of signals and downstream targets. While serum contains a mixture of ligands, a wider diversity of stimuli needs to be considered to span the varying dynamics that can be driven by receptors including FGFRs, IGFRs, PDGFRs, and TrkA/B^30,41,49. It will also be important to consider parallel pathways such as PI3K/AKT/mTOR and JAK/STAT and their target genes to fully characterize the regulation of cell state by these receptors, as these pathways work in coordination to determine cellular phenotypes^50,51. Another key caveat in our datasets is that EKAR3.1, like many ERK biosensors, is partially sensitive to cyclin-dependent kinases (CDKs) during mitosis⁵². Although mitotic events are rarer than ERK activity changes, some variation in EKAR measurements likely arises from CDK activity, and thus some of the correlation between EKAR and pRb is likely attributable to this cross-specificity. This point serves as a reminder that co-variance or cross-talk among measurements will bias these types of machine learning analyses, and should be carefully evaluated. Finally, while datasets on cell lines are sufficient to generate an initial framework, future models will need to be developed by collecting data from primary tissue, patient-derived organoids, or other in vivo systems, to extend these models toward physiological relevance.

Methods

See Supplementary Table 3 for a list of reagents, materials, and software used in the study.

Reporter cell line generation

The EKAR3.1 construct was produced from the EKAR3 plasmid by excising the BamHI-MfeI fragment and replacing it with the annealed oligos 5’- gatccgctccagatgtccctagaactccagtggataaagcaaagctgtcattccaatttccgc and 5’- aattgcggaaattggaatgacagctttgctttatccactggagttctagggacatctggagcg. This modification introduces an altered substrate sequence with an alanine added between a serine-proline pair near the ERK targeting region, removing a potential secondary phosphorylation site that would be detrimental to the linearity of the reporter signal relative to ERK activity. Stable cell lines were created by electroporating MCF10A (clone 5e), MCF7, HCC827, or A549 cells with the EKAR3.1 construct on the piggyBAC transposase system⁵³. Cells were selected with neomycin (250 μg/ml 2 weeks) until they were resistant to selection ( ~ 2 weeks).

Cell culture and media

MCF10A cells (clone 5e)⁵⁴ were maintained in DMEM/F12 supplemented with 5% horse serum, 20 ng/ml EGF, 10 μg/ml Insulin, 500 ng/ml hydrocortisone, and 100 ng/ml cholera toxin. HCC827 (ATCC, CRL-2868) and A549 (ATCC, CCL-185) cells were maintained in RPMI supplemented with 10% Fetal Bovine Serum (FBS) and 2.5 mM L-Glutamine. MCF7 (ATCC, HTB-22) cells were maintained in DMEM supplemented with 10% FBS. 10 cm plates were passaged approximately every four days and re-plated at a 1:10 dilution. Imaging experiments were conducted in custom DMEM/F12 lacking phenol red, riboflavin, and folate. This “imaging media” was supplemented with 500 ng/ml hydrocortisone, 17.5 mM glucose, 1 mM sodium pyruvate, 2 mM glutamine, 50 μg/ml penicillin/streptomycin. Before plating cells for imaging experiments, 5 μl of Rat tail collagen was added to the middle of each well of a glass bottom 96-well plate (Cellvis) and incubated for 45 mins at 37 °C. Cells were trypsinized, plated at 6000 cells per well, and then incubated at 37 °C for 45–60 mins. Growth media was then added, and the plate was incubated overnight. The next day, immediately before the imaging experiment, the plate was washed 3x with imaging media, and the media was changed to imaging media. The experiment began one hour after this media change. In experiments containing cancer cells, HCC827, A549, and MCF7 cells were plated in tandem, with each cell line plated on 1/3^rd of the available wells on the plate.

Live cell microscopy and data acquisition

Prepared 96-well plates were imaged on a Nikon Ti-E inverted microscope with a stage-top incubator (37 °C, 5% CO₂). Coordinates within each well of the 96-well plate were imaged at 6 minute increments which were automated by the Nikon Elements AR software. Images were captured using an Andor Zyla 5.5 scMOS camera and a 20x/0.75 NA objective. Chroma #49001 (ET-CFP) and #49003 (ET-YFP) excitation/emission filter cubes were used for mTurquoise2 and YPet measurements, respectively. Further details are described in ref. ⁵³. Coordinates of each acquisition area were saved for future imaging of immunostaining experiments.

Cyclic immunofluorescence

Immediately after the final acquisition of the live cell experiment, cells were fixed in freshly prepared 12% paraformaldehyde for 10 min. Cells were then permeabilized with fresh, cold methanol for 10 mins (2 times total). Cells were then ready for iterative rounds of staining (4i) using a protocol adapted from ref. ³¹. Briefly, the iterative protocol involves rounds of elution, blocking, primary staining, secondary staining, Hoechst staining, and finally image acquisition in a specific imaging buffer. Recipes for buffers are as follows: Elution buffer (0.5 M Glycine, 3 M Urea, 3 M Guanidinium Chloride, 70 mM TCEP), Blocking buffer (200 mM NH₄Cl, 300 mM Maleimide, 2% BSA in PBS), primary/secondary staining buffer (200 mM NH₄Cl, 2% BSA in PBS), Hoechst-33342 stain (1:10,000 in PBS), and 4i imaging buffer (700 mM N-Acetyl Cysteine). Antibodies were incubated for 24 to 48 hours from varying concentrations recommended by the manufacturer. For the MCF10A-only experiments, the protocol was validated during the first replicate experiment to ensure that antibodies were eluted, data is shown in Fig. S2b–d. For the second and third replicate experiments, a visual inspection was completed prior to each round of staining to ensure proper antibody elution. In the cancer cell experiments with additional antibody stains, a visual inspection was conducted between rounds to ensure proper elution.

Phos-tag western blotting

MCF10A 5e cells were plated on 6-well dishes the day before lysing. Cells were treated with EGF, PD0235901, or imaging media and lysed at the indicated time points. This procedure involved rinsing each well twice with ice-cold PBS, cell scraping, and lysis with RIPA buffer (Sigma) witha Halt protease inhibitor cocktail and 1 μM DTT. Cells were lysed at 80–90% confluency with 50 μl of lysis buffer per well. 2 μl of each sample was then loaded in pre-cast phos-tag gels (Wako-Chem) and ran at 100 V for 3 hours. The gel was chelated two times with transfer buffer and 10 mM EDTA for 15 minutes each and rinsed once more with just transfer buffer. Proteins were transferred overnight at 50 V. The membrane was blocked with Li-COR Odyssey blocking buffer and blotted with anti-GFP antibody (24 hr incubation). The membrane was then blotted with Li-COR 800 anti-Mouse secondary antibody and imaged using a fluorescent scanner (Sapphire-Azure Biosystems). Intensities of the resulting phosphorylated EKAR3.1 reporter and total EKAR3.1 bands were measured in ImageJ.

Image processing

Imaging data were saved as .nd2 files and accessed using the Bio-Formats toolbox for MATLAB (available from www.openmicroscopy.org/bio-formats), and processed with a custom MATLAB cell segmentation pipeline⁵³. The procedure identified each cell’s nucleus using either EKAR3.1 (live-cell) or Hoechst 33342 (IF) as a nuclear marker. The cytoplasm was defined as a ring around each cell’s nucleus. Background signal intensity was measured by imaging a well with no cells, but containing live-cell imaging media or 4i imaging buffer. Cell position tracking and linking were performed using uTrack 2.0⁵⁵. The resulting single-cell data was filtered to remove cells with less than 15 hours of tracking data. FRET measurements of ERK activity for each cell were calculated with 1 – ((CFP/YFP) / R_p), where CFP and YFP are the intensities of Cyan and YFP channels measured in each cell, respectively. R_p is the ratio of total power collected of CFP over YFP where the power of each channel is the integral of the spectral product of excitation intensity, filter transmittances, exposure time, fluorophore absorption and emission properties, and quantum efficiency of the camera (detailed in appendix of ref. ⁴¹). To link live-cell and fixed-cell data together, X-Y coordinates of cells in each well were registered to align cells across measurements. For wells with troublesome alignment results, image registration was conducted to calculate the shift in orientation between the images.

Batch effect correction

To correct for batch effects in the immunofluorescence measurements across three replicates of MCF10A experiments, we scaled measurements of each target to optimally align median values across identical treatments. As many measurement distributions were sufficiently long-tailed to bias even median values, the normalization was performed in logspace. For each 4i target, we calculated the median value for each treatment and matched identical treatments across replicates. These treatments included all EGF doses at timepoint 30, MEKi at timepoint 30, and imaging media control. We then took the log₁₀ of these values and fit a linear model (Eq. 1):

$${{Intensity}}_{{replicate}3}={\beta }_{1}\left({{Intensity}}_{{replicate}{{\rm{\_}}}n}\right)+{\beta }_{0}$$

(1)

Where Intensity_{replicate_n} represents log₁₀ median values for either replicate 1 or replicate 2, Intensity_replicate3 represents the corresponding log₁₀ median values for the third experimental replicate, and β₀ and β₁ are the scaling factors. These scaling factors were then used to correct all single-cell values for replicates 1 and 2. The corrected values were then returned to the linear scale by exponentiating. The normalization resulted in effective batch correction across all conditions. (Fig. S2c, d). A similar method was conducted in cancer cell experiments; however, since the three cell lines were plated in tandem and our goals was to preserve cell-line differences, we computed normalization samples as the median over all cell lines for each condition in each replicate. As such, all cell lines in each replicate were scaled with the identical scaling factor(β₁) per antibody. A visualization of these normalization results for each cell line is shown in Fig. S5d, S6d, S7d. Nuclear and Cytosolic measurements for each antibody stain were normalized individually.

EKAR3.1 Calibration

FRET measurements from MCF10A cells were calibrated to deliver a quantitative linear readout of ERK activity, as described previously⁴¹; refer to the Appendix of this past study for a detailed derivation of this method. Briefly, we used Phos-Tag immunoblotting to quantify the fraction of the EKAR3.1 reporter that is phosphorylated in 3 concentrations of EGF (15 mins), phosphorylation inhibited (MEKi for 2 hours), and control conditions (Supplementary Figs. S1b and S11). These values were then linearly fit against the average EKAR3.1 signal for the same conditions (Eq. 2).

$$\frac{{{EKAR}}^{A}}{{{EKAR}}^{T}}=\left({K}_{{AU}}+{K}_{{AP}}\right)\frac{{{EKAR}}^{P}}{{{EKAR}}^{T}}$$

(2)

Here, EKAR^P/EKAR^T represents the phos-tag ratio between phosphorylated and total reporter. The EKAR3.1 signal is pre-processed to estimate the fraction of EKAR molecules in an “associated” conformation (${f}_{A}$ or EKAR^A/EKAR^T), i.e.: ${f}_{A}=\frac{{{EKAR}}^{A}}{{{EKAR}}^{T}}=1-\frac{{I}_{{CFP}}}{{I}_{{YFP}}}/{R}_{P}$, where ${I}_{{CFP}}$ and ${I}_{{YFP}}$ refer to the measured intensity of the cyan (donor) and yellow (acceptor) channels, respectively, and ${R}_{P}$ refers to the corresponding ratio of imaging power in each of these channels. K_AU and K_AP represent the fractions of EKAR in the “associated” state when completely unphosphorylated and when completely phosphorylated, respectively. Single-cell FRET measurements (i.e. f_A) were then used to estimate the concentration ratio of active ERK to the competing phosphatase(s) (Eq. 3). This ratio is the quantitative measure of ERK activity in a cell (ERK^A/PPASE^A).

$$\frac{{{ERK}}^{A}}{{{PPASE}}^{A}}=\frac{{f}_{A}-\,{K}_{{AU}}}{{K}_{{AP}}-\,{f}_{A}}$$

(3)

Data analysis and regression modeling

Cells with less than 15 hours of data were removed prior to analysis, and cells out of the expected range of the FRET measurements were removed. FRET measurements were then adjusted using the reporter calibration model created from the phos-tag experiments. Thus, statistical models were created on cells that had complete EKAR and ETG measurements. Models were created using 10-fold cross-validation. The data were randomly assigned to 10 groups, with the 10^th group held out of the model fitting procedure. The model was then tested against the 10^th group (test-set) to collect the test error (residual mean squared error, RMSE). This procedure is repeated for a total of 10 times to collect RMSE values from 10 test sets. The model that produced the lower test-set error was then refitted to the entire dataset to calculate the reported RMSE values.

Pulse analysis and peak detection

The findpeaks function in MATLAB was used to find local maxima (peaks) for each cell’s ERK activity. Pulse features were then calculated based on the identified peaks. Frequency was calculated using the meanfreq function in MATLAB.This function estimates the mean normalized frequency of the power spectrum of each ERK activity trace.

Statistical tests

For single-cell immunofluorescence data, each statistical comparison was made by t-test with unequal variances, and false discovery rate was controlled within each dataset via the Benjamini and Hochberg Step-Up procedure (α = 0.05). The variance for each experiment was determined from single-cell samples and added to the variance across experiments. This corresponds to a linear error model: ε_i = ε_cell + ε_exp, where the error (from the mean) of an individual cell ε_i equals the sum of the errors arising from cell-to-cell variation ε_cell and from experiment variation ε_exp.

Spatial heatmap generation

Each cell’s time-averaged coordinates were used to calculate the average Euclidean distance between each pair of cells within each well of the 96-well plate. Hierarchical clustering was performed on this distance matrix. The optimal leaf order was calculated by maximizing the sum of the similarity between adjacent leaves by flipping tree branches and without dividing the clusters. This order was then used to sort and display the live-cell and fixed-cell data.

ETG prediction models and evaluation

CNN models consisted of 1) a feature learning module and 2) a prediction module. The feature learning module consists of 2 convolutional layers (16 channels and kernel size of 16) followed by an FC layer with a size of 192 to match the initial input size. The prediction module consists of 2 FC layers (each size of 64 with relu activations) followed by a final linear FC layer that outputs a single ETG prediction. We performed hyperparameter variation for the learning rate ([0.01,0.001,0.0001]) and L2-regularization ([0.1,0.01,0.001]) to compare the average test prediction performance (MSE). While the performance was robust across parameters, we chose a learning rate of 0.001 and L2-regularization of 0.01, which had the best overall performance. We trained one model per ETG for 100 epochs using the Adam optimizer with a learning rate of 0.001 and L2 regularization of 0.001. For the linear model, linear regression was implemented using the sklearn python package with default parameters. The inputs were either the raw or featurized ERK activity for the linear model or the featurized linear model, respectively. Evaluation of the model was performed by holding out 20% of the total data as a test set per fold, then splitting the remaining data between the training set, consisting of 64% of the total data, and the validation set consisting of 16% of total data,.with each fold having roughly the same representation from each well of origin and treatment.

To identify significant time points, we used feature attribution, specifically Integrated Gradient⁵⁶, to identify input time points that the model considers significant tothe prediction of ETG. Integrated Gradient was implemented using the Python package Captum⁵⁷. Feature attribution outputs score from each input time point to ETG per cell, which was averaged across cells for summarized visualization in the form of a heatmap.

To test the importance of the timepoints after the initial stimulation, we trained new CNN models to only use timepoints 2 hours after stimulation for ETG prediction. This model was trained on 14 hours of ERK activity data. The model and training setup used is identical to the setup used for the model with all the time points (19 hours of ERK activity data).

While recurrent neural networks (RNNs) or sequence-based models are typically better suited for time series data, they are more difficult to train, especially for noisy continuous data. For our simple regression task with the additional goal of identifying the most informative time points, CNN was sufficient. Additionally, RNNs would have a harder time modeling spikes from stimulus as it cannot consider absolute time/position. While more advanced sequenced models such as attention-based models could have been used, the goal here was not necessarily to achieve the best prediction possible but to identify significant time points. Therefore, we chose to use CNN for our analysis with MSE and R² metrics for evaluation.

Temporal ERK Signaling Classifier

EKAR time series data were clustered into five groups using k-means clustering. Each group was assigned its class label. Because the k-means algorithm clusters all cells into a respective class, individual cells whose signal did not correlate with the mean of the class were removed. To do so, we calculated the Pearson correlation (r) between each cell and the mean of its class and enforced a minimum threshold of r = 0.7. Next, an ensemble learning method (AdaBoost M2⁵⁸ with 500 learning cycles) was fit using antibody staining measurements to predict the class labels of each cell. This was done using MATLAB’s fitcensemble function. 10-fold cross-validation was used and the model with the lowest test set error was displayed. Predictor Importance was estimated by summing the estimates over all weak learners in the ensemble–MATLAB predictorImportance(ens) function. Regional and Whole-well models were trained similarly, except that the data was first grouped into regional neighborhoods or well averages. For whole-well models (Fig. 6e, green bars), a standard decision tree classifier was used due to the small sample size. MCF10A classification models used nuclear pERK, Fra-1, c-Myc, c-Jun, (p)c-Fos, Egr-1, and pRb as predictors, whereas cancer cell lines additionally used DUSP1^nuc, DUSP6^cyt, p4EBP1 ^nuc, FoxO1 ^nuc, RSK1 ^nuc, GSK3β^nuc, EZH2 ^nuc, pEGFR^cyt, pp70S6k^cyt, NF-1^cyt, and E-Cadherin^cyt.

Ordinary Differential Equation Modeling

The ODE model was adapted from Davies et al. ²⁹. The model of ERK-dependent gene expression (Eqs. 4–7) was constructed from a mass action approximation. This process is modeled in four steps (Eq. 4) phosphorylation of a transcription factor by ERK (TF^P), (Eq. 5) transcription of target mRNA (mRNA), (Eq. 6) translation of target protein (P), and (Eq. 7) potential stabilization of target protein by ERK-dependent phosphorylation (P^P). A regulatory term is included in the transcription process allowing negative feedback from the target protein onto its own production. The model is formulated as a delay differential equation to account for the effective lag times of transcription and translation without explicitly addressing the complex processes involved. For the purpose of simulating a set of specific hypothetical ETGs, we selected a reduced parameter space from which to sample (see Supplemental Table 2). Given the relative simplicity of the model, the parameters to vary were chosen analytically, avoiding direct correlations and minimizing the sampling space dimension.

$$\frac{d}{{dt}}\,{{TF}}^{P}\left(t\right)=\,{k}_{{pTF}} * {ERK}\left(t\right) * \left({{TF}}^{T}-\,{{TF}}^{P}\left(t\right)\right)-\,{k}_{{dTF}} * \,{{TF}}^{P}\,(t)$$

(4)

$$\frac{d}{{dt}}{mRNA}\left(t\right)=\frac{{k}_{b}+\,{k}_{m}*\,{{TF}}^{P}(t-\,{\tau }_{m})}{{\left(\frac{P\left(t-{\tau }_{m}\right)+{P}^{P}(t-{\tau }_{m})}{{k}_{D}}\right)}^{v}+1}-\,{k}_{{{\varnothing }}m} * {mRNA}(t)$$

(5)

$$\frac{d}{{dt}}\,P\left(t\right)=\,{k}_{P} * {mRNA}\left(t-\,{\tau }_{P}\right)+{k}_{{dP}} * {P}^{P}\left(t\right)-\left({k}_{{{\varnothing }}P}+\,{k}_{{P}^{P}}\right) * P(t)$$

(6)

$$\frac{d}{{dt}}\,{P}^{P}\left(t\right)=\,{k}_{{pP}} * {ERK}\left(t\right) * P\left(t\right)-\left({k}_{{{\varnothing }}{P}^{P}}+\,{k}_{{dP}}\right) * {P}^{P}(t)$$

(7)

UMAP projections

To embed the high-dimensional immunofluorecence data into 2 dimensions for visualization, we used the UMAP algorithm⁵⁹. UMAP was run using the umap-learn python package. Default hyperparameters were used: n_neighbors = 15, min_dist = 0.1, and n_components = 2. UMAPs of MCF10A data were created with nuclear measurements of Egr-1, Fra-1, c-Jun, c-Myc, c-Fos, pERK, pc-Fos, and pRb. UMAPs of HCC827, A549, and MCF7 were created with nuclear measurements of Egr-1, Fra-1, c-Jun, c-Myc, c-Fos, pERK, pc-Fos, pRb, DUSP1, p4EBP1, FoxO1, RSK1, GSK3β, EZH2, along with cytosolic measurements of pEGFR, pp-70s6k, NF-1, E-Cadherin, and DUSP6.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All source data for figures, including all processed FRET and immunofluorescence values for single cells, have been deposited in a Figshare repository and are available at this link: https://doi.org/10.6084/m9.figshare.27047617. Due to the file size limitations for existing repositories, raw microscopy image sets will be provided by the corresponding author upon e-mailed request.

Code availability

All custom MATLAB and Python code for model generation, data analysis, and figure plotting are provided here: https://doi.org/10.6084/m9.figshare.27047617.

References

Lavoie, H., Gagnon, J. & Therrien, M. ERK signalling: a master regulator of cell behaviour, life and fate. Nat. Rev. Mol. Cell Biol. 21, 607–632 (2020).
CAS PubMed Google Scholar
Min, M., Rong, Y., Tian, C. & Spencer, S. L. Temporal integration of mitogen history in mother cells controls proliferation of daughter cells. Science 368, 1261–1265 (2020).
ADS CAS PubMed PubMed Central Google Scholar
Zwang, Y. et al. Two phases of mitogenic signaling unveil roles for p53 and EGR1 in elimination of inconsistent growth signals. Mol. Cell 42, 524–535 (2011).
CAS PubMed PubMed Central Google Scholar
Jones, S. M. & Kazlauskas, A. Growth-factor-dependent mitogenesis requires two distinct phases of signalling. Nat. Cell Biol. 3, 165–172 (2001).
CAS PubMed Google Scholar
Yang, C., Tian, C., Hoffman, T. E., Jacobsen, N. K. & Spencer, S. L. Melanoma subpopulations that rapidly escape MAPK pathway inhibition incur DNA damage and rely on stress signalling. Nat. Commun. 12, 1747 (2021).
ADS CAS PubMed PubMed Central Google Scholar
Gerosa, L. et al. Receptor-driven ERK pulses reconfigure MAPK signaling and enable persistence of drug-adapted BRAF-mutant melanoma cells. Cell Syst. 11, 478–494.e9 (2020).
CAS PubMed PubMed Central Google Scholar
Bollag, G. et al. Clinical efficacy of a RAF inhibitor needs broad target blockade in BRAF-mutant melanoma. Nature 467, 596–599 (2010).
ADS CAS PubMed PubMed Central Google Scholar
Milde-Langosch, K. et al. Expression and prognostic relevance of activated extracellular-regulated kinases (ERK1/2) in breast cancer. Br. J. Cancer 92, 2206 (2005).
CAS PubMed PubMed Central Google Scholar
Aoki, K. et al. Stochastic ERK activation induced by noise and cell-to-cell propagation regulates cell density-dependent proliferation. Mol. Cell 52, 529–540 (2013).
CAS PubMed Google Scholar
Ponsioen, B. et al. Quantifying single-cell ERK dynamics in colorectal cancer organoids reveals EGFR as an amplifier of oncogenic MAPK pathway signalling. Nat. Cell Biol. 23, 377–390 (2021).
CAS PubMed PubMed Central Google Scholar
Aikin, T. J., Peterson, A. F., Pokrass, M. J., Clark, H. R. & Regot, S. MAPK activity dynamics regulate non-cell autonomous effects of oncogene expression. Elife 9, e60541 (2020).
Albeck, J. G., Mills, G. B. & Brugge, J. S. Frequency-modulated pulses of ERK activity transmit quantitative proliferation signals. Mol. Cell 49, 249–261 (2013).
CAS PubMed Google Scholar
Uhlitz, F. et al. An immediate–late gene expression module decodes ERK signal duration. Mol. Syst. Biol. 13, 928 (2017).
PubMed PubMed Central Google Scholar
Murphy, L. O., Smith, S., Chen, R.-H., Fingar, D. C. & Blenis, J. Molecular interpretation of ERK signal duration by immediate early gene products. Nat. Cell Biol. 4, 556–564 (2002).
CAS PubMed Google Scholar
Murphy, L. O., MacKeigan, J. P. & Blenis, J. A network of immediate early gene products propagates subtle differences in mitogen-activated protein kinase signal amplitude and duration. Mol. Cell. Biol. 24, 144–153 (2004).
CAS PubMed PubMed Central Google Scholar
Cook, S. J., Aziz, N. & McMahon, M. The repertoire of fos and jun proteins expressed during the G1 phase of the cell cycle is determined by the duration of mitogen-activated protein kinase activation. Mol. Cell. Biol. 19, 330–341 (1999).
CAS PubMed PubMed Central Google Scholar
Nakakuki, T. et al. Ligand-specific c-Fos expression emerges from the spatiotemporal control of ErbB network dynamics. Cell 141, 884–896 (2010).
CAS PubMed PubMed Central Google Scholar
Ram, A. et al. A guide to ERK dynamics, part 2: downstream decoding. Biochem. J. 480, 1909–1928 (2023).
CAS PubMed Google Scholar
Gillies, T. E., Pargett, M., Minguet, M., Davies, A. E. & Albeck, J. G. Linear integration of ERK activity predominates over persistence detection in Fra−1 regulation. Cell Syst. 5, 549–563.e5 (2017).
CAS PubMed PubMed Central Google Scholar
Wilson, M. Z., Ravindran, P. T., Lim, W. A. & Toettcher, J. E. Tracing information flow from Erk to target gene induction reveals mechanisms of dynamic and combinatorial control. Mol. Cell 67, 757–769.e5 (2017).
CAS PubMed PubMed Central Google Scholar
Saito, T. H., Uda, S., Tsuchiya, T., Ozaki, Y.-I. & Kuroda, S. Temporal decoding of MAP kinase and CREB phosphorylation by selective immediate early gene expression. PLoS One 8, e57037 (2013).
ADS CAS PubMed PubMed Central Google Scholar
Ender, P. et al. Spatiotemporal control of ERK pulse frequency coordinates fate decisions during mammary acinar morphogenesis. Dev. Cell 57, 2153–2167.e6 (2022).
CAS PubMed Google Scholar
Escobar-Hoyos, L. F. et al. Altered RNA splicing by mutant p53 activates oncogenic RAS signaling in pancreatic cancer. Cancer Cell 38, 198–211.e8 (2020).
CAS PubMed PubMed Central Google Scholar
Flaherty, K. T. et al. Inhibition of mutated, activated BRAF in metastatic melanoma. N. Engl. J. Med. 363, 809–819 (2010).
CAS PubMed PubMed Central Google Scholar
Buffet, C. et al. DUSP5 and DUSP6, two ERK specific phosphatases, are markers of a higher MAPK signaling activation in BRAF mutated thyroid cancers. PLoS One 12, e0184861 (2017).
PubMed PubMed Central Google Scholar
Yoo, D. H., Im, Y. S., Oh, J. Y., Gil, D. & Kim, Y.-O. DUSP6 is a memory retention feedback regulator of ERK signaling for cellular resilience of human pluripotent stem cells in response to dissociation. Sci. Rep. 13, 1–16 (2023).
ADS Google Scholar
Bugaj, L. J. et al. Cancer mutations and targeted drugs can disrupt dynamic signal encoding by the Ras-Erk pathway. Science 361, eaao3048 (2018).
Ravindran, P. T., McFann, S., Thornton, R. H. & Toettcher, J. E. A synthetic gene circuit for imaging-free detection of signaling pulses. Cell Syst. 13, 131–142.e13 (2022).
CAS PubMed Google Scholar
Davies, A. E. et al. Systems-level properties of EGFR-RAS-ERK signaling amplify local signals to generate dynamic gene expression heterogeneity. Cell Syst. 11, 161–175.e5 (2020).
CAS PubMed PubMed Central Google Scholar
Ryu, H. et al. Frequency modulation of ERK activation dynamics rewires cell fate. Mol. Syst. Biol. 11, 838 (2015).
Gut, G., Herrmann, M. D. & Pelkmans, L. Multiplexed protein maps link subcellular organization to cellular states. Science 361, eaar7042 (2018).
Yamamoto, T. et al. Continuous ERK activation downregulates antiproliferative genes throughout G1 phase to allow cell-cycle progression. Curr. Biol. 16, 1171–1182 (2006).
CAS PubMed Google Scholar
Cheong, R., Rhee, A., Wang, C. J., Nemenman, I. & Levchenko, A. Information transduction capacity of noisy biochemical signaling networks. Science 334, 354–358 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Gookin, S. et al. A map of protein dynamics during cell-cycle progression and cell-cycle exit. PLoS Biol. 15, e2003268 (2017).
PubMed PubMed Central Google Scholar
Teague, S. et al. Time-integrated BMP signaling determines fate in a stem cell model for early human development. Nat. Commun. 15, 1471 (2024).
ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Yao, J., Pilko, A. & Wollman, R. Distinct cellular states determine calcium signaling response. Mol. Syst. Biol. 12, 894 (2016).
PubMed PubMed Central Google Scholar
Moffitt, J. R. & Zhuang, X. RNA imaging with multiplexed error-robust fluorescence in situ hybridization (MERFISH). Methods Enzymol. 572, 1–49 (2016).
CAS PubMed PubMed Central Google Scholar
Nishimura, M., Inoue, Y. & Hayashi, S. A wave of EGFR signaling determines cell alignment and intercalation in the Drosophila tracheal placode. Development 134, 4273–4282 (2007).
CAS PubMed Google Scholar
Nakamura, A., Goto, Y., Kondo, Y. & Aoki, K. Shedding light on developmental ERK signaling with genetically encoded biosensors. Development 148, dev199767 (2021).
Kramer, B. A., Sarabia Del Castillo, J. & Pelkmans, L. Multimodal perception links cellular state to decision-making in single cells. Science 377, 642–648 (2022).
ADS CAS PubMed Google Scholar
Gillies, T. E. et al. Oncogenic mutant RAS signaling activity is rescaled by the ERK/MAPK pathway. Mol. Syst. Biol. 16, e9518 (2020).
CAS PubMed PubMed Central Google Scholar
Lin, Y., Sohn, C. H., Dalal, C. K., Cai, L. & Elowitz, M. B. Combinatorial gene regulation by modulation of relative pulse timing. Nature 527, 54–58 (2015).
ADS CAS PubMed PubMed Central Google Scholar
Sarkisian, C. J. et al. Dose-dependent oncogene-induced senescence in vivo and its evasion during mammary tumorigenesis. Nat. Cell Biol. 9, 493–505 (2007).
CAS PubMed Google Scholar
Hiratsuka, T. et al. Intercellular propagation of extracellular signal-regulated kinase activation revealed by in vivo imaging of mouse skin. Elife 4, e05178 (2015).
PubMed PubMed Central Google Scholar
DeCuzzi, N. L. et al. Spatiotemporal clusters of ERK activity coordinate cytokine-induced inflammatory responses in human airway epithelial cells. Am. J. Respir. Cell Mol. Biol. https://doi.org/10.1165/rcmb.2024-0256OC (2024).
Gagliardi, P. A. et al. Collective ERK/Akt activity waves orchestrate epithelial homeostasis by driving apoptosis-induced survival. Dev. Cell https://doi.org/10.1016/j.devcel.2021.05.007 (2021).
Comandante-Lou, N., Baumann, D. G. & Fallahi-Sichani, M. AP-1 transcription factor network explains diverse patterns of cellular plasticity in melanoma cells. Cell Rep. 40, 111147 (2022).
CAS PubMed PubMed Central Google Scholar
Stallaert, W. et al. The molecular architecture of cell cycle arrest. Mol. Syst. Biol. 18, e11087 (2022).
CAS PubMed PubMed Central Google Scholar
Raina, D., Fabris, F., Morelli, L. G. & Schröter, C. Intermittent ERK oscillations downstream of FGF in mouse embryonic stem cells. Development 149, dev199710 (2022).
Chen, J.-Y., Lin, J.-R., Cimprich, K. A. & Meyer, T. A two-dimensional ERK-AKT signaling code for an NGF-triggered cell-fate decision. Mol. Cell 45, 196–209 (2012).
CAS PubMed Google Scholar
Gross, S. M. & Rotwein, P. Quantification of growth factor signaling and pathway cross talk by live-cell imaging. Am. J. Physiol. Cell Physiol. 312, C328–C340 (2017).
PubMed PubMed Central Google Scholar
Hirashima, T. Live imaging approach of dynamic multicellular responses in ERK signaling during vertebrate tissue development. Biochem. J. 479, 129–143 (2022).
CAS PubMed Google Scholar
Pargett, M., Gillies, T. E., Teragawa, C. K., Sparta, B. & Albeck, J. G. Single-Cell Imaging of ERK Signaling Using Fluorescent Biosensors. Methods Mol. Biol. 1636, 35–59 (2017).
CAS PubMed PubMed Central Google Scholar
Janes, K. A., Wang, C.-C., Holmberg, K. J., Cabral, K. & Brugge, J. S. Identifying single-cell molecular programs by stochastic profiling. Nat. Methods 7, 311–317 (2010).
CAS PubMed PubMed Central Google Scholar
Jaqaman, K. et al. Robust single-particle tracking in live-cell time-lapse sequences. Nat. Methods 5, 695–702 (2008).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (eds. Precup, D. & Teh, Y. W.) vol. 70 3319–3328 (PMLR, 06–11 Aug 2017).
Kokhlikyan, N. et al. Captum: A unified and generic model interpretability library for PyTorch. arXiv [cs.LG] (2020).
Freund, Y. & Schapire, R. E. Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on International Conference on Machine Learning 148–156 (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1996).
McInnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv [stat.ML] (2018).

Download references

Acknowledgements

This work was supported by the National Institute of General Medical Sciences (R01GM115650 and R35GM139621 to JGA; T32GM007377 and 5R25GM056765 to AR) the National Heart, Lung, and Blood Institute (R01HL151983 to JGA), and the National Cancer Institute (U54CA283766). We thank Taryn Gillies for their helpful feedback on the manuscript.

Author information

Authors and Affiliations

Department of Molecular and Cellular Biology, University of California, Davis, CA, USA
Abhineet Ram, Michael Pargett, Yongin Choi, Devan Murphy, Carolyn Teragawa, Markhus Cabel, Nont Kosaisawe, Gerald Quon & John G. Albeck

Authors

Abhineet Ram
View author publications
Search author on:PubMed Google Scholar
Michael Pargett
View author publications
Search author on:PubMed Google Scholar
Yongin Choi
View author publications
Search author on:PubMed Google Scholar
Devan Murphy
View author publications
Search author on:PubMed Google Scholar
Carolyn Teragawa
View author publications
Search author on:PubMed Google Scholar
Markhus Cabel
View author publications
Search author on:PubMed Google Scholar
Nont Kosaisawe
View author publications
Search author on:PubMed Google Scholar
Gerald Quon
View author publications
Search author on:PubMed Google Scholar
John G. Albeck
View author publications
Search author on:PubMed Google Scholar

Contributions

A.R., M.P., and J.G.A. conceptualized the study, interpreted data, performed data analysis, and wrote the manuscript. A.R. and C.T. conducted imaging experiments. M.P. created the ordinary differential equation model and the scripts for gene simulations. Y.C. and G.Q. performed convolutional neural network analysis. D.M. assisted with immunoblots and image registration. M.C. assisted with cell culture. N.K. assisted with data analysis.

Corresponding author

Correspondence to John G. Albeck.

Ethics declarations

Competing interests

John Albeck has received research grants from Kirin Corporation. The other authors declare no competing interests.

Inclusion and Ethics

All collaborators have fulfilled the criteria for authorship required by Nature Portfolio and are included as authors of this study. This research is locally relevant and included local researchers throughout the entire research process including study design, study implementation, data ownership, intellectual property, and authorship of publications. Roles and responsibilities were agreed amongst collaborators ahead of the research. This research was not severely restricted or prohibited in the researchers’ setting. This work does not result in stigmatization, incrimination, discrimination or otherwise personal risk to participants, nor does it risk the health, safety, and security of the researchers. We have considered relevant local and regional research in the citations.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Reporting Summary

Description of Additional Supplementary Files

Supplementary Movie 1

Supplementary Movie 2

Supplementary Movie 3

Supplementary Movie 4

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ram, A., Pargett, M., Choi, Y. et al. Deciphering the history of ERK activity from fixed-cell immunofluorescence measurements. Nat Commun 16, 4721 (2025). https://doi.org/10.1038/s41467-025-58348-7

Download citation

Received: 10 February 2023
Accepted: 19 March 2025
Published: 21 May 2025
DOI: https://doi.org/10.1038/s41467-025-58348-7