The scale of the challenge

The 2019 Nobel Prize in Chemistry acknowledged the groundbreaking contributions of John B. Goodenough, M. Stanley Whittingham and Akira Yoshino, whose contribution to developing lithium-ion batteries (LIBs) has revolutionized energy storage. LIBs are keystone technology for zero-emissions energy systems to seek to mitigate climate change amid escalating environmental concerns. The transportation sector, a major contributor to greenhouse gas emissions (28% in the US alone)1, can achieve significant decarbonization through electrification. This need is reflected in substantial government funding allocated through initiatives like the US Bipartisan Infrastructure Law and Inflation Reduction Act. Leading automakers, including General Motors, Ford, Toyota, and Volvo, have committed to a complete transition to electric vehicles (EVs) within the next decade, further amplifying this shift2,3,4,5,6. Global EV sales are projected to reach 85 million by 20307. The transportation revolution extends beyond ground vehicles, with the unveiling of electric flying cars like the Jetson personal electric aerial vehicle8 and the Alice electric aircraft9. These commitments and innovations highlight the burgeoning demand for high-performance energy storage solutions as the automotive industry embraces a sustainable transportation future powered by advanced batteries.

While LIBs stand out as the predominant energy storage solution for EVs, their global adoption remains in its early stages. As of 2022, EVs represented a mere 6.2% of the total vehicles in the United States10. The pivotal challenges impeding the transition from internal combustion vehicles to EVs encompass accurately estimating driving range, ensuring battery longevity, addressing safety concerns, and establishing economic viability. A notable obstacle in this transition is the notable gap between battery research conducted in academic settings and the practical implementation of these technologies in the industry. Bridging this divide is crucial for advancing the evolution of EV batteries. As the demand for advanced energy storage solutions continues to surge, there is an escalating need for innovative methodologies that can seamlessly translate from academic research, encompassing cell modeling, to practical applications at the system level. Addressing these challenges becomes imperative for propelling the successful integration of EVs into mainstream transportation.

To address the challenge of EV range anxiety, predicting future battery performance is essential for estimating the potential driving distance. Intelligent learning algorithms driven by data, specifically machine learning11, offer a promising avenue for realizing these predictions. However, traditional machine learning, commonly employed in academic battery research, relies heavily on extensive training data. For example, modeling battery lifetime performance often necessitates a range of real data, spanning from fresh to aged cells12, and includes both damaged and healthy cells operating under diverse dynamic conditions. Laboratory generation of such data is not only tedious and time-consuming but also financially burdensome13. Alternatively, extracting data from the industry confronts obstacles related to data sharing and privacy, particularly when proprietary information is involved14. Furthermore, conventional machine learning models are essentially black boxes without defined physical constraints. In battery prognostics, these models may produce predictions beyond practically feasible ranges or exhibit low accuracy for unseen data15,16. The absence of physical information hampers the safety, maintenance, and applicability of such models in battery science. Unless machine learning becomes more interpretable in a physical sense, its translation into industry applications is at risk17,18,19.

Thus, despite LIBs dominating the energy storage landscape, significant challenges persist in areas such as driving range, lifespan, safety, and cost, hindering widespread EV adoption. A primary obstacle is the disconnect between academic research and industrial applications. To bridge this gap and enhance battery performance, we emphasize integrating physics and machine learning methods. This perspective explores the transformative potential of combining physics and machine learning in battery research, with a focus on the modeling and control of LIBs and their application in the EV industry. Despite the awareness of emerging battery technologies, such as Na-ion and solid-state batteries, this article concentrates on LIBs. However, the same method can also be extended to other battery technologies.

Where we are and the potential

Understanding end-user requirements is pivotal in battery research. Notably, battery experiments conducted in academic laboratories frequently operate under conditions and parameters that diverge substantially from those encountered by EVs20. The pitfalls associated with the creation of real-life experimental conditions can hinder the scaling-up and manufacturing of batteries, as well as impede the seamless transfer of technology to industry. This section delves into some of the bottlenecks inherent in existing methodologies.

A critical challenge in meeting industry demands lies in the scarcity of high-accuracy and economically viable battery models, which are essential for optimizing performance, ensuring safety, and facilitating diagnosis. Existing LIB models fall broadly into two categories: physics-based and machine-learning models. Physics-based models, classified as white-box models, intricately capture the dynamics of physical processes and are commonly formulated through partial differential equations (PDEs) or ordinary differential equations. This category can be further divided into electrochemical models and equivalent circuit models (ECMs)21.

Electrochemical models, exemplified by the Doyle-Fuller-Newman or single particle models, are formulated through nonlinear partial differential algebraic equations, presenting complexity and requiring the identification of numerous parameters22. The computational intensity of physics-based models becomes apparent, particularly in solving multiphysics PDEs. Moreover, simulations involving nonlinear high-order PDEs, such as the Allen–Cahn, Cahn–Hilliard, and Navier Stokes equations, may encounter convergence challenges23. Additionally, physics, grounded in first principles analysis of idealized systems, frequently falls short in predicting the behavior of real-world systems24. On the other hand, though ECMs offer a low computational burden, they exhibit limitations in capturing crucial internal electrochemical states, such as side-reaction overpotentials and electrode surface concentration, which are vital for fast charging and power prediction. Furthermore, many of these models rely on constant parameters, rendering them often inadequate in capturing nonlinear phenomena within battery dynamics.

In contrast, machine learning operates as a black-box model that typically lacks the incorporation of physically meaningful information, thereby constraining its utility for accurate physical state estimation. As previously highlighted, machine learning necessitates substantial and high-quality data for effective training and validation and operates independently of the laws of physics, which occasionally yields solutions that are physically impractical. Nonetheless, the advantage of machine learning over physics-based models lies in its ability to discern patterns within measured data, particularly in cases where the underlying physical laws are not well understood25.

As shown in Table 1, the comparison of pure physics-based and machine-learning models is demonstrated. As evident, physics-based models require lower data requirements and have better extrapolation and interpretation but higher computation costs. On the contrary, machine learning enables lower computation costs but higher data requirements and worse extrapolation and interpretation. Based on the comparison, we are motivated to explore the prospects of combining physics and machine learning to compensate for their respective weaknesses and, at the same time, accommodate the strengths of each approach.

Table 1 Comparison of physics-based and machine-learning models

Why can integrating physics and machine learning help?

The integration of physics and machine learning proves advantageous for battery management due to the essential roles played by both disciplines. Managing batteries poses a real engineering challenge, requiring consideration of multiple factors simultaneously, including accuracy, robustness, computation cost, deployment cost, and more. Consequently, it becomes imperative to leverage available information in an optimized manner to address this multifaceted problem effectively26.

We illustrate the inter-correlation of resolved, modeled, and observed physics with respect to physics-based and machine-learning models in Fig. 1. Physics-based models, grounded in the first principles of LIB mechanisms, offer notable explainability and generalization capability27. However, enhancing the physics-based understanding of battery dynamics remains a formidable challenge, especially since the resolved physics is considerably limited. Although the battery dynamic can be modeled based on modeled physics, such as the pseudo-two-dimensions (P2D) models, they rely on PDEs with constrained capability and high computation costs. To alleviate the complexity of battery modeling, numerous assumptions are introduced28. Concurrently, the parameterization of the P2D model through experimental methods typically requires several months, yielding limited accuracy due to measurement errors and the inherent constraints of the P2D model29,30. Moreover, a lot of observed physics and unobserved physics are still difficult to model due to their complexity and the limitations of research.

Fig. 1
figure 1

Limitations of pure physics-based models and machine learning models.

The machine learning models derive strength from extensive battery data obtained under both laboratory and real-application conditions, coupled with diverse machine learning models31. As shown in Fig. 1, in an ideal scenario, a perfectly trained and tuned machine-learning model has the potential to encompass the entire physics of battery behavior. However, the labor-intensive and costly process of labeling battery data persists, even under ideal laboratory conditions, and there remains a lack of machine learning architecture fully cognizant of battery mechanisms32,33. Real-time extraction of limited labels, such as state of charge, capacity, open circuit voltage, and resistance, is possible in laboratory conditions with expensive measurements involving voltage, current, and temperature. However, these labels only represent external battery performance, leaving the internal physics mechanisms unaccounted for in real time34. For batteries in real-world applications, like EVs and energy storage systems, labeling becomes even more challenging in laboratory conditions, as periodic check-up tests are impractical35. In battery management research, general machine learning models like long short-term memory neural networks and convolutional neural networks are extensively used. However, a distinctive architecture based on the unique mechanisms of batteries is yet to be fully realized.

By synergizing physics and machine learning, we can harness the complementary strengths of both models, leading to a substantial enhancement in battery management research. This paper offers a concise overview of the integration of physics and machine learning, aiming to shed light on its potential to address prevailing challenges in battery research. The inception of integrating physics and machine learning as a viable solution to ESS prediction challenges emerged around 202036. A noteworthy example is the work by Sendek et al. in 202337, where they predicted the time required to measure the ionic conductivity of 18,000 Li-ion-containing materials. In a comparative analysis, considering present-day technology, the estimated times for experimental, physical model, and integration of physics and machine learning approaches were found to be 1500 years, 700 years, and one month, respectively. This striking comparison underscores the accelerated prediction speed offered by the integration of physics and machine learning, surpassing the efficiency of either physics or machine learning alone.

In 2021, Karniadakis et al.38 introduced the innovative approach of crafting machine learning algorithms that incorporate physical information, aiming to mitigate the arduous task of training networks with vast amounts of data. This methodology capitalizes on leveraging the insights offered by physical laws, advocating for the integration of physical models into machine learning frameworks. Notably, this approach is applicable even in scenarios involving partially understood and uncertain systems, demonstrating scalability to address large-scale problems. In practical terms, where retrieving and cataloging data from experiments or real-life operations incurs substantial costs, conventional machine learning becomes a less favorable option for performance modeling.

The integration of physics and machine learning provides a computationally efficient alternative to high-fidelity simulations, mitigating computational costs. machine learning exhibits the capability to analyze a broad spectrum of degradation mechanisms and operating conditions, capturing infrequent loading events often overlooked by simplified physics-based simulations39. Overcoming this limitation, the amalgamation of physics and machine learning has garnered attention. In April 2021, Nature Machine Intelligence presented machine learning as a promising tool for estimating battery degradation39. Given its advantages, the battery research community is actively exploring integration strategies to delve into dynamic aspects such as degradation and state of health40,41,42, thermal runaway and safety features23, internal potential predictions42, and Lithium diffusivity43. A crucial challenge is accurately replicating real-world battery usage in experiments. This necessitates meticulous testing with precise charging and resting protocols that mimic EV cycling schedules44,45,46. This process can be time-consuming, spanning months to years, particularly when considering calendar aging. For instance, creating a dataset of 124 Lithium Ferrous Phosphate cells for predicting early failures during fast charging can take several months47. In the context of second-life applications, making retired EV batteries suitable for a circular economy involves reusing, remanufacturing, repurposing, or recycling them for commercial availability at a significantly lower price48. Achieving accurate and realistic battery lifespan predictions under diverse conditions, ranging from temperature variations to material modifications, poses a formidable challenge for smaller, individual academic research laboratories.

In contrast to standalone physics or machine learning approaches, the integration of physics and machine learning offers several advantages in terms of enhanced interpretability and explainability. These advantages, coupled with limited data requirements and lower computational costs, serve to bridge the gaps between industry and academia. Through enhanced generalization and transferability, this integration can leverage inexpensive simulation data from physics to enhance machine learning estimation accuracy. This facilitates low-cost data acquisition and successful calibration, even with a limited number of observations, thus addressing data generation limitations in laboratories or restrictions in accessibility to experimental data from industry. Effective knowledge transfer between academic research and industrial applications hinges on the exchange of data and information related to real-life battery usage mechanisms. A critical gap exists in the systematic collection, standardization, and accessibility of experimental battery data. Integrated cyber-physical systems, often referred to as digital twins, play a pivotal role in building a robust database of open-sourced meaningful data. This initiative aims to bridge the gap between academia and industry with applications encompassing the diagnosis of failure mechanisms, cost-effective computation, optimized manufacturing protocols, shortened product development periods, scalability, and contributions to the circular economy.

How do we integrate physics and machine learning?

The integration of physics and machine learning is classified into two distinct categories: internal integration and external integration, represented by Fig. 2a. As illustrated in Fig. 2b, the physics-informed neural network represents a typical example of internal integration. It effectively enhances the model architecture and training efficiency of the machine learning model by integrating battery physics into the model structure and loss equations. As depicted in Fig. 2c, examples of external integration include machine learning-based parameterization of physics-based models, the generation of data for machine learning through physics-based models, and the acceleration of physical model aging using machine learning. In essence, internal integration aims to overcome the limitations of one method by leveraging the strengths of the other, while external integration seeks to combine the inherent advantages of both methods. To provide a comprehensive understanding, we will briefly review instances of the integration of physics and machine learning in battery health and safety management, highlighting the intricacies of their combination.

Fig. 2: How to integrate physics and machine learning.
figure 2

a General integration framework; b representative examples of internal integration; c representative examples of external integration.

Battery health management

Effective battery health management, encompassing diagnosis, prognosis, and optimization, is paramount for enhancing efficiency and reliability throughout the battery’s entire life cycle49. Real-time improvement in battery health management is achievable by harnessing the potential of both physics and machine learning. In Table 2, we present a comprehensive summary of representative papers focusing on battery health management based on the integration of physics and machine learning. This table provides classification and detailed insights into the strategies employed to seamlessly integrate physics and machine learning for superior battery health management.

Table 2 Representative papers on the integration of physics and machine learning for battery health management

In diagnosing battery health states, Weddle et al.50 innovatively constructed a P2D model to generate synthetic data for machine learning, validating their method with experimental data. Thelen et al.51 proposed a framework relying solely on early-life experimental data for comprehensive battery health diagnosis across the entire life cycle. To enhance machine learning training, late-life battery data was simulated using a half-cell model. Similarly, Hofmann et al.52 suggested a battery health diagnosis method, employing fused experimental data and P2D model-simulated data for machine learning training. Lin et al.53 developed a framework that integrates an electrochemical impedance spectroscopy-based model with machine learning for battery health diagnosis, utilizing both electrochemical impedance spectroscopy and ECM parameters as inputs for the machine learning model. Additionally, Li et al.54 pioneered a battery digital twin for diagnosing capacity, resistance, and aging modes, leveraging field data, impedance-based models, and artificial intelligence. The parameters of the open circuit voltage-coupled ECM model were identified based on battery charging data using an optimization algorithm throughout the full life cycle. Based on their previous work on battery aging diagnosis using physics-based methods55,56,57, Dubarry et al.58 proposed a battery digital twin. It is derived from field data and physics models and utilizes the generated synthetic dataset to train machine learning for comprehensive battery health diagnosis in field applications. Kim et al.59 proposed a deep learning framework with synthetic data generated by a physics-based method, diagnosing the battery aging in degradation modes level with improved performance.

In the realm of prognosis, leveraging the internal integration of physics and machine learning, Flores et al.60 introduced a physics-constrained neural network for predicting battery remaining useful life (RUL). This innovative approach involved extracting a neural differential operator from the initial 100 cycles of aging data. Nascimento et al.61 integrated physics into the neural network cell to enhance the architecture for battery modeling and prognosis. Similarly, Shi et al.40 enhanced the long short-term memory architecture by incorporating a physics-based aging model for effective degradation modeling and RUL prediction. Wang et al.62 devised a neural network architecture based on the ECM for battery modeling and prognosis. For external integration of physics and machine learning, Zhang et al.63 extracted machine learning features from the ECM parameters to prognosticate battery life. Ma et al.64 developed a versatile physics model to inform a machine-learning model for predicting battery RUL. References65,66 extracted hybrid features utilizing an electrochemical model and measurable data. Furthermore, in the domain of physics and machine learning-based health optimization, Wei et al.67 pioneered a strategy for optimized fast charging. This involved estimating and controlling the internal states of the P2D model with machine learning, resulting not only in reduced charging time but also effectively limiting health costs. Although the aforementioned works are concentrated on the battery cell, they can be effectively utilized in both battery modules and packs. Since battery modules and packs are made up of battery cells connected in series and parallel, the use of some representative cells can effectively realize the management of both battery modules and battery packs. Meanwhile, by combining physics with machine learning, the representative cells in a module or battery pack can be identified using electrode-level parameters. Additionally, machine learning can address the challenge of cell state/parameter non-uniformity across the pack. This, in turn, will be beneficial for battery control, e.g., fast charging technologies that do not lead to lithium plating and additional lifetime loss.

Battery safety management

In LIBs, a critical safety concern is thermal runaway68. This hazardous event arises from internal short circuits, resulting in an uncontrollable increase in cell temperature, potentially leading to combustion. The irreversible phenomenon of battery explosion poses a profound safety risk in industrial applications. Finegan et al.69 present a comprehensive set of perspectives on the integration of physics and machine learning for predicting battery safety. In this context, we provide a summary of representative papers that have successfully integrated physics and machine learning for battery safety management in Table 3.

Table 3 Representative papers on the integration of physics and machine learning for battery safety management

In the domain of thermal behavior modeling or monitoring, Mesgarpour et al.70 innovatively devised a pattern-based machine learning model for assessing battery thermal behavior. This approach seamlessly integrates pattern recognition and physics-constructed training into the machine learning framework. Boonma et al.71 predicted battery thermal behavior by utilizing a machine-learning model with physics-based loss functions. Pang et al.72 introduced an approach for heat generation rate estimation based on a machine learning model, incorporating additional features derived from a physics-based model. Wei et al.73 proposed a thermal modeling and diagnosis method employing machine learning, with inputs derived from a physics-based lumped thermal model and measurable signals. Cho et al.74 suggested a machine learning-based battery temperature estimation method with augmented input from the physics-based model. Zheng et al.75 proposed a sensorless temperature estimation method based on a physics-based thermal model and ECM-augmented features with machine learning. Yang et al.76 developed a machine learning-based battery thermal model under an external short circuit with a physics-based loss function.

To model and predict battery thermal runaway, Kim et al.23 introduced a multiphysics-informed neural network, shaping the training process with multiphysics-based loss functions. Goswami et al.77 expanded the input parameters for machine learning in the prediction of thermal runaway, utilizing the simulated image of battery temperature generated by the physics model as a key input feature. Additionally, Chen et al.78 developed an early detection method of battery lithium plating based on a machine learning framework and physics-based electrochemical signatures. Firoozi et al.79 proposed a fault detection algorithm by seamlessly integrating physics and machine learning. The ultimate decision was derived from a collective assessment of outputs from both the physics model and machine learning components.

Future directions and perspectives

Based on our systematic survey and analyses of the existing challenges on battery health and safety, we identify the following perspectives of research in battery technology through the synergy of physics and machine learning.

Internal integration

While substantial progress has been made in improving machine learning with battery physics, refining the architecture and training efficiency is currently in a nascent stage. A uniformly accepted machine learning model with a physics-based architecture is yet to be fully developed, necessitating improvement in the internal architecture to align with the genuine internal structure of batteries. Enhancing training efficiency is conceivable with a physics-based machine learning architecture, ensuring that input data flow cohesively within the machine learning model, akin to current flow in real batteries. The cost function for training machine learning warrants refinement, considering accuracy, robustness, and cost across multiple dimensions.

The modeled physics, while not perfect, can be effectively improved with machine learning, addressing the limitations and assumptions inherent in physics-based modeling, such as the P2D model. Machine learning can efficiently utilize measured data to enhance physics-based modeling and mitigate the limitations of PDEs. By leveraging machine learning, the identification of particle number and size, as well as capturing particle dynamics during usage, can be improved for advanced battery management.

External integration

A promising avenue involves integrating machine learning with fractional-order physics to enhance the realism of physical knowledge. Fractional-order models exhibit superior performance in capturing diffusion dynamics and charge transfer reactions, especially in mid and low-frequency regions overcoming the limitations of conventional integer-order models based on ODE and PDEs80,81. As these models show promise, a perspective of research is to improve applicability and understanding. In the realm of model parameter identification, strategies to determine optimum weights are yet to be well-defined, urging the need for continuous exploration and refinement. Moreover, the integration of physics and machine learning calls for an appropriate weighting method to balance multi-task losses during machine learning training, emphasizing the importance of methodical approaches for model parameter tuning82.

Aging-related features can be efficiently extracted using physical models, enabling the rapid training of machine learning models without the need for extensive failure data to predict potential safety and health hazards13. To predict RUL, recalibrating aging parameters using discharge curves obtained at different battery ages is typically required. This process, however, poses challenges for real-life deployments due to the need for numerous discharge curves. A potential solution could involve proposing a methodology to update aging parameters during deployment, eliminating the need for repeated calibration. The hybrid model can learn new aging parameters as the battery operates in the field, preventing operational disruptions. The integration of physics and machine learning becomes instrumental in forecasting unseen battery failures, such as faulty tabs, foreign debris, welding burrs, misplacement of electrodes and current collectors, and weakening of the separator (e.g., thermal runaway). These physical insights into battery behavior are embedded in machine learning, allowing for accurate predictions of cell failure risks, including issues like bloating, expansion or contraction, and cell hardening due to electrolyte depletion. This integrated approach is paramount for ensuring battery safety and reliability.

Battery data in sustainable value chain

By assembling the smart sensor into the battery cells, the battery data can be monitored and collected at the electrode level. It would be beneficial for the physics observation inside the battery, thereby improving the understanding of battery aging mechanisms and enhancing battery management with physics and machine learning. The concept of battery passport83 is effective for improving battery management and control in the full life span, which can be facilitated by integrating physics and machine learning. The information labeled in the battery passport can be enriched with battery data in a whole life cycle manner, and the combination of physics and machine learning can give an insight into battery behavior. Considering the feasibility of battery passport, the most important characteristics of battery cells need to be extracted from large amounts of data, i.e., voltage, current, and temperature, to reduce the demanded memory sources while maintaining the key information for 2nd life usage or recycling. Battery data genomes84 are critical to building sustainable economic systems. However, they are restricted by the conflicts and confidentiality of industries. By building up the universal functionalities of battery management in the cloud with physics and machine learning, the users and industries can download the generalized model and fine-tune the parameters with the confidential data for application according to their battery chemistry, application scenarios, and upload the updated parameters for the improvement of universal battery digital twin based on their flexible choice. In this way, confidential data can be protected with only some parameters in the model being shared for the improvement of battery management.

Battery digital twin

As depicted in Fig. 3, the integration of physics and machine learning for the realization of a battery digital twin in the cloud holds the potential to unlock key functionalities in advanced battery health and safety management. This integration aims to enhance safety and reliability, prolong battery lifetime, enable optimal fast charging without health deterioration, achieve cost savings, and elevate residual values.

Fig. 3
figure 3

Physics and machine learning for advanced battery health and safety management.

Integration of physics and machine learning for real-time prediction and estimation in onboard battery management is crucial for achieving lifelong learning. This approach goes beyond simply implementing learned knowledge, extending to the continuous discovery and adaptation of new behavior throughout the entire battery lifetime. Digital twins85,86,87,88, representing digital replicas of an energy storage system based on real-life data, play a pivotal role in accurate state estimations, covering aspects such as charge, remaining discharge energy89, power, health, or safety90,91. A promising application of the integration of physics and machine learning is in the realm of digital twins as intelligent cloud platforms for the health prognosis of LIBs. Establishing the infrastructure for storing meaningful battery data is essential to support battery models and accelerate the adoption of new battery chemistries. These digital twins can serve as virtual testbeds, particularly beneficial for integration into larger energy management systems, such as smart grids or renewable energy installations. They facilitate the anticipation of maintenance needs, optimization of scheduling, and minimization of downtime. Both physics and machine learning contribute to building digital twins that effectively meet industrial demands in a cost-efficient manner91.

Conclusions

This paper provides a comprehensive review of recent literature concerning the integration of physics and machine learning for the purpose of battery health and safety management. In doing so, we have identified promising areas where this integration has the potential to substantially enhance battery performance. Our discussion extends to future directions and perspectives, elucidating how the fusion of physics and machine learning can advance battery management practices. This symbiosis opens up novel avenues in battery technology, fostering a profound comprehension of battery physics, precise prediction of performance metrics, real-time control, optimization, and informed decision-making, thereby elevating safety measures to new heights.

The integration of physics and machine learning introduces a transformation in battery technology, offering intelligent energy storage management and optimizing battery architectures. The improved modeling, prediction, and reliability achieved through this integration are poised to redefine the landscape of battery applications. These perspectives extend beyond theoretical realms, reaching practical domains where this integration can revolutionize diverse applications, including electric vehicles, electric aircraft, power grids, portable electronics, and beyond. This work advocates for leveraging the potential combination of physics and machine learning in energy storage technology to propel us toward a cleaner, greener, and more sustainable future.