Introduction

With the rapid development of China’s economy, the chemical industry, as one of the leading pillar industries of the national economy, has achieved remarkable accomplishments. However, the production safety situation in the chemical industry remains severe, with various safety accidents occurring frequently. For instance, the “8·12” extraordinarily severe fire and explosion accident at the dangerous goods warehouse of Ruihai Company in Tianjin Port in 2015 resulted in a total of 165 fatalities, 8 people missing, and 798 injured, with direct economic losses amounting to RMB 6.866 billion. The “3·12” catastrophic explosion accident in Xiangshui, Jiangsu Province, in 2019 caused a total of 78 deaths and 76 seriously injured individuals, as well as severe pollution of the atmosphere and water bodies, with direct economic losses reaching RMB 1.986 billion. This poses a huge threat to people’s lives, property safety, and the environment. Therefore, it is of significant research importance to conduct a quantitative analysis of the risk factors contributing to chemical safety accidents, obtain the importance evaluation values for the occurrence probabilities and consequences of factors considered comprehensively, and based on this, manage and control the critical factors to reduce accident risks.

It is an important method for quantitative risk analysis and evaluation to build accident evolution model by considering the coupling relationship of various risk factors. Many scholars have done a lot of research in this field1,2,3. Pan et al.4 proposed a risk analysis model based on Bayesian Networks. By adopting a hybrid approach combining fuzzy set theory and the noisy OR-gate model, they addressed the issues of difficulty in obtaining conditional probability parameters and inaccuracy in assessment results in traditional BNs. Babaleye and Kurt5 proposes an integrated safety analysis approach for conducting a decommissioning risk analysis of offshore installations. The approach incorporates hierarchical Bayesian analysis with Bayesian network to assess the accident causations leading to futile decommissioning operation. Güler et al.6 used the Bow-Tie based fuzzy Bayesian network method to analyze the risk of tank cleaning operations. Wang et al.7 established a reliability prediction model for emergency evacuation of offshore platforms based on K2 structure learning algorithm and Bayesian network parameter learning method. The findings can provide insights for the development of cost effective EER(Escape, Evacuation, and Rescue) strategies for an offshore platform.

Bayesian networks are used to establish the accident evolution models above, and the influence of small probability risk factors is ignored in the modeling process. For chemical safety accidents, small probability risk factors may lead to serious accident consequences. Therefore, in this paper, a complex network is used to establish an accident evolution model that considers all risk factors.

For the risk factors involved in chemical risk factors, some scholars have also conducted relevant research. Xiang et al.8 investigated the risk coupling characteristics of chemical safety accidents by conducting risk classification and identification of risk factors associated with such accidents. They analyzed the coupling patterns of risk factors in chemical safety accidents and developed an N-K model based on complex networks. The results indicated that management and equipment technology factors are key coupling factors, suggesting the need for enhanced management and equipment maintenance. Maduabuchi et al.9 conducted an analysis of the immediate causes, prerequisites, and underlying failures of accidents, leading to the identification of key risk factors in accidents. The experimental results indicated that the primary risk factor is predominantly design defects, site process safety management and barrier management/control. Soltanzadeh et al.10 proposed an integrated approach to assess safety and security risks in chemical process industries. The validity and reliability of the taxonomy were assessed using the Delphi method, involving Subject Matter Experts (N = 25), and through statistical analysis. Subsequently, the Fuzzy Analytical Hierarchy Process was employed to determine the importance level of each contributing factor and dimension, thereby enabling the integration of safety and security risk levels. The findings of this study highlighted the dimensions and contributing factors that have the most significant impact on the integrated safety and security risk level.

The aforementioned studies primarily consider the impact of material and machinery factors and management factors on chemical accidents, with less consideration given to human factors and environmental factors. Chemical production encompasses a wide range of risk factors, for example, the fact that most of the materials and products produced by chemical enterprises have hazardous characteristics, including flammability, explosiveness, and toxicity, and the equipment itself has hazardous production conditions including high pressure and extreme temperatures, which can easily result in production and safety accidents11,12. Additionally to the intrinsic safety factors mentioned above, chemical safety production is also affected by external factors such as personnel and the environment13,14. As an example, illegal operations by personnel can pose safety hazards15, and extreme weather conditions such as high or low temperatures can affect the performance of individual equipment16. The impact of these factors on safety risks cannot be ignored. Therefore, this paper presents a quantitative analysis method of chemical accident risk based on complex network considering human factors, environmental factors and other factors.

In the present work, we constructed a safety risk network for chemical enterprises based on complex network theory firstly. In this network, the nodes are the extracted risk factors and accident results, the edges are the causal relationships between factors and between factors and results. Subsequently, a method for calculating node risk thresholds and dynamic risk values, which comprehensively considers multiple factors, was designed to deduce the evolution mechanism of chemical accidents. On this foundation, an evolution model for work safety accidents in chemical enterprises was developed to simulate the actual development process of chemical accidents. Furthermore, quantitative evaluations of the importance of various risk factors were obtained, and specific control measures were formulated. Finally, the rationality of the proposed evolution model and the accuracy of the quantitative analysis of risk factors presented in this paper were validated through practical cases of chemical accidents.

Methods

As shown in Fig. 1, the research framework outlines the causal evolution and quantitative analysis of external risk factors in chemical production accidents using complex networks, which includes identification and extraction of external risk factors, the construction of CESRN networks, and simulation and analysis of the evolution of those networks. The data for this study are sourced from 481 representative safety production accident records of 30 hazardous chemical enterprises in China’s chemical industry and hazardous chemical field from 2010 to 2022. By analyzing the textual data of accidents, we can examine multiple causal factors in each record and determine the relationship between these causal factors and typical refining equipment safety and production accidents. Complex network models are sensitive to data skewness. Therefore, it is important to ensure that the accident dataset is not skew. Based on the year of accident occurrence and the type of accident result in Table 1, the Shapiro-Wilk test(p-value 0.734>0.05) indicates that the input dataset is non-skewed.

Fig. 1
figure 1

Analytical framework for this study.

Table 1 481 representative safety production accident records.

Extraction risk factors

The Cognitive Reliability and Error Analysis Method (CREAM)17 is used to extract the risk factors and accident chains that contribute to chemical accidents. The first step is to refer to the Shen et al.’s18 “Error Mode Precedent Table” and GB/T 13861-2022 Classification and Code for hazardous and harmful factors in the procedure in order to determine the antecedent category. In the next step, the identified antecedents as consequences are compared with Shen et al.’s18 “Consequence Precedent List” to identify the corresponding general and specific antecedents. As a consequence of the general antecedents obtained, repeat the above steps until the initial cause of the accident is identified.

Firstly, randomly select 10% of safety production accident data and extract accident chains. Afterwards, normalize the accident chains with the CREAM. The next step involves expert review and consistency training. It is demonstrated that this method produces objective and representative risk factors and accident chains when the pass rate exceeds 80%19. After extracting the risk factors and accident chains from all 481 safety production accident data, repeating the above process until the consistency check is achieved, the Establish Typical Chemical Safety Production Accidents Risk Factors Database is created. There were 24 Human Factors, 17 Material and Machine Conditions, 7 Management Factors, 20 Environmental Conditions, and 19 Accident Factors in the accident description. Details are provided in Table  2.

Table 2 Four types of risk factors and accidents.

Network construction

Chemical processes are nonlinear and complex20. With the rise of complex network theory in recent years, it has become increasingly popular for constructing accident causation networks. Many scholars used this method to simulate complex dynamic accident processes and explore the major causal factors in accidents21,22,23,24,25,26. Therefore, we define dynamic risk values and risk transmission mechanisms by constructing a safety risk network for chemical enterprises based on complex network theory, thereby establishing an evolutionary model of the CESRN.

Network structure

Define the safety risk network for chemical enterprises as G, the adjacency matrix is \(M(m_{ij})_{n\times n}\), where n represents the total number of nodes in the network. In this network, the nodes are the extracted risk factors and accident results, the edges are the causal relationships between factors and between factors and results, and the values of the edges are the strength of the correlation between each node.

$$\begin{aligned} \begin{aligned} M&=\begin{bmatrix}m_{1,1}& m_{1,2}& \cdots & m_{1,n}\\ m_{2,1}& m_{2,2}& \cdots & m_{2,n}\\ \vdots & \vdots & \vdots & \vdots \\ m_{n,1}& m_{n,2}& \cdots & m_{n,n}\end{bmatrix} \\ m_{ij}&=w_{ij}e_{ij} \\ w_{ij}&=\frac{m_{ij}}{m_i+m_j-m_{ij}} \\ e_{ij}&={\left\{ \begin{array}{ll}1,i\Rightarrow j \text { Connected in the accident chain}\\ 0,i\Rightarrow j\text { Not connected in the accident chain}\end{array}\right. } \end{aligned} \end{aligned}$$
(1)

\(m_{ij}\) is the connection status from node i to node j. \(w_{ij}\) is the co-occurrence rate of the edge from node i to node j. In this formula \(m_i\) and \(m_j\) represent the frequency of occurrence of nodes i and j, \(m_{ij}\) is the co-occurrence frequency of nodes i and j. \(e_{ij}\) indicates whether node i is connected to node j. If connected, it is 1, otherwise it is 0.

Dynamic risk value

According to the energy propagation theory27,28, accidents are caused by the application of energy above a threshold. Risk factors themselves have a certain risk value and maximum energy tolerance, known as the risk threshold. When the risk value of a certain risk factor is greater than or equal to its risk threshold, the risk factor will be triggered and release its own energy29. The released energy will be transferred along the risk network, triggering other risk factors in succession, ultimately leading to accidents30.

In order to calculate the risk threshold, this study used eleven complex network topology properties31 in three dimensions of risk factors: degree, strength, and centrality coefficients, and combined them with the Analytic Network Process (ANP) algorithm. As opposed to existing methods32, this study comprehensively examines the amplification effect of risk factors in the process of chemical safety production accidents. Here are the relevant indicators.

In degree \(K_i^{in}\), out degree \(K_i^{out}\) and total degree \(K_i^{total}\) indicates the number of edges that enter, exit and connecting of this node, and it is calculated as follows:

$$\begin{aligned} \begin{aligned} K_i^{\textrm{in}}=\sum _{j\ne i}e_{ji}\\ K_i^{\textrm{out}}=\sum _{j\ne i}e_{ij}\\ K_i^{\textrm{total}}=K_i^{\textrm{in}}+K_i^{\textrm{out}} \end{aligned} \end{aligned}$$
(2)

where \(e_{ij}\) is the edge from node i to node j in formula (1).

In strength \(S_i^{in}\), out strength \(S_i^{out}\) and total strength \(S_i^{total}\) indicates the sum of weights that enter, exit and connecting of this node, and can be written as follows:

$$\begin{aligned} \begin{aligned} S_i^{\textrm{in}}=\sum _{j\ne i}w_{ji}e_{ji}\\ S_i^{\textrm{out}}=\sum _{j\ne i}w_{ij}e_{ij}\\ S_i^{\textrm{total}}=S_i^{\textrm{in}}+S_i^{\textrm{out}} \end{aligned} \end{aligned}$$
(3)

where \(w_{ji}\) is the co-occurrence rate of the edge from node i to node j in formula (1).

Betweenness centrality coefficient represents the centrality and transitivity of nodes. The Betweenness centrality of node i can be written as:

$$\begin{aligned} \begin{aligned} BC_i=\frac{\sum _{i\ne j\in \textrm{N}}\frac{\sigma{_{vj}}(i)}{\sigma{_{vj}}}}{(n-1)(n-2)} \end{aligned} \end{aligned}$$
(4)

where \(\sigma{_{vj}}(i)\) is the number of shortest paths from node v to node j through node i, \(\sigma{_{vj}}\) is the number of shortest paths from node v to node j in the network and n represents the total number of nodes. \(\textrm{N}\) is the set of all nodes.

Closeness centrality coefficient indicates the degree to which a node is centered in a directed weighted network. It can be written as follows:

$$\begin{aligned} \begin{aligned} C_C (i)=\frac{n-1}{\sum _{j=1,j\ne i}^nl_{ij}} \end{aligned} \end{aligned}$$
(5)

where \(l_{ij}\) is the shortest path length from node i to j.

Degree centrality coefficient \(C_D(i)\) indicates the degree of direct connection to other nodes. \(C_D(i)\) equals to:

$$\begin{aligned} \begin{aligned} C_D(i)=\frac{K_i^{\textrm{total}}}{n-1} \end{aligned} \end{aligned}$$
(6)

Clustering coefficient indicates the degree of aggregation of nodes in a directed weighted network. Specifically, the Clustering coefficient of node i can be calculated with:

$$\begin{aligned} \begin{aligned} C_l(i)=\frac{\{e_{jv} : j, v \in \textrm{N}_i^{\textrm{in}}\}+\{e_{jv} : j, v \in \textrm{N}_i^{\textrm{out}}\}}{K_i^{\textrm{in}}(K_i^{\textrm{in}}-1)+K_i^{\textrm{out}}(K_i^{\textrm{out}}-1)} \end{aligned} \end{aligned}$$
(7)

where \(\textrm{N}_i^{\textrm{in}}\) is the set of nodes with edge in to node i. \(\textrm{N}_i^{\textrm{out}}\) is the set of nodes with edge out of node i.

Katz centrality coefficient indicates the tightness and importance of the connection between nodes and neighbor nodes. Specifically, the Katz centrality of node i can be defined as:

$$\begin{aligned} \begin{aligned} C_K(i)=\mathrm {\alpha }\sum _{j=1}^nM_{ij}C_K(j)+\mathrm {\beta } \end{aligned} \end{aligned}$$
(8)

where \(\mathrm {\alpha }\) is a damping factor that controls the influence of neighbor centrality. \(\mathrm {\beta }\) is a constant term that adjusts the initial value or offset of centrality. We set the value of \(\mathrm {\alpha }\) and \(\mathrm {\beta }\) with 0.1 and 1.0. \(M_{ij}\) is the adjacency matrix in formula (1).

ANP33,34 is an analysis method for analyzing interrelationships between elements in complex networks. In this study, the above indicators are used to determine the risk threshold based on this method. The first step is to conduct a correlation analysis of the elements. Analyze the decomposed elements at various levels in relation to their independence, interdependence, and correlation based on the formulas of various indicators. Secondly, construct a network hierarchy using Super Decisions version 3.2 software. The third step is to construct an unweighted super matrix. Calculate element judgment matrices and overall judgment matrices by conducting element correlation analysis and comparing the relative importance of pairwise elements that are related to the relative comparison criteria. Following the construction of the matrix, corresponding comparisons are conducted in order to determine the weights assigned to the different element groups and the relative weights assigned to the judgment matrix, and an initial unweighted super matrix is generated. Additionally, calculate the weighted super matrix and limit matrix. Then the comprehensive weights are used to determine the threshold value \(C_i\) for the risk factors. Finally, the comprehensive weight and topology index comprehensive value are calculated. Because the larger the degree, strength, and centrality of the risk factors, the stronger their connectivity in the network, and the easier it is to trigger accidents, therefore the lower their risk threshold. As a consequence, the threshold of the direct risk factors of the accident is the reciprocal of this comprehensive value, and the risk threshold of the accident is the sum of the thresholds of the direct risk factors. The initial risk value of the risk factor is its energy value in a steady state. This is determined by the characteristics of the risk factor itself and the environment it is in. According to Luo et al.35, this study also assumes that the initial risk value \(L_i(0)\) of each risk factor is half of its risk threshold. The risk value of the outcome factor is the sum of the risk values of the main risk factors that led to the accident. The risk threshold for Accidents(As) is the sum of the thresholds of the direct risk factors.

Risk transfer mechanism

The risk transmission mechanism is the rule for implementing risk transmission, which allocates risk values based on the strength of the correlation between risk factors. For any risk factor i, \(E_{i}(t)\) is the risk value of risk factor i at time t. When the risk value of risk factor i exceeds its threshold \(C_{i}\), i.e. \(E_{i}(t)>C_{i}\), this risk factor will be activated. Then the connection directly associated with the risk factor becomes the triggering state, and the risk factor directly associated with the risk factor will receive an increase in its energy risk value and may be triggered to continue transmitting energy, ultimately leading to an accident. The calculation method is that for node j in the set of neighbor nodes \(N_i\) of node i, the risk value obtained from i in \(t + 1\) is the risk value of i allocated based on the weight w of the directed weighted network edge. The risk value of node j can be defined as \(E_j(t+1) = \frac{w_{ij}}{\sum _{j\in {N_{i}}}W_{ij}}E_i(t)+E_j(t)\).

Results and discussion

With the 481 representative safety production accident records, construct CESRN in Fig. 2 using formula (1). Then analyze the topological properties of each influencing factor and combine the ANP method with expert groups to obtain the unweighted and weighted super matrices based on the relative importance of indicators using Super Decisions version 3.2 software. As shown in Table 4 Indicator weight with ANP column and Fig. 3, the comprehensive weight of each indicator and the threshold value of each influencing factor are obtained.

Fig. 2
figure 2

Chemical Enterprise Safety Risk Network (CESRN).

Fig. 3
figure 3

Risk threshold.

Accident evolution analysis

This section selects personnel falling accidents from existing datasets for network evolution. The purpose is to verify that CESRN has interpretability for chemical safety production accidents triggered by external risks. In addition, CESRN comprehensively considers the amplification effect of risk factors in chemical safety production accidents. The accident occurred at 10am when the operator of the first circulating water plant started the fan. He found friction between the fan fins and the air duct. Workshop equipment engineer A contacted maintenance workshop inspection team leader B to repair the fan. B and the fitter team leader C, who was responsible for the maintenance task, agreed to repair the fan in the afternoon. They notified maintenance workshop inspection personnel E in the area. At around 2:10 pm, C and D arrived at the cooling water tower fan 1 # and saw that the air duct door was open. They inspected it. At this time, maintenance workshop E arrived and C and D explained the equipment malfunction to E. After that, E, C, and D entered the air duct again. E inspected the equipment malfunction and explained the maintenance requirements to C and D before leaving the air duct. At this point, F enters the air duct and undergoes maintenance together with C and D. After E came out, he contacted team leader B to confirm if the fan is powered off, and B called G to confirm if fan 1 # is powered off. At around 2:40 pm, E waved his hand at G coming out of the operating room at the top of the cooling water tower. As G had received a phone call confirming whether there was a power outage before, he made a gesture and returned to the operating room. G started the 1 # fan on the DCS and stopped it immediately after discovering power. At this moment, C, D, and F, who were carrying out maintenance work inside the wind duct, were startled by the sound of the motor starting. D and F fell onto the platform without falling, while C fell onto the filling layer 3.6 meters below the work platform.

According to the accident description, the operator did not have the awareness of wearing a safety belt during this high-altitude operation. This means that he had not paid attention to the risk protection of the operation for a long time, which created a safety hazard for this high-altitude operation. To begin with, the operators were completely unaware that safety belts were required during high-altitude operations, and the initial risk value of inadequate awareness of job risk protection HF18 was 10 times. Obtain \(t=0\), \(L_{HF18}(0)=8.47\).

Table 3 shows the specific process of accident evolution. Among them, Triggered Nodes are those nodes’ risk value changing at this time. Activated Nodes are those nodes’ risk value exceeding risk threshold. Next triggered nodes are the neighbors of Activated Nodes. Ultimately, the risk evolution path is that inadequate awareness of job risk protection (HF18) \(\rightarrow\) decision-making errors (HF20) \(\rightarrow\) and personnel deaths (A11).

Table 3 Risk transmission process of high altitude falling accident.

In order to verify the rationality of this network and its interpretability in the context of risk accidents. Although the homework personnel are aware of the importance of job risk prevention, they have not fully focused on reasonable protection. A risk value of 5 times the risk threshold is set for inadequate awareness of job risk protection HF18, resulting in \(t=0\) and \(L_{HF18}(0)=4.24\). The evolution of the accident is shown in Table 3. Finally, the risk evolution path is that Inadequate awareness of job risk protection (HF18) \(\rightarrow\) violation of operating procedures (HF21) \(\rightarrow\) personnel injury (A16).

In terms of accident evolution, the network has shown good results in accident completion deductions and rationality, but its effectiveness in restoring the integrity and complexity of accidents has not been satisfactory. Specifically, when constructing a network that extracts the risk factors from 481 accident data into 68 risk factors and 19 results, the extraction of risk factors must consider both their specificity and generalization. The construction process of CESRN is considered to be reliable.

Analysis of basic topology indicators

Figure 4a displays the distributions of in degree, out degree, and total degree of 87 nodes in the CESRN, which average total degree is 2.5. This means that each node is on average associated with approximately 3 other causal relationships. The total degree of node HF13 is the highest, followed by HF04, HF08, HF10 and HF01. They have more connections with other risk factors in this network. In addition, HF13 and HF08 have the highest out degree, that is 16. Next are HF04 and HF10. The three causal nodes with high degree values indicate that they can trigger more types of causal relationships and are crucial for amplifying the cascading effects in the causal sub network. The HF04 and HF11 nodes have the highest in degree, which is 11.The following nodes were HF01 and HF10.The three causation nodes with a higher value of in-degree indicated that they were easily influenced by other nodes.

Fig. 4
figure 4

Degree and strength of CESRN.

Figure 4b displays the distributions of out strength, in strength and total strength of 87 nodes in the CESRN, which average total strength is 0.721. In the CESRN network, the strength of a node is represented as the sum of the weights of the edges directly connected to it. The total strength of EC05, MC07 and MC12 nodes is the highest, and this node has a significant cascading effect, followed by MC1, MC06 and EC12. They are closely related to other nodes in the network. In addition, EC05, EC09, EC12, EC14, EC16, EC18, EC20, MC07, MC12, MC14, and MC16 have the highest out strength, which is 1. The in strength of EC05, EC19, MC06, MC07, MC11, and MC12 is the highest, which is 1.

Betweenness centrality is a measure of network centrality based on the shortest path. The betweenness centrality of a node is the proportion of all shortest paths between other pairs of nodes that contain that node. It is an indicator for determining the role of nodes as transportation hubs in the network. Almost 69% of causal nodes have zero intermediary centrality values because they do not act as any transportation hub in the shortest path between other node pairs. The HF13 node has the highest betweenness centrality, which is 0.1. The following causation nodes were HF04, HF08, HF01, HF10, HF11. The betweenness centrality values of other nodes are less than 0.02. Closeness Centrality represents the degree to which a node is centered in a directed weighted network.

It was found that 44 causal nodes had clustering coefficient values greater than zero, and 21 causal nodes had clustering coefficient values of zero. There are still three causal nodes with missing clustering coefficients, as their total degree is equal to 1. The average clustering coefficient of the fruit network is 0.117. Compared with the random network generated by the Gephi version 0.10.1 software with the same number of nodes, the average clustering coefficient of the sub network is much higher than the value of the random network, which is 0.033.

Katz centrality is used to indicate the tightness and importance of connections between nodes and neighbor nodes. The maximum Katz center coefficient of node HF04 is 0.22. The following causal nodes are HF11, HF01, HF13, HF10, HF08. The Katz center coefficient values of other nodes are less than 0.07.

The diameter of CESRN is 7 and the average path length is 2.97. All factors interact by average 3 steps. As shown in the distribution of the shortest path between nodes in the chemical accident causation network, the correlation between various factors is relatively close. The average separation degree is small. After each factor fails, it has strong diffusion and propagation efficiency, which can easily trigger a chain reaction. Therefore, it is necessary to establish a rapid response mechanism to quickly cut off the transmission path of a certain influencing factor when it fails, in order to prevent a chain reaction.

The average path lengths from risk factors to accidents for Human factor, Material and machine condition, Management factor and Environmental condition are 2.52, 0.82, 2.80, and 1.29. There is a distinguishable relationship between the four types of risk factors. The average path length from Material and machine condition to accidents is the shortest. This shows that it seems be the direct cause of chemical safety accidents. The average path length from management factor to accidents is the longest. It can be seen as an indirect cause of the accident. It may lead to accidents by triggering other risk factors. Since the CESRN has small diameter and short average path length, the domino effect often occurs quickly.

Analysis of risk factors importance

Based on real accidents for simulation and evolutionary reasoning analysis, the importance of risk factors is defined as the ratio of the risk thresholds to the minimum risk value that causes accidents to occur. It can be written as:

$$\begin{aligned} H_i=\frac{C_i}{S_i} \end{aligned}$$
(9)

\(S_i\) refers to the minimum risk value of risk factor i which leads to accidents. The larger the \(H_i\), the higher the importance of risk factors.

Fig. 5
figure 5

The importance of risk factors.

As shown in Fig. 5, MF03 has the highest importance and a relative low frequency which are 6.52 and 0.2%. This indicates that although process modification is unlikely to occur, it can easily lead to accidents. HF16 has relatively high importance and frequency which are 1.22 and 1.59%, which indicates that enterprises and national regulatory authorities should strengthen safety knowledge training for practitioners. The phenomenon of labor outsourcing in chemical enterprises in China is quite common. Many workers have not received systematic safety knowledge and skills training, resulting in frequent accidents due to non-standard operations. HF04 has a relatively low importance and a relative high frequency which are 0.36 and 6.55%. It seems that although there are many regulatory loopholes, those risk factors are not sensitive.

Overall, The values of importance are 7.92, 15.5, 12.06 and 22.21 on Management factor, Material and machine condition, Human factor and Environmental condition. Thus, Material and machine condition and Environmental conditions dominate the impact of accidents, while Human factor and Management factor have a slightly smaller impact on accidents. The conclusions obtained from this study are basically consistent with the expert inspection analysis, which shows that the quantitative analysis of risk factors had certain application value.

Sensitivity analysis and validation

The objective of sensitivity analysis is to evaluate the model’s output in response to changes in input parameters, which is essential for understanding and optimizing the model. As a result of sensitivity analysis, it is possible to understand the model’s performance under different input conditions, evaluating its stability and reliability. Additionally, sensitivity analysis can be used to identify the parameters that have the greatest impact on the model.In this study, an ANP model based on subjective expert evaluation is used to determine the risk threshold for each node in the network construction. In order to evaluate the robustness of the model and identify key indicators related to risk values, it is necessary to perform sensitivity testing of risk threshold indicators.

According to Table 4, sensitivity analysis involves adjusting 11 indicators to determine different risk thresholds, and testing on 20 cases to obtain the network’s Accident triggering accuracy and Accident interpretability rate. Specifically, 10 test cases are accidents that not in the 481 accident datasets, and the other 10 cases are network rationality tests with lower initial risk values in the above 10 accidents. The results of sensitivity testing revealed that when the weights of Centrality Coefficients and Strength indicators were high, the average accuracy was 75.00% and 70.00%, respectively. Changing their proportions significantly affected accident deduction accuracy. Therefore, these two types of indicators are key indicators for measuring risk value. Additionally, when Katz centrality and Total strength weights are high, the average accuracy is 55.00% and 50.00%, respectively. Consequently, these two indicators play a crucial role in measuring the risk value of a node. When the indicator weight changes, it significantly affects accident inference accuracy. When using the indicator weight with ANP, the Accident triggering accuracy and Accident interpretability rate are the highest, so the CESRN is reliable and stable.

Table 4 Sensitivity analysis of indicator weights.

Application discussion

Based on the experimental results, it is evident that a lack of safety awareness among personnel in the chemical production industry is the most significant contributor to the occurrence of chemical safety accidents. In numerous actual chemical safety accidents, the direct causes highlighted in accident reports are mostly due to personnel failing to operate in accordance with safety regulations, such as not wearing safety harnesses during work at heights and not conducting combustible gas tests before hot work operations. This indicates that the phenomenon of personnel lacking safety awareness and neglecting potential safety hazards is widespread in the chemical production process. Therefore, enhancing safety training for relevant personnel and raising their safety awareness can effectively prevent the occurrence of chemical safety accidents.

In this paper, a quantitative analysis of risk factors in the chemical production process is conducted by establishing an accident evolution model based on complex network theory. This approach, which involves constructing a complex network linking various risk factors with accident outcomes for risk analysis, provides valuable insights and assistance to other scholars engaged in related research. The risk analysis methodology proposed herein is not only pertinent to the chemical industry but also applicable to industries such as coal mining, electric power, and construction, where production safety accidents may potentially occur.

Conclusion

This paper primarily investigates the impact of Human factor, Environmental condition, Material and machine condition, and Management factor on chemical safety production accidents. Initially, a safety risk network for chemical enterprises based on complex network theory was constructed, with each risk factor defined as a node within the network. Subsequently, a calculation method for node risk thresholds and dynamic risk values that comprehensively considers multiple factors was designed. On this foundation, an evolution model for safety production accidents in chemical enterprises was established. Finally, the proposed method was validated through actual chemical accident cases, and the results demonstrate the feasibility of the method presented in this paper and the accuracy of quantitative analysis of chemical accident risks. Through the method proposed in this paper, it is concluded that chemical accidents are caused by the combined effects of various risk factors, with environmental conditions playing a dominant role. Based on the results of the quantitative risk analysis, stringent supervision of process changes in chemical production and enhanced safety training for personnel involved in chemical production are effective measures to reduce the risk of chemical accidents.

There are several limitations to the current job. Firstly, complex networks require manual encoding of multiple causes and accidents, which takes time. As the number of accident cases increases, manual encoding may be expensive for many accidents. Exploring an automatic accident chain encoding method based on causal relationships using text mining techniques in natural language processing. Secondly, in the accident evolution model for work safety in chemical enterprises, the propagation weights of the risk values are deterministic, which may introduce certain deviations from the actuality when the accident scenarios are complex. Future research can conduct quantitative analysis on the propagation weights of risk values to obtain more accurate and reasonable weight values.