Abstract
Benzene separation from hydrocarbon mixtures is a challenge in the refining and petrochemical industries. The application of liquid–liquid extraction process using ionic liquids (I.Ls) is an option for this separation. The selection of the most appropriate I.L. for this application is a challenging task due to the variety of anion and cation structures. In the current study, the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases has been evaluated using the Quantitative Structure–Property Relationship (QSPR) method. A dataset comprising of 112 ternary systems (namely, I.L., benzene, and aliphatic hydrocarbon) was compiled after an extensive review of literature. The primary dataset consists of 17 anions, 20 cations, and 12 aliphatic hydrocarbons. Therefore, the impact of the structure of anion, cation, or aliphatic hydrocarbon on the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases has been investigated. The linear QSPR models were constructed using Multiple Linear Regression (MLR). The statistical evaluation of the final linear model showed that the constructed model (R2 = 0.900) has an acceptable capability to predict the mole fraction of benzene in the I.L.-rich phase. Additionally, non-linear QSPR models were developed using Genetic Programming (GP) and Artificial Neural Network (ANN) machine learning methods. The statistical evaluation of the GP model (R2 = 0.927) and ANN model (R2 = 0.939) showed that non-linear models had slightly higher prediction accuracy compared to the linear model. The final QSPR model was developed using the BELe3 cation descriptor which is a 2D Burden eigenvalues descriptor and HTm anion descriptor which is a 3D GETAWAY descriptor. After model construction, the selected molecular descriptors of anion and cation structures has been interpreted. The results showed that the size and the electronegativity of the atoms in the anion and cation structure are probably important parameters that affect the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases. Additionally, the anion shape can be considered as an effective parameter in the benzene extraction process.
Similar content being viewed by others
Introduction
Aromatic compounds especially BTX (i.e., benzene, toluene, and xylenes) and ethyl benzene are valuable raw materials in the petrochemical industries1. In addition, the reduction of sulfur, nitrogen, and aromatic compounds (especially Benzene) in the middle distillate products has considerable importance due to their environmental impacts2,3. Therefore, great attention has been focused on the separation of aromatic compounds from naphtha-cracking streams. The separation and purification of aromatic compounds are usually challenging due to the close boiling points of different hydrocarbons and the different combinations of azeotropes that may be formed4,5. Based on aromatic concentration, three separation techniques are categorized commercially: (I) liquid–liquid extraction is predominantly employed typically for low aromatic concentrations (in the range of 20 to 65 wt%); (II) extractive distillation is commonly chosen for moderate aromatic concentrations (in the range of 65 to 90 wt%), and (III) the azeotropic distillation is used for extremely high aromatic concentrations (> 90 wt%)6,7.
The liquid extraction process is regarded as the most favorable approach in the aromatic separation process for mixtures containing lower than 20 wt% aromatic contents. The notable advantages of this method include low energy consumption, preservation of both physical properties and chemical structures, ease of application, and mild operating conditions. In the extraction process, the selection of an appropriate solvent is important. In other words, an ideal solvent should be environmentally friendly, easily regenerated, sufficiently available at low cost, have a high level of aromatic compound selectivity, have a significant distribution ratio, and maintain a minimum solvent-to-feed ratio2,8. In addition, the physical and thermodynamic properties of the solvent such as density, surface tension, viscosity, and thermal stability should be applicable in the industry. Commonly, organic solvents such as furfuryl alcohol, N-formyl morpholine (NFM), sulfolane, ethylene glycols, dimethyl sulfoxide (DMSO), and N-methyl pyrrolidone (NMP) are employed in the aromatic/aliphatic separation process. Due to the high levels of toxicity, volatility, flammability, and regeneration costs of these organic solvents, alternative solvents are being investigated3,8.
Nowadays, ionic liquids (I.Ls) have been gradually noticed as an attractive and promising alternative to traditional organic solvents for the extraction of aromatic and aliphatic hydrocarbons. In general, I.Ls exhibit low vapor pressure, notable thermal and chemical stability, effective solubility for both organic and inorganic compounds, and low environmental impacts. Additionally, I.Ls are often recognized as “designer solvents” due to their potential for diverse combinations of anions and cations, resulting in numerous I.Ls with distinct specifications and a wide range of applications6,9. In the petroleum refining industries, numerous separation processes using I.Ls have been investigated. These processes include desulfurization, denitrogenation10, and dearomatization of fuels11.
Investigation of liquid–liquid equilibrium (LLE) data is needed to better understand the role of I.Ls in the liquid–liquid extraction process12. As a result, the number of experimental studies on the LLE of ternary systems containing I.L., aromatic hydrocarbon, and aliphatic hydrocarbon has increased over the recent decades. The literature predominantly discusses imidazolium-based I.Ls as the preferred solvents for extracting aromatic hydrocarbons from the mixture of aromatic/aliphatic components13,14. Furthermore, other categories of I.Ls based on pyridinium, ammonium, etc. have been investigated in some studies15,16.
The vast number of potential I.Ls that can be synthesized makes it impossible to select the optimal I.L. for a particular task or any separation/extraction process only using experimental measurements. Therefore, the development of predictive models can be significantly time-saving and cost-effective. In the literature, various models such as COSMO-RS17 and UNIFAC18 have been used to calculate LLE data, but limited ternary systems were studied in the development of each model. Therefore, the variety of anions, cations, and aliphatic hydrocarbons in the studied systems was very limited.
The QSPR method is a commonly used technique that utilizes molecular descriptors to establish quantitative relationships between a chemical structure and its corresponding property19,20,21. Therefore, it is possible to design and/or develop new solvents or chemical structures using the QSPR method for different applications. This modeling method not only offers quantitative information about the desired property but also provides valuable qualitative insights via descriptor interpretation. The QSPR method has been extensively applied in various studies related to ILs, demonstrating its effectiveness in predicting their thermodynamic and physicochemical properties22,23,24,25.
In our recent researches, we have utilized the QSPR approach to develop descriptive and predictive models for estimating the distribution of thiophene26 , a sulfur-containing compound, and pyridine27, a nitrogen-containing compound, between the hydrocarbon-rich and I.L.-rich phases. This study focuses on the separation of benzene from fuels. Benzene is a toxic aromatic cyclic hydrocarbon and exposure to it beyond the standard limit can cause serious problems for human health such as cancer. On the other hand, benzene is used as a raw material for the production of resins, drugs, dyes, and many other chemical products. Therefore, the separation of benzene from the fuels has received much attention in different studies. After reviewing the literature, it was found that despite the availability of LLE data for a considerable number of ternary systems including I.L., benzene (as an aromatic hydrocarbon), and an aliphatic hydrocarbon, there is no QSPR model to predict the distribution of benzene between the aliphatic hydrocarbon-rich and I.L.-rich phases. In order to fill this gap, a dataset containing LLE data for 112 ternary systems was collected. This dataset covers 17 anions, 20 cations, and 12 aliphatic hydrocarbons. By the combination of the involved anion, cation, and aliphatic hydrocarbon structures in the present dataset, 4080 possible ternary systems (i.e., 20 × 17 × 12 = 4080 systems) can be formed. It should be noted that there is only experimental data for 3% (112 ternary systems) of the 4080 systems, so far. As a result, developing a QSPR model can estimate the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases of the remaining non-studied systems. Additionally, the impact of the variation of anion, cation, and aliphatic hydrocarbon structure on benzene separation has been investigated.
QSPR method
Dataset
Collecting an adequate and reliable dataset is the first step in the QSPR method. A dataset including 112 ternary systems was collected after a comprehensive review of the literature3,9,11,12,13,14,15,16,17,18,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61. The collected dataset includes 17 anions, 20 cations, and 12 aliphatic hydrocarbons. More details regarding the ternary systems for mentioned dataset are presented in Table 1. All LLE data points were measured at 298.15 K and 1 atm.
The dataset included various aliphatic hydrocarbons such as straight-chain hydrocarbon32, cyclic hydrocarbon61, and branched hydrocarbon35. Also, imidazolium-, pyrrolidinium-, ammonium-, and pyridinium-based I.Ls were present in the dataset.
In order to estimate the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases, it was tried to investigate the relationship between the benzene mole fraction in the aliphatic hydrocarbon-rich phase (X2) and the benzene mole fraction in the I.L.-rich phase (Y2) for all ternary systems included in the main dataset and predict Y2 in terms of X2 using a linear model.
Due to the variation of different structures (i.e., anion, cation, and aliphatic hydrocarbon) in each ternary system, it is necessary to apply the effect of the structure of each group (i.e., anions, cations, or aliphatic hydrocarbons) using molecular descriptors in the model. Therefore, the descriptors of each group were considered to develop the model.
Two subsets (i.e., train and test subsets) were used to validate the constructed QSPR model. Train subset was used for the development and internal validation of the QSPR model. Test subset was used for external validation of the constructed model.
Basic theory
All 112 ternary systems of the dataset including I.L. (1), benzene (2), and aliphatic hydrocarbon (3) were investigated. A relationship between the values of Y2 and X2 was identified in every ternary system. Consequently, Eq. (1) can be developed to predict Y2 values at any desired X2 values:
where f can be a linear or non-linear function of X2. According to Table 1, the structures of anions, cations, and aliphatic hydrocarbons are different in each of the ternary systems. In Eq. (1), the only independent variable in the model is X2, so the prediction of Y2 values is not possible with this model as it does not consider the impact of anion, cation, and aliphatic hydrocarbon structures. As a result, anion, cation, and aliphatic hydrocarbon descriptors should be considered as independent variables along with X2 (see Eq. (2)).
Molecular descriptors calculation
Structural features of various compounds are encrypted in molecular descriptors such as constitutional, geometrical, topological, etc. In order to calculate different types of descriptors, it was necessary to use the optimal structure for each compound in the dataset. Therefore, the 3D structure of all anions, cations, and aliphatic hydrocarbons was drawn using Chembio3D Ultra software. Afterward, the structure optimization was carried out based on the Density Functional Theory (DFT). In this regard, the 6-31++G(d,p) basis set and B3LYP level were employed. In the next step, Dragon software was employed to calculate the molecular descriptors for the optimized structures. Descriptors with constant or almost constant for each group were eliminated. Finally, 1247 anion descriptors, 1282 cation descriptors, and 1079 aliphatic hydrocarbon descriptors were calculated.
Model construction
Descriptors selection
In order to identify the most important and effective descriptors among a large number of descriptors (nearly 3608 descriptors), variable selection is important. The Genetic Algorithm (GA) method was employed to select appropriate variables as one of the most effective and efficient variable selection methods. GA is known for its capability to explore the large search space efficiently by mimicking the process of natural selection62. It operates by generating a population of potential solutions (in this case, the subsets of descriptors) and iteratively improving them through operations such as selection, crossover, and mutation. The fitness function used in GA evaluates the predictive accuracy and complexity of the developed model to ensure that the selected descriptors not only improve the prediction performance but also prevent overfitting63. The fitness function employed in this study was based on the Leave-One-Out Cross-Validation (LOO-CV) coefficient of determination (R2). By optimizing the fitness function through LOO-CV, the best subset of descriptors was identified in such a way to improve the predictive performance of the developed model while maintaining the model simplicity.
The performance of the GA in selecting appropriate descriptors is highly dependent on some key parameters such as population size, mutation rate, and the number of iterations64. Population size controls the number of candidate solutions (i.e., subsets of descriptors) evaluated in each iteration. A carefully chosen population size of 500 allows for sufficient exploration of the descriptor space while maintaining the computational efficiency. Mutation rate defines the likelihood of the random changes in the solutions, which helps maintain diversity in the population and avoids early convergence to suboptimal solutions63,65. In the present study, a mutation rate of 20 was selected to balance the exploration and exploitation of the search space. Iterations refer to the number of generations through which the population evolves. In this study, 10,000 iterations were used ensuring that the GA converged towards an optimal subset of descriptors without overfitting risk.
Linear model
For discovering the correlation between the independent variables (i.e., GA-selected descriptors and X2) and the Y2, Multiple Linear Regression (MLR) was employed to construct a QSPR model. The GA-MLR model was developed using the QSARINS software66,67,68.
Non-linear models
Machine learning has received much attention in recent years as a powerful tool in the development of predictive models. Machine learning methods can develop non-linear models, which is more accurate compared to the linear models. In this study, Genetic Programming (GP) and Artificial Neural Network (ANN), as two common methods in the field of machine learning approaches, were used to develop non-linear models.
In order to develop non-linear GP based model which is biological science inspired, GPTIPS code was employed. The input data of GPTIPS was X2 and the final selected descriptors. In this method, the model was developed using selected mathematical operators (e.g. addition, subtraction, multiplication, and division) and adjusting the parameters of the code. After several iterations of modeling and comparing the statistical parameters of the developed models, the best model has been selected. More details on the model development procedure using GP can be found elsewhere27,69.
Multi-Layer Perceptron (MLP), which is one of the most widely used artificial neural networks, was chosen to develop the ANN model. MLP consists of an input layer, an output layer, and one or more hidden layers, each layer containing a number of neurons. In this study, X2 and the final selected descriptors were the input of the network and Y2 was the output of the network. More details on the model development procedure using MLP can be found elsewhere27,69.
Statistical parameters and models validation
Table 2 provides statistical parameters that can be utilized to assess the predictive capability of the QSPR model that has been developed. Where n is the number of data in the train, test, or main set, p is the number of independent variables, \({\text{Y}}_{{{2}_{{\text{i}}} }}^{{{\text{Exp}}}}\). is the experimental Y2 value, \({\text{Y}}_{{{2}_{{\text{i}}} }}^{{{\text{Pre}}}}\). is the predicted Y2, and \(\overline{{\text{Y}}}_{{2}}\) is the average of the experimental Y2 values.
Q2LOO-CV, which is defined as the LOO-CV coefficient of detertion, was initially used for internal validation of the constructed model69. Additionally, Leave-Many-Out Cross Validation (LMO-CV) was performed, where 25% of the data was left out in each iteration. The Q2LMO-CV was also calculated, further supporting the robustness and predictive power of the model. The calculation of Q2 for LMO-CV follows a similar procedure as that for LOO-CV (Eq. (7)). More detailed information about cross-validation techniques can be found elsewhere70,71. Additionally, external validation is commonly employed in the model validation process considering train and test sets.
Results
Linear model construction
In our recent research, we examined 84 ternary systems comprising of I.L., thiophene, and hydrocarbon solvent. The results revealed a correlation between the mole fraction of thiophene in the hydrocarbon-rich phase and the mole fraction of thiophene in the I.L.-rich phase. In another investigation, we examined 51 ternary systems comprising of I.L., pyridine, and hydrocarbon solvent. The results revealed a correlation between the mole fraction of pyridine in the hydrocarbon-rich phase and the mole fraction of pyridine in the I.L.-rich phase. Thiophene is an aromatic heterocycle compound containing a sulfur atom and pyridine is an aromatic heterocycle compound containing a nitrogen atom.
In this study, by reviewing the literature, 112 ternary systems containing I.L., aromatic hydrocarbon (benzene in the current study), and aliphatic hydrocarbon were gathered. In total, 17 anions, 20 cations, and 12 aliphatic hydrocarbons were included in the dataset. For each ternary system, the relation between the benzene mole fraction in the I.L.-rich phase (Y2) and the benzene mole fraction in the aliphatic hydrocarbon-rich phase (X2) was calculated. It should be noted that several polynomial forms were examined in the fitting process and it was found that the linear relation between Y2 and X2 is more accurate in spite of its simplicity. The results of this analysis are tabulated in Table 3. As observed in Table 3, the R2 for all correlations are greater than 0.85. Consequently, a strong correlation between X2 and Y2 was confirmed.
In order to provide a model that predicts the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases, Eq. (12) was developed using all the data points in the dataset (see Table 4). In other words, a linear model was developed using X2 to predict Y2 values. Y2 is the target variable and X2 is the independent variable. As Table 3 shows, in each ternary system, anion, cation, and aliphatic hydrocarbon are different. Therefore, it was necessary to add appropriate molecular descriptors to the model from each of the three mentioned groups. In this regard, Eq. (13) was constructed using one anion descriptor and X2, Eq. (14) was constructed using one cation descriptor and X2, and Eq. (15) was developed using one aliphatic hydrocarbon descriptor and X2. The statistical parameters of the developed models are presented in Table 4. As the statistical parameters of the developed models show, R2 value for the developed model using only X2 was 0.698. This value is equal to 0.861, 0.765, and 0.710 for the models that include X2 and one anion descriptor, one cation descriptor, or one aliphatic hydrocarbon descriptor, respectively. These results show that the structure of anion and cation has a considerable impact on determining the Y2 value; In contrast, the addition of the aliphatic hydrocarbon descriptor to the model had a negligible effect on increasing the accuracy of the constructed model. For further investigation, Eq. (16) was developed using X2, an anion descriptor, and an aliphatic hydrocarbon descriptor. As the comparison of the statistical parameters of Eqs. (13) and (16) shows, there is almost no progress in the model estimation capability. Also, Eq. (17) was developed using X2, a cation descriptor, and an aliphatic hydrocarbon descriptor. Similarly, the comparison of Eqs. (17) and (14) shows the insignificant effect of the addition of aliphatic hydrocarbon descriptor on the developed model. The reason can be due to the similarity of the structure of hydrocarbons in the present dataset. Therefore, the model was developed using only X2, an anion descriptor, and a cation descriptor (Eq. 18).
Plot (a) in Fig. 1 shows the Y2 values predicted using Eq. (12) versus Y2Exp and plot (b) shows the Y2 values predicted using Eq. (18) versus Y2Exp. The results clearly show that considering the anion and cation descriptors in the developed model increases the estimation capability of the constructed model.
The estimated Y2 values versus experimental Y2 values in the main set for the developed model using (a) only X2 variable (i.e., Eq. (12)), (b) X2 and GA-chosen descriptors regardless turning point (i.e., Eq. (18)), and (c) X2 and GA-chosen descriptors considering turning point (i.e., Eq. (21) or (22)).
In order to investigate the results of modeling using polynomials containing higher degrees of X2, the second to fifth power of X2 was also considered as independent variables along with X2 and anion, cation and aliphatic hydrocarbon descriptors. The mentioned variables were given as input to QSARINS software. The outcome of this analysis confirmed that the inclusion of higher powers of X2 do not improve the accuracy of the developed model. Therefore, the linear model seems to be the most appropriate model.
As plot (b) in Fig. 1 shows, the predicted Y2 values are negative for some data points. By considering the X2 = 0.15 as a turning point and constructing the model separately for X2 range larger than 0.15 and lower than 0.15, this problem was solved (Please check plot (c) in Fig. 1). In addition, the comparison of the X2 coefficient for the models developed using only the X2 variable shows that in the range of 0–0.15, the Y2 values have a stronger dependence on the X2 value (see Eqs. (19) and (20)) in Table 4). Therefore, developing two linear models with the same variables for the range of 0 to 0.15 and 0.15 to 1 can upgrade the estimation capability of the model. In this regard, the model was initially created for the X2 range of 0.15 to 1 by selecting suitable descriptors for both anions and cations. Subsequently, the determined variables were utilized to develop the model for the range of 0 to 0.15.
In the construction of QSPR models, identifying the appropriate number of descriptors used in the model is important. The breaking point plot is used as a valuable tool to determine the optimal number of descriptors. In this regard, models were developed using X2 and anion or cation descriptors for the range of 0.15–1. Also, R2 and Fisher function (F) values were determined for each constructed model. As a result, the effect of increasing the number of descriptors of each group on the estimation capability of the constructed model was determined. Considering the models developed using only X2 and anion descriptors (see Fig. 2), only one anion descriptor was needed to develop the model. Moreover, an increment in the number of anion descriptors not only had a negligible effect on the accuracy of the model but also caused a considerable decrease in the F-value. In addition, the breaking point plot confirmed that increasing the number of cation descriptors does not result in a significant improvement in the accuracy of the model. Therefore, due to the effectiveness of the presence of cation descriptors in determining Y2 values, using one cation descriptor in the model is sufficient. Finally, three variables (i.e., X2, one anion descriptor, and one cation descriptor) were selected to develop the final model. Taking into account the X2 = 0.15 as the turning point, the model was constructed for the range of 0.15 to 1 using three mentioned variables (Eq. (21)). Afterward, considering the points (0,0) and (0.15, Y2 predicted by Eq. (21), the model for the range 0 to 0.15 was also developed (Eq. (22)). As a result, the final model consists of two linear models developed for 0 to 0.15 and 0.15 to 1 ranges.
The plot (c) in Fig. 1 shows the Y2 values predicted using Eqs. (21) and (22) versus Y2Exp. A comparison between plots (b) and (c) in Fig. 1 reveals that the constructed model considering the turning point is not only more accurate compared to the model without the turning point, but also there is no negative predicted Y2 value in the model.
Evaluation and validation of the constructed linear QSPR model
In addition to internal validation, external validation is also very important for the evaluation of the constructed QSPR models. In the external validation, most of the data are classified into train set to be used for the model development. The rest of the data are classified in the test set to investigate the validation of the constructed model. In this regard, 26% of the total data (i.e., 30 ternary systems which are identified in Table 3) including [BF4] anion, [DCA] anion, [TfO] anion, [BMpyr] cation, [C10MIM] cation, [N4111] cation, n-decane and n-dodecane were involved in the test set. The external validation of the model is more reliable due to the inclusion of three anions, two cations, and two aliphatic hydrocarbons exclusively in the test set. In other words, the test set contains four ternary systems in which all anion, cation, and aliphatic hydrocarbon structures are different considering the involved structures in the train set (i.e., 61st, 62nd, 66th, and 67th systems).
To highlight the benefits of considering the X2 = 0.15 as a turning point, Eq. (23) was constructed utilizing the train set for X2 values higher than 0.15, at first (see Table 5). The linear model for the X2 range lower than 0.15 (Eq. (24)) was developed considering two points (i.e., (0,0) and (0.15, Y2 predicted by Eq. (23)). The Y2 values predicted using Eq. (23) or (24) for all data points can be found in the supporting information Excel file.
Figure 3 shows William’s plot, residual plot, and Y2 values predicted using Eq. (23) or (24) versus experimental Y2 values for test and train sets. According to the William’s plot, no outlier is observed in the employed dataset. The outcome of statistical evaluation of the final QSPR model after train and test classification, are reported in Table 6.
Non-linear model construction
As mentioned, GP and ANN methods were chosen to develop non-linear models. First, Eq. (25) was developed using the four mathematical operators of addition, subtraction, multiplication and division.
Afterward, the MLP network was employed to develop a non-linear ANN model. The network inputs were X2 along with an anion descriptor and a cation descriptor selected by GA (see Fig. 4). For the hidden layers, the hyperbolic tangent sigmoid non-linear activation function was selected and for the output layer, the linear activation function was selected. Since the Scaled Conjugate Gradient learning algorithm resulted in the minimum MSE values, it was used to train the network. Finally, an MLP network containing one hidden layer with 5 neurons was selected as the most appropriate MLP network. The predicted Y2 values using the GP and MLP models as well as the weights and biases of the MLP network can be found in the supporting information Excel file.
Evaluation and validation of the constructed non-linear QSPR models
The train and test sets for the non-linear models were considered the same as the linear model. The values of the predicted Y2 versus experimental Y2 and also the residual values versus experimental Y2 for GP and ANN models are presented in Fig. 5. The statistical parameters of the final linear and non-linear constructed QSPR models for train and test sets are presented in Table 7.
(a) Estimated Y2 values versus experimental Y2 values using GP-based model, (b) residuals versus experimental Y2 using GP-based model, (c) estimated Y2 values versus experimental Y2 values using MLP-based model, and (d) residuals versus experimental Y2 using MLP-based model (Train and test sets are considered).
In this study, the accuracy of the QSPR models was evaluated by computing and presenting the statistical parameters of the developed models which are tabulated in Table 8. A comparison between the statistical parameters of Eq. (12) and Eq. (18) reveals that considering the anion and cation descriptors in the model development improves the model accuracy significantly. Additionally, a comparison between the statistical parameters of Eq. (18) with Eqs. (21) and (22) confirms that considering the turning point in the model development increases the estimation capability of the model. On the other hand, the comparison of statistical parameters of non-linear models with the final linear model (i.e., Eq. (23) or (24)) shows that although the prediction accuracy of the model has increased slightly, this increase is not significant. One reason can be the variety of the anions and cations in the dataset. The more important reason that seems to be the almost linear subordination of Y2 to X2 and the selected descriptors. Therefore, compared to non-linear models, the developed linear model has a favorable prediction accuracy along with its simplicity. Therefore, the linear model was considered a more favorable model due to its less complexity and applicability in the interpretation of descriptors.
Discussion
In the development of the QSPR model, it was found that the effect of the aliphatic hydrocarbon structure on the estimation of Y2 value is insignificant. Therefore, the aliphatic hydrocarbon descriptor was not considered in the developed final model. The name and structure of all aliphatic hydrocarbons are presented in Table 9.
After developing the QSPR model, the analysis of the chosen descriptors within the model helps to find their possible importance in estimating the target variable (Y2). In this regard, anion and cation descriptors that appeared in the model have been introduced and investigated.
Interpretation of the anion descriptor
HTm descriptor was GA-selected anion descriptor in the final QSPR model. HTm is a GETAWAY (GEometry, Topology, and Atom-Weights AssemblY) descriptor. GETAWAY descriptors attempt to match the 3D molecular geometry provided by the molecular influence (or leverage) matrix and atom connectivity using chemical information, topology, and various atomic weighting schemes72. Considering the matrix type, GETAWAY descriptor can be classified into two categories: influence/distance matrix R (R-GETAWAY) and molecular influence matrix H (H-GETAWAY). The matrix type for calculating the HTm descriptor is the influence matrix H and it is weighted by atomic masses. As a symmetric matrix, the molecular influence matrix (H) is defined by Eq. (26)73:
H is a symmetric A × A matrix, with the M matrix comprising three columns representing the central Cartesian atomic coordinates (x, y, z), and the A row representing the atoms of a molecule. The calculations are performed on the H-filled molecular graph. The superscript T is the transposed of the M matrix. Diagonal elements in the H matrix (i.e., hii) are called leverage and are determined as follows:
where h is the element of the H matrix, the average value of the diagonal terms in the H matrix is represented by \(\overline{{\text{h}}}\), while the rank of the M matrix is denoted by D. For linear molecules, D is equal to 1, for planar molecules it is 2, and for 3D-molecules it is 3. It is important to note that the molecular influence matrix remains unchanged regardless of rotation.
Leverages (hii) represent the contribution or influence of each atom in determining the shape of the molecule. Mantle atoms in the structure of molecules always have higher leverage values compared to the atoms near the center of the molecule. In addition, the size and shape of the molecule are effective in determining the maximum leverage values. In spherical molecular structures, the leverage values are lower compared to the linear molecular structures. The maximum leverage decreases as the number of atoms (i.e., the molecular size) increases in the molecules that have almost the same shape. Consequently, the HTm descriptor is particularly useful for studying the molecular systems where atom connectivity and the spatial distribution of atomic masses have a significant effect on the molecular behavior, such as in the extraction processes. This molecular descriptor is capable to reflect the geometrical positioning of atoms, along with their influence on the molecular interactions, provides a more accurate representation of the overall molecular structure.
The off-diagonal elements (hij) reflect the ability of atom j to interact with atom i. The positive values of hij indicate that atoms i and j are located in the same molecular region and the probability of atomic interactions between them is higher. On the other hand, the negative values of hij show that the mentioned atoms are located in the molecular region opposite to the center of the molecule and the probability of atomic interactions between them is low.
The coefficient of HTm is positive in Eq. (23) or (24) which shows that increasing the HTm values of the anion structures improves the benzene extraction. The HTm descriptor values are presented in Table 10. The bond length and the number of bonds are effective in determining the leverage value. Therefore, between the atoms close to the surface of the anion, the larger atoms (which are chlorine, sulfur, fluorine, oxygen, and nitrogen atoms, respectively), have larger leverage values. Therefore, the presence of electronegative atoms near the anion surface can increase the interaction between the hydrogen atom in benzene and the electronegative atom in anion. Additionally, the size and shape of anion structure is effective in determining HTm descriptor values. Therefore, the HTm descriptor’s dependence on the atomic masses and the molecular topology makes it a key factor in evaluating the intermolecular interactions, especially in systems where the electron density and spatial arrangement play pivotal roles. In the benzene extraction process, these interactions are crucial because they directly affect the efficiency of the phase transfer and separation processes. Therefore, the HTm descriptor, through its leverage of the atomic masses and 3D structural information, offers deeper insight into the anion’s capacity to interact with benzene molecules, especially in I.L. systems where the complex molecular behavior is involved.
Interpretation of the cation descriptor
BELe3 descriptor was GA-chosen cation descriptor in the QSPR model. BELe3 is Burden eigenvalues descriptor that is weighted by atomic Sanderson electronegativities. In order to calculate the Burden eigenvalues, Burden matrix (B) is calculated. Burden matrix which is calculated based on the hydrogen-depleted graph is defined as Eq. (31)74.
where Zi is the atomic number of atoms i, and E is the edges of the graph. The conventional bond order, denoted as \({\pi }_{ij}^{*}\), represents different values for various types of bonds. For instance, it takes on values of 0.1, 0.15, 0.2, 0.3, and 0.01 for single, aromatic, double, triple, and terminal bonds, respectively.
The Burden eigenvalues reflect the molecular connectivity and help to predict the interaction strength in the solvent systems. Higher eigenvalues can suggest stronger interactions between the cation and benzene, enhancing the extraction efficiency. In addition, the BELe3 descriptor is particularly relevant for evaluating the interactions in I.L. systems due to its sensitivity to the electronic environment of the cation. Table 11 illustrates that the BELe3 descriptor values is increased by an increase in the length of the alkyl side chain in the cation structure. It may be due to the increase in the number of single bonds. As the alkyl side chain length increases, not only the number of single bonds increase, but also the steric effects of the cation play critical role in the molecular interactions. Eq. (23) or (24) show the positive effect of increasing BELe3 values on the benzene extraction. Therefore, a longer alkyl chain may enhance the benzene extraction. It should be noted that as the cation size increases, the Coulombic interaction between the anion and the cation decreases. Consequently, the CH-π interaction between the cation and benzene increases. This enhancement is particularly crucial in the I.L. systems, where these interactions can significantly influence the solubility and extraction outcomes. Moreover, the electronegativity of the atoms in the structure of the cations is effective in determining the Y2 value. It seems that the impact of electronegativity in this context underscores the significance of cation structure design in optimizing the extraction efficiency, as it affects the overall interaction energy and stability of the I.L. with benzene.
Conclusion
In the present study, after collecting an extensive dataset including 112 ternary systems containing I.L., benzene, and aliphatic hydrocarbon, new linear and non-linear QSPR models were developed to estimate the benzene distribution between the aliphatic hydrocarbon-rich and I.L.-rich phases. For this purpose, the existence of correlation between X2 and Y2 values was confirmed for each of the ternary systems, at first. Afterward, considering the anion, cation, and aliphatic hydrocarbon molecular descriptors in the model, it was found that the aliphatic hydrocarbon descriptor has a negligible effect in predicting Y2 values. Consequently, only cation and anion molecular descriptors were considered in the constructed model. It was also found that considering the X2 = 0.15 as a turning point is effective in increasing the estimation capability of the constructed model. Moreover, the breaking point plot showed that only one anion descriptor and one cation descriptor were sufficient for the development of the model. As a result, the final linear model was constructed using X2, one anion descriptor, and one cation descriptor considering the turning point. Internal and external validation confirmed that the linear constructed QSPR model can estimate Y2 with reasonable accuracy. GP and MLP machine learning method were employed for developing non-linear QSPR models. The results showed that the prediction accuracy for the non-linear models is only slightly better compared to the linear model. Therefore, the linear model was preferred in this study due to its simplicity and ease of interpretation.
Interpretation of the GA-chosen descriptors in the constructed model revealed that the size of the cation and anion, the electronegativity of the atoms in the cation structure, the shape of anion, and the presence of electronegative atoms near the surface of the anions are effective in predicting Y2 values. The QSPR-developed model is an applicable tool to predict Y2 values for unstudied ternary systems (only 112 ternary systems from 4080 synthesizable ternary systems using 17 anions, 20 cations, and 12 aliphatic hydrocarbons have been studied). The outcome of the present study is a step forward in the development of a computational approach for solvent screening in the extractive removal of benzene from fuel mixtures containing aromatic/aliphatic compounds using I.Ls as green solvents.
Data availability
It should be justified that “All data generated or analysed during this study are included in this published article [and its supplementary information files]”.
Abbreviations
- C2MIM:
-
1-Ethyl-3-methyl imidazolium
- C3MIM:
-
1-Propyl-3-methylimidazolium
- C4MIM:
-
1-Butyl-3-methyl imidazolium
- C5MIM:
-
1-Pentyl-3-methylimidazolium
- C6MIM:
-
1-Hexyl-3-methylimidazolium
- C8MIM:
-
1-Methyl-3-octyl imidazolium
- C10MIM:
-
1-Decyl-3-methylimidazolium
- C12MIM:
-
1-Dodecyl-3-methylimidazolium
- COC2MIM:
-
1-(2-Methoxyethyl)-3-methylimidazolium
- BMPyr:
-
1-Butyl-1-methylpyrrolidinium
- EMPY:
-
1-Ethyl-3-methyl pyridinium
- EPY:
-
1-Ethylpyridinium
- BPY:
-
1-Butylpyridinium
- HPY:
-
1-Hexylpyridinium
- N4111 :
-
Butyltrimethylammonium
- N4441 :
-
Tributylmethylammonium
- C2 :
-
Ethyl-(2-hydroxy-ethyl)-dimethyl-ammonium
- TEMA:
-
(2-Hydroxyethyl) methylammonium
- Pre:
-
Predicted data
- Exp:
-
Experimental data
- SCN:
-
Thiocyanate
- MeSO4 :
-
Methyl sulfate
- OAc:
-
Acetate
- DEP:
-
Diethyl phosphate
- DCA:
-
Dicyanamide
- MP:
-
Methyl phosphonate
- OTf:
-
Trifluoromethanesulfonate
- EtSO4 :
-
Ethyl sulfate
- OcSO4 :
-
Octyl sulfate
- MDEGSO4 :
-
Diethylenglycol monomethyl ether sulfate
- BF4 :
-
Tetrafluoroborate
- NTF2 :
-
Bis(trifluoromethylsulfonyl)imide
- Cl:
-
Chloride
- AlCl4 :
-
Aluminum tetrachloride
- FeCl4 :
-
Tetrachloroferrate
- NO3 :
-
Nitrate
- PF6 :
-
Hexafluorophosphate
- Cat:
-
Cation descriptor
- Ani:
-
Anion descriptor
- HC:
-
Aliphatic hydrocarbon descriptor
References
García, S., Larriba, M., Casas, A., García, J. & Rodríguez, F. Separation of toluene and heptane by liquid-liquid extraction using binary mixtures of the ionic liquids 1-butyl-4-methylpyridinium bis(trifluoromethylsulfonyl)imide and 1-ethyl-3-methylimidazolium ethylsulfate. J. Chem. Eng. Data 57(9), 2472–2478. https://doi.org/10.1021/je300635c (2012).
Rodriguez, N. R., Requejo, P. F. & Kroon, M. C. Aliphatic–aromatic separation using deep eutectic solvents as extracting agents. Ind. Eng. Chem. Res. 54(45), 11404–11412. https://doi.org/10.1021/acs.iecr.5b02611 (2015).
Requejo, P. F., Gómez, E., Calvar, N. & Domínguez, Á. Application of pyrrolidinium-based ionic liquid as solvent for the liquid extraction of benzene from its mixtures with aliphatic hydrocarbons. Ind. Eng. Chem. Res. 54(4), 1342–1349. https://doi.org/10.1021/ie5040489 (2015).
Ayuso, M. et al. Separation of benzene from methylcycloalkanes by extractive distillation with cyano-based ionic liquids: Experimental and CPA EoS modelling. Sep. Purif. Technol. 234, 116128. https://doi.org/10.1016/j.seppur.2019.116128 (2020).
Ángeles, D., Begoña, G., Patricia, F. R. & Sandra, C. Extraction of aromatic compounds from their mixtures with alkanes: from ternary to quaternary (or higher) systems. In Solvents, Ionic Liquids and Solvent Effects (eds. Daniel, G.-M. & Magdalena, M.) (IntechOpen, 2019).
Lyu, Y., Brennecke, J. F. & Stadtherr, M. A. Review of recent aromatic–aliphatic–ionic liquid ternary liquid-liquid equilibria and their modeling by COSMO-RS. Ind. Eng. Chem. Res. 59(19), 8871–8893. https://doi.org/10.1021/acs.iecr.0c00581 (2020).
Meindersma, G. W. & de Haan, A. B. Conceptual process design for aromatic/aliphatic separation with ionic liquids. Chem. Eng. Res. Des. 86(7), 745–752. https://doi.org/10.1016/j.cherd.2008.02.016 (2008).
Hadj-Kali, M. K., Salleh, Z., Ali, E., Khan, R. & Hashim, M. A. Separation of aromatic and aliphatic hydrocarbons using deep eutectic solvents: a critical review. Fluid Phase Equilib. 448, 152–167. https://doi.org/10.1016/j.fluid.2017.05.011 (2017).
Requejo, P. F., Calvar, N., Domínguez, Á. & Gómez, E. Comparative study of the LLE of the quaternary and ternary systems involving benzene, n-octane, n-decane and the ionic liquid [BMpyr][NTf2]. J. Chem. Thermodyn. 98, 56–61. https://doi.org/10.1016/j.jct.2016.02.027 (2016).
Kȩdra-Królik, K., Fabrice, M. & Jaubert, J.-N. Extraction of thiophene or pyridine from n-heptane using ionic liquids. Gasoline and diesel desulfurization. Ind. Eng. Chem. Res. 50(4), 2296–2306. https://doi.org/10.1021/ie101834m (2011).
Domínguez, I., Requejo, P. F., Canosa, J. & Domínguez, Á. (Liquid+liquid) equilibrium at T=298.15K for ternary mixtures of alkane+aromatic compounds+imidazolium-based ionic liquids. J. Chem. Thermodyn. 74, 138–143. https://doi.org/10.1016/j.jct.2014.01.022 (2014).
González, E. J., Calvar, N., Gómez, E. & Domínguez, Á. Separation of benzene from alkanes using 1-ethyl-3-methylpyridinium ethylsulfate ionic liquid at several temperatures and atmospheric pressure: Effect of the size of the aliphatic hydrocarbons. J. Chem. Thermodyn. 42(1), 104–109. https://doi.org/10.1016/j.jct.2009.07.017 (2010).
Arce, A., Earle, M. J., Rodríguez, H. & Seddon, K. R. Separation of benzene and hexane by solvent extraction with 1-Alkyl-3-methylimidazolium bis{(trifluoromethyl)sulfonyl}amide ionic liquids: effect of the alkyl-substituent length. J. Phys. Chem. B 111(18), 4732–4736. https://doi.org/10.1021/jp066377u (2007).
Zhou, T. et al. Evaluation of the ionic liquids 1-alkyl-3-methylimidazolium hexafluorophosphate as a solvent for the extraction of benzene from cyclohexane: (Liquid+liquid) equilibria. J. Chem. Thermodyn. 48, 145–149. https://doi.org/10.1016/j.jct.2011.12.006 (2012).
Gómez, E., Domínguez, I., Calvar, N. & Domínguez, Á. Separation of benzene from alkanes by solvent extraction with 1-ethylpyridinium ethylsulfate ionic liquid. J. Chem. Thermodyn. 42(10), 1234–1239. https://doi.org/10.1016/j.jct.2010.04.022 (2010).
Requejo, P. F., Calvar, N., Gómez, E. & Domínguez, Á. Study of the suitability of two ammonium-based ionic liquids for the extraction of benzene from its mixtures with aliphatic hydrocarbons. Fluid Phase Equilib. 426, 17–24. https://doi.org/10.1016/j.fluid.2016.02.006 (2016).
Gómez, E., Domínguez, I., Calvar, N., Palomar, J. & Domínguez, Á. Experimental data, correlation and prediction of the extraction of benzene from cyclic hydrocarbons using [Epy][ESO4] ionic liquid. Fluid Phase Equilib. 361, 83–92. https://doi.org/10.1016/j.fluid.2013.10.033 (2014).
Peng, D., Horvat, D. P. & Picchioni, F. Computer-aided ionic liquid design and experimental validation for benzene-cyclohexane separation. Ind. Eng. Chem. Res. 60(13), 4951–4961. https://doi.org/10.1021/acs.iecr.0c05935 (2021).
Yousefinejad, S. & Hemmateenejad, B. Chemometrics tools in QSAR/QSPR studies: a historical perspective. Chemom. Intell. Lab. Syst. 149, 177–204. https://doi.org/10.1016/j.chemolab.2015.06.016 (2015).
Katritzky, A. R., Pacureanu, L., Dobchev, D. & Karelson, M. QSPR study of critical micelle concentration of anionic surfactants using computational molecular descriptors (in eng). J. Chem. Inf. Model. 47(3), 782–93. https://doi.org/10.1021/ci600462d (2007).
Tropsha, A. Best practices for QSAR model development, validation, and exploitation. Mol. Inf. 29(6–7), 476–488. https://doi.org/10.1002/minf.201000061 (2010).
Gorji, A. E. & Sobati, M. A. Toward molecular modeling of thiophene distribution between the ionic liquid and hydrocarbon phases: Effect of hydrocarbon structure. J. Mol. Liq. 287, 110976. https://doi.org/10.1016/j.molliq.2019.110976 (2019).
A. E. Gorji and M. A. Sobati, How anion structures can affect the thiophene distribution between imidazolium-based ionic liquid and hydrocarbon phases? A theoretical QSPR study. Energy Fuels 33(9), 8576–8587. https://doi.org/10.1021/acs.energyfuels.9b02416 (2019).
Nekoeinia, M., Yousefinejad, S. & Abdollahi, A. Prediction of E T N polarity scale of ionic liquids using a QSPR approach. Ind. Eng. Chem. Res. 54, 12682–12689. https://doi.org/10.1021/acs.iecr.5b02982 (2015).
Peng, D., Kleiweg, A.-J., Winkelman, J. G. M., Song, Z. & Picchioni, F. A hierarchical hybrid method for screening ionic liquid solvents for extractions exemplified by the extractive desulfurization process. ACS Sustain. Chem. Eng. 9(7), 2705–2716. https://doi.org/10.1021/acssuschemeng.0c07866 (2021).
Gorji, A. E., Sobati, M. A., Alopaeus, V. & Uusi-Kyyny, P. Toward solvent screening in the extractive desulfurization using ionic liquids: QSPR modeling and experimental validations. Fuel 302, 121159. https://doi.org/10.1016/j.fuel.2021.121159 (2021).
Amereh, M., Ebrahimpoor Gorji, A. & Sobati, M. A. Toward solvent selection for the extractive removal of pyridine from fuels using ionic liquids: a QSPR study. Fuel 343, 127820. https://doi.org/10.1016/j.fuel.2023.127820 (2023).
González, E. J., Calvar, N., Gómez, E. & Domínguez, Á. Separation of benzene from linear alkanes (C6–C9) Using 1-Ethyl-3-methylimidazolium ethylsulfate at T = 298.15 K. J. Chem. Eng. Data 55(9), 3422–3427. https://doi.org/10.1021/je1001544 (2010).
González, E. J., Calvar, N., González, B. & Domínguez, Á. (Liquid+liquid) equilibria for ternary mixtures of (alkane+benzene+[EMpy] [ESO4]) at several temperatures and atmospheric pressure. J. Chem. Thermodyn. 41(11), 1215–1221. https://doi.org/10.1016/j.jct.2009.05.008 (2009).
Zhang, F. et al. Benzyl- and vinyl-functionalized imidazoium ionic liquids for selective separating aromatic hydrocarbons from alkanes. Ind. Eng. Chem. Res. 55(3), 747–756. https://doi.org/10.1021/acs.iecr.5b03814 (2016).
Domínguez, I., González, E. J., González, R. & Domínguez, Á. Evaluation of [C3mim][NTf2] as solvent for the liquid-liquid extraction of benzene from mixtures of benzene and hexane. Sep. Sci. Technol. 47(2), 331–336. https://doi.org/10.1080/01496395.2011.621161 (2012).
Lee, K.-H., You, S.-H. & Park, S.-J. The selectivity of imidazolium-based ionic liquids with different anions to BTX aromatics in hexane at 298.15 K and atmospheric pressure. Korean J. Chem. Eng. 33(10), 2982–2989. https://doi.org/10.1007/s11814-016-0140-4 (2016).
Domańska, U., Pobudkowska, A. & Królikowski, M. Separation of aromatic hydrocarbons from alkanes using ammonium ionic liquid C2NTf2 at T=298.15K. Fluid Phase Equilibria 259(2), 173–179. https://doi.org/10.1016/j.fluid.2007.06.025 (2007).
Sakal, S. A., Lu, Y.-Z., Jiang, X.-C., Shen, C. & Li, C.-X. A Promising ionic liquid [BMIM][FeCl4] for the extractive separation of aromatic and aliphatic hydrocarbons. J. Chem. Eng. Data 59(3), 533–539. https://doi.org/10.1021/je400076x (2014).
Wiśniewski, P. et al. Effect of the ionic liquids on extraction of aromatic and sulfur compounds from the model petrochemical stream. Fluid Phase Equilibria 552, 113296. https://doi.org/10.1016/j.fluid.2021.113296 (2022).
Manohar, C. V., Rabari, D., Kumar, A. A. P., Banerjee, T. & Mohanty, K. Liquid–liquid equilibria studies on ammonium and phosphonium based ionic liquid–aromatic–aliphatic component at T=298.15K and p=1bar: correlations and a-priori predictions. Fluid Phase Equilibria 360, 392–400. https://doi.org/10.1016/j.fluid.2013.10.005 (2013).
García, J., Fernández, A., Torrecilla, J. S., Oliet, M. & Rodríguez, F. Ternary liquid−liquid equilibria measurement for hexane and benzene with the ionic liquid 1-Butyl-3-methylimidazolium methylsulfate at T = (298.2, 313.2, and 328.2) K. J. Chem. Eng. Data 55(1), 258–261. https://doi.org/10.1021/je900321j (2010).
Mokhtarani, B. et al. Ternary (liquid–liquid) equilibria of nitrate based ionic liquid+alkane+benzene at 298.15K: experiments and correlation. Fluid Phase Equilibria 341, 35–41. https://doi.org/10.1016/j.fluid.2012.12.025 (2013).
Hashim, M. A., Zulhaziman, M., Salleh, M., Ali, E. & Hadj-Kali, M. K. Selective extraction of benzene from benzene–cyclohexane mixture using 1-ethyl-3-methylimidazolium tetrafluoroborate ionic liquid. AIP Conf. Proc. 2124(1), 020028. https://doi.org/10.1063/1.5117088 (2019).
Domínguez, I., Calvar, N., Gómez, E. A. & Domínguez, Á. Separation of benzene from heptane using tree ionic liquids: BMimMSO4, BMimNTf2, and PMimNTf2. Procedia Eng. 42, 1597–1605. https://doi.org/10.1016/J.PROENG.2012.07.553 (2012).
Pereiro, A. B. & Rodríguez, A. An ionic liquid proposed as solvent in aromatic hydrocarbon separation by liquid extraction. AIChE J. 56(2), 381–386. https://doi.org/10.1002/aic.11937 (2010).
Revelli, A.-L., Mutelet, F. & Jaubert, J.-N. Extraction of benzene or thiophene from n-heptane using ionic liquids. NMR and thermodynamic study. J. Phys. Chem. B 114(13), 4600–4608. https://doi.org/10.1021/jp911978a (2010).
Letcher, T. & Reddy, P. Ternary (liquid + liquid) equilibria for mixtures of 1-hexyl-3-methylimidazolium (tetrafluoroborate or hexafluorophosphate) + benzene + an alkane at T=298.2 K and p=0.1 MPa. J. Chem. Thermodyn. 37, 415–421. https://doi.org/10.1016/j.jct.2004.05.001 (2005).
Enayati, M., Mokhtarani, B., Sharifi, A. & Mirzaei, M. Extraction of benzene from heptane with pyridinium based ionic liquid at (298.15, 308.15 and 318.15) K. Fluid Phase Equilibria 411, 53–58. https://doi.org/10.1016/j.fluid.2015.12.009 (2016).
Deenadayalu, N., Ngcongo, K. C., Letcher, T. M. & Ramjugernath, D. Liquid−liquid equilibria for ternary mixtures (an ionic liquid + benzene + heptane or hexadecane) at T = 298.2 K and atmospheric pressure. J. Chem. Eng. Data 51(3), 988–991. https://doi.org/10.1021/je050494l (2006).
Gómez, E., Domínguez, I., González, B. & Domínguez, Á. Liquid−liquid equilibria of the ternary systems of alkane + aromatic + 1-ethylpyridinium ethylsulfate ionic liquid at T = (283.15 and 298.15) K. J. Chem. Eng. Data 55(11), 5169–5175. https://doi.org/10.1021/je100716c (2010).
Requejo, P. F., Calvar, N., Domínguez, Á. & Gómez, E. Application of the ionic liquid tributylmethylammonium bis(trifluoromethylsulfonyl)imide as solvent for the extraction of benzene from octane and decane at T = 298.15 K and atmospheric pressure. Fluid Phase Equilibria 417, 137–143. https://doi.org/10.1016/J.FLUID.2016.02.028 (2016).
Domínguez, I., González, E. J., González, R. & Domínguez, Á. Extraction of benzene from aliphatic compounds using commercial ionic liquids as solvents: study of the liquid-liquid equilibrium at T = 298.15 K. J. Chem. Eng. Data 56(8), 3376–3383. https://doi.org/10.1021/je200334e (2011).
Requejo, P. F., Calvar, N., Domínguez, Á. & Gómez, E. Determination and correlation of (liquid+liquid) equilibria of ternary and quaternary systems with octane, decane, benzene and [BMpyr][DCA] at T=298.15K and atmospheric pressure. J. Chem. Thermodyn. 94, 197–203. https://doi.org/10.1016/j.jct.2015.11.016 (2016).
Calvar, N., Domínguez, I., Gómez, E., Palomar, J. & Domínguez, Á. Evaluation of ionic liquids as solvent for aromatic extraction: Experimental, correlation and COSMO-RS predictions. J. Chem. Thermodyn. 67, 5–12. https://doi.org/10.1016/j.jct.2013.07.011 (2013).
Letcher, T. M. & Naicker, P. K. Ternary liquid−liquid equilibria for mixtures of an n-alkane + an aromatic hydrocarbon + N-Methyl-2-pyrrolidone at 298.2 K and 1 atm. J. Chem. Eng. Data 43(6), 1034–1038. https://doi.org/10.1021/je980114e (1998).
González, E. J., Calvar, N., González, B. & Domínguez, Á. Liquid extraction of benzene from its mixtures using 1-ethyl-3-methylimidazolium ethylsulfate as a solvent. J. Chem. Eng. Data 55(11), 4931–4936. https://doi.org/10.1021/je100508y (2010).
Salleh, M. Z. M., Hadj-Kali, M. K., Hashim, M. A. & Mulyono, S. Ionic liquids for the separation of benzene and cyclohexane – COSMO-RS screening and experimental validation. J. Mol. Liquids 266, 51–61. https://doi.org/10.1016/j.molliq.2018.06.034 (2018).
Zhou, T. et al. Deep separation of benzene from cyclohexane by liquid extraction using ionic liquids as the solvent. Ind. Eng. Chem. Res. 51(15), 5559–5564. https://doi.org/10.1021/ie202728j (2012).
Ismail, M., Bustam, M. A. & Man, Z. Extraction of benzene and cyclohexane using [BMIM][N(CN)<SUB>2</SUB>] and their equilibrium modelling. Adv. Mater. Sustaina. Growth 1901, 080001, https://doi.org/10.1063/1.5010516 (2017).
Wang, R., Wang, J., Meng, H., Li, C. & Wang, Z. Liquid−liquid equilibria for benzene + cyclohexane + 1-methyl-3-methylimidazolium dimethylphosphate or + 1-ethyl-3-methylimidazolium diethylphosphate. J. Chem. Eng. Data 53(5), 1159–1162. https://doi.org/10.1021/je700759h (2008).
Domínguez, I., Calvar, N., Gómez, E. & Domínguez, Á. Liquid–liquid extraction of aromatic compounds from cycloalkanes using 1-butyl-3-methylimidazolium methylsulfate ionic liquid. J. Chem. Eng. Data 58(2), 189–196. https://doi.org/10.1021/je300826t (2013).
Lyu, Z. et al. Simulation based ionic liquid screening for benzene–cyclohexane extractive separation. Chem. Eng. Sci. 113, 45–53. https://doi.org/10.1016/j.ces.2014.04.011 (2014).
Salleh, M. Z. M., Hadj-Kali, M. K., Hashim, M. A. & Ali, E. Extractive separation of benzene and cyclohexane using 1-butyl-3-methylimidazolium acetate. IOP Conf. Ser.: Mater. Sci. Eng. 458(1), 012067. https://doi.org/10.1088/1757-899X/458/1/012067 (2018).
González, E. J., Calvar, N., González, B. & Domínguez, Á. Measurement and correlation of liquid–liquid equilibria for ternary systems cyclooctane+aromatic hydrocarbon+1-ethyl-3-methylpyridinium ethylsulfate at T=298.15K and atmospheric pressure. Fluid Phase Equilibria 291(1), 59–65. https://doi.org/10.1016/j.fluid.2009.12.019 (2010).
Domínguez, I., González, E. J. & Domínguez, Á. Liquid extraction of aromatic/cyclic aliphatic hydrocarbon mixtures using ionic liquids as solvent: Literature review and new experimental LLE data. Fuel Process. Technol. 125, 207–216. https://doi.org/10.1016/j.fuproc.2014.04.001 (2014).
Mitchell, M. An Introduction to Genetic Algorithms (The MIT Press, 1996). https://doi.org/10.7551/mitpress/3927.001.0001.
Haupt, R. L. & Haupt, S. E. Practical Genetic Algorithms (Wiley InterScience electronic collection) (Wiley, 2004).
Goldberg, D. E. Genetic Algorithms in Search Optimization and Machine Learning (. Addison-Wesley Longman Publishing Co., 1989).
Whitley, D. A Genetic Algorithm Tutorial, Statistics and Computing, vol. 4 (Springer, 1998). https://doi.org/10.1007/BF00175354.
Gramatica, P. Principles of QSAR modeling: comments and suggestions from personal experience. Int. J. Quant. Struct.-Property Relationships (IJQSPR) 5(3), 61–97. https://doi.org/10.4018/IJQSPR.20200701.oa1 (2020).
Gramatica, P. & Sangion, A. A historical excursus on the statistical validation parameters for QSAR models: a clarification concerning metrics and terminology. J. Chem. Inf. Model. 56(6), 1127–1131. https://doi.org/10.1021/acs.jcim.6b00088 (2016).
Gramatica, P., Chirico, N., Papa, E., Cassani, S. & Kovarich, S. QSARINS: A new software for the development, analysis, and validation of QSAR MLR models. J. Comput. Chem. 34(24), 2121–2132. https://doi.org/10.1002/jcc.23361 (2013).
Ghomisheh, Z., Gorji, A. E. & Sobati, M. A. Prediction of critical properties of sulfur-containing compounds: new QSPR models. J. Mol. Graph. Model. 101, 107700. https://doi.org/10.1016/j.jmgm.2020.107700 (2020).
Xu, Q.-S. & Liang, Y.-Z. Monte Carlo cross validation. Chemometr. Intel. Lab. Syst. 56(1), 1–11. https://doi.org/10.1016/S0169-7439(00)00122-2 (2001).
Hastie, T., Tibshirani, R. & Friedman, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer series in statistics) (Springer, 2001).
Gackowski, M. et al. Application of multivariate adaptive regression splines (MARSplines) for predicting antitumor activity of anthrapyrazole derivatives. Int. J. Mol. Sci. 23, 9. https://doi.org/10.3390/ijms23095132.
Leszczynski, J. Handbook of Computational Chemistry (Handbook of Computational Chemistry) (Springer, 2012).
Todeschini, R., Consonni, V., Mannhold, R., Kubinyi, H. & Timmerman, H. Handbook of Molecular Descriptors (Methods & Principles in Medicinal Chemistry) (Wiley, 2008).
Acknowledgements
The authors sincerely acknowledge Prof. Paola Grammatica, senior professor of the University of Insubria for providing the free license of the QSARINS software.
Author information
Authors and Affiliations
Contributions
Mahdieh Amereh: Data curation; Investigation; Methodology; Formal analysis; Software; Writing—Original Draft; Visualization; Validation Ali Ebrahimpoor Gorji: Investigation; Methodology; Formal analysis; Validation Mohammad Amin Sobati: Conceptualization; Methodology; Supervision; Writing—Review & Editing; Project administration.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Amereh, M., Gorji, A.E. & Sobati, M.A. Molecular insights on the solvent screening for the benzene extraction from fuels using ionic liquids via QSPR method. Sci Rep 14, 30718 (2024). https://doi.org/10.1038/s41598-024-79639-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-79639-x