Using data-driven models to simulate the performance of surfactants in reducing heavy oil viscosity | Scientific Reports

Scientific Reports volume 14, Article number: 27670 (2024) Cite this article

Metrics details

There is a substantial body of literature exploring the challenges associated with exploring and exploiting these underground resources. Unconventional resources, particularly heavy oil reservoirs, are critical for meeting ever-increasing global energy demand. By injecting surfactants into heavy oil, chemically enhanced oil recovery (EOR) may enable emulsification, which may reduce the viscosity of heavy oil and facilitate extraction and transportation. In this work, a large experimental dataset, containing 2020 data points, was extracted from the literature for modeling oil-in-water (O/W) emulsion viscosity using machine learning (ML) methods. The algorithms used pressure, temperature, salinity, surfactant concentration, type of surfactant, shear rate, and crude oil density as inputs. For this purpose, five ML algorithms were selected and optimized, including adaptive boosting (AB), convolutional neural network (CNN), ensemble learning (EL), artificial neural network (ANN), and decision tree (DT). A combined simulated annealing (CSA) method was utilized to optimize all algorithms. With AARE, R2, MAE, MSE, and RMSE values of 8.982, 0.996, 0.004, 0.0002, and 0.0132, respectively, the ANN predictor exhibited higher accuracy in predicting O/W emulsion viscosity for total data (train and test subsets combined). A Monte-Carlo sensitivity analysis was also performed to determine the impact of input features on the model output. By using the proposed ML predictor, expensive and time-consuming experiments can be eliminated and emulsion viscosity predictions can be expedited without the need for costly experiment.

The large-scale exploitation of heavy oil reservoirs, as unconventional fossil resources, has become hugely important in meeting economic development needs and plays a leading role in the further development of the oil and gas industry, which is currently supplying global energy demands1,2,3,4. Nevertheless, the merit of these replacement resources comes with several drawbacks as they offer high viscous asphaltenic deposits and trouble both production and transportation units5. The scientific community has addressed these technical/economic hurdles with treatment methods of thermal, chemical, and miscible enhanced recovery6,7,8,9. Considering the arguments made by Chen, et al.10, the utilization of surface active agents (surfactants) seems a promising approach to reduce the viscosity of heavy oil and assist the production operation through emulsification.

Emulsification is known as the formation of tiny droplets of a liquid that are dispersed into a continuous phase of another immiscible liquid. Having a binary immiscible liquid mixture, e.g., oil and water, one can form emulsions by adding a surfactant and introducing energy to the system11. It should be noted that the nature of the emerging emulsion and its stability is primarily contingent upon the type of surfactant and emulsion preparation protocols12. Surfactants have amphipathic molecular structure which provides them with a hydrophilic head and a hydrophobic tail. This allows them to overcome the strong interfacial tension (IFT) between two immiscible liquids, alter rock wettability, and ultimately produce thermodynamically stabilized emulsions10,13. Due to the mobility improvements, oil-in-water (O/W) emulsions have shown favorable impacts in chemical enhanced oil recovery (EOR) and heavy oil transportation14,15,16,17,18,19,20. Therefore, the technical viability and controlling factors of this recovery method have been extensively investigated in the literature.

Organic (asphaltene and wax) and inorganic (inorganic salts) scales in crude oil are somehow related to the emulsion stability21. To elaborate more, the tightness of O/W emulsion is directly affected by the asphaltene content of crude oil with smaller sizes in water droplets22. Increasing wax content, on the other hand, can form tighter emulsions23. Furthermore, Kumar and Mahto24 stated that increasing the pH of the aqueous phase can slightly reduce the emulsion viscosity and remarkably enhance the stability. Among other key influencing factors, higher shear rate and consequently shear time, higher salinity, and higher temperature have a synergetic effect on emulsion stability25,26,27. Clearly, decades of extensive laboratory efforts have been made to optimize the O/W emulsification. However, despite water-in-oil (W/O) emulsions, empirical viscosity prediction models for O/W emulsions are scarce28. So far, Shi, et al.29 are the only one to tackle this problem by combining two well-known models, which were originally introduced for W/O emulsions30,31, to acquire an estimation of O/W emulsion viscosity. Therefore, it has remained an unclear question whether these valuable experimental results can help us to model the viscosity reduction of heavy oil through O/W emulsification and eliminate the costly time-consuming measurements. In addition, accurate viscosity prediction is conducive to effectively designing wellbore completion and surface facilities.

Computationally efficient, flexible, and powerful classic and modern ML techniques have been brought to attention in various sectors of geoscience surveys, inclusive geology32,33, geophysics34,35,36, petrophysics37,38,39,40,41,42, fluid properties43, rock-fluid integration44,45, pore-scale simulations46,47,48, EOR49,50,51, drilling engineering52,53,54,55 and production operation56,57,58. A relevant line of research has employed classic machine learning (ML) techniques for W/O emulsion viscosity modeling purposes. For instance, the behavior of petroleum emulsion was studied by Umar, et al.59 using gene expression programming (GEP) and a statistical approach. They discovered that the aging time of an emulsion is strongly linked to emulsion viscosity. Moreover, Li, et al.60 developed a viscosity prediction model based on the Cuckoo optimization algorithm least squares support vector machine (COA-LSSVM) and demonstrated better results in comparison to empirical models. To the best of the authors’ knowledge, no study has focused on modeling the O/W emulsion viscosity using ML.

The objective of the current research is to present ML-assisted O/W viscosity predictors based on a dataset of 2020 data points. To do so, five ML algorithms, including, adaptive boosting (AB), convolutional neural network (CNN), ensemble learning (EL), artificial neural network (ANN), and decision tree (DT) were implemented. Pressure, temperature, salinity, surfactant concentration, type of surfactant, shear rate, and crude oil density are fed as input features to each algorithm to procrastinate the O/W emulsion viscosity. Subsequently, the performance of all predictors is evaluated and compared to each other.

In this work, a databank consisting 2020 data points of oil-in-water emulsion viscosities was gathered from previously well-established papers9,14,61,62,63,64,65,66,67,68,69,70,71,72. In these resources, a great deal of effort has been dedicated to experimentally demonstrate the performance of various surfactants including sodium dodecyl sulfate (SDS), boronic ester anionic-nonionic (SYW), compound SYW with oleic acid and ethanolamine (SYG), gemini surfactant (GS), sodium nonylphenol polyoxyethylene ether sulfate (NPES), octyl phenol ethoxylate (Triton X-100), aliphatic alcohol ethoxylate (AEO-9), Span 60, Ralufon 414, alkylphenol polyoxyethylene ether (APE), erucamidopropyl hydroxypropyl sulfobetaine, Sodium lignosulfonate (SLS), Cetyl trimethyl ammonium chloride (CTAC), cetyl trimethyl ammonium bromide (CTAB), octadecyl trime thylammonium chloride (OTAC), sodium carbonate (Na2CO3), Polyoxyethylene sorbitan monooleate (PS-81), Karanj surfactant, chitosan-based cationic surfactant (CBCS), Newly developed surfactant molecule (AA), and tri-triethanolamine monosunflower ester in reducing the viscosity of heavy crude oil by O/W emulsion formation. It is worth noting that there are other surfactants that are used for crude oil viscosity reduction, and their data can be found in literature; however, due to the lack of data of important influencing factors such as salinity, shear rate, etc. the related data cannot be used in this study.

There are two types of duplicates: Type I (identical independent and dependent parameters) and Type II (same independent features, different response values). Type I duplicates keep the first occurrence, while Type II are excluded. Low-variance features have nearly constant values and should be excluded as they don’t aid modelling. Data should be normalized before calculating variance. Features with variances below 0.005 are considered low-variance. Removing rows and columns should be iterative to avoid new low-variance columns and duplicate rows. This process identified and removed 2 Type I and 10 Type II duplicates, resulting in a database with 2008 rows and 8 columns.

Often, databases contain collinear independent parameters, which add complexity and increase computation time. Removing these features can simplify the problem without losing information, thus reducing computational effort. Since the data distribution isn’t normal, the Spearman correlation factor (R) is used to assess collinearity. A correlation matrix heat-map is ideal for displaying R values. Features with R values of 0.9 or higher are considered collinear. When collinearity is detected, only one feature is kept, and the others are discarded. In this study, no features were found to be collinear, as shown in Fig. 1.

Collinear features in our database.

The data used includes the viscosity of O/W emulsions as the output for the ML models, with pressure, temperature, salinity, surfactant concentration, shear rate, and crude oil density as inputs. Table 1 shows the statistical characteristics of all data points, including minimum, maximum, mean, standard deviation, kurtosis, skewness, and sampling error.

Figure 2 shows box-plots for each parameter to provide a quick visual understanding of data distribution. To avoid scale differences affecting our judgment, data is standardized (Eq. 1), resulting in a data vector with a mean of 0 and a standard deviation of 1.

Box-plot generated for our database.

Outliers in this study are data points far from the majority, not necessarily errors but interesting anomalies. Including them can cause model instability and poor predictions due to skewed data distributions. To address this, outliers beyond 3 times the standard deviation from the mean were excluded. This method removed 19 outliers (5.78% of the data), reducing the database to 1892 rows and 8 columns, including the target parameter.

Including features with vastly different scales can mislead modeling efforts. To avoid this, all parameters in the database were scaled. Data scaling speeds up optimization and eliminates data offset. “Standard scaling” (Eq. 1), was used, fitting the scaling on the training data and then applying it to the testing dataset. This ensures all features have equal influence on the model, regardless of their original scale.

In this section, the fundamental theories of the selected algorithms of the current study are explained.

DT is a type of non-parametric algorithm without distributional assumptions that comes under the supervised learning division and has shown effective capability in prediction, missing value handling, data manipulation, interpretation, and cause-and-effect analysis73. The underlying idea of the DT classifier or regressor is to follow a binary splitting process. Therefore, DT commences with a splitting criterion, which is specified based on the input data, and the parent nodes are sequentially descended into two child nodes until they eventually reach the terminal node where the errors are satisfactory74. Owing to this sequence of partitioning rules, one can generate an informative robust prediction scheme that can overcome both numerical and categorical target types75. In spite of DT’s flexibility, it is an unsteady algorithm since a small change in the input data could have large effects on model outputs. In addition, extra care is needed for hyper-parameter tuning of DT as it is susceptible to overfitting. Figure 3 illustrates the flow chart of the DT algorithm.

Flowchart of DT algorithm.

The central premise of EL is to take advantage of the multiple individual model predictions and combine them to enhance the accuracy of the final prediction54. EL has a three-stage process: the generation phase (creating an ensemble of models), the pruning phase (selecting a smaller group of models to reduce computational load and, if possible, to enhance accuracy), and the integration phase (using constant or nonconstant weighting functions to acquire ensemble prediction based on single models’ prediction)76. Bagging, boosting, and stacking are the main ensemble methods that empower well-known algorithms like random forest, adaptive boosting, extreme gradient boosting, light gradient boosting machines, and categorical boosting77. In the bagging method, the training set is split and assigned to each base learner using random sampling. Then, majority voting is applied to base learners to achieve a strong regressor or classifier. Boosting starts with training a weak learner, evaluating the prediction of that weak learner, selecting the highly erroneous training samples, and continuing to the next weak learner by passing the adjusted training set which includes poorly predicted samples of the prior step. Stacking uses the same training set on different base learners (level_0 learners) to form metadata. Later, metadata are fed into another learner, which is called a meta-learner, to generate final predictions. Meta-learning is a ML method where a ML algorithm learns from predictions of other algorithms to compute superior outputs, compared to the individual models that make up the ensemble78. In this study, the stacking method is utilized to combine three algorithms, including support vector machine (SVM), DT, and k nearest neighbor (KNN).

AB, the so-called AdaBoost, algorithm belongs to the ensemble methods category in ML, similar to the bagging technique but more complicated79. Unlike random forest, which uses a predetermined number of samples to be trained in parallel, AB sequentially generates estimators from modified versions of the training dataset, i.e., boosting. As previously underlined, the key theme of boosting is to form an ensemble using a combination of rules to improve the performance of each ensemble member80. As shown in Fig. 4, boosting starts with assigning equal weights to each data point. In each iteration, these weights are getting updated in a way that weakly predicted data points will be given higher weight in the subsequent stage.

Flowchart of AB algorithm.

Evolved from the notion of simulating human brain functionality, the ANN algorithm can learn key information patterns in a bound range of data types by constructing a system of weight vectors and consecutively optimizing those weights81. ANNs are comprised of three layers, namely input, output, and hidden layers, and can model highly non-linear systems with intricate relationships among variables82. Normally, each neuron in a layer is linked to the other neuron in the adjacent layer through a weighted connection (\(\:{w}_{ij}\))83. Note that the optimal multi-layer feedforward network structure should be determined based on the system complexity. Once the input layer receives the data, its neurons pass them to the neurons of the first hidden layer through the aforementioned weighted connections. This keeps going down to the neurons of the output layer. Mathematically speaking, neurons 1 to \(\:i\) on a hidden layer start transferring data \(\:{x}_{i}\) to neuron \(\:j\) in the next hidden layer through Eq. (2).

In this weighted sum, \(\:{w}_{ij}\) denotes the strength of the link between \(\:{i}^{th}\) neuron on one layer and the \(\:{j}^{th}\) on the next. \(\:{\theta\:}_{j}\) is the bias parameter.

Then, the transfer of \(\:{net}_{j}\) to the next layer finalizes by employing an activation function, such as Sigmoid, RelU, Tanh, etc84,85. Finally, an optimizer could help shape and mold the ANN by updating weights in each backpropagation86. The whole process of a usual ANN is schematically shown in Fig. 5.

Interaction between different components of an ANN.

CNN has revolutionized the field of deep learning with its outstanding performance in solving untrivial problems, especially in computer vision and natural language processing87. Various variations of CNN architectures, for example, classic CNN models of LeNet-5, AlexNet, MobileNet v3, and GhostNet, are presented in several studies. Since these variants are identical in most of the basic concepts, LeNet88 is taken as an example to explain CNN functionality. Note that LeNet is also the architecture of interest in this research. LeNet is conceptualized with three types of layers: convolutional, pooling, and fully-connected layers. Most of the computation takes place in convolutional layers comprising a set of feature maps with neurons. The core idea of pooling is to down-sample and reduce the computational load for the subsequent layer. Fully-connected layer is similar to the simple ANN, where each neuron in a layer is connected to each and every neuron in the next layer. Detailed implementation of CNN is explained in “Model development” Section.

As a common practice in ML workflows, the data bank was preprocessed and split into train (80% of total data) and test (20% of total data) subsets. The preprocessing stage included removing all missing values and normalizing input features. Then the train set was fed to all algorithms along with performing optimization operations.

The combined simulated annealing (CSA)89, as a powerful optimizing method, was plugged into all selected algorithms to find optimum hyperparameters. CSA is an enhanced version of simulated annealing (SA) which was primarily developed by Metropolis, et al.90. Through CSA, a series of individual SAs is coupled by the function of costs in SA processes and the acceptance probability function (\(\:{A}^{\eta\:}\)). Metropolis rule90 is commonly utilized to quantify \(\:{A}^{\eta\:}\), which is defined in Eq. (3).

;

Where, \(\:\psi\:\) is the coupling term and a function of a function of all costs of solution in state set \(\eta \: \in S\). \(\:{x}_{i}\) and \(\:{y}_{i}\) are individual solutions in η. \(\:{T}_{k}{\:}^{a}\) denotes the acceptance temperature.

The robustness and accuracy of the models in this study have been evaluated using different statistical quality measures, including mean absolute error (MAE) (Eq. (4)), mean squared error (MSE) (Eq. (5)), root mean squared error (RMSE) (Eq. (6)), coefficient of determination (R2) (Eq. (7)), and average absolute relative error (AARE%) (Eq. (8)).

Where, N is the total number of observations.

The optimum structure of the presented models are found based on the performance of each model under various hyper-parameters. For DT, the overall complexity is generally controlled by its maximum depth. The optimum depth would prevent the model from under- or over-fitting. The maximum depth was set from 0.1 to 10 in 60 trials and results were examined based on the R2 of the train set. As shown in Fig. 6, the optimum maximum depth value of the train set is 6.04 as the R2 values approach 0.99 for all depths equal and greater than 6.04. Three algorithms were used in parallel, including SVM, DT, and KNN in the structure of the EL model. Using the Euclidean distance metric, the KNN algorithm was employed in the analysis, and the optimal number of neighbors was determined as four. The SVM kernel was also modeled using a Radial Basis Function using a Euclidean distance. The parameters were selected to balance model complexity and performance by setting the regularization parameter C to 120, the epsilon parameter 0.001, and the gamma parameter 0.02. For adaptive boosting model, the number of base estimators was targeted as a determinative hyper-parameter to adjust the accuracy of the model. In order to acquire a low-biased model, the number of estimators was changed in the range of 1 to 160. R2 values were monitored and depicted in Fig. 7, which suggests the optimum value of 99. Optimization led to the structure of 3 convolutional layers, one pooling layer, and a fully connected neural network for the CNN model. Various numbers of hidden layers can be utilized for constructing the ANN. In this study, three hidden layers were considered. The number of 15 and 1 neurons were allocated in the first and third hidden layers, respectively. Afterwards, the number of neurons in the second hidden layer was determined using CSA. To do so, AARE values were examined for a neuron quantity in the range of 1 to 15. It is evident from Fig. 8 that 5 neurons for the second hidden layer would provide better results for the train set.

Coefficient of determination versus maximum depth of DT for the train set.

Coefficient of determination versus the number of estimators of AB for the train set.

Coefficient of determination versus the number of neurons in the second hidden layer of ANN for the train set.

The optimized models in the previous section were employed to predict the viscosity of O/W emulsions based on the pressure, temperature, salinity, surfactant concentration, type of surfactant, shear rate, and crude oil density in various thermodynamic conditions. Statistical parameters, such as AARE, R2, MAE, MSE, and RMSE, were utilized to compare the performance of all selected algorithms. These error values are graphically compared in Fig. 9. Taking into account these error values, ANN outperforms other algorithms in viscosity prediction of total data (train and test subsets together) with AARE, R2, MAE, MSE, and RMSE values of 8.982, 0.996, 0.004, 0.0002, and 0.0132, respectively.

Graphical analysis of statistical parameters for train, test, and total sets, (a) RMSE, (b) MSE, (c) MAE, (d) R2, and (e) AARE%.

Furthermore, predicted versus actual values of O/W emulsion viscosity are represented in Fig. 10. The ideal prediction line (cross line 45°) is added to this plot for ease of evaluation. The train and test data points are mostly concentrated around the ideal prediction line with high determination coefficients which indicates high conformity of predicted amounts with actual ones. Nevertheless, ANN (in Fig. 10d) has resulted in more accurate values with more diagonally concentrated data points and R2 values of 0.996 and 0.995 for train and test subsets, respectively.

Predicted viscosity versus actual viscosity of (a) AB, (b) CNN, (c) EL, (d) ANN, and (e) DT for train and test subsets.

To further investigate the applicability of proposed ML-based models, relative errors are illustrated in Fig. 11 for each experimental train and test data point. According to this figure, lower viscosities are associated with higher errors since there are much fewer experimental results with such good viscosity reduction outcomes. Plus, as expected, relative errors for ANN have a higher density over the zero-error line which dictates its superiority.

Relative error plot of predicted values of (a) AB, (b) CNN, (c) EL, (d) ANN, (e) DT for train and test subsets.

So far, the aforementioned analysis has proved the higher proficiency of the ANN model in O/W emulsion viscosity prediction. In order to have an all-in-one comprehensive analysis, the cumulative data frequency (CDF) function was employed to study the distribution of absolute relative errors of the selected algorithms in this work. According to Fig. 12, the CNN, ANN, EL, and DT models yield acceptable accuracy for most of the data points; however, ANN exhibits higher accuracy compared to the obtained results by the other models.

Absolute relative error distribution for total data.

Finally, sensitivity analysis was carried out to determine the positive or negative impacts of all input features on O/W emulsion viscosity estimation using ANN. To do so, Monte-Carlo simulations91 were performed in multiple trials with different probabilistic model inputs to identify uncertainties. Based on the uncertainty evaluation, the importance of features is shown in Fig. 13. Consequently, pressure, surfactant concentration, temperature, shear rate, type of surfactant (based on the hydrophilic-lipophilic bond (HLB)), salinity, and crude oil density have the highest to lowest impact on viscosity prediction. According to the obtained results, salinity and HLB have positive effect on O/W emulsion viscosity, which means that viscosity increases with increasing each of these factors. Pressure, temperature, surfactant concentration, shear rate, and crude oil density have negative impact and can reduce the O/W emulsion viscosity. It is worth noting that in agreement with the findings of various studies in the literature92,93,94, temperature has a negative effect on emulsion viscosity.

Feature importance plot for ANN model.

To evaluate the effectiveness of five machine learning models (AdaB, CNN, EL, ANN, and DT), this study used a Taylor diagram with standard deviation, correlation coefficient, and normalized root mean square error (NRMSE) metrics (Fig. 14). Taylor diagrams offer a comprehensive view of model performance, showing concordance and variability between measured and predicted values95. The ANN model had a standard deviation of 164.43, while the AdaB, CNN, EL, and DT models had standard deviations of 167.85, 164.42, 156.74, and 159.27, respectively, indicating high accuracy in predicting data dispersion. The correlation coefficient validated model accuracy, with ANN being the most efficient (0.998), followed by AdaB (0.996), CNN (0.997), EL (0.994), and DT (0.994). The ANN model also had the lowest relative prediction error (0.013) compared to DT (0.024). Overall, the ANN model was the most accurate based on these metrics.

Taylor diagram for comparing trained AdaB, CNN, EL, ANN, and DT models.

Using surface-active agents (surfactants) in heavy oil recovery reduces viscosity by forming O/W emulsions. In this regard, predictive models can be used to assess and improve surfactant-assisted recovery methods by simulating operational prospects and acquiring accurate estimations of resulting emulsions. This study examined the effects of surfactant injection on O/W emulsion viscosity using a large data bank to develop ML predictors. Using important input parameters, such as Pressure, temperature, salinity, surfactant concentration, type of surfactant, shear rate, and crude oil density, the predictors were found to be well-suited to such a task in a wide range of thermodynamic conditions. A comparison of all developed models revealed that ANN demonstrated the most reliable performance in terms of accuracy. Furthermore, sensitivity analyses revealed that pressure and surfactant concentration were the most influential input parameters affecting ANN’s output. Modeling the effect of various surfactants in reducing crude oil viscosity using the proposed approaches and other data-driven models suffers from limitations which could affect the results. One limitation is the reliance on a dataset that, while expanded, still may not capture the full range of variability in certain parameters, particularly in extreme thermodynamic conditions. Furthermore, the prediction of viscosity is highly dependent on the accuracy and availability of input data, and certain factors, such as surfactant interactions and complex fluid behaviors, may not have been fully explored. We recommend that future studies incorporate larger and more diverse datasets to further enhance model accuracy and explore the interactions between parameters in greater detail.

All the data generated or analyzed during this study are included in this published article (in Figures and Tables). The datasets are also available from the corresponding author on reasonable request.

Martínez-Palou, R. et al. Transportation of heavy and extra-heavy crude oil by pipeline: A review. J. Petrol. Sci. Eng. 75, 274–282 (2011).

Article Google Scholar

Shah, A. et al. A review of novel techniques for heavy oil and bitumen extraction and upgrading. Energy Environ. Sci. 3, 700–714 (2010).

Article Google Scholar

Guo, K., Li, H. & Yu, Z. In-situ heavy and extra-heavy oil recovery: A review. Fuel. 185, 886–902 (2016).

Article Google Scholar

Ma, J., Yao, M., Yang, Y. & Zhang, X. Comprehensive review on stability and demulsification of unconventional heavy oil-water emulsions. J. Mol. Liq. 350, 118510 (2022).

Article Google Scholar

Nguyen, M. T. et al. Recent advances in asphaltene transformation in heavy oil hydroprocessing: Progress, challenges, and future perspectives. Fuel Process. Technol. 213, 106681 (2021).

Article Google Scholar

Nasr, T. N. & Ayodele, O. R. in SPE international improved oil recovery conference in Asia Pacific. SPE-97488-MS (SPE).

Das, S. K. & Vapex An efficient process for the recovery of heavy oil and bitumen. SPE J. 3, 232–237 (1998).

Article Google Scholar

Sharma, P., Kostarelos, K. & Salman, M. Optimization of closed-cycle oil recovery: A non-thermal process for bitumen and extra heavy oil recovery. RSC Adv. 11, 26554–26562 (2021).

Article ADS PubMed Central Google Scholar

Si, Y., Zhu, Y., Liu, T., Xu, X. & Yang, J. Synthesis of a novel borate ester anion-nonionic surfactant and its application in viscosity reduction and emulsification of heavy crude oil. Fuel. 333, 126453 (2023).

Article Google Scholar

Chen, W. et al. A Comprehensive Review on Screening, Application, and perspectives of surfactant-based chemical-enhanced oil recovery methods in unconventional Oil reservoirs. Energy Fuels. 37, 4729–4750 (2023).

Article Google Scholar

Walstra, P. Principles of emulsion formation. Chem. Eng. Sci. 48, 333–349 (1993).

Article Google Scholar

Alharbi, G. G. & Abdulhamid, M. A. Optimization of water/oil emulsion preparation: Impact of time, speed, and homogenizer type on droplet size and dehydration efficiency. Chemosphere. 335, 139136 (2023).

Article Google Scholar

Low, L. E., Siva, S. P., Ho, Y. K., Chan, E. S. & Tey, B. T. Recent advances of characterization techniques for the formation, physical properties and stability of Pickering emulsion. Adv. Colloid Interface Sci. 277, 102117 (2020).

Article Google Scholar

Ashrafizadeh, S., Motaee, E. & Hoshyargar, V. Emulsification of heavy crude oil in water by natural surfactants. J. Petrol. Sci. Eng. 86, 137–143 (2012).

Article Google Scholar

Mandal, A., Samanta, A., Bera, A. & Ojha, K. in 2010 International Conference on Chemistry and Chemical Engineering. 190–194 (IEEE).

Abed, S., Abdurahman, N., Yunus, R., Abdulbari, H. & Akbari, S. in IOP Conference Series: Materials Science and Engineering. 012060 (IOP Publishing).

Saniere, A., Hénaut, I. & Argillier, J. Pipeline transportation of heavy oils, a strategic, economic and technological challenge. Oil gas Sci. Technol. 59, 455–466 (2004).

Article Google Scholar

Souas, F., Safri, A. & Benmounah, A. A review on the rheology of heavy crude oil for pipeline transportation. Petroleum Res. 6, 116–136 (2021).

Article Google Scholar

He, L., Lin, F., Li, X., Sui, H. & Xu, Z. Interfacial sciences in unconventional petroleum production: From fundamentals to applications. Chem. Soc. Rev. 44, 5446–5494 (2015).

Article Google Scholar

Rimmer, D., Gregoli, A., Hamshar, J. & Yildirim, E. (ACS, (1992).

Kokal, S. & Al-Juraid, J. in SPE Annual Technical Conference and Exhibition? SPE-48995-MS (SPE).

Kokal, S. & Al-Juraid, J. in SPE Annual Technical Conference and Exhibition? SPE-56641-MS (SPE).

Uetani, T. et al. Experimental investigation of crude-oil emulsion stability: Effect of oil and brine compositions, asphaltene, wax, toluene insolubles, temperature, shear stress, and water cut. SPE Prod. Oper. 35, 320–334 (2020).

Google Scholar

Kumar, S. & Mahto, V. Use of a novel surfactant to prepare oil-in-water emulsion of an Indian heavy crude oil for pipeline transportation. Energy Fuels. 31, 12010–12020 (2017).

Article Google Scholar

Davies, G., Nilsen, F. & Gramme, P. in SPE Annual Technical Conference and Exhibition? SPE-36587-MS (SPE).

He, M., Pu, W. & Yang X.-r. in International Field Exploration and Development Conference. 3664–3674 (Springer).

Walsh, J. M. The Savvy Separator Series: Part 5. The effect of shear on produced water treatment. Oil Gas Facilities. 5, 16–23 (2016).

Article Google Scholar

Zhang, J., Xu, J., Gao, M. & Wu, Y. -x. apparent viscosity of oil-water (coarse) emulsion and its rheological characterization during the phase inversion region. J. Dispers. Sci. Technol. 34, 1148–1160 (2013).

Article Google Scholar

Shi, S., Wang, Y., Liu, Y. & Wang, L. A new method for calculating the viscosity of W/O and O/W emulsion. J. Petrol. Sci. Eng. 171, 928–937 (2018).

Article Google Scholar

Richardson, E. Über die viskosität von emulsionen. Kolloid-Zeitschrift. 65, 32–37 (1933).

Article Google Scholar

I. Taylor, G. The viscosity of a fluid containing small drops of another fluid. Proc. Royal Soc. Lond. Ser. Containing Papers Math. Phys. Character. 138, 41–48 (1932).

ADS Google Scholar

Caté, A., Perozzi, L., Gloaguen, E. & Blouin, M. Machine learning as a tool for geologists. Lead. Edge. 36, 215–219 (2017).

Article Google Scholar

Wang, Y. et al. Machine learning prediction of quartz forming-environments. J. Geophys. Research: Solid Earth. 126, 021925 (2021). e2021JB.

Google Scholar

Li, S., Liu, N., Li, F., Gao, J. & Ding, J. Automatic fault delineation in 3-D seismic images with deep learning: Data augmentation or ensemble learning? IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2022).

Google Scholar

Zhang, B. et al. Exploring factors affecting the performance of deep learning in seismic fault attribute computation. Interpretation. 10, T619–T636 (2022).

Article Google Scholar

Zhang, Y., Liu, Y., Zhang, H. & Xue, H. Seismic facies analysis based on deep learning. IEEE Geosci. Remote Sens. Lett. 17, 1119–1123 (2019).

Article ADS Google Scholar

Najafi-Silab, R., Soleymanzadeh, A., Kolah-Kaj, P. & Kord, S. Electrical rock typing using Gaussian mixture model to determine cementation factor. J. Petroleum Explor. Prod. Technol. 13, 1329–1344 (2023).

Article Google Scholar

Erofeev, A., Orlov, D., Ryzhov, A. & Koroteev, D. Prediction of porosity and permeability alteration based on machine learning algorithms. Transp. Porous Media. 128, 677–700 (2019).

Article Google Scholar

Menke, H. P., Maes, J. & Geiger, S. Upscaling the porosity–permeability relationship of a microporous carbonate for Darcy-scale flow with machine learning. Sci. Rep. 11, 2625 (2021).

Article ADS PubMed Central Google Scholar

Bérubé, C. L. et al. Predicting rock type and detecting hydrothermal alteration using machine learning and petrophysical properties of the Canadian malartic ore and host rocks, Pontiac Subprovince, Québec, Canada. Ore Geol. Rev. 96, 130–145 (2018).

Article Google Scholar

Anemangely, M., Ramezanzadeh, A., Amiri, H. & Hoseinpour, S. A. Machine learning technique for the prediction of shear wave velocity using petrophysical logs. J. Petrol. Sci. Eng. 174, 306–327 (2019).

Article Google Scholar

Hajibolouri, E. et al. Permeability modelling in a highly heterogeneous tight carbonate reservoir using comparative evaluating learning-based and fitting-based approaches. Sci. Rep. 14, 10209 (2024).

Article ADS PubMed Central Google Scholar

Onwuchekwa, C. in SPE Nigeria Annual International Conference and Exhibition. (OnePetro).

Daryasafar, A., Keykhosravi, A. & Shahbazi, K. Modeling CO2 wettability behavior at the interface of brine/CO2/mineral: Application to CO2 geo-sequestration. J. Clean. Prod. 239, 118101 (2019).

Article Google Scholar

Shafiei, A., Tatar, A., Rayhani, M., Kairat, M. & Askarova, I. Artificial neural network, support vector machine, decision tree, random forest, and committee machine intelligent system help to improve performance prediction of low salinity water injection in carbonate oil reservoirs. J. Petrol. Sci. Eng. 219, 111046 (2022).

Article Google Scholar

Da Wang, Y., Blunt, M. J., Armstrong, R. T. & Mostaghimi, P. Deep learning in pore scale imaging and modeling. Earth Sci. Rev. 215, 103555 (2021).

Article Google Scholar

Ishola, O. & Vilcaez, J. Machine learning modeling of permeability in 3D heterogeneous porous media using a novel stochastic pore-scale simulation approach. Fuel. 321, 124044 (2022).

Article Google Scholar

Yamaguchi, A. J. et al. Multiscale numerical simulation of CO2 hydrate storage using machine learning. Fuel. 334, 126678 (2023).

Article Google Scholar

Daryasafar, A., Ahadi, A. & Kharrat, R. Modeling of steam distillation mechanism during steam injection process using artificial intelligence. Sci. World J. 2014 (2014).

Cheraghi, Y., Kord, S. & Mashayekhizadeh, V. Application of machine learning techniques for selecting the most suitable enhanced oil recovery method; challenges and opportunities. J. Petrol. Sci. Eng. 205, 108761 (2021).

Article Google Scholar

Cheraghi, Y., Kord, S. & Mashayekhizadeh, V. A two-stage screening framework for enhanced oil recovery methods, using artificial neural networks. Neural Comput. Appl., 1–18 (2023).

Zhong, R., Salehi, C. & Johnson, R. Jr Machine learning for drilling applications: A review. J. Nat. Gas Sci. Eng., 104807 (2022).

Zhou, F., Fan, H., Liu, Y., Zhang, H. & Ji, R. Hybrid model of machine learning method and empirical method for rate of Penetration Prediction based on data similarity. Appl. Sci. 13, 5870 (2023).

Article Google Scholar

Sagi, O. & Rokach, L. Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Min. Knowl. Discovery. 8, e1249 (2018).

Google Scholar

Osarogiagbon, A. U., Khan, F., Venkatesan, R. & Gillard, P. Review and analysis of supervised machine learning algorithms for hazardous events in drilling operations. Process Saf. Environ. Prot. 147, 367–384 (2021).

Article Google Scholar

Bikmukhametov, T. & Jäschke, J. Oil production monitoring using gradient boosting machine learning algorithm. Ifac-Papersonline. 52, 514–519 (2019).

Article Google Scholar

Meribout, M. et al. Multiphase flow meters targeting oil & gas industries. Measurement. 165, 108111 (2020).

Article Google Scholar

Ponomareva, I. N., Galkin, V. I. & Martyushev, D. A. Operational method for determining bottom hole pressure in mechanized oil producing wells, based on the application of multivariate regression analysis. Petroleum Res. 6, 351–360 (2021).

Article Google Scholar

Umar, A. A., Saaid, I. M., Sulaimon, A. A. & Pilus, R. M. Predicting the viscosity of petroleum emulsions using gene expression programming (GEP) and response surface methodology (RSM). Journal of Applied Mathematics (2020). (2020).

Li, C., Huang, Q., Ma, S. & Ji, C. An experimental study on the viscosity of water-in-oil emulsions. J. Dispers. Sci. Technol. 37, 305–316 (2016).

Article Google Scholar

Kumar, G., Mani, E. & Sangwai, J. S. Impact of surface-modified silica nanoparticle and surfactant on the stability and rheology of oil-in-water Pickering and surfactant-stabilized emulsions under high-pressure and high-temperature. J. Mol. Liq. 379, 121620 (2023).

Article Google Scholar

Fogang, L. T., Sultan, A. S. & Kamal, M. S. Understanding viscosity reduction of a long-tail sulfobetaine viscoelastic surfactant by organic compounds. RSC Adv. 8, 4455–4463 (2018).

Article ADS Google Scholar

Kumar, S. & Mahto, V. Emulsification of Indian heavy crude oil using a novel surfactant for pipeline transportation. Pet. Sci. 14, 372–382 (2017).

Article Google Scholar

VijayaKumar, S., Zakaria, J. & Ridzuan, N. The role of Gemini surfactant and SiO2/SnO/Ni2O3 nanoparticles as flow improver of Malaysian crude oil. J. King Saud University-Engineering Sci. 34, 384–390 (2022).

Article Google Scholar

Chen, H. et al. Formulation and evaluation of a new multi-functional fracturing fluid system with oil viscosity reduction, rock wettability alteration and interfacial modification. J. Mol. Liq. 375, 121376 (2023).

Article Google Scholar

Liu, M., Wu, Y., Zhang, L., Rong, F. & Yang, Z. Mechanism of viscosity reduction in viscous crude oil with polyoxyethylene surfactant compound system. Pet. Sci. Technol. 37, 409–416 (2019).

Article Google Scholar

Sulistyarso, H. B., Pamungkas, J. & Hermawan, Y. D. The effects between Interfacial Tension and Viscosity Reduction in Viscous Crude Oil through the addition of surfactant Sodium Lignosulfonate (SLS) for EOR purpose. Petroleum Sci. Eng. 6, 59–64 (2022).

Google Scholar

Gu, X. et al. Investigation of cationic surfactants as clean flow improvers for crude oil and a mechanism study. J. Petrol. Sci. Eng. 164, 87–90 (2018).

Article Google Scholar

Vegad, G. D. & Jana, A. K. Viscosity Reduction of Indian Heavy Crude Oil by Emulsification to O/W emulsion using Polysorbate-81. J. Surfactants Deterg. 24, 301–311 (2021).

Article Google Scholar

Kesarwani, H., Saxena, A., Saxena, N. & Sharma, S. Oil mobilization potential of a novel anionic Karanj oil surfactant: Interfacial, wetting characteristic, adsorption, and oil recovery studies. Energy Fuels. 35, 10597–10610 (2021).

Article Google Scholar

Negi, H., Faujdar, E., Saleheen, R. & Singh, R. K. Viscosity modification of heavy crude oil by using a chitosan-based cationic surfactant. Energy Fuels. 34, 4474–4483 (2020).

Article Google Scholar

Al-Roomi, Y., George, R., Elgibaly, A. & Elkamel, A. Use of a novel surfactant for improving the transportability/transportation of heavy/viscous crude oils. J. Petrol. Sci. Eng. 42, 235–243 (2004).

Article Google Scholar

Song, Y. Y. & Ying, L. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiat.. 27, 130 (2015).

Google Scholar

Kotsiantis, S. B. Decision trees: A recent overview. Artif. Intell. Rev. 39, 261–283 (2013).

Article Google Scholar

Sharma, H. & Kumar, S. A survey on decision tree algorithms of classification in data mining. Int. J. Sci. Res. (IJSR). 5, 2094–2097 (2016).

Article Google Scholar

Mendes-Moreira, J., Soares, C., Jorge, A. M. & Sousa, J. F. D. Ensemble approaches for regression: A survey. Acm Comput. Surv. (csur). 45, 1–40 (2012).

Article Google Scholar

Mienye, I. D. & Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access. 10, 99129–99149 (2022).

Article Google Scholar

Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5149–5169 (2021).

Google Scholar

Freund, Y. & Schapire, R. E. in icml. 148–156 (Citeseer).

Meir, R. & Rätsch, G. in Advanced lectures on machine learning: Machine learning summer school 2002 Canberra, Australia, February 11–22, 2002 Revised Lectures 118–183 Springer, (2003).

Zou, J., Han, Y. & So, S. S. Overview of artificial neural networks. Artif. Neural Networks: Methods Appl., 14–22 (2009).

Yegnanarayana, B. Artificial Neural Networks. (PHI Learning Pvt (Ltd., 2009).

Zhang, Z., Zhang, Z. & Artificial neural network. Multivariate time series analysis in climate and environmental research, 1–35 (2018).

Rasamoelina, A. D., Adjailia, F. & Sinčák, P. in 2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI). 281–286 (IEEE).

Feng, J. & Lu, S. in Journal of Physics: Conference Series. 022030 (IOP Publishing).

Vani, S. & Rao, T. M. in 3rd international conference on trends in electronics and informatics (ICOEI). 331–336 (IEEE). (2019).

Gu, J. et al. Recent advances in convolutional neural networks. Patt. Recogn. 77, 354–377 (2018).

Article ADS Google Scholar

LeCun, Y. et al. Handwritten digit recognition with a back-propagation network. Adv. Neural. Inf. Process. Syst. 2 (1989).

Xavier-de-Souza, S., Suykens, J. A., Vandewalle, J. & Bollé, D. Coupled simulated annealing. IEEE Trans. Syst. Man. Cybernetics Part. B (Cybernetics). 40, 320–335 (2009).

Article Google Scholar

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953).

Article ADS Google Scholar

Iman, R. L., Helton, J. C. & Campbell, J. E. An approach to sensitivity analysis of computer models: Part II—Ranking of input variables, response surface validation, distribution effect and technique synopsis. J. Qual. Technol. 13, 232–240 (1981).

Article Google Scholar

Partal, P., Guerrero, A., Berjano, M. & Gallegos, C. Influence of concentration and temperature on the flow behavior of oil-in‐water emulsions stabilized by sucrose palmitate. J. Am. Oil Chem. Soc. 74, 1203–1212 (1997).

Article Google Scholar

Juntarasakul, O. & Maneeintr, K. in Lop conference series: Earth and environmental science. 012024 (IOP Publishing).

Husin, H. & Hussain, H. H. in Science and Technology Behind NanoemulsionsIntechOpen, (2018).

Taylor, K. E. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Research: Atmos. 106, 7183–7192 (2001).

Article ADS Google Scholar

Download references

Department of Petroleum Engineering, Ahvaz Faculty of Petroleum, Petroleum University of Technology, Ahvaz, Iran

Ehsan Hajibolouri, Reza Najafi-Silab, Amin Daryasafar, Abbas Ayatizadeh Tanha & Shahin Kord

You can also search for this author in PubMed Google Scholar

CRediT authorship contribution statementEhsan Hajibolouri: Conceptualization, Methodology, Software, Investigation, Validation, Formal analysis, Writing-Review & Editing. Reza Najafi-Silab: Methodology, Software, Investigation, Validation, Formal analysis, Writing-Review & Editing. Amin Daryasafar: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing-Review & Editing. Abbas Ayatizadeh Tanha: Conceptualization, Methodology, Validation, Formal analysis, Writing-Review & Editing. Shahin Kord: Supervision, Methodology, Software, Formal analysis, Writing-Review & Editing.

Correspondence to Amin Daryasafar or Shahin Kord.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Hajibolouri, E., Najafi-Silab, R., Daryasafar, A. et al. Using data-driven models to simulate the performance of surfactants in reducing heavy oil viscosity. Sci Rep 14, 27670 (2024). https://doi.org/10.1038/s41598-024-79368-1

Download citation

Received: 25 April 2024

Accepted: 08 November 2024

Published: 12 November 2024

DOI: https://doi.org/10.1038/s41598-024-79368-1

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative