Skip to main content

Advertisement

Log in

A survey of machine learning techniques in structural and multidisciplinary optimization

  • Review Paper
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

Machine Learning (ML) techniques have been used in an extensive range of applications in the field of structural and multidisciplinary optimization over the last few years. This paper presents a survey of this wide but disjointed literature on ML techniques in the structural and multidisciplinary optimization field. First, we discuss the challenges associated with conventional optimization and how Machine learning can address them. Then, we review the literature in the context of how ML can accelerate design synthesis and optimization. Some real-life engineering applications in structural design, material design, fluid mechanics, aerodynamics, heat transfer, and multidisciplinary design are summarized, and a brief list of widely used open-source codes as well as commercial packages are provided. Finally, the survey culminates with some concluding remarks and future research suggestions. For the sake of completeness, categories of ML problems, algorithms, and paradigms are presented in the Appendix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  • Abueidda DW, Koric S, Sobh NA (2020) Topology optimization of 2D structures with nonlinearities using deep learning. Comput Struct 237:106283

    Article  Google Scholar 

  • Abueidda DW, Lu Q, Koric S (2021) Meshless physics-informed deep learning method for three-dimensional solid mechanics. Int J Numer Meth Eng 122(23):7182–7201

    Article  MathSciNet  Google Scholar 

  • Acar E, Rais-Rohani M (2009) Ensemble of metamodels with optimized weight factors. Struct Multidisc Optim 37(3):279–294

    Article  Google Scholar 

  • Acar E, Solanki K (2009) System reliability based vehicle design for crashworthiness and effects of various uncertainty reduction measures. Struct Multidisc Optim 39(3):311–325

    Article  Google Scholar 

  • Adeli H, Park HS (1995) A neural dynamics model for structural optimization—theory. Comput Struct 57(3):383–390

    Article  MathSciNet  MATH  Google Scholar 

  • Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P, Nasrin MS, Asari VK (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3):292

    Article  Google Scholar 

  • Amsallem D, Farhat C (2008) Interpolation method for adapting reduced-order models and application to aeroelasticity. AIAA J 46(7):1803–1813

    Article  Google Scholar 

  • An D, Liu J, Zhang M, Chen X, Chen M, Sun H (2020) Uncertainty modeling and runtime verification for autonomous vehicles driving control: A machine learning-based approach. J Syst Softw 167:110617

    Article  Google Scholar 

  • Asperti A, Evangelista D, Piccolomini EL (2021) A survey on variational autoencoders from a green AI perspective. SN Computer Science 2(4):1–23

    Article  Google Scholar 

  • Ates GC, Gorguluarslan RM (2021) Two-stage convolutional encoder-decoder network to improve the performance and reliability of deep learning models for topology optimization. Struct Multidisc Optim 63(4):1927–1950

    Article  MathSciNet  Google Scholar 

  • Banga S, Gehani H, Bhilare S, Patel S, Kara L (2018). 3d topology optimization using convolutional neural networks. arXiv preprint arXiv:1808.07440.

  • Baraldi P, Mangili F, Zio E (2015) A prognostics approach to nuclear component degradation modeling based on Gaussian process regression. Prog Nucl Energy 78:141–154

    Article  MATH  Google Scholar 

  • Barber D, Wang Y (2014). Gaussian processes for Bayesian estimation in ordinary differential equations. In International conference on machine learning (pp. 1485–1493). PMLR.

  • Bataleblu AA (2019) Computational intelligence and its applications in uncertainty-based design optimization. In Bridge Optimization-Inspection and Condition Monitoring. IntechOpen.

  • Behzadi MM, Ilieş HT (2021) Real-time topology optimization in 3D via deep transfer learning. Comput Aided Des 135:103014

    Article  MathSciNet  Google Scholar 

  • Bendsøe MP (1989) Optimal shape design as a material distribution problem. Struct Optim 1(4):193–202

    Article  Google Scholar 

  • Bi S, Zhang J, Zhang G (2020) Scalable deep-learning-accelerated topology optimization for additively manufactured materials. arXiv preprint arXiv:2011.14177.

  • Bielecki D, Patel D, Rai R, Dargush GF (2021) Multi-stage deep neural network accelerated topology optimization. Struct Multidisc Optim 64(6):3473–3487

    Article  MathSciNet  Google Scholar 

  • Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, UK

    MATH  Google Scholar 

  • Bostanabad R, Chan YC, Wang L, Zhu P, Chen W (2019) Globally approximate gaussian processes for big data with application to data-driven metamaterials design. J Mech Des 141(11):111402

    Article  Google Scholar 

  • Bühlmann P (2012). Bagging, boosting and ensemble methods. In Handbook of computational statistics (pp. 985–1022). Springer, Berlin, Heidelberg.

  • Burnap A, Pan Y, Liu Y, Ren Y, Lee H, Gonzalez R, Papalambros PY (2016b) Improving design preference prediction accuracy using feature learning. J Mech Des 138(7):071404

    Article  Google Scholar 

  • Burnap A, Liu Y, Pan Y, Lee H, Gonzalez R, Papalambros PY (2016a) Estimating and exploring the product form design space using deep generative models. In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 50107, p. V02AT03A013). American Society of Mechanical Engineers.

  • Burnap A, Hauser JR, Timoshenko A (2021) Design and evaluation of product aesthetics: A human-machine hybrid approach. Available at SSRN 3421771.

  • Caldeira J, Nord B (2020) Deeply uncertain: Comparing methods of uncertainty quantification in deep learning algorithms. Mach Learn: Sci Technol 2(1):015002

    Google Scholar 

  • Cang R, Yao H, Ren Y (2019) One-shot generation of near-optimal topology through theory-driven machine learning. Comput-Aided Des 109:12–21

    Article  MathSciNet  Google Scholar 

  • Capuano G, Rimoli JJ (2019) Smart finite elements: a novel machine learning application. Comput Methods Appl Mech Eng 345:363–381

    Article  MathSciNet  MATH  Google Scholar 

  • Cerbone G (1992) Machine learning techniques in optimal design. In: Artificial Intelligence in Design’92 (pp. 699–717). Springer, Dordrecht

  • Cha YJ, Choi W, Büyüköztürk O (2017) Deep learning-based crack damage detection using convolutional neural networks. Comput-Aided Civ Inf Eng 32(5):361–378

    Article  Google Scholar 

  • Chakraborty S (2021) Transfer learning based multi-fidelity physics informed deep neural network. J Comput Phys 426:109942

    Article  MathSciNet  MATH  Google Scholar 

  • Chan S, Elsheikh AH (2018) A machine learning approach for efficient uncertainty quantification using multiscale methods. J Comput Phys 354:493–511

    Article  MathSciNet  MATH  Google Scholar 

  • Chandrasekhar A, Suresh K (2021) TOuNN: Topology optimization using neural networks. Struct Multidisc Optim 63(3):1135–1149

    Article  MathSciNet  Google Scholar 

  • Chen W, Ahmed F (2021a) MO-PaDGAN: Reparameterizing Engineering Designs for augmented multi-objective optimization. Appl Soft Comput 113:107909

    Article  Google Scholar 

  • Chen W, Ahmed F (2021b) Padgan: Learning to generate high-quality novel designs. J Mech Des 143(3):031703

    Article  Google Scholar 

  • Chen CT, Gu GX (2020) Generative deep neural networks for inverse materials design using backpropagation and active learning. Adv Sci 7(5):1902607

    Article  Google Scholar 

  • Chen X, Chen X, Zhou W, Zhang J, Yao W (2020) The heat source layout optimization using deep learning surrogate modeling. Struct Multidisc Optim 62(6):3127–3148

    Article  Google Scholar 

  • Chen W, Ahmed F (2020) PaDGAN: A generative adversarial network for performance augmented diverse designs. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 84003, p. V11AT11A010). American Society of Mechanical Engineers.

  • Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794)

  • Chen W, Chiu K, Fuge M (2019) Aerodynamic design optimization and shape exploration using generative adversarial networks. In AIAA Scitech 2019 Forum (p. 2351)

  • Chhabra JP, Warn GP (2019) A method for model selection using reinforcement learning when viewing design as a sequential decision process. Struct Multidisc Optim 59(5):1521–1542

    Article  MathSciNet  Google Scholar 

  • Chi H, Zhang Y, Tang TLE, Mirabella L, Dalloro L, Song L, Paulino GH (2021) Universal machine learning for topology optimization. Comput Methods Appl Mech Eng 375:112739

    Article  MathSciNet  MATH  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-Vector Networks. Mach Learn 20(3):273–297

    Article  MATH  Google Scholar 

  • Cox DR (1958) The regression analysis of binary sequences. J Roy Stat Soc: Ser B (methodol) 20(2):215–232

    MathSciNet  MATH  Google Scholar 

  • Cutajar K, Osborne M, Cunningham J, Filippone M (2016) Preconditioning kernel matrices. In International conference on machine learning (pp. 2529–2538). PMLR.

  • Dai Y, Li Y, Liu LJ (2019) New product design with automatic scheme generation. Sens Imag 20(1):1–16

    Google Scholar 

  • Deng C, Wang Y, Qin C, Lu W (2020) Self-directed online machine learning for topology optimization. arXiv preprint arXiv:2002.01927.

  • Deng H, To AC (2020) Topology optimization based on deep representation learning (DRL) for compliance and stress-constrained design. Comput Mech 66:449–469

    Article  MathSciNet  MATH  Google Scholar 

  • Dering M, Cunningham J, Desai R, Yukish MA, Simpson TW, Tucker CS (2018) A physics-based virtual environment for enhancing the quality of deep generative designs. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 51753, p. V02AT03A015). American Society of Mechanical Engineers.

  • Diego-Mas JA, Alcaide-Marzal J (2016) Single users’ affective responses models for product form design. Int J Ind Ergon 53:102–114

    Article  Google Scholar 

  • Doi S, Sasaki H, Igarashi H (2019) Multi-objective topology optimization of rotating machines using deep learning. IEEE Trans Magn 55(6):1–5

    Article  Google Scholar 

  • Dong K, Eriksson D, Nickisch H, Bindel D, Wilson AG (2017) Scalable log determinants for Gaussian process kernel learning. arXiv preprint arXiv:1711.03481

  • Du X, Xu H, Zhu F (2021) A data mining method for structure design with uncertainty in design variables. Comput Struct 244:106457

    Article  Google Scholar 

  • Džeroski S, Ženko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54(3):255–273

    Article  MATH  Google Scholar 

  • Elingaard MO, Aage N, Bærentzen JA, Sigmund O (2022) De-homogenization using convolutional neural networks. Comput Methods Appl Mech Eng 388:114197

    Article  MathSciNet  MATH  Google Scholar 

  • Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer M (2020) An introductory review of deep learning for prediction models with big data. Front Artif Intel 3:4

    Article  Google Scholar 

  • Falck R, Gray JS, Ponnapalli K, Wright T (2021) dymos: A Python package for optimal control of multidisciplinary systems. J Open Source Soft 6(59):2809

    Article  Google Scholar 

  • Fernández-Godino MG, Park C, Kim NH, Haftka RT (2016) Review of multi-fidelity models. arXiv preprint arXiv:1609.07196.

  • Ferreiro-Cabello J, Fraile-Garcia E, de Pison Ascacibar EM, Martinez-de-Pison FJ (2018) Metamodel-based design optimization of structural one-way slabs based on deep learning neural networks to reduce environmental impact. Eng Struct 155:91–101

    Article  Google Scholar 

  • Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Article  Google Scholar 

  • Fix E, Hodges JL (1989) Discriminatory analysis. Nonparametric discrimination: Consistency properties. Inter Stat Rev/revue Internationale De Statistique 57(3):238–247

    MATH  Google Scholar 

  • Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, USA

    Book  Google Scholar 

  • Freiesleben J, Keim J, Grutsch M (2020) Machine learning and design of experiments: Alternative approaches or complementary methodologies for quality improvement? Qual Reliab Eng Int 36(6):1837–1848

    Article  Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Annal Stat 25:1189–1232

    MathSciNet  MATH  Google Scholar 

  • Fukushima K (1988) Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130

    Article  Google Scholar 

  • García-Segura T, Yepes V, Frangopol DM (2017) Multi-objective design of post-tensioned concrete road bridges using artificial neural networks. Struct Multidisc Optim 56(1):139–150

    Article  MathSciNet  Google Scholar 

  • Gardner JR, Pleiss G, Bindel D, Weinberger KQ, Wilson AG (2018a). Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. arXiv preprint arXiv:1809.11165.

  • Gardner J, Pleiss G, Wu R, Weinberger K, Wilson A (2018b) Product kernel interpolation for scalable Gaussian processes. In International Conference on Artificial Intelligence and Statistics (pp. 1407–1416). PMLR.

  • Garriga AG, Mainini L, Ponnusamy SS (2019) A machine learning enabled multi-fidelity platform for the integrated design of aircraft systems. J Mech Des 141(12):121405

    Article  Google Scholar 

  • Gladstone RJ, Nabian MA, Keshavarzzadeh V, Meidani H (2021) Robust topology optimization using variational autoencoders. arXiv preprint arXiv:2107.10661.

  • Goel T, Haftka RT, Shyy W, Queipo NV (2007) Ensemble of Surrogates. Struct Multidisc Optim 33(3):199–216

    Article  Google Scholar 

  • Golub GH, Reinsch C (1971) Singular value decomposition and least squares solutions. In Linear algebra (pp. 134–151). Springer, Berlin, Heidelberg

  • Gomes WJDS (2020) Shallow and deep artificial neural networks for structural reliability analysis. ASME J Risk Uncertainty Part B 6(4):041006

    Article  Google Scholar 

  • Gomes GSDS, Ludermir TB (2013) Optimization of the weights and asymmetric activation function family of neural network for time series forecasting. Expert Syst Appl 40(16):6438–6446

    Article  Google Scholar 

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. Adv Neur Info Process Syst 27.

  • Gray JS, Hwang JT, Martins JR, Moore KT, Naylor BA (2019) OpenMDAO: An open-source framework for multidisciplinary design, analysis, and optimization. Struct Multidisc Optim 59(4):1075–1104

    Article  MathSciNet  Google Scholar 

  • Harman HH (1976) Modern factor analysis. University of Chicago press, USA

    MATH  Google Scholar 

  • Harzing AW (2007). Publish or Perish, available from https://harzing.com/resources/publish-or-perish

  • Hasegawa K, Fukami K, Murata T, Fukagata K (2020) CNN-LSTM based reduced order modeling of two-dimensional unsteady flows around a circular cylinder at different Reynolds numbers. Fluid Dyn Res 52(6):065501

    Article  MathSciNet  Google Scholar 

  • He L, Qian W, Zhao T, Wang Q (2020a) Multi-fidelity aerodynamic data fusion with a deep neural network modeling method. Entropy 22(9):1022

    Article  MathSciNet  Google Scholar 

  • He P, Mader CA, Martins JR, Maki KJ (2020b) Dafoam: an open-source adjoint framework for multidisciplinary design optimization with openfoam. AIAA J 58(3):1304–1319

    Article  Google Scholar 

  • Hou TY, Lam KC, Zhang P, Zhang S (2019) Solving Bayesian inverse problems from the perspective of deep generative networks. Comput Mech 64(2):395–408

    Article  MathSciNet  MATH  Google Scholar 

  • Jabarullah Khan NK, Elsheikh AH (2019) A machine learning based hybrid multi-fidelity multi-level Monte Carlo method for uncertainty quantification. Front Environ Sci 7:105

    Article  Google Scholar 

  • Janda T, Zemanová A, Hála P, Konrád P, Schmidt J (2020) Reduced order model of glass plate loaded by low-velocity impact. Int J Comput Methods Exp Meas 8(1):36–46

    Google Scholar 

  • Jang S, Kang N (2020) Generative design by reinforcement learning: Maximizing diversity of topology optimized designs. arXiv preprint arXiv:2008.07119.

  • Jiang J, Fan JA (2019) Global optimization of dielectric metasurfaces using a physics-driven neural network. Nano Lett 19(8):5366–5372

    Article  Google Scholar 

  • Jiang J, Fan JA (2020) Simulator-based training of generative neural networks for the inverse design of metasurfaces. Nanophotonics 9(5):1059–1069

    Article  Google Scholar 

  • Jiang X, Wang H, Li Y, Mo K (2020) Machine learning based parameter tuning strategy for MMC based topology optimization. Adv Eng Softw 149:102841

    Article  Google Scholar 

  • Jin SS (2020) Compositional kernel learning using tree-based genetic programming for Gaussian process regression. Struct Multidisc Optim 62:1313–1351

    Article  MathSciNet  Google Scholar 

  • Jung J, Yoon K, Lee PS (2020) Deep learned finite elements. Comput Methods Appl Mech Eng 372:113401

    Article  MathSciNet  MATH  Google Scholar 

  • Kallioras NA, Lagaros ND (2020) DzAIℕ: Deep learning based generative design. Procedia Manufacturing 44:591–598

    Article  Google Scholar 

  • Kallioras NA, Kazakis G, Lagaros ND (2020) Accelerated topology optimization by means of deep learning. Struct Multidisc Optim 62(3):1185–1212

    Article  MathSciNet  Google Scholar 

  • Kambampati S, Du Z, Chung H, Kim HA, Jauregui C, Townsend S, Hedges L (2018). OpenLSTO: Open-source software for level set topology optimization. In: 2018 Multidisciplinary Analysis and Optimization Conference (p. 3882).

  • Kaplan EM, Acar E, Bülent Özer M (2021) Development of a method for maximum structural response prediction of a store externally carried by a jet fighter. Proce Inst Mech Eng Part G: J Aeros Eng 09544100211022244.

  • Karlik B, Olgac AV (2011) Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int J Artif Intel Exp Sys 1(4):111–122

    Google Scholar 

  • Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L (2021) Physics-informed machine learning. Nature Reviews. Physics 3(6):422–440

    Google Scholar 

  • Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Liu TY (2017) Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:3146–3154

    Google Scholar 

  • Keshavarzzadeh V, Alirezaei M, Tasdizen T, Kirby RM (2021) Image-based multiresolution topology optimization using deep disjunctive normal shape model. Comput Aided Des 130:102947

    Article  MathSciNet  Google Scholar 

  • Khan S, Gunpinar E, Moriguchi M, Suzuki H (2019a) Evolving a psycho-physical distance metric for generative design exploration of diverse shapes. J Mech Des 141(11):111101

    Article  Google Scholar 

  • Khan S, Gunpinar E, Sener B (2019b) GenYacht: An interactive generative design system for computer-aided yacht hull design. Ocean Eng 191:106462

    Article  Google Scholar 

  • Khatouri H, Benamara T, Breitkopf P, Demange J, Feliot P (2020) Constrained multi-fidelity surrogate framework using Bayesian optimization with non-intrusive reduced-order basis. Adv Model Simul Eng Sci 7(1):1–20

    Article  Google Scholar 

  • Kim SH, Boukouvala F (2020) Machine learning-based surrogate modeling for data-driven optimization: A comparison of subset selection for regression techniques. Optim Lett 14(4):989–1010

    Article  MathSciNet  MATH  Google Scholar 

  • Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

  • Kochkov D, Smith JA, Alieva A, Wang Q, Brenner MP, Hoyer S (2021) Machine learning–accelerated computational fluid dynamics. Proc Nat Acad Sci 118(21):e2101784118

    Article  MathSciNet  Google Scholar 

  • Kollmann HT, Abueidda DW, Koric S, Guleryuz E, Sobh NA (2020) Deep learning for topology optimization of 2D metamaterials. Mater Des 196:109098

    Article  Google Scholar 

  • Kou J, Zhang W (2019) A hybrid reduced-order framework for complex aeroelastic simulations. Aerosp Sci Technol 84:880–894

    Article  Google Scholar 

  • Kumar M, Yadav N (2011) Multilayer perceptrons and radial basis function neural network methods for the solution of differential equations: A survey. Comput Math Appl 62(10):3796–3811

    Article  MathSciNet  MATH  Google Scholar 

  • Lafage R, Defoort S, Lefebvre T (2019) WhatsOpt: a web application for multidisciplinary design analysis and optimization. In AIAA Aviation 2019 Forum (p. 2990).

  • Lee J, Jeong H, Kang S (2008) Derivative and GA-based methods in metamodeling of back-propagation neural networks for constrained approximate optimization. Struct Multidisc Optim 35(1):29–40

    Article  Google Scholar 

  • Lee XY, Balu A, Stoecklein D, Ganapathysubramanian B, Sarkar S (2019) A case study of deep reinforcement learning for engineering design: Application to microfluidic devices for flow sculpting. J Mech Des 141(11):111401

    Article  Google Scholar 

  • Lee S, Kim H, Lieu QX, Lee J (2020) CNN-based image recognition for topology optimization. Knowl-Based Syst 198:105887

    Article  Google Scholar 

  • Lee YO, Jo J, Hwang J (2017). Application of deep neural network and generative adversarial network to industrial maintenance: A case study of induction motor fault detection. In: 2017 IEEE international conference on big data (big data) (pp. 3248–3253). IEEE

  • Lee M, Park Y, Jo H, Kim K, Lee S, Lee I (2022) Deep generative tread pattern design framework for efficient conceptual design. J Mechan Des 1–28.

  • Lei X, Liu C, Du Z, Zhang W, Guo X (2019) Machine learning-driven real-time topology optimization under moving morphable component-based framework. J Appl Mech 86(1):011004

    Article  Google Scholar 

  • Li Y, Mei F (2021) Deep learning-based method coupled with small sample learning for solving partial differential equations. Mult Tools Appl 80(11):17391–17413

    Article  Google Scholar 

  • Li M, Wang Z (2021) An LSTM-based ensemble learning approach for time-dependent reliability analysis. J Mech Des 143(3):031702

    Article  Google Scholar 

  • Li B, Huang C, Li X, Zheng S, Hong J (2019) Non-iterative structural topology optimization using deep learning. Comput Aided Des 115:172–180

    Article  Google Scholar 

  • Li S, Xing W, Kirby R, Zhe S (2020) Multi-fidelity Bayesian optimization via deep neural networks. Adv Neural Info Proc Syst 33.

  • Liao H, Zhang W, Dong X, Poczos B, Shimada K, Burak Kara L (2020) A deep reinforcement learning approach for global routing. J Mech Des 142(6):061701

    Article  Google Scholar 

  • Lin Q, Hong J, Liu Z, Li B, Wang J (2018) Investigation into the topology optimization for conductive heat transfer based on deep learning approach. Int Commun Heat Mass Transfer 97:103–109

    Article  Google Scholar 

  • Lin Q, Liu Z, Hong J (2019) Method for directly and instantaneously predicting conductive heat transfer topologies by using supervised deep learning. Int Commun Heat Mass Transfer 109:104368

    Article  Google Scholar 

  • Liu K, Tovar A, Nutwell E, Detwiler D (2015) Thin-walled compliant mechanism component design assisted by machine learning and multiple surrogates.

  • Liu D, Wang Y (2019) Multi-fidelity physics-constrained neural network and its application in materials modeling. J Mech Des 141(12):121403

    Article  Google Scholar 

  • Lye KO, Mishra S, Ray D, Chandrashekar P (2021) Iterative surrogate model optimization (ISMO): An active learning algorithm for PDE constrained optimization with deep neural networks. Comput Methods Appl Mech Eng 374:113575

    Article  MathSciNet  MATH  Google Scholar 

  • Lynch ME, Sarkar S, Maute K (2019) Machine learning to aid tuning of numerical parameters in topology optimization. J Mech Des 141(11):114502

    Article  Google Scholar 

  • Ma SB, Kim S, Kim JH (2020) Optimization design of a two-vane pump for wastewater treatment using machine-learning-based surrogate modeling. Processes 8(9):1170

    Article  Google Scholar 

  • McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133

    Article  MathSciNet  MATH  Google Scholar 

  • McFall KS (2013) Automated design parameter selection for neural networks solving coupled partial differential equations with discontinuities. J Franklin Inst 350(2):300–317

    Article  MathSciNet  MATH  Google Scholar 

  • Meng X, Karniadakis GE (2020) A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems. J Comput Phys 401:109020

    Article  MathSciNet  MATH  Google Scholar 

  • Minisci E, Vasile M (2013) Robust design of a reentry unmanned space vehicle by multifidelity evolution control. AIAA J 51(6):1284–1295

    Article  Google Scholar 

  • Mondal S (2020) Probabilistic machine learning for advanced engineering design optimization and diagnostics, PhD dissertation, Penn State University.

  • Montgomery DC, Peck EA, Vining GG (2021) Introduction to linear regression analysis. John Wiley & Sons, USA

    MATH  Google Scholar 

  • Motamed M (2020) A multi-fidelity neural network surrogate sampling method for uncertainty quantification. Int J Uncertain Quantif 10(4).

  • Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao J, Bessa MA (2019) Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci 116(52):26414–26420

    Article  Google Scholar 

  • Müller J, Park J, Sahu R, Varadharajan C, Arora B, Faybishenko B, Agarwal D (2021) Surrogate optimization of deep neural networks for groundwater predictions. J Global Optim 81(1):203–231

    Article  MathSciNet  MATH  Google Scholar 

  • Nagarajan HP, Mokhtarian H, Jafarian H, Dimassi S, Bakrani-Balani S, Hamedi A, Haapala KR (2019) Knowledge-based design of artificial neural network topology for additive manufacturing process modeling: A new approach and case study for fused deposition modeling. J Mech Des 141(2):021705

    Article  Google Scholar 

  • Nakamura K, Suzuki Y (2020) Deep learning-based topological optimization for representing a user-specified design area. arXiv preprint arXiv:2004.05461.

  • Napier N, Sriraman SA, Tran HT, James KA (2020) An artificial neural network approach for generating high-resolution designs from low-resolution input in topology optimization. J Mech Des 142(1):011402

    Article  Google Scholar 

  • Naranjo-Pérez J, Infantes M, Jiménez-Alonso JF, Sáez A (2020) A collaborative machine learning-optimization algorithm to improve the finite element model updating of civil engineering structures. Eng Struct 225:111327

    Article  Google Scholar 

  • Nie Z, Lin T, Jiang H, Kara LB (2021) Topologygan: Topology optimization using generative adversarial networks based on physical fields over the initial domain. J Mech Des 143(3):031715

    Article  Google Scholar 

  • Ning C, You F (2018) Data-driven stochastic robust optimization: General computational framework and algorithm leveraging machine learning for optimization under uncertainty in the big data era. Comput Chem Eng 111:115–133

    Article  Google Scholar 

  • Nobari AH, Rashad MF, Ahmed F (2021) Creativegan: Editing generative adversarial networks for creative design synthesis. arXiv preprint arXiv:2103.06242.

  • Odonkor P, Lewis K (2019) Data-driven design of control strategies for distributed energy systems. J Mech Des 141(11):111404

    Article  Google Scholar 

  • Oh S, Jung Y, Lee I, Kang N (2018) Design automation by integrating generative adversarial networks and topology optimization. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 51753, p. V02AT03A008). American Society of Mechanical Engineers.

  • Oh S, Jung Y, Kim S, Lee I, Kang N (2019) Deep generative design: Integration of topology optimization and generative models. J Mech Des 141(11).

  • Owoyele O, Pal P, Vidal Torreira A, Probst D, Shaxted M, Wilde M, Senecal PK (2021) An automated machine learning-genetic algorithm (AutoML-GA) approach for efficient simulation-driven engine design optimization. arXiv e-prints, arXiv-2101

  • Panchal JH, Fuge M, Liu Y, Missoum S, Tucker C (2019) Machine learning for engineering design. J Mech Des 141(11)

  • Pánek D, Orosz T, Karban P (2020) Artap: Robust design optimization framework for engineering applications. arXiv 2019. arXiv preprint arXiv:1912.11550

  • Parsonage B, Maddock CA (2020) Multi-stage multi-fidelity information correction for artificial neural network based meta-modelling. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 950–957). IEEE

  • Patel J, Choi SK (2012) Classification approach for reliability-based topology optimization using probabilistic neural networks. Struct Multidisc Optim 45(4):529–543

    Article  MathSciNet  MATH  Google Scholar 

  • Pawar S, Rahman SM, Vaddireddy H, San O, Rasheed A, Vedula P (2019) A deep learning enabler for nonintrusive reduced order modeling of fluid flows. Phys Fluids 31(8):085101

    Article  Google Scholar 

  • Peherstorfer B, Willcox K, Gunzburger M (2018) Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev 60(3):550–591

    Article  MathSciNet  MATH  Google Scholar 

  • Pereira DR, Piteri MA, Souza AN, Papa JP, Adeli H (2020) FEMa: A finite element machine for fast learning. Neural Comput Appl 32(10):6393–6404

    Article  Google Scholar 

  • Perez RE, Jansen PW, Martins JR (2012) pyOpt: a Python-based object-oriented framework for nonlinear constrained optimization. Struct Multidisc Optim 45(1):101–118

    Article  MathSciNet  MATH  Google Scholar 

  • Perron C, Rajaram D, Mavris DN (2021) Multi-fidelity non-intrusive reduced-order modelling based on manifold alignment. Proce Royal Soc A 477(2253):20210495

    Article  MathSciNet  Google Scholar 

  • Pillai AC, Thies PR, Johanning L (2019) Mooring system design optimization using a surrogate assisted multi-objective genetic algorithm. Eng Optim 51(8):1370–1392

    Article  MathSciNet  Google Scholar 

  • Popov AA, Mou C, Sandu A, Iliescu T (2021) A multifidelity ensemble Kalman filter with reduced order control variates. SIAM J Sci Comput 43(2):A1134–A1162

    Article  MathSciNet  MATH  Google Scholar 

  • Puentes L, Raina A, Cagan J, McComb C. (2020) Modeling a strategic human engineering design process: Human-inspired heuristic guidance through learned visual design agents. In Proceedings of the Design Society: DESIGN Conference (Vol. 1, pp. 355–364). Cambridge University Press.

  • Qian C, Ye W (2021) Accelerating gradient-based topology optimization design with dual-model artificial neural networks. Struct Multidisc Optim 63(4):1687–1707

    Article  MathSciNet  Google Scholar 

  • Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

  • Raina A, McComb C, Cagan J (2019). Learning to design from humans: Imitating human designers through deep learning. J Mech Des 141(11)

  • Raissi M, Karniadakis GE (2018) Hidden physics models: Machine learning of nonlinear partial differential equations. J Comput Phys 357:125–141

    Article  MathSciNet  MATH  Google Scholar 

  • Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707

    Article  MathSciNet  MATH  Google Scholar 

  • Rasmussen CE (2003). Gaussian processes in machine learning. In Summer school on machine learning (pp. 63–71). Springer, Berlin, Heidelberg

  • Rätsch G, Onoda T, Müller KR (2001) Soft Margins for AdaBoost. Mach Learn 42(3):287–320

    Article  MATH  Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536

    Article  MATH  Google Scholar 

  • Sasaki H, Igarashi H (2019a) Topology optimization accelerated by deep learning. IEEE Trans Magn 55(6):1–5

    Article  Google Scholar 

  • Sasaki H, Igarashi H (2019b) Topology optimization of IPM motor with aid of deep learning. Int J Appl Electromagnet Mech 59(1):87–96

    Article  Google Scholar 

  • Shi M, Lv L, Sun W, Song X (2020a) A multi-fidelity surrogate model based on support vector regression. Struct Multidisc Optim 61(6):2363–2375

    Article  MathSciNet  Google Scholar 

  • Shi X, Qiu T, Wang J, Zhao X, Qu S (2020b) Metasurface inverse design using machine learning approaches. J Phys D Appl Phys 53(27):275105

    Article  Google Scholar 

  • Shu D, Cunningham J, Stump G, Miller SW, Yukish MA, Simpson TW, Tucker CS (2020) 3d design using generative adversarial networks and physics-based validation. J Mech Des 142(7):071701

    Article  Google Scholar 

  • Singh AP, Medida S, Duraisamy K (2017) Machine-learning-augmented predictive modeling of turbulent separated flows over airfoils. AIAA J 55(7):2215–2227

    Article  Google Scholar 

  • Singla M, Ghosh D, Shukla KK (2020) A survey of robust optimization based machine learning with special reference to support vector machines. Int J Mach Learn Cybern 11(7):1359–1385

    Article  Google Scholar 

  • Solanki KN, Acar E, Rais-Rohani M, Horstemeyer MF, Steele WG (2009) Product design optimisation with microstructure-property modelling and associated uncertainties. Int J Des Eng 2(1):47–79

    Google Scholar 

  • Song H, Choi KK, Lee I, Zhao L, Lamb D (2013) Adaptive virtual support vector machine for reliability analysis of high-dimensional problems. Struct Multidisc Optim 47(4):479–491

    Article  MathSciNet  MATH  Google Scholar 

  • Sosnovik I, Oseledets I (2019) Neural networks for topology optimization. Russ J Numer Anal Math Model 34(4):215–223

    Article  MathSciNet  MATH  Google Scholar 

  • Strömberg N (2020) Efficient detailed design optimization of topology optimization concepts by using support vector machines and metamodels. Eng Optim 52(7):1136–1148

    Article  MathSciNet  Google Scholar 

  • Su G, Peng L, Hu L (2017) A Gaussian process-based dynamic surrogate model for complex engineering structural reliability analysis. Struct Saf 68:97–109

    Article  Google Scholar 

  • Sun H, Ma L (2020) Generative design by using exploration approaches of reinforcement learning in density-based structural topology optimization. Designs 4(2):10

    Article  Google Scholar 

  • Sun G, Wang S (2019) A review of the artificial neural network surrogate modeling in aerodynamic design. Proc Inst Mech Eng, Part G: J Aeros Eng 233(16):5863–5872

    Article  Google Scholar 

  • Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press.

  • Tan RK, Zhang NL, Ye W (2020) A deep learning-based method for the design of microstructural materials. Struct Multidisc Optim 61(4):1417–1438

    Article  MathSciNet  Google Scholar 

  • Tao J, Sun G (2019) Application of deep learning based multi-fidelity surrogate model to robust aerodynamic design optimization. Aerosp Sci Technol 92:722–737

    Article  Google Scholar 

  • Tenne Y (2019). Enhancing simulation-driven optimization by machine-learning. Int J Model Optim 9(4)

  • Thole SP, Ramu P (2020) Design space exploration and optimization using self-organizing maps. Struct Multidisc Optim 62(3):1071–1088

    Article  Google Scholar 

  • Trehan S, Carlberg KT, Durlofsky LJ (2017) Error modeling for surrogates of dynamical systems using machine learning. Int J Numer Meth Eng 112(12):1801–1827

    Article  MathSciNet  Google Scholar 

  • Trinchero R, Larbi M, Torun HM, Canavero FG, Swaminathan M (2018) Machine learning and uncertainty quantification for surrogate models of integrated devices with a large number of parameters. IEEE Access 7:4056–4066

    Article  Google Scholar 

  • Tripathy RK, Bilionis I (2018) Deep UQ: Learning deep neural network surrogate models for high dimensional uncertainty quantification. J Comput Phys 375:565–588

    Article  MathSciNet  MATH  Google Scholar 

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008)

  • Wang C, Yao S, Wang Z, Hu J (2021a) Deep super-resolution neural network for structural topology optimization. Eng Optim 53(12):2108–2121

    Article  Google Scholar 

  • Wang D, Xie C, Wang S (2021c) An adaptive RBF neural network–based multi-objective optimization method for lightweight and crashworthiness design of cab floor rails using fuzzy subtractive clustering algorithm. Struct Multidisc Optim 63(2):915–928

    Article  Google Scholar 

  • Wang L, van Beek A, Da D, Chan YC, Zhu P, Chen W (2022) Data-driven multiscale design of cellular composites with multiclass microstructures for natural frequency maximization. Compos Struct 280:114949

    Article  Google Scholar 

  • Wang F, Song M, Edelen A, Huang X (2019) Machine learning for design optimization of storage ring nonlinear dynamics. arXiv preprint arXiv:1910.14220.

  • Wang D, Xiang C, Pan Y, Chen A, Zhou X, Zhang Y (2021b) A deep convolutional neural network for topology optimization with perceptible generalization ability. Eng Optim 1–16

  • Wiener N (1938) The homogeneous chaos. Am J Math 60(4):897–936

    Article  MathSciNet  MATH  Google Scholar 

  • Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning (Vol. 2, No. 3, p. 4). Cambridge, MA: MIT press.

  • Williams G, Meisel NASimpson TW, McComb C (2019) Design repository effectiveness for 3D convolutional neural networks: Application to additive manufacturing. J Mechan Des 141(11)

  • Wu X, Kozlowski T, Meidani H (2018) Kriging-based inverse uncertainty quantification of nuclear fuel performance code BISON fission gas release model using time series measurement data. Reliab Eng Syst Saf 169:422–436

    Article  Google Scholar 

  • Wu J (2017) Introduction to convolutional neural networks. National Key Lab for Novel Software Technology. Nanjing University. China 5(23), 495.

  • Wuraola A, Patel N (2018) SQNL: A new computationally efficient activation function. In: 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1–7). IEEE

  • Xu Y, Gao Y, Wu C, Fang J, Sun G, Steven GP, Li Q (2021) Machine learning based topology optimization of fiber orientation for variable stiffness composite structures. Int J Num Methods Eng

  • Yamasaki S, Yaji K, Fujita K (2021) Data-driven topology design using a deep generative model. Struct Multidisc Optim 1–20.

  • Yan L, Zhou T (2019). An adaptive surrogate modeling based on deep neural networks for large-scale Bayesian inverse problems. arXiv preprint arXiv:1911.08926.

  • Yang Y, Perdikaris P (2019) Conditional deep surrogate models for stochastic, high-dimensional, and multi-fidelity systems. Comput Mech 64(2):417–434

    Article  MathSciNet  MATH  Google Scholar 

  • Yao H, Gao Y, Liu Y (2020) FEA-Net: A physics-guided data-driven model for efficient mechanical response prediction. Comput Methods Appl Mech Eng 363:112892

    Article  MathSciNet  MATH  Google Scholar 

  • Yonekura K, Suzuki K (2021) Data-driven design exploration method using conditional variational autoencoder for airfoil design. Struct Multidisc Optim 1–12.

  • Yonekura K, Hattori H (2019) Framework for design optimization using deep reinforcement learning. Struct Multidisc Optim 60(4):1709–1713

    Article  Google Scholar 

  • Yu Y, Hur T, Jung J, Jang IG (2019) Deep learning for determining a near-optimal topological design without any iteration. Struct Multidisc Optim 59(3):787–799

    Article  Google Scholar 

  • Yuan C, Moghaddam M (2020) Attribute-aware generative design with generative adversarial networks. IEEE Access 8:190710–190721

    Article  Google Scholar 

  • Zhang Y, Ye W (2019) Deep learning-based inverse method for layout design. Struct Multidisc Optim 60(2):527–536

    Article  MathSciNet  Google Scholar 

  • Zhang J, Zhao X (2021) Machine-learning-based surrogate modeling of aerodynamic flow around distributed structures. AIAA J 59(3):868–879

    Article  Google Scholar 

  • Zhang X, Xie F, Ji T, Zhu Z, Zheng Y (2021a) Multi-fidelity deep neural network surrogate model for aerodynamic shape optimization. Comput Methods Appl Mech Eng 373:113485

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Z, Li Y, Zhou W, Chen X, Yao W, Zhao Y (2021b) TONR: An exploration for a novel way combining neural network with topology optimization. Comput Method Appl Mech Eng 386:114083

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ikjin Lee.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Replication of results

In this review paper, we do not provide any results to replicate.

Additional information

Responsible Editor: Byeng D. Youn

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix. ML methods widely used in the context of structural and multidisciplinary optimization

Appendix. ML methods widely used in the context of structural and multidisciplinary optimization

ML algorithms can be categorized into four groups: 1) classification, 2) regression, 3) clustering, and 4) dimension reduction as shown in Fig. 6. Classification and regression are both supervised learning algorithms, where the main idea is to generate a prediction model. If the predicted response is discrete, it is a classification problem, whereas if the response is continuous, then it is a regression problem. Therefore, in general, the ML algorithms used for classification and regression are very similar. The most commonly used classical ML algorithms for classification problems include logistic regression [Cox (1958)], k-nearest neighbors [Fix and Hodges (1989)], support vector machines (SVM) [Cortes and Vapnik (1995)], kernel SVM, naive Bayes, decision tree classification, and random forest classification. The most commonly used classical ML algorithms for regression problems include simple linear regression, multiple linear regression, polynomial regression, Kriging, support vector regression (SVR), decision tree regression, and random forest regression.

Fig. 6
figure 6

Categories of ML problems

Clustering is similar to classification in that they are both used for grouping the data. The main difference is that classification is used to categorize labeled data, whereas clustering detects patterns within an unlabeled data set. Therefore, the classification is a supervised learning algorithm, whereas the clustering is an unsupervised learning algorithm. The most commonly used classical ML algorithms for clustering problems include k-means, mean shift clustering, Gaussian mixture models, density-based spatial clustering, and hierarchical agglomerative clustering.

Dimension reduction aims to reduce the number of input variables in a dataset, thereby protecting against the curse of dimensionality, which makes the algorithm difficult to run as the dimensions of the data increase. Data from a large dimensional space is transformed into a smaller dimensional space ensuring that it provides similar information. Dimension reduction methods can be further categorized into linear methods and non-linear methods. The most commonly used linear learning algorithms for dimension reduction include principal component analysis [Wiener (1938)], factor analysis [Harman (1976)], linear discriminant analysis [Fisher (1936)], and singular value decomposition [Golub and Reinsch (1971)]. The nonlinear algorithms include kernel principal component analysis, isometric mapping, and t-distributed stochastic neighbor embedding (t-SNE). Among the ML methods listed in Fig. 6, we briefly explain ML methods that are widely used in the context of structural and multidisciplinary optimization in the following subsections.

1.1 A.1 Linear regression

Linear regression [Montgomery et al. (2021)] models the relationship between the response variable (dependent) y and one or more independent variables x. If there exists only one independent variable, then it is called simple linear regression. The fundamental idea in linear regression is to find the coefficients of the basis functions that best model the data. Ordinary least squares (OLS) are the most common method used to train the model with the given data to estimate the unknown coefficients. Function nonlinearity is modeled using complex basis functions while keeping the regression linear.

$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n + \varepsilon = {{x}}^{{f T}} {\boldsymbol{\upbeta }} + \varepsilon$$
(A1)

where β0 and βn are the unknown coefficients and ε is the error term.

1.2 A.2 Gaussian process (GP)

GP [Rasmussen (2003)], also known as Kriging when the mean of GP is zero, is a stochastic approach that finds wide use in regression, classification, and unsupervised learning. It is usually utilized in the linear regression framework while using the Gaussian kernel as the basis function. It is the preferred approach for inference on functions as well. GP is a generalization of Gaussian probability distribution in which every finite collection of random variables has a multivariate Gaussian distribution. GP is a distribution over functions with a continuous domain such as time or space. Since GP provides model prediction as well as prediction error estimates, even when the simulation is deterministic, it is sought after to be used as surrogates in design and analysis of expensive computer experiments. Since GP metamodels can fit complicated surfaces well, it is suited for fitting accurate global metamodels. A GP is completely specified by mean function m(x) and covariance function \(k\left( {{{\bf{x}}},{{\bf{x}}}^{\prime}} \right)\) as

$$f\left( {{\bf{x}}} \right) \sim GP\left( {m\left( {{\bf{x}}} \right),\,k\left( {{{\bf{x}}},{{\bf{x}}}^{\prime}} \right)} \right)$$
(A2)

GP can be extended to multiple outputs by using multiple means and covariances. It permits easy interpolation of data and has an inbuilt mechanism to account for noise. Furthermore, GP can quantify the uncertainty about the prediction and have conditional distributions that allow adaptive sampling or Bayesian studies. Owing to the fact that GP models are regularly evaluated on a grid leading to multivariate normal distributions and the computational time for calculating the inversion and determinant of n × n covariance matrix is of O3, using GP is a challenge while using large scale datasets. Recently, approaches such as matrix–vector multiplication [Gardner et al. (2018a), (2018b)], [Dong et al. (2017)] and sparse GP [Cutajar et al. (2016)] have been developed to reduce the amount of computation when the data set is more than 100 k.

1.3 A.3 Artificial neural network (ANN)

In the 1940s, [McCulloch and Pitts (1943)] formulated the first NN model. Since its inception, NN has found interest among both researchers and applications in various domains. As a result, better algorithms and more powerful networks have been developed. ANN refers to a biologically inspired sub-domain of artificial intelligence (AI) modeled based on the network of the brain. Akin to the human brain, ANNs have neurons (called nodes) which are connected to each other in different layers of the networks as shown in Fig. 7. The basic idea of ANN is that an input vector x is weighted by w and along with bias b, subjected to an activation function f that is linear or nonlinear to produce the output y as given as

$$y = f\left( {{{\bf w}}^{{\rm T}} {{\bf x}} + {{\bf b}}} \right)$$
(A3)

The weights in Eq. (A3) are optimized during training until a specified level of accuracy is reached by the network. Based on the application, there are many activation functions used in ANN, namely sigmoid, hyperbolic tangent, rectifier linear unit (ReLU), Heaviside, signum, and softmax functions [Karlik and Olgac (2011)]. Researchers have also developed application specific activation functions (Wuraola and Patel 2018, [Gomes and Ludermir (2013). Since ANN deals with multidimensional data, approaches such as StandardScaler, RobustScaler, MinMaxScaler, and Normalizer for data scaling, can be used for data processing and can prevent convergence to zero or diverge to infinity during the learning process.

Fig. 7
figure 7

Simple architecture of ANN

ANN is broadly classified into two categories such as feed-forward NN and feed backward NN. In the feed-forward NN, the information will pass only in the forward direction i.e., from the input layer to the hidden layer (if any) and then to the output. Single-layer perceptron, multi-layer perceptron, and radial basis function networks are examples of feed-forward NN. In the feed backward NN, the inputs are fed in the forward direction and errors are computed to be propagated in the reverse (hence the terminology back) direction to the previous layers, so as to reduce the error in the cost function by readjusting the weights. Examples include Bayesian regularised NN and Kohonen’s self-organizing map. The loss function is computed as the difference between the prediction and the target after each feedforward pass. In the backpropagation process, the optimizer trains parameters such as weights and biases iteratively through optimization to minimize the loss function.

ANNs can be used for both regression and classification problems which are techniques in predictive modeling. In the context of classification, since ANN works by splitting the problem into layered networks of simpler elements, ANNs are reliable when the tasks involve many features. The most attractive feature of ANNs is that they provide predictive capability by mapping any number of inputs and outputs. Upon training, the predictions are fast and cheap.

NNs are typically black box approaches. That is, one might not be able to capture the influence of independent variables on dependent variables. Overfitting is a fundamental challenge of ANN as it depends predominantly on training data. With traditional CPUs, ANNs were expensive in terms of computational time to train the network, but the invention of cloud computing and increased computing power have relieved the computational burden. However, researchers began to focus on more complex problems and used more layers to train on large sets of data, resulting in longer computational times with multiple training iterations.

1.4 A.4 Deep neural network (DNN)

DNN is created when NNs are stacked one after the other. The primary difference between the conventional NN and DNN is that the former has one or two hidden layers and the latter has several hidden layers as shown in Fig. 8. Each circle in the figure calculates a weighted sum of the input vectors and bias following which a nonlinear function is applied to obtain the output. DNNs can handle functions with limited regularity and are powerful for high-dimension problems. The basic idea of DNN is to approximate a function with a non-linear activation function [Emmert-Streib et al. (2020)], with n hidden layers as represented in

$${{\bf DNN}}\left( y \right) = {{\bf w}}^{\left( n \right)} {{\bf x}}^{\left( n \right)} + {{\bf b}}^{\left( n \right)}$$
(A4)

and

$${{\bf x}}^{\left( {k + 1} \right)} = \sigma \left( {{{\bf w}}^{\left( k \right)} {{\bf x}}^{\left( k \right)} + {{\bf b}}^{\left( k \right)} } \right),\,\,\,\,\,\,k = 0,\,1, \cdots ,n - 1$$
(A5)

where w and b are the weights and biases of the network and \(\sigma\) is the activation function. DNN is more complex in connecting layers than a network with 1 or 2 hidden layers and has the automatic feature extraction capability. Therefore, when larger training data is used, the DNN can provide accurate predictions compared to classical ML algorithms where the accuracy is kept fairly constant.

Fig. 8
figure 8

Architecture of DNN

There are three major classes of DNNs, namely supervised, semi-supervised, and unsupervised DNNs. Examples of supervised learning algorithms include deep feed-forward networks (DFNNs) and CNNs. Restricted Boltzmann machines, autoencoders (AEs), GANs, and long short-term memory networks (LSTMs) are examples of unsupervised learning algorithms. Recurrent neural networks (RNN) are an example of semi-supervised learning techniques.

While solving complex problems such as image classification, natural language processing, and speech recognition, DNN is more useful than shallow networks. DNNs typically outperform other approaches when the data is large. DNN architectures are very flexible to adapt to new problems and can work with any data type. Getting trapped in the local minima, vanishing gradient, and overfitting are some of the challenges associated with DNN training that require large data.

1.5 A.5 Convolutional neural network (CNN)

One of the most widely used DNNs are the CNNs [Fukushima (1988)]. While ANN is inspired by the human brain, CNNs are inspired by the human optical system and are predominantly applied to imaging analysis. CNNs consist of two operations, namely convolution and pooling. Unlike ANNs, in CNNs the neurons in one layer are connected to nearby neurons in the next layer. This leads to a significant reduction in the number of parameters in the network. A typical CNN consists of an input, an output, and multiple hidden layers which consist of a series of convolutional layers (filters or convolution kernels) as shown in Fig. 9. ReLU is the typical activation function used, followed by operations such as pooling layers, fully connected layers, and normalization layers. Backpropagation is used for error minimization and weight adjustment. [Wu (2017)] provides a tutorial on CNN. Compared to CPU-based architectures, CNNs with GPU-based architectures take less time for training, because GPU vastly is superior in the computation of dense algebraic kernels, such as matrix–vector multiplication, in which DL algorithms are mainly composed.

Fig. 9
figure 9

Architecture of CNN

CNNs can easily process high-dimensional inputs such as images. CNNs are good at extracting local information from the text and exploring meaningful semantic and syntactic meanings between phrases and words. Also, the natural composition of text data can be easily handled by a CNN’s architecture. CNNs need large data for training and hence are computationally intensive. Encoding the position and orientation of objects is still a challenge in CNN.

1.6 A.6 Reinforcement learning (RL)

RL [Sutton and Barto (2018)] is one of the paradigms of ML algorithms where the agents learn by interacting with the environment. RL works on trial and error-based learning and maximizes the reward rather than finding the hidden structure unlike other ML algorithms. As can be seen in Fig. 10, an RL agent performs an action ‘a’ while transiting from a state st to st +1 and is rewarded rt +1 for the action at and this process is repeated iteratively to maximize the reward. The probability of transition to the new state is expressed by P(st +1 | st, at). The best sequence of actions that an RL agent can make is called a policy and the entire set of actions from start to finishthat an agent performs is called an episode. Usually, the dynamics of the RL problem can be captured by using a Markov decision process.

RL usually performs better in solving complex problems compared to other standard learning techniques. If no training data set is available, it is bound to learn from experience. RL focuses on achieving long-term results that are difficult to accomplish by other techniques. Similar to other ML techniques, RL requires large data and is computationally expensive. RL can be heavily affected due to the curse of dimensionality. When the conventional RL is combined with DL, deep RL can be set up. The deep RL uses DNNs to calculate rewards, and policies that are usually accomplished by a state of action pairs in RL. The deep RL can be employed where there exists a complex state and very high computations are required (Fig. 10).

Fig. 10
figure 10

Basic setting of RL

1.7 A.7 Recurrent neural network (RNN)

RNN [Rumelhart et al 1986] is one of the common semi-supervised learning algorithms that use sequential data (or ordered data) for training. Examples of ordered data are DNA sequence, financial data, and time-series data. RNN uses the current input as well as the past history of inputs that it has learned through the hidden state while making decisions. Typically RNNs consists of an input layer, a hidden layer, and an output layer as shown in Fig. 11. The number of neurons in the hidden layer of RNNs should be between the number of inputs and the number of outputs. The key feature of RNN is that it makes a copy of the output and sends it back into the network. Thus, the past information gets stored. The change in the knowledge of the network is updated in the hidden state at every time step and the update can be expressed as

$$h_t = f_w \left( {x_t ,\,h_{t - 1} } \right)$$
(A6)

where ht is the new hidden state, ht-1 is the past hidden state, xt is the current input, and fw is the fixed function with trainable weights.

Fig. 11
figure 11

RNN with hidden memory state

These algorithms commonly find application in ordinal or temporal problems such as image captioning, speech recognition, and natural language processing. Similar to spatial data being efficiently processed by CNNs, RNNs are designed to process the sequential data in an efficient manner. As the number of time steps increases, the number of model parameters in the RNN model does not increase. While training an RNN, error gradients are used to update the network weights. Sometimes the error gradients can accumulate resulting in large updates of weights (exploding gradients) and an unstable network. On the other hand, if the weight updates are small, one faces the problem of vanishing gradients. These are the two major issues associated with the RNNs. In order to solve the gradient problem, weight initialization methods such as Xavier initialization and He initialization, gradient clipping, and batch normalization are used, or an LSTM or GRU is devised. When timely dependencies in sequences need to be captured, RNN are one of the best choices. However, recent developments such as Transformers [Vaswani et al, (2017)] can outperform RNN in such applications.

1.8 A.8 Variational autoencoder (VAE)

An autoencoder (AE) is a type of unsupervised learning that learns unlabeled data and has traditionally been used for dimensionality reduction and feature learning, but recently it has gained a lot of popularity as a generative model that can generate data similar to training data. Since VAE (Kingma and Welling 2013) is based on an AE, it consists of two parts: encoder and decoder. However, unlike AE, which represents a latent vector as a value, the latent vector of VAE uses a density function. Since VAE is based on a probabilistic model, it has computational flexibility. This latent vector is used to predict an input image, and VAE training is performed with the goal of reducing the difference between the generated image and the input image as shown in Fig. 12. Finally, evidence lower bound and re-parameterization tricks are used to perform optimization. The main advantage of VAE is that it is useful to perform other tasks such as design optimization in the latent space using the latent vector information. However, since the density is not obtained directly, the quality of the generated model may be somewhat inferior to the direct density methods such as pixelRNN or pixelCNN, and the generated image is relatively blurry compared to GAN.

Fig. 12
figure 12

Architecture of VAE (Asperti et al. 2021)

1.9 A.9 Generative adversarial network (GAN)

GAN [Goodfellow et al. (2014)] as shown in Fig. 13 trains a model that samples a latent vector from a simple distribution and generates it as an image based on the game-theoretic approach. The objective function of GAN consists of a discriminator output for real data, and a discriminator output for generated fake data. Because the generator and discriminator train with the goal of minimizing and maximizing the objective function, respectively, GAN is called a minmax game. The story of a counterfeiter (generator) and a police officer (discriminator) is an easy-to-understand example of the concept of GAN. When a counterfeiter creates a counterfeit currency, the police can determine whether it is genuine or not, and in the process, the generator and discriminator evolve competitively to generate a more authentic counterfeit currency.

Fig. 13
figure 13

Architecture of GAN (Alom et al. 2019)

GAN is difficult to apply to various fields due to unstable learning ability; consequently, a DCGAN [Radford et al. (2015)] with CNN in the generator part was developed. In addition to the GAN model, various models such as conditional GAN (cGAN), boundary equilibrium GAN, and super-resolution GAN have been developed to improve performance and for application to new fields. The main advantage of GANs is that it is possible to create new and novel images.

1.10 A.10 Ensemble methods

Ensemble methods are one paradigm of ML techniques that have become popular during the past three decades [Bishop (1995)], where several learning algorithms are used to train and solve a problem. Boosting, bagging [Bühlmann (2012)], and stacking (Džeroski and Ženko 2004) are the most widely used approaches in ensemble methods. AdaBoost [Rätsch et al. (2001)], gradient boosting (Friedman 2001), extreme gradient boosting [Chen and Guestrin (2016)], and light gradient boosting [Ke et al. (2017)] are a few algorithms that are more frequently used in boosting. Bagging meta-estimator and random forest are the popular ensemble algorithms in bagging.

Ensemble methods are used to improve the accuracy of the model by reducing the variance. Bagging is a variance reduction technique whereas boosting and stacking’s objective is to reduce the bias and not the variance. Ensembles have been shown to serve as insurance against bad predictions and issue a red flag when one of the models is performing inconsistently on a consistent basis, especially at regions of interest. Eventually, all the ensemble algorithms attempt to improve the model accuracy. The generalization ability of a single learner is not as good as ensemble methods, since it uses multiple learners, and this is one of the major advantages of using ensemble methods.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramu, P., Thananjayan, P., Acar, E. et al. A survey of machine learning techniques in structural and multidisciplinary optimization. Struct Multidisc Optim 65, 266 (2022). https://doi.org/10.1007/s00158-022-03369-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00158-022-03369-9

Keywords

Profiles

  1. Pugazhenthi Thananjayan