A survey of machine learning techniques in structural and multidisciplinary optimization

Ramu, Palaniappan; Thananjayan, Pugazhenthi; Acar, Erdem; Bayrak, Gamze; Park, Jeong Woo; Lee, Ikjin

doi:10.1007/s00158-022-03369-9

A survey of machine learning techniques in structural and multidisciplinary optimization

Review Paper
Published: 10 September 2022

Volume 65, article number 266, (2022)
Cite this article

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Palaniappan Ramu¹,
Pugazhenthi Thananjayan¹,
Erdem Acar²,
Gamze Bayrak²,
Jeong Woo Park³ &
…
Ikjin Lee ORCID: orcid.org/0000-0002-3470-7341³

5003 Accesses
43 Citations
Explore all metrics

Abstract

Machine Learning (ML) techniques have been used in an extensive range of applications in the field of structural and multidisciplinary optimization over the last few years. This paper presents a survey of this wide but disjointed literature on ML techniques in the structural and multidisciplinary optimization field. First, we discuss the challenges associated with conventional optimization and how Machine learning can address them. Then, we review the literature in the context of how ML can accelerate design synthesis and optimization. Some real-life engineering applications in structural design, material design, fluid mechanics, aerodynamics, heat transfer, and multidisciplinary design are summarized, and a brief list of widely used open-source codes as well as commercial packages are provided. Finally, the survey culminates with some concluding remarks and future research suggestions. For the sake of completeness, categories of ML problems, algorithms, and paradigms are presented in the Appendix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

OpenMDAO: an open-source framework for multidisciplinary design, analysis, and optimization

Article Open access 01 March 2019

Accelerating Large-scale Topology Optimization: State-of-the-Art and Challenges

Article 28 January 2021

Introduction and Overview: Hybrid Metaheuristics in Structural Engineering—Including Machine Learning Applications

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Abueidda DW, Koric S, Sobh NA (2020) Topology optimization of 2D structures with nonlinearities using deep learning. Comput Struct 237:106283
Article Google Scholar
Abueidda DW, Lu Q, Koric S (2021) Meshless physics-informed deep learning method for three-dimensional solid mechanics. Int J Numer Meth Eng 122(23):7182–7201
Article MathSciNet Google Scholar
Acar E, Rais-Rohani M (2009) Ensemble of metamodels with optimized weight factors. Struct Multidisc Optim 37(3):279–294
Article Google Scholar
Acar E, Solanki K (2009) System reliability based vehicle design for crashworthiness and effects of various uncertainty reduction measures. Struct Multidisc Optim 39(3):311–325
Article Google Scholar
Adeli H, Park HS (1995) A neural dynamics model for structural optimization—theory. Comput Struct 57(3):383–390
Article MathSciNet MATH Google Scholar
Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P, Nasrin MS, Asari VK (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3):292
Article Google Scholar
Amsallem D, Farhat C (2008) Interpolation method for adapting reduced-order models and application to aeroelasticity. AIAA J 46(7):1803–1813
Article Google Scholar
An D, Liu J, Zhang M, Chen X, Chen M, Sun H (2020) Uncertainty modeling and runtime verification for autonomous vehicles driving control: A machine learning-based approach. J Syst Softw 167:110617
Article Google Scholar
Asperti A, Evangelista D, Piccolomini EL (2021) A survey on variational autoencoders from a green AI perspective. SN Computer Science 2(4):1–23
Article Google Scholar
Ates GC, Gorguluarslan RM (2021) Two-stage convolutional encoder-decoder network to improve the performance and reliability of deep learning models for topology optimization. Struct Multidisc Optim 63(4):1927–1950
Article MathSciNet Google Scholar
Banga S, Gehani H, Bhilare S, Patel S, Kara L (2018). 3d topology optimization using convolutional neural networks. arXiv preprint arXiv:1808.07440.
Baraldi P, Mangili F, Zio E (2015) A prognostics approach to nuclear component degradation modeling based on Gaussian process regression. Prog Nucl Energy 78:141–154
Article MATH Google Scholar
Barber D, Wang Y (2014). Gaussian processes for Bayesian estimation in ordinary differential equations. In International conference on machine learning (pp. 1485–1493). PMLR.
Bataleblu AA (2019) Computational intelligence and its applications in uncertainty-based design optimization. In Bridge Optimization-Inspection and Condition Monitoring. IntechOpen.
Behzadi MM, Ilieş HT (2021) Real-time topology optimization in 3D via deep transfer learning. Comput Aided Des 135:103014
Article MathSciNet Google Scholar
Bendsøe MP (1989) Optimal shape design as a material distribution problem. Struct Optim 1(4):193–202
Article Google Scholar
Bi S, Zhang J, Zhang G (2020) Scalable deep-learning-accelerated topology optimization for additively manufactured materials. arXiv preprint arXiv:2011.14177.
Bielecki D, Patel D, Rai R, Dargush GF (2021) Multi-stage deep neural network accelerated topology optimization. Struct Multidisc Optim 64(6):3473–3487
Article MathSciNet Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, UK
MATH Google Scholar
Bostanabad R, Chan YC, Wang L, Zhu P, Chen W (2019) Globally approximate gaussian processes for big data with application to data-driven metamaterials design. J Mech Des 141(11):111402
Article Google Scholar
Bühlmann P (2012). Bagging, boosting and ensemble methods. In Handbook of computational statistics (pp. 985–1022). Springer, Berlin, Heidelberg.
Burnap A, Pan Y, Liu Y, Ren Y, Lee H, Gonzalez R, Papalambros PY (2016b) Improving design preference prediction accuracy using feature learning. J Mech Des 138(7):071404
Article Google Scholar
Burnap A, Liu Y, Pan Y, Lee H, Gonzalez R, Papalambros PY (2016a) Estimating and exploring the product form design space using deep generative models. In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 50107, p. V02AT03A013). American Society of Mechanical Engineers.
Burnap A, Hauser JR, Timoshenko A (2021) Design and evaluation of product aesthetics: A human-machine hybrid approach. Available at SSRN 3421771.
Caldeira J, Nord B (2020) Deeply uncertain: Comparing methods of uncertainty quantification in deep learning algorithms. Mach Learn: Sci Technol 2(1):015002
Google Scholar
Cang R, Yao H, Ren Y (2019) One-shot generation of near-optimal topology through theory-driven machine learning. Comput-Aided Des 109:12–21
Article MathSciNet Google Scholar
Capuano G, Rimoli JJ (2019) Smart finite elements: a novel machine learning application. Comput Methods Appl Mech Eng 345:363–381
Article MathSciNet MATH Google Scholar
Cerbone G (1992) Machine learning techniques in optimal design. In: Artificial Intelligence in Design’92 (pp. 699–717). Springer, Dordrecht
Cha YJ, Choi W, Büyüköztürk O (2017) Deep learning-based crack damage detection using convolutional neural networks. Comput-Aided Civ Inf Eng 32(5):361–378
Article Google Scholar
Chakraborty S (2021) Transfer learning based multi-fidelity physics informed deep neural network. J Comput Phys 426:109942
Article MathSciNet MATH Google Scholar
Chan S, Elsheikh AH (2018) A machine learning approach for efficient uncertainty quantification using multiscale methods. J Comput Phys 354:493–511
Article MathSciNet MATH Google Scholar
Chandrasekhar A, Suresh K (2021) TOuNN: Topology optimization using neural networks. Struct Multidisc Optim 63(3):1135–1149
Article MathSciNet Google Scholar
Chen W, Ahmed F (2021a) MO-PaDGAN: Reparameterizing Engineering Designs for augmented multi-objective optimization. Appl Soft Comput 113:107909
Article Google Scholar
Chen W, Ahmed F (2021b) Padgan: Learning to generate high-quality novel designs. J Mech Des 143(3):031703
Article Google Scholar
Chen CT, Gu GX (2020) Generative deep neural networks for inverse materials design using backpropagation and active learning. Adv Sci 7(5):1902607
Article Google Scholar
Chen X, Chen X, Zhou W, Zhang J, Yao W (2020) The heat source layout optimization using deep learning surrogate modeling. Struct Multidisc Optim 62(6):3127–3148
Article Google Scholar
Chen W, Ahmed F (2020) PaDGAN: A generative adversarial network for performance augmented diverse designs. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 84003, p. V11AT11A010). American Society of Mechanical Engineers.
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794)
Chen W, Chiu K, Fuge M (2019) Aerodynamic design optimization and shape exploration using generative adversarial networks. In AIAA Scitech 2019 Forum (p. 2351)
Chhabra JP, Warn GP (2019) A method for model selection using reinforcement learning when viewing design as a sequential decision process. Struct Multidisc Optim 59(5):1521–1542
Article MathSciNet Google Scholar
Chi H, Zhang Y, Tang TLE, Mirabella L, Dalloro L, Song L, Paulino GH (2021) Universal machine learning for topology optimization. Comput Methods Appl Mech Eng 375:112739
Article MathSciNet MATH Google Scholar
Cortes C, Vapnik V (1995) Support-Vector Networks. Mach Learn 20(3):273–297
Article MATH Google Scholar
Cox DR (1958) The regression analysis of binary sequences. J Roy Stat Soc: Ser B (methodol) 20(2):215–232
MathSciNet MATH Google Scholar
Cutajar K, Osborne M, Cunningham J, Filippone M (2016) Preconditioning kernel matrices. In International conference on machine learning (pp. 2529–2538). PMLR.
Dai Y, Li Y, Liu LJ (2019) New product design with automatic scheme generation. Sens Imag 20(1):1–16
Google Scholar
Deng C, Wang Y, Qin C, Lu W (2020) Self-directed online machine learning for topology optimization. arXiv preprint arXiv:2002.01927.
Deng H, To AC (2020) Topology optimization based on deep representation learning (DRL) for compliance and stress-constrained design. Comput Mech 66:449–469
Article MathSciNet MATH Google Scholar
Dering M, Cunningham J, Desai R, Yukish MA, Simpson TW, Tucker CS (2018) A physics-based virtual environment for enhancing the quality of deep generative designs. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 51753, p. V02AT03A015). American Society of Mechanical Engineers.
Diego-Mas JA, Alcaide-Marzal J (2016) Single users’ affective responses models for product form design. Int J Ind Ergon 53:102–114
Article Google Scholar
Doi S, Sasaki H, Igarashi H (2019) Multi-objective topology optimization of rotating machines using deep learning. IEEE Trans Magn 55(6):1–5
Article Google Scholar
Dong K, Eriksson D, Nickisch H, Bindel D, Wilson AG (2017) Scalable log determinants for Gaussian process kernel learning. arXiv preprint arXiv:1711.03481
Du X, Xu H, Zhu F (2021) A data mining method for structure design with uncertainty in design variables. Comput Struct 244:106457
Article Google Scholar
Džeroski S, Ženko B (2004) Is combining classifiers with stacking better than selecting the best one? Mach Learn 54(3):255–273
Article MATH Google Scholar
Elingaard MO, Aage N, Bærentzen JA, Sigmund O (2022) De-homogenization using convolutional neural networks. Comput Methods Appl Mech Eng 388:114197
Article MathSciNet MATH Google Scholar
Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer M (2020) An introductory review of deep learning for prediction models with big data. Front Artif Intel 3:4
Article Google Scholar
Falck R, Gray JS, Ponnapalli K, Wright T (2021) dymos: A Python package for optimal control of multidisciplinary systems. J Open Source Soft 6(59):2809
Article Google Scholar
Fernández-Godino MG, Park C, Kim NH, Haftka RT (2016) Review of multi-fidelity models. arXiv preprint arXiv:1609.07196.
Ferreiro-Cabello J, Fraile-Garcia E, de Pison Ascacibar EM, Martinez-de-Pison FJ (2018) Metamodel-based design optimization of structural one-way slabs based on deep learning neural networks to reduce environmental impact. Eng Struct 155:91–101
Article Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Article Google Scholar
Fix E, Hodges JL (1989) Discriminatory analysis. Nonparametric discrimination: Consistency properties. Inter Stat Rev/revue Internationale De Statistique 57(3):238–247
MATH Google Scholar
Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, USA
Book Google Scholar
Freiesleben J, Keim J, Grutsch M (2020) Machine learning and design of experiments: Alternative approaches or complementary methodologies for quality improvement? Qual Reliab Eng Int 36(6):1837–1848
Article Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Annal Stat 25:1189–1232
MathSciNet MATH Google Scholar
Fukushima K (1988) Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130
Article Google Scholar
García-Segura T, Yepes V, Frangopol DM (2017) Multi-objective design of post-tensioned concrete road bridges using artificial neural networks. Struct Multidisc Optim 56(1):139–150
Article MathSciNet Google Scholar
Gardner JR, Pleiss G, Bindel D, Weinberger KQ, Wilson AG (2018a). Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. arXiv preprint arXiv:1809.11165.
Gardner J, Pleiss G, Wu R, Weinberger K, Wilson A (2018b) Product kernel interpolation for scalable Gaussian processes. In International Conference on Artificial Intelligence and Statistics (pp. 1407–1416). PMLR.
Garriga AG, Mainini L, Ponnusamy SS (2019) A machine learning enabled multi-fidelity platform for the integrated design of aircraft systems. J Mech Des 141(12):121405
Article Google Scholar
Gladstone RJ, Nabian MA, Keshavarzzadeh V, Meidani H (2021) Robust topology optimization using variational autoencoders. arXiv preprint arXiv:2107.10661.
Goel T, Haftka RT, Shyy W, Queipo NV (2007) Ensemble of Surrogates. Struct Multidisc Optim 33(3):199–216
Article Google Scholar
Golub GH, Reinsch C (1971) Singular value decomposition and least squares solutions. In Linear algebra (pp. 134–151). Springer, Berlin, Heidelberg
Gomes WJDS (2020) Shallow and deep artificial neural networks for structural reliability analysis. ASME J Risk Uncertainty Part B 6(4):041006
Article Google Scholar
Gomes GSDS, Ludermir TB (2013) Optimization of the weights and asymmetric activation function family of neural network for time series forecasting. Expert Syst Appl 40(16):6438–6446
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. Adv Neur Info Process Syst 27.
Gray JS, Hwang JT, Martins JR, Moore KT, Naylor BA (2019) OpenMDAO: An open-source framework for multidisciplinary design, analysis, and optimization. Struct Multidisc Optim 59(4):1075–1104
Article MathSciNet Google Scholar
Harman HH (1976) Modern factor analysis. University of Chicago press, USA
MATH Google Scholar
Harzing AW (2007). Publish or Perish, available from https://harzing.com/resources/publish-or-perish
Hasegawa K, Fukami K, Murata T, Fukagata K (2020) CNN-LSTM based reduced order modeling of two-dimensional unsteady flows around a circular cylinder at different Reynolds numbers. Fluid Dyn Res 52(6):065501
Article MathSciNet Google Scholar
He L, Qian W, Zhao T, Wang Q (2020a) Multi-fidelity aerodynamic data fusion with a deep neural network modeling method. Entropy 22(9):1022
Article MathSciNet Google Scholar
He P, Mader CA, Martins JR, Maki KJ (2020b) Dafoam: an open-source adjoint framework for multidisciplinary design optimization with openfoam. AIAA J 58(3):1304–1319
Article Google Scholar
Hou TY, Lam KC, Zhang P, Zhang S (2019) Solving Bayesian inverse problems from the perspective of deep generative networks. Comput Mech 64(2):395–408
Article MathSciNet MATH Google Scholar
Jabarullah Khan NK, Elsheikh AH (2019) A machine learning based hybrid multi-fidelity multi-level Monte Carlo method for uncertainty quantification. Front Environ Sci 7:105
Article Google Scholar
Janda T, Zemanová A, Hála P, Konrád P, Schmidt J (2020) Reduced order model of glass plate loaded by low-velocity impact. Int J Comput Methods Exp Meas 8(1):36–46
Google Scholar
Jang S, Kang N (2020) Generative design by reinforcement learning: Maximizing diversity of topology optimized designs. arXiv preprint arXiv:2008.07119.
Jiang J, Fan JA (2019) Global optimization of dielectric metasurfaces using a physics-driven neural network. Nano Lett 19(8):5366–5372
Article Google Scholar
Jiang J, Fan JA (2020) Simulator-based training of generative neural networks for the inverse design of metasurfaces. Nanophotonics 9(5):1059–1069
Article Google Scholar
Jiang X, Wang H, Li Y, Mo K (2020) Machine learning based parameter tuning strategy for MMC based topology optimization. Adv Eng Softw 149:102841
Article Google Scholar
Jin SS (2020) Compositional kernel learning using tree-based genetic programming for Gaussian process regression. Struct Multidisc Optim 62:1313–1351
Article MathSciNet Google Scholar
Jung J, Yoon K, Lee PS (2020) Deep learned finite elements. Comput Methods Appl Mech Eng 372:113401
Article MathSciNet MATH Google Scholar
Kallioras NA, Lagaros ND (2020) DzAIℕ: Deep learning based generative design. Procedia Manufacturing 44:591–598
Article Google Scholar
Kallioras NA, Kazakis G, Lagaros ND (2020) Accelerated topology optimization by means of deep learning. Struct Multidisc Optim 62(3):1185–1212
Article MathSciNet Google Scholar
Kambampati S, Du Z, Chung H, Kim HA, Jauregui C, Townsend S, Hedges L (2018). OpenLSTO: Open-source software for level set topology optimization. In: 2018 Multidisciplinary Analysis and Optimization Conference (p. 3882).
Kaplan EM, Acar E, Bülent Özer M (2021) Development of a method for maximum structural response prediction of a store externally carried by a jet fighter. Proce Inst Mech Eng Part G: J Aeros Eng 09544100211022244.
Karlik B, Olgac AV (2011) Performance analysis of various activation functions in generalized MLP architectures of neural networks. Int J Artif Intel Exp Sys 1(4):111–122
Google Scholar
Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L (2021) Physics-informed machine learning. Nature Reviews. Physics 3(6):422–440
Google Scholar
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Liu TY (2017) Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:3146–3154
Google Scholar
Keshavarzzadeh V, Alirezaei M, Tasdizen T, Kirby RM (2021) Image-based multiresolution topology optimization using deep disjunctive normal shape model. Comput Aided Des 130:102947
Article MathSciNet Google Scholar
Khan S, Gunpinar E, Moriguchi M, Suzuki H (2019a) Evolving a psycho-physical distance metric for generative design exploration of diverse shapes. J Mech Des 141(11):111101
Article Google Scholar
Khan S, Gunpinar E, Sener B (2019b) GenYacht: An interactive generative design system for computer-aided yacht hull design. Ocean Eng 191:106462
Article Google Scholar
Khatouri H, Benamara T, Breitkopf P, Demange J, Feliot P (2020) Constrained multi-fidelity surrogate framework using Bayesian optimization with non-intrusive reduced-order basis. Adv Model Simul Eng Sci 7(1):1–20
Article Google Scholar
Kim SH, Boukouvala F (2020) Machine learning-based surrogate modeling for data-driven optimization: A comparison of subset selection for regression techniques. Optim Lett 14(4):989–1010
Article MathSciNet MATH Google Scholar
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
Kochkov D, Smith JA, Alieva A, Wang Q, Brenner MP, Hoyer S (2021) Machine learning–accelerated computational fluid dynamics. Proc Nat Acad Sci 118(21):e2101784118
Article MathSciNet Google Scholar
Kollmann HT, Abueidda DW, Koric S, Guleryuz E, Sobh NA (2020) Deep learning for topology optimization of 2D metamaterials. Mater Des 196:109098
Article Google Scholar
Kou J, Zhang W (2019) A hybrid reduced-order framework for complex aeroelastic simulations. Aerosp Sci Technol 84:880–894
Article Google Scholar
Kumar M, Yadav N (2011) Multilayer perceptrons and radial basis function neural network methods for the solution of differential equations: A survey. Comput Math Appl 62(10):3796–3811
Article MathSciNet MATH Google Scholar
Lafage R, Defoort S, Lefebvre T (2019) WhatsOpt: a web application for multidisciplinary design analysis and optimization. In AIAA Aviation 2019 Forum (p. 2990).
Lee J, Jeong H, Kang S (2008) Derivative and GA-based methods in metamodeling of back-propagation neural networks for constrained approximate optimization. Struct Multidisc Optim 35(1):29–40
Article Google Scholar
Lee XY, Balu A, Stoecklein D, Ganapathysubramanian B, Sarkar S (2019) A case study of deep reinforcement learning for engineering design: Application to microfluidic devices for flow sculpting. J Mech Des 141(11):111401
Article Google Scholar
Lee S, Kim H, Lieu QX, Lee J (2020) CNN-based image recognition for topology optimization. Knowl-Based Syst 198:105887
Article Google Scholar
Lee YO, Jo J, Hwang J (2017). Application of deep neural network and generative adversarial network to industrial maintenance: A case study of induction motor fault detection. In: 2017 IEEE international conference on big data (big data) (pp. 3248–3253). IEEE
Lee M, Park Y, Jo H, Kim K, Lee S, Lee I (2022) Deep generative tread pattern design framework for efficient conceptual design. J Mechan Des 1–28.
Lei X, Liu C, Du Z, Zhang W, Guo X (2019) Machine learning-driven real-time topology optimization under moving morphable component-based framework. J Appl Mech 86(1):011004
Article Google Scholar
Li Y, Mei F (2021) Deep learning-based method coupled with small sample learning for solving partial differential equations. Mult Tools Appl 80(11):17391–17413
Article Google Scholar
Li M, Wang Z (2021) An LSTM-based ensemble learning approach for time-dependent reliability analysis. J Mech Des 143(3):031702
Article Google Scholar
Li B, Huang C, Li X, Zheng S, Hong J (2019) Non-iterative structural topology optimization using deep learning. Comput Aided Des 115:172–180
Article Google Scholar
Li S, Xing W, Kirby R, Zhe S (2020) Multi-fidelity Bayesian optimization via deep neural networks. Adv Neural Info Proc Syst 33.
Liao H, Zhang W, Dong X, Poczos B, Shimada K, Burak Kara L (2020) A deep reinforcement learning approach for global routing. J Mech Des 142(6):061701
Article Google Scholar
Lin Q, Hong J, Liu Z, Li B, Wang J (2018) Investigation into the topology optimization for conductive heat transfer based on deep learning approach. Int Commun Heat Mass Transfer 97:103–109
Article Google Scholar
Lin Q, Liu Z, Hong J (2019) Method for directly and instantaneously predicting conductive heat transfer topologies by using supervised deep learning. Int Commun Heat Mass Transfer 109:104368
Article Google Scholar
Liu K, Tovar A, Nutwell E, Detwiler D (2015) Thin-walled compliant mechanism component design assisted by machine learning and multiple surrogates.
Liu D, Wang Y (2019) Multi-fidelity physics-constrained neural network and its application in materials modeling. J Mech Des 141(12):121403
Article Google Scholar
Lye KO, Mishra S, Ray D, Chandrashekar P (2021) Iterative surrogate model optimization (ISMO): An active learning algorithm for PDE constrained optimization with deep neural networks. Comput Methods Appl Mech Eng 374:113575
Article MathSciNet MATH Google Scholar
Lynch ME, Sarkar S, Maute K (2019) Machine learning to aid tuning of numerical parameters in topology optimization. J Mech Des 141(11):114502
Article Google Scholar
Ma SB, Kim S, Kim JH (2020) Optimization design of a two-vane pump for wastewater treatment using machine-learning-based surrogate modeling. Processes 8(9):1170
Article Google Scholar
McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133
Article MathSciNet MATH Google Scholar
McFall KS (2013) Automated design parameter selection for neural networks solving coupled partial differential equations with discontinuities. J Franklin Inst 350(2):300–317
Article MathSciNet MATH Google Scholar
Meng X, Karniadakis GE (2020) A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems. J Comput Phys 401:109020
Article MathSciNet MATH Google Scholar
Minisci E, Vasile M (2013) Robust design of a reentry unmanned space vehicle by multifidelity evolution control. AIAA J 51(6):1284–1295
Article Google Scholar
Mondal S (2020) Probabilistic machine learning for advanced engineering design optimization and diagnostics, PhD dissertation, Penn State University.
Montgomery DC, Peck EA, Vining GG (2021) Introduction to linear regression analysis. John Wiley & Sons, USA
MATH Google Scholar
Motamed M (2020) A multi-fidelity neural network surrogate sampling method for uncertainty quantification. Int J Uncertain Quantif 10(4).
Mozaffar M, Bostanabad R, Chen W, Ehmann K, Cao J, Bessa MA (2019) Deep learning predicts path-dependent plasticity. Proc Natl Acad Sci 116(52):26414–26420
Article Google Scholar
Müller J, Park J, Sahu R, Varadharajan C, Arora B, Faybishenko B, Agarwal D (2021) Surrogate optimization of deep neural networks for groundwater predictions. J Global Optim 81(1):203–231
Article MathSciNet MATH Google Scholar
Nagarajan HP, Mokhtarian H, Jafarian H, Dimassi S, Bakrani-Balani S, Hamedi A, Haapala KR (2019) Knowledge-based design of artificial neural network topology for additive manufacturing process modeling: A new approach and case study for fused deposition modeling. J Mech Des 141(2):021705
Article Google Scholar
Nakamura K, Suzuki Y (2020) Deep learning-based topological optimization for representing a user-specified design area. arXiv preprint arXiv:2004.05461.
Napier N, Sriraman SA, Tran HT, James KA (2020) An artificial neural network approach for generating high-resolution designs from low-resolution input in topology optimization. J Mech Des 142(1):011402
Article Google Scholar
Naranjo-Pérez J, Infantes M, Jiménez-Alonso JF, Sáez A (2020) A collaborative machine learning-optimization algorithm to improve the finite element model updating of civil engineering structures. Eng Struct 225:111327
Article Google Scholar
Nie Z, Lin T, Jiang H, Kara LB (2021) Topologygan: Topology optimization using generative adversarial networks based on physical fields over the initial domain. J Mech Des 143(3):031715
Article Google Scholar
Ning C, You F (2018) Data-driven stochastic robust optimization: General computational framework and algorithm leveraging machine learning for optimization under uncertainty in the big data era. Comput Chem Eng 111:115–133
Article Google Scholar
Nobari AH, Rashad MF, Ahmed F (2021) Creativegan: Editing generative adversarial networks for creative design synthesis. arXiv preprint arXiv:2103.06242.
Odonkor P, Lewis K (2019) Data-driven design of control strategies for distributed energy systems. J Mech Des 141(11):111404
Article Google Scholar
Oh S, Jung Y, Lee I, Kang N (2018) Design automation by integrating generative adversarial networks and topology optimization. In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference (Vol. 51753, p. V02AT03A008). American Society of Mechanical Engineers.
Oh S, Jung Y, Kim S, Lee I, Kang N (2019) Deep generative design: Integration of topology optimization and generative models. J Mech Des 141(11).
Owoyele O, Pal P, Vidal Torreira A, Probst D, Shaxted M, Wilde M, Senecal PK (2021) An automated machine learning-genetic algorithm (AutoML-GA) approach for efficient simulation-driven engine design optimization. arXiv e-prints, arXiv-2101
Panchal JH, Fuge M, Liu Y, Missoum S, Tucker C (2019) Machine learning for engineering design. J Mech Des 141(11)
Pánek D, Orosz T, Karban P (2020) Artap: Robust design optimization framework for engineering applications. arXiv 2019. arXiv preprint arXiv:1912.11550
Parsonage B, Maddock CA (2020) Multi-stage multi-fidelity information correction for artificial neural network based meta-modelling. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 950–957). IEEE
Patel J, Choi SK (2012) Classification approach for reliability-based topology optimization using probabilistic neural networks. Struct Multidisc Optim 45(4):529–543
Article MathSciNet MATH Google Scholar
Pawar S, Rahman SM, Vaddireddy H, San O, Rasheed A, Vedula P (2019) A deep learning enabler for nonintrusive reduced order modeling of fluid flows. Phys Fluids 31(8):085101
Article Google Scholar
Peherstorfer B, Willcox K, Gunzburger M (2018) Survey of multifidelity methods in uncertainty propagation, inference, and optimization. SIAM Rev 60(3):550–591
Article MathSciNet MATH Google Scholar
Pereira DR, Piteri MA, Souza AN, Papa JP, Adeli H (2020) FEMa: A finite element machine for fast learning. Neural Comput Appl 32(10):6393–6404
Article Google Scholar
Perez RE, Jansen PW, Martins JR (2012) pyOpt: a Python-based object-oriented framework for nonlinear constrained optimization. Struct Multidisc Optim 45(1):101–118
Article MathSciNet MATH Google Scholar
Perron C, Rajaram D, Mavris DN (2021) Multi-fidelity non-intrusive reduced-order modelling based on manifold alignment. Proce Royal Soc A 477(2253):20210495
Article MathSciNet Google Scholar
Pillai AC, Thies PR, Johanning L (2019) Mooring system design optimization using a surrogate assisted multi-objective genetic algorithm. Eng Optim 51(8):1370–1392
Article MathSciNet Google Scholar
Popov AA, Mou C, Sandu A, Iliescu T (2021) A multifidelity ensemble Kalman filter with reduced order control variates. SIAM J Sci Comput 43(2):A1134–A1162
Article MathSciNet MATH Google Scholar
Puentes L, Raina A, Cagan J, McComb C. (2020) Modeling a strategic human engineering design process: Human-inspired heuristic guidance through learned visual design agents. In Proceedings of the Design Society: DESIGN Conference (Vol. 1, pp. 355–364). Cambridge University Press.
Qian C, Ye W (2021) Accelerating gradient-based topology optimization design with dual-model artificial neural networks. Struct Multidisc Optim 63(4):1687–1707
Article MathSciNet Google Scholar
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
Raina A, McComb C, Cagan J (2019). Learning to design from humans: Imitating human designers through deep learning. J Mech Des 141(11)
Raissi M, Karniadakis GE (2018) Hidden physics models: Machine learning of nonlinear partial differential equations. J Comput Phys 357:125–141
Article MathSciNet MATH Google Scholar
Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707
Article MathSciNet MATH Google Scholar
Rasmussen CE (2003). Gaussian processes in machine learning. In Summer school on machine learning (pp. 63–71). Springer, Berlin, Heidelberg
Rätsch G, Onoda T, Müller KR (2001) Soft Margins for AdaBoost. Mach Learn 42(3):287–320
Article MATH Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article MATH Google Scholar
Sasaki H, Igarashi H (2019a) Topology optimization accelerated by deep learning. IEEE Trans Magn 55(6):1–5
Article Google Scholar
Sasaki H, Igarashi H (2019b) Topology optimization of IPM motor with aid of deep learning. Int J Appl Electromagnet Mech 59(1):87–96
Article Google Scholar
Shi M, Lv L, Sun W, Song X (2020a) A multi-fidelity surrogate model based on support vector regression. Struct Multidisc Optim 61(6):2363–2375
Article MathSciNet Google Scholar
Shi X, Qiu T, Wang J, Zhao X, Qu S (2020b) Metasurface inverse design using machine learning approaches. J Phys D Appl Phys 53(27):275105
Article Google Scholar
Shu D, Cunningham J, Stump G, Miller SW, Yukish MA, Simpson TW, Tucker CS (2020) 3d design using generative adversarial networks and physics-based validation. J Mech Des 142(7):071701
Article Google Scholar
Singh AP, Medida S, Duraisamy K (2017) Machine-learning-augmented predictive modeling of turbulent separated flows over airfoils. AIAA J 55(7):2215–2227
Article Google Scholar
Singla M, Ghosh D, Shukla KK (2020) A survey of robust optimization based machine learning with special reference to support vector machines. Int J Mach Learn Cybern 11(7):1359–1385
Article Google Scholar
Solanki KN, Acar E, Rais-Rohani M, Horstemeyer MF, Steele WG (2009) Product design optimisation with microstructure-property modelling and associated uncertainties. Int J Des Eng 2(1):47–79
Google Scholar
Song H, Choi KK, Lee I, Zhao L, Lamb D (2013) Adaptive virtual support vector machine for reliability analysis of high-dimensional problems. Struct Multidisc Optim 47(4):479–491
Article MathSciNet MATH Google Scholar
Sosnovik I, Oseledets I (2019) Neural networks for topology optimization. Russ J Numer Anal Math Model 34(4):215–223
Article MathSciNet MATH Google Scholar
Strömberg N (2020) Efficient detailed design optimization of topology optimization concepts by using support vector machines and metamodels. Eng Optim 52(7):1136–1148
Article MathSciNet Google Scholar
Su G, Peng L, Hu L (2017) A Gaussian process-based dynamic surrogate model for complex engineering structural reliability analysis. Struct Saf 68:97–109
Article Google Scholar
Sun H, Ma L (2020) Generative design by using exploration approaches of reinforcement learning in density-based structural topology optimization. Designs 4(2):10
Article Google Scholar
Sun G, Wang S (2019) A review of the artificial neural network surrogate modeling in aerodynamic design. Proc Inst Mech Eng, Part G: J Aeros Eng 233(16):5863–5872
Article Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press.
Tan RK, Zhang NL, Ye W (2020) A deep learning-based method for the design of microstructural materials. Struct Multidisc Optim 61(4):1417–1438
Article MathSciNet Google Scholar
Tao J, Sun G (2019) Application of deep learning based multi-fidelity surrogate model to robust aerodynamic design optimization. Aerosp Sci Technol 92:722–737
Article Google Scholar
Tenne Y (2019). Enhancing simulation-driven optimization by machine-learning. Int J Model Optim 9(4)
Thole SP, Ramu P (2020) Design space exploration and optimization using self-organizing maps. Struct Multidisc Optim 62(3):1071–1088
Article Google Scholar
Trehan S, Carlberg KT, Durlofsky LJ (2017) Error modeling for surrogates of dynamical systems using machine learning. Int J Numer Meth Eng 112(12):1801–1827
Article MathSciNet Google Scholar
Trinchero R, Larbi M, Torun HM, Canavero FG, Swaminathan M (2018) Machine learning and uncertainty quantification for surrogate models of integrated devices with a large number of parameters. IEEE Access 7:4056–4066
Article Google Scholar
Tripathy RK, Bilionis I (2018) Deep UQ: Learning deep neural network surrogate models for high dimensional uncertainty quantification. J Comput Phys 375:565–588
Article MathSciNet MATH Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I (2017) Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008)
Wang C, Yao S, Wang Z, Hu J (2021a) Deep super-resolution neural network for structural topology optimization. Eng Optim 53(12):2108–2121
Article Google Scholar
Wang D, Xie C, Wang S (2021c) An adaptive RBF neural network–based multi-objective optimization method for lightweight and crashworthiness design of cab floor rails using fuzzy subtractive clustering algorithm. Struct Multidisc Optim 63(2):915–928
Article Google Scholar
Wang L, van Beek A, Da D, Chan YC, Zhu P, Chen W (2022) Data-driven multiscale design of cellular composites with multiclass microstructures for natural frequency maximization. Compos Struct 280:114949
Article Google Scholar
Wang F, Song M, Edelen A, Huang X (2019) Machine learning for design optimization of storage ring nonlinear dynamics. arXiv preprint arXiv:1910.14220.
Wang D, Xiang C, Pan Y, Chen A, Zhou X, Zhang Y (2021b) A deep convolutional neural network for topology optimization with perceptible generalization ability. Eng Optim 1–16
Wiener N (1938) The homogeneous chaos. Am J Math 60(4):897–936
Article MathSciNet MATH Google Scholar
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning (Vol. 2, No. 3, p. 4). Cambridge, MA: MIT press.
Williams G, Meisel NASimpson TW, McComb C (2019) Design repository effectiveness for 3D convolutional neural networks: Application to additive manufacturing. J Mechan Des 141(11)
Wu X, Kozlowski T, Meidani H (2018) Kriging-based inverse uncertainty quantification of nuclear fuel performance code BISON fission gas release model using time series measurement data. Reliab Eng Syst Saf 169:422–436
Article Google Scholar
Wu J (2017) Introduction to convolutional neural networks. National Key Lab for Novel Software Technology. Nanjing University. China 5(23), 495.
Wuraola A, Patel N (2018) SQNL: A new computationally efficient activation function. In: 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1–7). IEEE
Xu Y, Gao Y, Wu C, Fang J, Sun G, Steven GP, Li Q (2021) Machine learning based topology optimization of fiber orientation for variable stiffness composite structures. Int J Num Methods Eng
Yamasaki S, Yaji K, Fujita K (2021) Data-driven topology design using a deep generative model. Struct Multidisc Optim 1–20.
Yan L, Zhou T (2019). An adaptive surrogate modeling based on deep neural networks for large-scale Bayesian inverse problems. arXiv preprint arXiv:1911.08926.
Yang Y, Perdikaris P (2019) Conditional deep surrogate models for stochastic, high-dimensional, and multi-fidelity systems. Comput Mech 64(2):417–434
Article MathSciNet MATH Google Scholar
Yao H, Gao Y, Liu Y (2020) FEA-Net: A physics-guided data-driven model for efficient mechanical response prediction. Comput Methods Appl Mech Eng 363:112892
Article MathSciNet MATH Google Scholar
Yonekura K, Suzuki K (2021) Data-driven design exploration method using conditional variational autoencoder for airfoil design. Struct Multidisc Optim 1–12.
Yonekura K, Hattori H (2019) Framework for design optimization using deep reinforcement learning. Struct Multidisc Optim 60(4):1709–1713
Article Google Scholar
Yu Y, Hur T, Jung J, Jang IG (2019) Deep learning for determining a near-optimal topological design without any iteration. Struct Multidisc Optim 59(3):787–799
Article Google Scholar
Yuan C, Moghaddam M (2020) Attribute-aware generative design with generative adversarial networks. IEEE Access 8:190710–190721
Article Google Scholar
Zhang Y, Ye W (2019) Deep learning-based inverse method for layout design. Struct Multidisc Optim 60(2):527–536
Article MathSciNet Google Scholar
Zhang J, Zhao X (2021) Machine-learning-based surrogate modeling of aerodynamic flow around distributed structures. AIAA J 59(3):868–879
Article Google Scholar
Zhang X, Xie F, Ji T, Zhu Z, Zheng Y (2021a) Multi-fidelity deep neural network surrogate model for aerodynamic shape optimization. Comput Methods Appl Mech Eng 373:113485
Article MathSciNet MATH Google Scholar
Zhang Z, Li Y, Zhou W, Chen X, Yao W, Zhao Y (2021b) TONR: An exploration for a novel way combining neural network with topology optimization. Comput Method Appl Mech Eng 386:114083
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Design, Optimization and Probabilistic Techniques Laboratory, Department of Engineering Design, Indian Institute of Technology Madras, Chennai, 600036, India
Palaniappan Ramu & Pugazhenthi Thananjayan
Department of Mechanical Engineering, TOBB University of Economics and Technology, 06560, Ankara, Turkey
Erdem Acar & Gamze Bayrak
Mechanical Engineering Department, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea
Jeong Woo Park & Ikjin Lee

Authors

Palaniappan Ramu
View author publications
Search author on:PubMed Google Scholar
Pugazhenthi Thananjayan
View author publications
Search author on:PubMed Google Scholar
Erdem Acar
View author publications
Search author on:PubMed Google Scholar
Gamze Bayrak
View author publications
Search author on:PubMed Google Scholar
Jeong Woo Park
View author publications
Search author on:PubMed Google Scholar
Ikjin Lee
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Ikjin Lee.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Replication of results

In this review paper, we do not provide any results to replicate.

Additional information

Responsible Editor: Byeng D. Youn

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix. ML methods widely used in the context of structural and multidisciplinary optimization

ML algorithms can be categorized into four groups: 1) classification, 2) regression, 3) clustering, and 4) dimension reduction as shown in Fig. 6. Classification and regression are both supervised learning algorithms, where the main idea is to generate a prediction model. If the predicted response is discrete, it is a classification problem, whereas if the response is continuous, then it is a regression problem. Therefore, in general, the ML algorithms used for classification and regression are very similar. The most commonly used classical ML algorithms for classification problems include logistic regression [Cox (1958)], k-nearest neighbors [Fix and Hodges (1989)], support vector machines (SVM) [Cortes and Vapnik (1995)], kernel SVM, naive Bayes, decision tree classification, and random forest classification. The most commonly used classical ML algorithms for regression problems include simple linear regression, multiple linear regression, polynomial regression, Kriging, support vector regression (SVR), decision tree regression, and random forest regression.

Clustering is similar to classification in that they are both used for grouping the data. The main difference is that classification is used to categorize labeled data, whereas clustering detects patterns within an unlabeled data set. Therefore, the classification is a supervised learning algorithm, whereas the clustering is an unsupervised learning algorithm. The most commonly used classical ML algorithms for clustering problems include k-means, mean shift clustering, Gaussian mixture models, density-based spatial clustering, and hierarchical agglomerative clustering.

Dimension reduction aims to reduce the number of input variables in a dataset, thereby protecting against the curse of dimensionality, which makes the algorithm difficult to run as the dimensions of the data increase. Data from a large dimensional space is transformed into a smaller dimensional space ensuring that it provides similar information. Dimension reduction methods can be further categorized into linear methods and non-linear methods. The most commonly used linear learning algorithms for dimension reduction include principal component analysis [Wiener (1938)], factor analysis [Harman (1976)], linear discriminant analysis [Fisher (1936)], and singular value decomposition [Golub and Reinsch (1971)]. The nonlinear algorithms include kernel principal component analysis, isometric mapping, and t-distributed stochastic neighbor embedding (t-SNE). Among the ML methods listed in Fig. 6, we briefly explain ML methods that are widely used in the context of structural and multidisciplinary optimization in the following subsections.

1.1 A.1 Linear regression

Linear regression [Montgomery et al. (2021)] models the relationship between the response variable (dependent) y and one or more independent variables x. If there exists only one independent variable, then it is called simple linear regression. The fundamental idea in linear regression is to find the coefficients of the basis functions that best model the data. Ordinary least squares (OLS) are the most common method used to train the model with the given data to estimate the unknown coefficients. Function nonlinearity is modeled using complex basis functions while keeping the regression linear.

$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots + \beta_n x_n + \varepsilon = {{x}}^{{f T}} {\boldsymbol{\upbeta }} + \varepsilon$$

(A1)

where β₀ and β_n are the unknown coefficients and ε is the error term.

1.2 A.2 Gaussian process (GP)

GP [Rasmussen (2003)], also known as Kriging when the mean of GP is zero, is a stochastic approach that finds wide use in regression, classification, and unsupervised learning. It is usually utilized in the linear regression framework while using the Gaussian kernel as the basis function. It is the preferred approach for inference on functions as well. GP is a generalization of Gaussian probability distribution in which every finite collection of random variables has a multivariate Gaussian distribution. GP is a distribution over functions with a continuous domain such as time or space. Since GP provides model prediction as well as prediction error estimates, even when the simulation is deterministic, it is sought after to be used as surrogates in design and analysis of expensive computer experiments. Since GP metamodels can fit complicated surfaces well, it is suited for fitting accurate global metamodels. A GP is completely specified by mean function m(x) and covariance function $k\left( {{{\bf{x}}},{{\bf{x}}}^{\prime}} \right)$ as

$$f\left( {{\bf{x}}} \right) \sim GP\left( {m\left( {{\bf{x}}} \right),\,k\left( {{{\bf{x}}},{{\bf{x}}}^{\prime}} \right)} \right)$$

(A2)

GP can be extended to multiple outputs by using multiple means and covariances. It permits easy interpolation of data and has an inbuilt mechanism to account for noise. Furthermore, GP can quantify the uncertainty about the prediction and have conditional distributions that allow adaptive sampling or Bayesian studies. Owing to the fact that GP models are regularly evaluated on a grid leading to multivariate normal distributions and the computational time for calculating the inversion and determinant of n × n covariance matrix is of O³, using GP is a challenge while using large scale datasets. Recently, approaches such as matrix–vector multiplication [Gardner et al. (2018a), (2018b)], [Dong et al. (2017)] and sparse GP [Cutajar et al. (2016)] have been developed to reduce the amount of computation when the data set is more than 100 k.

1.3 A.3 Artificial neural network (ANN)

In the 1940s, [McCulloch and Pitts (1943)] formulated the first NN model. Since its inception, NN has found interest among both researchers and applications in various domains. As a result, better algorithms and more powerful networks have been developed. ANN refers to a biologically inspired sub-domain of artificial intelligence (AI) modeled based on the network of the brain. Akin to the human brain, ANNs have neurons (called nodes) which are connected to each other in different layers of the networks as shown in Fig. 7. The basic idea of ANN is that an input vector x is weighted by w and along with bias b, subjected to an activation function f that is linear or nonlinear to produce the output y as given as

$$y = f\left( {{{\bf w}}^{{\rm T}} {{\bf x}} + {{\bf b}}} \right)$$

(A3)

The weights in Eq. (A3) are optimized during training until a specified level of accuracy is reached by the network. Based on the application, there are many activation functions used in ANN, namely sigmoid, hyperbolic tangent, rectifier linear unit (ReLU), Heaviside, signum, and softmax functions [Karlik and Olgac (2011)]. Researchers have also developed application specific activation functions (Wuraola and Patel 2018, [Gomes and Ludermir (2013). Since ANN deals with multidimensional data, approaches such as StandardScaler, RobustScaler, MinMaxScaler, and Normalizer for data scaling, can be used for data processing and can prevent convergence to zero or diverge to infinity during the learning process.

ANN is broadly classified into two categories such as feed-forward NN and feed backward NN. In the feed-forward NN, the information will pass only in the forward direction i.e., from the input layer to the hidden layer (if any) and then to the output. Single-layer perceptron, multi-layer perceptron, and radial basis function networks are examples of feed-forward NN. In the feed backward NN, the inputs are fed in the forward direction and errors are computed to be propagated in the reverse (hence the terminology back) direction to the previous layers, so as to reduce the error in the cost function by readjusting the weights. Examples include Bayesian regularised NN and Kohonen’s self-organizing map. The loss function is computed as the difference between the prediction and the target after each feedforward pass. In the backpropagation process, the optimizer trains parameters such as weights and biases iteratively through optimization to minimize the loss function.

ANNs can be used for both regression and classification problems which are techniques in predictive modeling. In the context of classification, since ANN works by splitting the problem into layered networks of simpler elements, ANNs are reliable when the tasks involve many features. The most attractive feature of ANNs is that they provide predictive capability by mapping any number of inputs and outputs. Upon training, the predictions are fast and cheap.

NNs are typically black box approaches. That is, one might not be able to capture the influence of independent variables on dependent variables. Overfitting is a fundamental challenge of ANN as it depends predominantly on training data. With traditional CPUs, ANNs were expensive in terms of computational time to train the network, but the invention of cloud computing and increased computing power have relieved the computational burden. However, researchers began to focus on more complex problems and used more layers to train on large sets of data, resulting in longer computational times with multiple training iterations.

1.4 A.4 Deep neural network (DNN)

DNN is created when NNs are stacked one after the other. The primary difference between the conventional NN and DNN is that the former has one or two hidden layers and the latter has several hidden layers as shown in Fig. 8. Each circle in the figure calculates a weighted sum of the input vectors and bias following which a nonlinear function is applied to obtain the output. DNNs can handle functions with limited regularity and are powerful for high-dimension problems. The basic idea of DNN is to approximate a function with a non-linear activation function [Emmert-Streib et al. (2020)], with n hidden layers as represented in

$${{\bf DNN}}\left( y \right) = {{\bf w}}^{\left( n \right)} {{\bf x}}^{\left( n \right)} + {{\bf b}}^{\left( n \right)}$$

(A4)

and

$${{\bf x}}^{\left( {k + 1} \right)} = \sigma \left( {{{\bf w}}^{\left( k \right)} {{\bf x}}^{\left( k \right)} + {{\bf b}}^{\left( k \right)} } \right),\,\,\,\,\,\,k = 0,\,1, \cdots ,n - 1$$

(A5)

where w and b are the weights and biases of the network and $\sigma$ is the activation function. DNN is more complex in connecting layers than a network with 1 or 2 hidden layers and has the automatic feature extraction capability. Therefore, when larger training data is used, the DNN can provide accurate predictions compared to classical ML algorithms where the accuracy is kept fairly constant.

There are three major classes of DNNs, namely supervised, semi-supervised, and unsupervised DNNs. Examples of supervised learning algorithms include deep feed-forward networks (DFNNs) and CNNs. Restricted Boltzmann machines, autoencoders (AEs), GANs, and long short-term memory networks (LSTMs) are examples of unsupervised learning algorithms. Recurrent neural networks (RNN) are an example of semi-supervised learning techniques.

While solving complex problems such as image classification, natural language processing, and speech recognition, DNN is more useful than shallow networks. DNNs typically outperform other approaches when the data is large. DNN architectures are very flexible to adapt to new problems and can work with any data type. Getting trapped in the local minima, vanishing gradient, and overfitting are some of the challenges associated with DNN training that require large data.

1.5 A.5 Convolutional neural network (CNN)

One of the most widely used DNNs are the CNNs [Fukushima (1988)]. While ANN is inspired by the human brain, CNNs are inspired by the human optical system and are predominantly applied to imaging analysis. CNNs consist of two operations, namely convolution and pooling. Unlike ANNs, in CNNs the neurons in one layer are connected to nearby neurons in the next layer. This leads to a significant reduction in the number of parameters in the network. A typical CNN consists of an input, an output, and multiple hidden layers which consist of a series of convolutional layers (filters or convolution kernels) as shown in Fig. 9. ReLU is the typical activation function used, followed by operations such as pooling layers, fully connected layers, and normalization layers. Backpropagation is used for error minimization and weight adjustment. [Wu (2017)] provides a tutorial on CNN. Compared to CPU-based architectures, CNNs with GPU-based architectures take less time for training, because GPU vastly is superior in the computation of dense algebraic kernels, such as matrix–vector multiplication, in which DL algorithms are mainly composed.

CNNs can easily process high-dimensional inputs such as images. CNNs are good at extracting local information from the text and exploring meaningful semantic and syntactic meanings between phrases and words. Also, the natural composition of text data can be easily handled by a CNN’s architecture. CNNs need large data for training and hence are computationally intensive. Encoding the position and orientation of objects is still a challenge in CNN.

1.6 A.6 Reinforcement learning (RL)

RL [Sutton and Barto (2018)] is one of the paradigms of ML algorithms where the agents learn by interacting with the environment. RL works on trial and error-based learning and maximizes the reward rather than finding the hidden structure unlike other ML algorithms. As can be seen in Fig. 10, an RL agent performs an action ‘a’ while transiting from a state s_t to s_{t +1} and is rewarded r_{t +1} for the action a_t and this process is repeated iteratively to maximize the reward. The probability of transition to the new state is expressed by P(s_{t +1} | s_t, a_t). The best sequence of actions that an RL agent can make is called a policy and the entire set of actions from start to finishthat an agent performs is called an episode. Usually, the dynamics of the RL problem can be captured by using a Markov decision process.

RL usually performs better in solving complex problems compared to other standard learning techniques. If no training data set is available, it is bound to learn from experience. RL focuses on achieving long-term results that are difficult to accomplish by other techniques. Similar to other ML techniques, RL requires large data and is computationally expensive. RL can be heavily affected due to the curse of dimensionality. When the conventional RL is combined with DL, deep RL can be set up. The deep RL uses DNNs to calculate rewards, and policies that are usually accomplished by a state of action pairs in RL. The deep RL can be employed where there exists a complex state and very high computations are required (Fig. 10).

1.7 A.7 Recurrent neural network (RNN)

RNN [Rumelhart et al 1986] is one of the common semi-supervised learning algorithms that use sequential data (or ordered data) for training. Examples of ordered data are DNA sequence, financial data, and time-series data. RNN uses the current input as well as the past history of inputs that it has learned through the hidden state while making decisions. Typically RNNs consists of an input layer, a hidden layer, and an output layer as shown in Fig. 11. The number of neurons in the hidden layer of RNNs should be between the number of inputs and the number of outputs. The key feature of RNN is that it makes a copy of the output and sends it back into the network. Thus, the past information gets stored. The change in the knowledge of the network is updated in the hidden state at every time step and the update can be expressed as

$$h_t = f_w \left( {x_t ,\,h_{t - 1} } \right)$$

(A6)

where h_t is the new hidden state, h_t-1 is the past hidden state, x_t is the current input, and f_w is the fixed function with trainable weights.

These algorithms commonly find application in ordinal or temporal problems such as image captioning, speech recognition, and natural language processing. Similar to spatial data being efficiently processed by CNNs, RNNs are designed to process the sequential data in an efficient manner. As the number of time steps increases, the number of model parameters in the RNN model does not increase. While training an RNN, error gradients are used to update the network weights. Sometimes the error gradients can accumulate resulting in large updates of weights (exploding gradients) and an unstable network. On the other hand, if the weight updates are small, one faces the problem of vanishing gradients. These are the two major issues associated with the RNNs. In order to solve the gradient problem, weight initialization methods such as Xavier initialization and He initialization, gradient clipping, and batch normalization are used, or an LSTM or GRU is devised. When timely dependencies in sequences need to be captured, RNN are one of the best choices. However, recent developments such as Transformers [Vaswani et al, (2017)] can outperform RNN in such applications.

1.8 A.8 Variational autoencoder (VAE)

An autoencoder (AE) is a type of unsupervised learning that learns unlabeled data and has traditionally been used for dimensionality reduction and feature learning, but recently it has gained a lot of popularity as a generative model that can generate data similar to training data. Since VAE (Kingma and Welling 2013) is based on an AE, it consists of two parts: encoder and decoder. However, unlike AE, which represents a latent vector as a value, the latent vector of VAE uses a density function. Since VAE is based on a probabilistic model, it has computational flexibility. This latent vector is used to predict an input image, and VAE training is performed with the goal of reducing the difference between the generated image and the input image as shown in Fig. 12. Finally, evidence lower bound and re-parameterization tricks are used to perform optimization. The main advantage of VAE is that it is useful to perform other tasks such as design optimization in the latent space using the latent vector information. However, since the density is not obtained directly, the quality of the generated model may be somewhat inferior to the direct density methods such as pixelRNN or pixelCNN, and the generated image is relatively blurry compared to GAN.

1.9 A.9 Generative adversarial network (GAN)

GAN [Goodfellow et al. (2014)] as shown in Fig. 13 trains a model that samples a latent vector from a simple distribution and generates it as an image based on the game-theoretic approach. The objective function of GAN consists of a discriminator output for real data, and a discriminator output for generated fake data. Because the generator and discriminator train with the goal of minimizing and maximizing the objective function, respectively, GAN is called a minmax game. The story of a counterfeiter (generator) and a police officer (discriminator) is an easy-to-understand example of the concept of GAN. When a counterfeiter creates a counterfeit currency, the police can determine whether it is genuine or not, and in the process, the generator and discriminator evolve competitively to generate a more authentic counterfeit currency.

GAN is difficult to apply to various fields due to unstable learning ability; consequently, a DCGAN [Radford et al. (2015)] with CNN in the generator part was developed. In addition to the GAN model, various models such as conditional GAN (cGAN), boundary equilibrium GAN, and super-resolution GAN have been developed to improve performance and for application to new fields. The main advantage of GANs is that it is possible to create new and novel images.

1.10 A.10 Ensemble methods

Ensemble methods are one paradigm of ML techniques that have become popular during the past three decades [Bishop (1995)], where several learning algorithms are used to train and solve a problem. Boosting, bagging [Bühlmann (2012)], and stacking (Džeroski and Ženko 2004) are the most widely used approaches in ensemble methods. AdaBoost [Rätsch et al. (2001)], gradient boosting (Friedman 2001), extreme gradient boosting [Chen and Guestrin (2016)], and light gradient boosting [Ke et al. (2017)] are a few algorithms that are more frequently used in boosting. Bagging meta-estimator and random forest are the popular ensemble algorithms in bagging.

Ensemble methods are used to improve the accuracy of the model by reducing the variance. Bagging is a variance reduction technique whereas boosting and stacking’s objective is to reduce the bias and not the variance. Ensembles have been shown to serve as insurance against bad predictions and issue a red flag when one of the models is performing inconsistently on a consistent basis, especially at regions of interest. Eventually, all the ensemble algorithms attempt to improve the model accuracy. The generalization ability of a single learner is not as good as ensemble methods, since it uses multiple learners, and this is one of the major advantages of using ensemble methods.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ramu, P., Thananjayan, P., Acar, E. et al. A survey of machine learning techniques in structural and multidisciplinary optimization. Struct Multidisc Optim 65, 266 (2022). https://doi.org/10.1007/s00158-022-03369-9

Download citation

Received: 30 December 2021
Revised: 10 August 2022
Accepted: 15 August 2022
Published: 10 September 2022
DOI: https://doi.org/10.1007/s00158-022-03369-9

Keywords

Profiles

Pugazhenthi Thananjayan View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of machine learning techniques in structural and multidisciplinary optimization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

OpenMDAO: an open-source framework for multidisciplinary design, analysis, and optimization

Accelerating Large-scale Topology Optimization: State-of-the-Art and Challenges

Introduction and Overview: Hybrid Metaheuristics in Structural Engineering—Including Machine Learning Applications

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Replication of results

Additional information

Publisher's Note

Appendix. ML methods widely used in the context of structural and multidisciplinary optimization

Appendix. ML methods widely used in the context of structural and multidisciplinary optimization

1.1 A.1 Linear regression

1.2 A.2 Gaussian process (GP)

1.3 A.3 Artificial neural network (ANN)

1.4 A.4 Deep neural network (DNN)

1.5 A.5 Convolutional neural network (CNN)

1.6 A.6 Reinforcement learning (RL)

1.7 A.7 Recurrent neural network (RNN)

1.8 A.8 Variational autoencoder (VAE)

1.9 A.9 Generative adversarial network (GAN)

1.10 A.10 Ensemble methods

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now