Neural Network-Based Formula for the Buckling Load Prediction of I-Section Cellular Steel Beams

Cellular beams are an attractive option for the steel construction industry due to their versatility in terms of strength, size, and weight. Further benefits are the integration of services thereby reducing ceiling-to-floor depth (thus, building’s height), which has a great economic impact. Moreover, the complex localised and global failures characterizing those members have led several scientists to focus their research on the development of more efficient design guidelines. This paper aims to propose an artificial neural network (ANN)-based formula to estimate the critical elastic buckling load of simply supported cellular beams under uniformly distributed vertical loads. The 3645-point dataset used in ANN design was obtained from an extensive parametric finite element analysis performed in ABAQUS. The independent variables adopted as ANN inputs are the following: beam’s length, opening diameter, web-post width, cross-section height, web thickness, flange width, flange thickness, and the distance between the last opening edge and the end support. The proposed model shows a strong potential as an effective design tool. The maximum and average relative errors among the 3645 data points were found to be 3.7% and 0.4%, respectively, whereas the average computing time per data point is smaller than a millisecond for any current personal computer.


Introduction
The use of cellular beams (i.e., perforated beams with circular web openings) in the construction sector has significantly increased over the past decade on account of the distinct and discreet advantages they offer. Cellular beams are applicable for long span structures, where integration of services such as ventilation ducts and lightings systems within the beam is attained, but also for short spans, where spatial interference among concentrated mechanical devices and structural elements may require a compromised solution. Cellular beams allow reducing the height of buildings to fit a required number of floors, otherwise fitting more floors in a given height limit having a significant economic impact to the whole structure's budget.
Furthermore, cellular beams offer practical advantages such as the possibility of (i) fixing the ceilings directly to the beams' lower flanges instead of requiring additional suspension elements, and (ii) allowing future equipment addition or replacement within the existent void holes. In fact, with the wider adoption of Building Information Modelling (BIM), the knowledge of those expansion possibilities is becoming a valuable asset for the building management.
Long span and lightweight structures also benefit from flexible designs with the fewer number of columns and foundations, and thus from the reduced construction time (Tsavdaridis 2010). The increase in beam depth due to the castellation process (i.e., profile cutting manufacturing) also provides greater flexural stiffness having a final section with larger section modulus (Morkhade and Gupta 2015). However, the presence of web openings significantly (ASCE 23-97, SCI-P100, SCI-P355) and produce the lowest errors. In 2011, Lawson and Hicks published the SCI-P355 (Lawson and Hicks 2011) design guidelines, an update to SCI-P068 (Lawson 1987) and SCI-P100 (Ward 1990) which proposed that the Vierendeel bending resistance is dependent on the classification of the web of the T-beams. This approach produced acceptable approximations for openings of specific dimensions where the best results were found with an error of 25-30%. It is worth noting that Chung et al. (2003), Verweij (2010) and Morkhade and Gupta (2015) have reported that the current guidelines, specifically SCI-P100 (Ward 1990) and SCI-P355 (Lawson and Hicks 2011), are inadequate, complicated and conservative when it comes to the design of perforated steel beams.
Artificial Neural Networks (ANN) have become a popular method to predict the response of structures.  presented a study relating the use of ANN in the evaluation of the load carrying capacity of the web-post of castellated steam beams based on 140 FE models. The computational technique generated predictions with great accuracy when compared to other methods. Sharifi and Tohidi (2014) also illustrated the application of ANN to accurately estimate the elastic buckling load capacity of steel bridge girders that have rectangular openings at the bottom zone in the web. This is considered as the worse possible location to place an opening to resist lateral torsional buckling. The ANN formula was derived from 21 FE models which managed to accurately predict the elastic buckling load. In 2014, . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0) 6 Tohidi and Sharifi demonstrated the versatility of ANN by studying the buckling capacity of steel beams with rectangular web openings that has experienced corrosion in the web. In addition, Tohidi and Sharifi (2015) developed an ANN model to estimate the bearing capacity of steel girders with corrosion at the bearing region. The ANN empirical formulas obtained were reported to be accurate in predicting the residual capacity of deteriorated steel beams.
The current study was motivated by the lack of rational (simple, efficient and accurate) design procedures relating to the buckling response of cellular beams. This paper proposes an ANN-based formula to estimate the critical elastic buckling load of simply supported cellular beams under uniformly distributed vertical loads, as function of eight independent geometrical parameters. This research is the first step of an ongoing investigation that aims to propose a novel and simple analytical design method to accurately compute the inelastic resistance of cellular steel beams. Α FE-based dataset comprising 3645 points was generated for this study, in order to allow the ANN model to have a significant generalization ability and be considered as a powerful tool for structural engineers and scientists to (i) estimate the elastic buckling load of cellular steel beams, and (ii) efficiently perform sensitivity analyses to further assess the behaviour of those members. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0) 7

FE Modelling
Three-dimensional FE models were developed using ABAQUS (Dassault Systèmes Simulia Corp, 2017), which were then parametrised to generated 3645 simulations. Typical values for the modulus of elasticity and Poisson's ratio were adopted (E = 210 GPa, ν=0.3). All models are simply supported where one end allows in-plane rotations but not translations and the other admits translations along the beam axis, beyond in-plane rotations. End twisting rotations were prevented by restraining both the top and bottom flange tips against out-of-plane displacements at the supports. A unitary load was applied to the top flange as a uniformly distributed pressure (then converted to a line load for ANN simulation purposessee Tab. 1). The FE mesh adopted was quad-dominated using shell elements of type SR8, which was tested against experimental work conducted by Tsavdaridis andD'Mello (2011), andSurtees andLiu (1995), providing accurate and reliable results (Rajana 2018). The mesh sizes recommended by El-Sawy et al.
(2014) for web and flanges were adopted. Fig. 1(a) illustrates the various parameters considered in the parametric analysis, whereas Fig. 1(b) illustrates one application of these type of structural members. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835

Parametric Analysis
The parametric models were submitted to the ABAQUS Lanczos Eigensolver using Python scripts. Tab. 1 presents the possible values taken for each independent (parametric) variable (see Fig. 1(a)) considered in the FEA. The 'first' web opening was placed at the centre of the beam whereas the remaining ones were offset from the former until (for a fixed beam's length, opening diameter, and web-post width) no more circular openings could fit within member's . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et

Introduction
Machine learning, one of the six disciplines of Artificial Intelligence (AI) without which the task of having machines acting humanly could not be accomplished, allows us to 'teach' computers how to perform tasks by providing examples of how they should be done (Hertzmann and Fleet 2012). When there is abundant data (also called examples or patterns) . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0) 10 explaining a certain phenomenon, but its theory richness is poor, machine learning can be a perfect tool. The world is quietly being reshaped by machine learning, being the Artificial Neural Network (also referred in this manuscript as ANN or neural net) its (i) oldest (McCulloch and Pitts 1943) and (ii) most powerful (Hern 2016) technique. ANNs also lead the number of practical applications, virtually covering any field of knowledge Irwin 2011, Prieto et. al 2016). In its most general form, an ANN is a mathematical model designed to perform a particular task, based in the way the human brain processes information, i.e. with the help of its processing units (the neurons). ANNs have been employed to perform several types of real-world basic tasks. Concerning functional approximation, ANN-based solutions are frequently more accurate than those provided by traditional approaches, such as multi-variate nonlinear regression, besides not requiring a good knowledge of the function shape being modelled (Flood 2008).
The general ANN structure consists of several nodes disposed in L vertical layers (input layer, hidden layers, and output layer) and connected between them, as depicted in Fig. 2.
Associated to each node in layers 2 to L, also called neuron, is a linear or nonlinear transfer (also called activation) function, which receives the so-called net input and transmits an output (see Fig. 5). All ANNs implemented in this work are called feedforward, since data presented in the input layer flows in the forward direction only, i.e. every node only connects to nodes belonging to layers located at the right-hand-side of its layer, as shown in Fig. 2. ANN's computing power makes them suitable to efficiently solve small to large-scale complex . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0) 11 problems, which can be attributed to their (i) massively parallel distributed structure and (ii) ability to learn and generalize, i.e., produce reasonably accurate outputs for inputs not used during the learning (also called training) phase.

Learning
Each connection between 2 nodes is associated to a synaptic weight (real value), which, together with each neuron's bias (also a real value), are the most common types of neural net unknown parameters that will be determined through learning. Learning is nothing else than determining network unknown parameters through some algorithm in order to minimize network's performance measure, typically a function of the difference between predicted and target (desired) outputs. When ANN learning has an iterative nature, it consists of three phases: should not be applied outside the input variable ranges used for network training. Since there performance results for each designed net, thus allowing the selection of the best ANN according to a certain criterion. The best network in each parametric SA is the one exhibiting the smallest average relative error (called performance) for all learning data.
It is worth highlighting that, in this manuscript, whenever a vector is added to a matrix, it means the former is to be added to all columns of the latter (valid in MATLAB).

Dimensional Analysis (feature 2)
The most widely used form of dimensional analysis is the Buckingham's π-theorem, which was implemented in this work as described in Bhaskar and Nigam (1990).

Input Dimensionality Reduction (feature 3)
When designing any ANN, it is crucial for its accuracy that the input variables are independent and relevant to the problem , Kasun et al. 2016). There are two types of dimensionality reduction, namely (i) feature selection (a subset of the original set of input variables is used), and (ii) feature extraction (transformation of initial variables into a smaller set). In this work, dimensionality reduction is never performed when the number of input variables is less than six. The implemented methods are described next.
where (i) Var(X) and Cov(X, Y) are the variance of X and covariance of X and Y, respectively, and (ii) ̅ and ̅ are the mean values of each variable. In this work, cases where | | ≥ 0.99 indicate that one of the variables in the pair must be removed from the ANN modelling. The one to be removed is the one appearing less in the remaining pairs ( , ) where | | ≥ 0.99. Once a variable is selected for removal, all pairs ( , ) involving it must be disregarded in the subsequent steps for variable removal.

Auto-Encoder
This feature extraction technique uses itself a 3-layer feedforward ANN called auto-encoder (AE). After training, the hidden layer output (y2p) for the presentation of each problem's input pattern (y1p) is a compressed vector (Q2 x 1) that can be used to replace the original input layer by a (much) smaller one, thus reducing the size of the ANN model. In this work, Q2=round(Q1/2) was Concerning the learning algorithm used for all AEs, no L2 weight regularization was employed, which was the only default specification not adopted in 'trainAutoencoder(…)'.

Orthogonal and Sparse Random Projections
This is another feature extraction technique aiming to reduce the dimension of input data Y1 (Q1 x P) while retaining the Euclidean distance between data points in the new feature space. This is attained by projecting all data along the (i) orthogonal or (ii) sparse random matrix A (Q1 x Q2, Q2 < Q1), as described by Kasun et al. (2016)

Training, Validation and Testing Datasets (feature 4)
Four distributions of data (methods) were implemented, namely pt-pv-ptt = {80-10-10, 70- 3) In order to select the validation patterns, randomly select pv / (pv + ptt) of those patterns not belonging to the previously defined training dataset. The remainder defines the testing dataset.
It might happen that the actual distribution pt-pv-ptt is not equal to the one imposed a priori (before step 1), which is due to the minimum required training patterns specified in step 2.

Input Normalization (feature 5)
The progress of training can be impaired if training data defines a region that is relatively narrow in some dimensions and elongated in others, which can be alleviated by normalizing each input variable across all data patterns. The implemented techniques are the following: Linear Max Abs Lachtermacher and Fuller (1995) proposed a simple normalization technique given by . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835

Nonlinear
Proposed by Pu and Mesbahi (2006), although in the context of output normalization, the only nonlinear normalization method implemented for input data reads The Hyperbolic Tangent function is also of sigmoid type, being defined as . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 Normalization can also be applied to the output variables so that, for instance, the amplitude of the solution surface at each variable is the same. Otherwise, training may tend to focus (at least in the earlier stages) on the solution surface with the greatest amplitude (Flood and Kartam 1994a).
Normalization ranges not including the zero value might be a useful alternative since convergence issues may arise due to the presence of many small (close to zero) target values (Mukherjee et al. 1996). Four normalization methods were implemented. The first three follow eq. The fourth normalization method implemented is the one described by eq. (6).

Multi-Layer Perceptron Network (MLPN)
This is a feedforward ANN exhibiting at least one hidden layer. Fig. 2 depicts a 3-2-1 MLPN (3 input nodes, 2 hidden neurons and 1 output neuron), where units in each layer link only to some nodes located ahead. At this moment, it is appropriate to define the concept of partially-(PC) and fully-connected (FC) ANNs. In this work a FC feedforward network is characterized by having each node connected to every node in a different layer placed forwardany other type of network is said to be PC (e.g., the one in Fig. 2). According to Wilamowski (2009), PC MLPNs are less powerful than MLPN where connections across layers are allowed, which usually lead to smaller networks (less neurons).
where φl is the transfer function used for all neurons in layer l.

Radial-Basis Function Network (RBFN)
Although having similar topologies, RBFN and MLPN behave very differently due to distinct hidden neuron modelsunlike the MLPN, RBFN have hidden neurons behaving differently than output neurons. According to Xie et al. (2011), RBFN (i) are specially    Lastly, according to the implementation carried out for initialization purposes (described in 3.3.12), (i) RBF center vectors per hidden layer (one per hidden neuron) are initialized as integrated in a matrix (termed RBF center matrix) having the same size of a weight matrix linking the previous layer to that specific hidden layer, and (ii) RBF widths (one per hidden neuron) are initialized as integrated in a vector (called RBF width vector) with the same size of a hypothetic bias vector.

Connectivity (feature 10)
For this ANN feature, three methods were implemented, namely (i) adjacent layersonly connections between adjacent layers are made possible, (ii) adjacent layers + input-outputonly connections between (ii1) adjacent and (ii2) input and output layers are allowed, and (iii) fully-connected (all possible feedforward connections). . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0)
(9), defined in 3.3.6, the ones defined next were also implemented as hidden transfer functions.
During software validation it was observed that some hidden node outputs could be infinite or NaN (not-a-number in MATLABe.g., 0/0=Inf/Inf=NaN), due to numerical issues concerning some hidden transfer functions and/or their calculated input. In those cases, it was decided to convert infinite to unitary values and NaNs to zero (the only exception was the bipolar sigmoid function, where NaNs were converted to -1). Other implemented trick was to convert possible Gaussian function's NaN inputs to zero.

Identity-Logistic
In Gunaratnam and Gero (1994), issues associated with flat spots at the extremes of a sigmoid function were eliminated by adding a linear function to the latter, reading where || … || denotes the Euclidean distance in all functions.

Parameter Initialization (feature 12)
The initialization of (i) weight matrices (Qa x Qb, being Qa and Qb node numbers in layers a and b being connected, respectively), (ii) bias vectors (Qb x 1), (iii) RBF center matrices (Qc-1 x Qc, being c the hidden layer that matrix refers to), and (iv) RBF width vectors (Qc x 1), are independent and in most cases randomly generated. For each ANN design carried out in the context of each parametric analysis combo, and whenever the parameter initialization method is not the 'Mini-Batch SVD', ten distinct simulations varying (due to their random nature) initialization values are carried out, in order to find the best solution. The implemented initialization methods are described next. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835

Rand [-lim, lim]
This function is based on the proposal in Waszczyszyn (1999), and generates random numbers with uniform distribution in [-lim, lim], being lim layer-dependent and defined by where a and b refer to the initial and final layers integrating the matrix being initialized, and L is the total number of layers in the network. In the case of a bias or RBF width vector, lim is always taken as 0.5. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0)

SVD
Although Deng et al. (2016) proposed this method for a 3-layer network, it was implemented in this work regardless the number of hidden layers.

Mini-Batch SVD
Based on Deng et al. (2016), this scheme is an alternative version of the former SVD. Now, training data is split into min{Qb, Pt} chunks (or subsets) of equal size Pti = max{floor(Pt / Qb), 1}floor rounds the argument to the previous integer (whenever it is decimal) or yields the argument itself, being each chunk aimed to derive Qbi = 1 hidden node.

Learning Algorithm (feature 13)
The most popular learning algorithm is called error back-propagation (BP), a first-order gradient method. Second-order gradient methods are known to have higher training speed and accuracy (Wilamowski 2011). The most employed is called Levenberg-Marquardt (LM). All these traditional schemes were implemented using MATLAB toolbox (The Mathworks, Inc 2017).
Concerning the LM scheme -'trainlm' in MATLAB, the only learning parameter set different than its default value was the abovementioned (ii).

Extreme Learning Machine (ELM, mb ELM, I-ELM, CI-ELM)
Besides these traditional learning schemes, iterative and time-consuming by nature, four

Performance Improvement (feature 14)
A simple and recursive approach aiming to improve ANN accuracy is called Neural Network Composite (NNC), as described in Beyer et al. (2006). In this work, a maximum of 10 extra ANNs were added to the original one, until maximum error was not improved between successive NNC solutions. Later in this manuscript, a solution given by a single neural net might be denoted as ANN, whereas the other possible solution is called NNC. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835  Wilson and Martinez (2003) suggested that if one wants to use mini-batch training with the same stability as online training, a rough estimate of the suitable learning rate to be used in learning algorithms such as the BP, is ηonline /√ , where cs is the chunk size and ηonline is the online learning ratetheir proposal was adopted in this work. Based on the proposal of Liang et al. (2006), the constant chunk size (cs) adopted for all chunks in mini-batch mode reads cs = min{mean(hn) + 50, Pt}, being hn a vector storing the number of hidden nodes in each hidden layer in the beginning of training, and mean(hn) the average of all values in hn.

Network Performance Assessment
Several types of results were computed to assess network outputs, namely (i) maximum error, (ii) % errors greater than 3%, and (iii) performance, which are defined next. All where (i) dqp is the q th desired (or target) output when pattern p within iteration i (p=1,…, Pi) is presented to the network, and (ii) yqLp is net's q th output for the same data pattern. Moreover, denominator in eq. (25) is replaced by 1 whenever |dqp| < 0.05dqp in the nominator keeps its real value. This exception to eq. (25) aims to reduce the apparent negative effect of large relative errors associated to target values close to zero. Even so, this trick may still lead to (relatively) large solution errors while groundbreaking results are depicted as regression plots (target vs. predicted outputs).

Maximum Error
This variable measures the maximum relative error, as defined by eq. (25), among all output variables and learning patterns. . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835

Percentage of Errors > 3%
This variable measures the percentage of relative errors, as defined by eq. (25), among all output variables and learning patterns, that are greater than 3%.

Performance
In functional approximation problems, network performance is defined as the average relative error, as defined in eq. (25), among all output variables and data patterns being evaluated (e.g., training, all data).

Software Validation
Several benchmark datasets/functions were used to validate the developed software, involving low-to high-dimensional problems and small to large volumes of data. Due to paper length limit, validation results are not presented herein but they were made public by Researcher (2018b).

Parametric Analysis Results
Aiming to reduce the computing time by cutting in the number of combos to be runnote that all features combined lead to hundreds of millions of combos, the whole parametric simulation was divided into nine parametric SAs, where in each one feature 7 only takes a single value. This measure aims to make the performance ranking of all combos within each 'small' analysis more . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 and SA 2 involved the ELM-based counterpart, (ii) the 3 rd -7 th SAs combined all possible methods from features 3, 4, 6 and 7, and concerning all other features, adopted the methods integrating the best combination from the aforementioned first SA, (iii) the 8 th SA combined all possible methods from features 11, 12 and 14, and concerning all other features, adopted the methods integrating the best combination (results compared after postprocessing) among the previous five sub-analyses, and lastly (iv) the 9 th SA combined all possible methods from features 9, 10 and 15, and concerning all other features, adopted the methods integrating the best combination from the previous analysis.
Summing up the ANN feature combinations for all parametric SAs, a total of 475 combos were run for this work.
ANN feature methods used in the best combo from each of the abovementioned nine parametric sub-analyses, are specified in Tab. 5 (the numbers represent the method number as in Tabs 2-4).

Proposed ANN-Based Model
The proposed model is the one, among the best ones from all parametric SAs, exhibiting the lowest maximum error (SA 9 with 5 layers and a distribution of nodes/layer of 8-11-11-11-1. Concerning connectivity, the network is fully-connected, and the hidden and output transfer functions are all Logistic (eq. (7)) and Identity (eq. (10)), respectively. The network was trained using the LM algorithm (1500 epochs). After design, the average network computing time concerning the presentation of a single example (including data pre/postprocessing) is 6.93E-05 s - Fig. 7   It is worth recalling that, in this manuscript, whenever a vector is added to a matrix, it means the former is to be added to all columns of the latter (valid in MATLAB).  2.6 0.3 0.0 32 8.34E-05 7 9.5 0.9 3.9 29 6.93E-05 8 7.0 0.7 1.9 29 6.80E-05 9 -----(b)

Input Data Preprocessing
For future use of the proposed ANN to simulate new data Y1,sim (8 x Psim matrix) concerning Psim patterns, the same data preprocessing (if any) performed before training must be applied to the input dataset. That preprocessing is defined by the methods used for ANN features 2, 3 and 5 (respectively 2, 6 and 1see Tab. 2), which should be applied after all (eventual) qualitative . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835 © 2018 by Abambres et al. (CC BY 4.0) 42 variables in the input dataset are converted to numerical (using feature 1's method). Next, the necessary preprocessing to be applied to Y1,sim, concerning features 2, 3 and 5, is fully described.

Dimensional Analysis and Dimensionality Reduction
Since dimensional analysis (d.a.) and dimensionality reduction (d.r.) were not carried out, one has      . Neural Network-based formula for the buckling load prediction of I-section cellular steel beams, hal-02074835

ANN-Based Analytical Model
Once determined the preprocessed input dataset {Y1,sim}n after (8 x Psim matrix), the next step is to present it to the proposed ANN to obtain the predicted output dataset {Y5,sim}n after (1 x Psim vector), which will be given in the same preprocessed format of the target dataset used in learning. In order to convert the predicted outputs to their 'original format' (i.e., without any transformation due to normalization or dimensional analysisthe only transformation visible will be the (eventual) qualitative variables written in their numeric representation), some postprocessing is needed, as described in detail in 3.7.3. Next, the mathematical representation of the proposed ANN is given, so that any user can implement it to determine {Y5,sim}n after , thus eliminating all rumors that ANNs are 'black boxes'.  . (31) Arrays Wj-s and bs are stored online in Developer (2018b), aiming to avoid an overlong article and ease model's implementation by any interested reader.

Output Data Postprocessing
In order to transform the output dataset obtained by the proposed ANN, {Y5,sim}n after (1 x Psim vector), to its original format (Y5,sim), i.e. without the effects of dimensional analysis and/or output normalization (possibly) taken in target dataset preprocessing prior training, the postprocessing addressed next must be performed.
Once obtained {Y5,sim}n after , the following relations hold for its relation to its non-normalized ({ 5, } . . ) and original (Y5,sim) formats (just after the dimensional analysis stage, and free of any pre-processing effects, respectively), reading