Abstract

Accurate and reliable prediction of Perfobond Rib Shear Strength Connector (PRSC) is considered as a major issue in the structural engineering sector. Besides, selecting the most significant variables that have a major influence on PRSC in every important step for attaining economic and more accurate predictive models, this study investigates the capacity of deep learning neural network (DLNN) for shear strength prediction of PRSC. The proposed DLNN model is validated against support vector regression (SVR), artificial neural network (ANN), and M5 tree model. In the second scenario, a comparable AI model hybridized with genetic algorithm (GA) as a robust bioinspired optimization approach for optimizing the related predictors for the PRSC is proposed. Hybridizing AI models with GA as a selector tool is an attempt to acquire the best accuracy of predictions with the fewest possible related parameters. In accordance with quantitative analysis, it can be observed that the GA-DLNN models required only 7 input parameters and yielded the best prediction accuracy with highest correlation coefficient (R = 0.96) and lowest value root mean square error (RMSE = 0.03936 KN). However, the other comparable models such as GA-M5Tree, GA-ANN, and GA-SVR required 10 input parameters to obtain a relatively acceptable level of accuracy. Employing GA as a feature parameter selection technique improves the precision of almost all hybrid models by optimally removing redundant variables which decrease the efficiency of the model.

1. Introduction

Steel-concrete composite/hybrid systems have found wide application in several engineering works, due to the recent advancements in structural engineering. In this regard, the shear connector serves as an important component that ensures the development of composite actions by facilitating the shear transfer between the concrete slab and the steel profile [13]. At the site, conventional shear connectors (i.e., Nelson stud) are beneficial owing to their high level of automation; meanwhile, they are prone to certain problems, especially, in structures that are subjected to stress [35]. When compared with other connectors, Nelson stud somehow exhibits low resistance which can lead to the design of girders with partial interaction. Considering this fact, many research studies have focused on how to improve the shear connector for hybrid composite systems [6, 7], with the first work dated back to the 1980s. The development of Perfobond, another form of a connector with higher resistance, was reported by Leonhardt, Andra, and Partners in 1987, when working on the third bridge that crossed the Caroni River in Venezuela [3, 6, 8]. A study by Vianna et al. [9] compared the economic costs of steel girders manufacturing using different types of connectors. From the outcome of the study, it was observed that Perfobond connectors are more cost-efficient to be used in the steel-concrete composite. Later, Vianna et al. reported another study on Perfobond and T-Perfobond rib shear connectors in terms of their ductility, resistance, and collapse modes [3]. The results showed that PRSC is both structurally efficient and economical in terms of shear transfer in hybrid and composite structures. Other research studies focused on the numerical and parametric evaluation of PRSC on 40 pushout samples [2, 10]. The study of a simple perforated plate PRSC contains several holes and transverse rebars; the sample also exhibited varying concrete compressive strengths. The study reported two major findings as it involved finite element (FE) method and regression analysis during the prediction of the PRSC shear capacity [11, 12]. Another experimental study on the structural response of PRSC was performed by [4]. The study reported an increase in the resistance of PRSC with increases in the number of holes, and based on this outcome, it was submitted that the resistance and ductility of PRSC can be increased by passing the reinforcement bars through the holes while reducing the upward displacement. Rodrigues and Laím focused on the influence of the holes number, rib holes, and transverse reinforcement, as well as the doubled PRSC at both ambient and high temperatures [6]. From the outcomes, higher temperatures significantly impacted PRSC in terms of its load-carrying capacity, especially the doubled PRSC. The study further showed that transverse reinforcement bars, when present in rib holes, cause a reduction in the capacity of PRSC to carry a load, especially at high temperatures. A parametric study on PRSC shear strength based on the FE method has been reported [13]. The developed model in this study was verified using experimental pushout tests, and from the results, a mathematical model was developed for the estimation of the shear capacity of PRSC. The shear behavior of PRSC has been investigated by [14], under both cyclic and static loadings. Based on the static tests, the results showed the shear capacity of pure concrete-based specimens to be about 65% of that of specimens with both concrete end-bearing zone and concrete dowels. Specimens with transverse rebars in holes exhibited shear capacity of about two times that of specimens with transverse rebars. From the cyclic tests, samples without transverse rebars showed a significant decrease in residual shear capacity compared to that of static shear capacity. Hence, specimens with transverse rebars exhibited residual shear capacity that was almost similar to their static shear capacity.

A parametric study has been reported on the circular-hole and long-hole PRSC [15]. From the outcome of the study, a relationship was established between the failure mode of both long-hole and circular-hole PRSC and the concrete failure. It was also reported that the increase in both height and diameter increases the shear stiffness of PRSC. Steel-concrete decks with PRSC have been investigated for dynamic characteristics in [16, 17]. The study considered PRSC with both normal-weight high-strength concrete and lightweight high-strength concrete. The considered characteristics include the natural frequencies, frequency response functions, and damping ratio; these were evaluated using a nondestructive approach. The experimental outcomes of natural frequencies were also compared using the FE model. From the results, the first mode with a damping ratio of almost 0.5% was found to be the most effective mode for both concrete types. Relying on these studies, it is evident that several factors influence the structural behavior of PRSC. Such factors include end-bearing force, rib spacing, rib arrangement, and concrete compressive strength, as well as the yield and area strength of the transverse rebars [18]. Having identified these factors that govern the behavior of PRSC, it becomes necessary that the resistance of PRSC should be predicted for proper implementation by the decision makers [19, 20]. The quantification of the PRSC resistance based on analytical methodologies has been introduced; however, there are certain limitations of such methods [21].

According to several studies published in the literature, their behavior of PRSC is affected by various contributing factors, including the area and the yield strength of transverse rears, the end-bearing force, the concrete compressive strength, the rib spacing, and the rib arrangement. In addition, several analytical and empirical models were developed to predict the resistance of PRSC; however, it provided undesirable predictions with an increase in calculated errors as well.

The advancements in technology have made computer-aid methods some of the optimistic alternatives for modelling several structural engineering-related problems, and the most famous among them is the artificial intelligence models which are easy, applicable, convenient, and strong predictive models [2225]. AI models are beneficial as they can solve nonlinear, stochastic, and nonstationary problems that may not be addressed when the classical regression models are used [2629]. Several AI models have been developed to determine the actual relationship between the predictors and perfobond rib shear strength. For instance, Köroğlu et al. [30] investigated the genetic programming (GP) model for ultimate shear capacity prediction in composite beams with profiled steel sheeting. The study compared the model’s accuracy in an ultimate shear capacity prediction of the composite beams to that of the proposed GP model based on the employed test data. From the results, it was submitted that the new GP performed more accurate ultimate shear capacity prediction of the composite beams compared to the building codes. Another study by Ali [31] focused on the prediction of the shear strength of channel shear connectors in a composite beam that consists of concrete and steel sections based on adaptive neuro-fuzzy inference system (ANFIS) and linear regression (LR) which are nonlinear and linear modelling tools, respectively. From the results, ANFIS performed more accurate and precise predictions than LR. Although, there have been several explorations of AI models for modelling shear strength-related structure and material problems [32, 33]. It can be observed for the existing approaches that the capacity of the used model is mainly influenced by the structure of the used approach and the selection of input parameters. Based on that, introducing a novel approach has capabilities to discover the complex relationship between predictors and target which is very important to increase the prediction precession [34]. Moreover, incorporating that approach with a novel algorithm to select the most important input parameters and accurately predict the Perfobond Rib Shear Strength Connector with high accuracy is very significant for structural engineering.

The research scope of computer aid is still limited, and the exploration of newly developed AI models is still ongoing research motivation. The DLNN model is a newly explored AI version that demonstrates a reliable machine learning model for solving nonlinear regression problems [3539] and yet to be developed for the PRSC shear strength prediction. The current research is conducted based on the implementation of the deep learning model for predicting PRSC shear strength modelling. Prior to the step, to predict the PRSC shear strength, genetic algorithm is used with the DNN model to select the most important input parameters and then introduce these factors to the adopted approach. The proposed DLNN model is validated against several well-established machine learning models including M5Tree, ANN, and SVR. The investigation is extended with the integration of the genetic algorithm as a robust nature-inspired optimization algorithm for input parameters’ selection. The obtained results are assessed and discussed comprehensively.

2. Experimental Dataset Description

To evaluate the shear strength capacity of a Perfobond connector (see Figure 1), data included 90 records related to the shear connector of steel-concrete structures. These records were collected from eight databases published in literature studies. These studies comprised data was collected from [15, 10, 40]. The input variables included concrete compressive strength , area of concrete dowels , rib holes number (n), area of cross reinforcement bars and yield stress of reinforcement bars , area of cross reinforcement bars and the tensile strength of cross rebar , area of the connector at the end-bearing zone , the ratio between the thickness of the concrete slab to connector height , connector height , the contact area between the connector and concrete , and coefficient of end-bearing force . These parameters were entered to the hybrid model to predict shear of Perfobond connector . The description statistics of the experimental dataset are shown in Table 1. The data were separated into groups: training dataset 85% and testing dataset 15%.

3. Methodology Overview

3.1. Artificial Neural Network (ANN)

In the last few decades, artificial networks, such as neural networks, social networks, and other algorithms, have established. The main merit of these technologies is their abilities to predict data and deal with complex systems. Neural network is a mathematical model applied to make the decision in the right way by mimicking the neurons in the biological brain [41]. The construction of a neural network is depending on the connection of several layers called neurons. Feedforward neural network with backpropagation learning algorithm is widely used by researchers. Backpropagation is a common algorithm in neural network applications due to its ability in training the network based on the supervised learning method [42]. In this algorithm, the predicted value is compared with the original variable to compute the error between them. The algorithm modified the weights in the neural network to decrease the error value to a small amount. The structure of the algorithm is explained as follows:where N represents the input neuron, H is the number of hidden layers in the neural network, and M is the output variable. The construction of a feedforward neural network contains at least an input variable, output, and hidden layer. In this network, the information is transferred in one direction from input to hidden and output layer without loops in the network. The number of input variables and predicted labels are corresponded to the number of neurons in the input and output layer, respectively. Neurons in hidden layers are used for nonlinear transformation of the input variables. Hidden layers in the neural network are calculated bywhere refers to the hidden layer, is the input variable, and represents the weight between the layers. The value of the output layer can be computed as below:

To design the network, a number of nodes and hidden layers are required. Various studies stated that one or two layers are enough to achieve better prediction performance [43, 44]. The best performance of the training process is based on the good selection of inputs’ network. During this process, the relations between inputs and outputs are designed by the neural network. In every iteration phase, the modification of weights and biases is done by decreasing the error measure between actual and predicted outcomes. The error between actual and predicted values is presented below:where d represents the real value and y is the estimated value obtained from the algorithm. In this study, one hidden layer was used with sigmoid activation function due to its validity in the regression process. Figure 2 shows the structure of the neural network.

3.2. Deep Learning Neural Networks (DLNN)

Recently, the study of neural network application is based on the concept of deep learning technique. The structure of deep neural network is an extension of classical neural network with the addition of extra hidden layer(s) to the network. Deep learning was introduced by Hinton et al. through proposing the layerwise greedy-learning method. By this method, the neural network is pretrained by unsupervised learning technique before the training process layer by layer. Deep learning technique is popular due to two reasons: (i) developing of huge technical data, which can solve the problem of overfitting and (ii) assigning of nonrandom value to the neural network before the unsupervised learning process [45]. Thus, better performance can be reached after the training phase. There are various types of deep learning tools, and in this study, the backpropagation neural network was used.

This approach is used in many types of application, the same multihidden layers’ approach, and trained via backpropagation with gradient descent algorithm. The network contains input, output, and large numbers of neurons and hidden layers. The algorithm is worked based on the connection between the first layer and hidden layers which leads to yielding of a new variable that is transferred to the output layer. Then, the output layer predicts the result of the process. The specific thing in deep learning is the nonlinear relations between the multiple layers in the network that gave them the ability to deal with different nonlinear functions. This deep network can recognize complex patterns used in a complicated process. Figure 3 describes the general structure of a deep neural network with input, output, and multiple hidden layers. The mathematical process can be discussed as follows:where f represents the activation function, is the weight matrix and, b is the bias. The input variable is denoted by 0 layer and L represents the output layer. In this research, hyperbolic tangent function is utilized as an activation function due to its ability to obtain better performance in the study problem.

3.3. M5 Rule Model

M5 rule algorithm was developed by Holmes et al. [35] to forecast the numeric and nominal data. Building of the M5 rule is based on M5 tree by using the trees to build the model. The popular technique of M5 Model Tree (MT) works with classification cum regression principle. MT propounded by Quinlan [46] divides the complete domain into many subdomains, and multiple linear regression models are developed for each of them. In this case, nonlinear input-output relationships are approximated by a number of linear models.

Rule generation depended on the partial and regression tree (PART) model developed by [47]. The work of the algorithm depends on iterating model constructing and choosing the rule which has a good result at each iteration. In the training phase, the M5 model is applied, and then, the best leaf is chosen as a rule. The process continues until all instances are tested and utilized by the rules. The main merit in this approach is that the algorithm builds full trees and develops a small amount of dataset at the testing phase [48, 49]. In the first stage, in the development of the MT model, a decision tree is developed following a division criterion. Based on the criterion used for dividing the domain, a number of variants of model trees are available, and the one which follows standard deviation reduction (SDR) as the criterion is known as an M5 Model Tree [42, 43]. The SDR quantifies the reduction in error at each node while testing of attributes, and its computation can be made as follows:where standard deviation reduction, is the number of training samples, is the training samples of ith subdomain, and and are the SDs of total samples and ith subdomain sample. The resulting model for the subdomain can be represented as , where O is the output, ,…. are the coefficients of linear regression, and , … are the inputs. The procedure of computation is illustrated in Figure 4, which shows the division to the number of subdomains followed by development of different models considering and as inputs.

The partitioning process should be continued till the variation in the class values of all the instances that are associated with a node becomes negligible. Then, the models are refined by the “pruning” and “smoothing” processes, which may help to alleviate the “overfitting” and abrupt changes between individual subclasses [50]. The complete theoretical description of M5 model trees is available in literature [43, 51]. This method does not demand any control parameter settings, while on the contrary, its application results in more user friendly linear models [45, 48].

3.4. Support Vector Regression (SVR)

Support vector regression has been developed by [52] as an algorithm based on using a hyperplane to separate a dataset and calculate the distance from the hyperplane and the nearest variable. In recent year, SVR algorithm has been intensively used by many researchers for solving different engineering problems and shown better prediction than other machine learning algorithms [53, 54]. SVR estimates the error between the input and output variable in the regression process by computing the distance from SVR margin. The mathematical expression of the SVR model is shown as follows:where M denotes the dataset training and x and y represent the input and output variables. The SVR function that applied to the training dataset iswhere represents the weight vector and refers to the high dimension of input space, whereas b represents the scalar vector. In the regression problem, error deviation can be calculated by the following optimization algorithm:where represents the slag variable and is equal to the error between regularization and empirical error. To optimize the SVR model, Lagrange multipliers and sequential minimal optimization were used. Figure 5 illustrates SVR regression with -insensitive loss function. In this study, the predicted problem is characterized by a nonlinear relationship between input and output variables, in which the nonlinear mapping of SVR can be suitable to calculate the data correlation. In the SVR model, 4 kernel functions can be used for nonlinear mapping during the training phase. These kernels include linear, sigmoid, polynomial, and radial basis functions [53]. Radial basis function is applied due to its efficiency and ability to deal with complex regression problems [55].

3.5. Hybridized Genetic Algorithm (GA) with AI Models

Genetic algorithm is an evolutionary algorithm used to optimize solutions in complicated systems by finding on biological selections [56]. GA has been employed in various research areas such as pattern recognition, image processing, and control system [57]. In several research studies in engineering and science applications, GA demonstrated a reliable method in feature selection than other selection tools [58, 59]. The efficiency of GA can be discussed by its ability to explore the search space and concentration on the global optimization which led to a better investigation, utilizing the search space. The main idea included the application of natural selection, such as the creation of chromosomes, crossover, and mutation in solving complex processes. These processes are employed to reduce the features which are transferred to binary string [60, 61].

There are three main phases at natural selection (see Figure 6). Firstly, use crossover to produce offspring, then mutation may occur to the generated individuals and, finally, the fittest individual is selected. The first step in genetic programming is the population initialization. In this step, many individuals are generated randomly. Then, the fittest individual is chosen to produce the offspring. This phenomenon can be applied for search space. We produce many solutions to the study problem and the best solution is selected from them. In this study, a genetic algorithm is employed to choose the highly correlated variables and the process is begun from two variables. Then, the models ANN, DLNN, SVR, and M5 rule are applied [62, 63].

3.6. Performance Metrics

In order to select the best predictive modelling approaches, five statistical indicators are used in this study which are root mean square error (RMSE), mean absolute (MAE), mean absolute percentage error (MAPE), relative root mean squared error (RRMSE), relative error (RE), mean relative error (MRE), Nash–Sutcliffe efficiency (NSE), and BIAS. The mathematical expressions of these measures can be seen below [6466]:where and are the observed and predicted values of PRSC capacity, represents the average values of PRSC, and is the total number of experimental samples. In this regard, eight statistical measures have been used to assess the performances of the adopted models. The mentioned measures are commonly used to assess the efficiency of AI models, including statistical parameter (i.e., RMSE and MAE) used for evaluating the forecasted error between actual and predictive values. Besides, the other measures such as NSE are employed to compute the degree of correlation between the predictive and actual points.

4. Results and Discussion

This section describes the results obtained from standard AI models according to all available data. The other scenario of this part of the study shows the incorporation of genetic algorithm as a tool used for selecting the most significant input parameters for adopted AI models. In both scenarios, the performance of each model is assessed based on quantitative assessment, using different statistical criteria, and visualized assessments, using different plots and figures.

4.1. First Scenario: Applying the Standard Models

The motivation of this current study is to accurately predict the shear strength of PRSC using different AI modelling approaches, including DLNN, SVR, ANN, and M5 Tree. In this scenario, all mentioned predictive models have been established based on ten predictors (, , , , , , connector height ,, and ). The collected experimental samples were divided into two sets, during the stage of developed standalone AI models. The majority of samples (85%) were used for modelling construction, whereas the rest was used for validation purposes. In order, to evaluate the performance accuracy of each modelling technique separately, ten statistical metrics were used, including correlation measures and error measures. Simulated results obtained by four predictive models for both training and testing stages were illustrated in Table 2. It can be clearly seen that all models during the training stage yielded unpromising accuracy except DLNN models which provided the best accuracy of predictions and produced the highest values of NSE (0.957) and the lowest values of RMSE (0.047 KN), MAE (0.033 KN), and RMSRE (0.914). However, the testing phase is the most important stage in the evaluation of the accuracy of the predictive models. According to Table 2, the superiority of DLNN over other AI models can be easily observed during the testing phase. Moreover, DLNN models generated the highest accuracy of predicted shear strength values with the shortest magnitudes of RMSE (0.045 KN), MAE (0.020 KN), RRMSE (0.092), and the highest values of NSE (0.888).

To evaluate the performance of each developed model during the testing phase in a more rigorous way, several graphical visualizations were established including scatter plots, relative error plots, and Taylor diagram. The scatter plot is considered a very important figure in the evaluation of the variance between the predicted and the actual shear strength values. Based on Figure 7, DLNN modes presented less scatter and recorded a higher value of correlation coefficient (R = 0.96) than the other comparable models (R of 0.95, 0.95, and 0.94, respectively, for M5Tree, ANN, and SVR). Besides, among all AI models, the DLNN modelling approaches produced fewer relative error percentages (see Figure 8). Figure 8 clearly indicates that except for one sample (sample 12), the relative error of predictions by DLNN is ±20% indicating a success rate of 92%, while by other AI methods, multiple samples surpass the relative error limits of even 40%.

For better visual comparison, the Taylor diagram was established because it can summarize different statistical measures (correlation coefficient and standard deviation) in one figure thereby, facilitating the process of selecting the best model accuracy.

Taylor diagrams are polar plots that present the similarity between observed and predicted data based on the correlation coefficient and standard deviation in a 2D plane. From Figure 9, it is evident that the point corresponding to DLNN predictions is the closest to the point corresponding to the observed dataset, indicating the best performance by the standalone DNN model, i.e., DLNN generated more accurate predicted values and closer to the actual ones. Based on the mentioned result, DLNN models showed better generalization capabilities in comparison with the other AI models during training and testing phases. Conversely, SVR modelling approach exhibited the lowest level of prediction accuracy in comparison with all AI predictive models.

4.2. Second Scenario: Applying the Hybrid Models

This section of the paper investigates the capability of using GA as a bioinspired algorithm for assisting the four AI models in selecting the best combination of input parameters, which importantly affects the PRSC. As the AI models can efficiently learn from the behavior of the datasets, it is very essential to minimize the model complexity. Thus, any improvement in the model performance with the use of minimal input parameters can be considered as economical in the modelling process. With this objective, the potential of GA was used for developing hybrid models with different AI methods in this study. Accordingly, eight different combination models were developed by hybridizing different AI methods with GA (GA-ANN, GA-SVR, GA-DLNN, and GA-M5Tree. The model combinations for different hybrid methods were designated as M1 to M8 in this paper, whereas the number of input parameters varied from 2 to 9 as presented in Tables 36.

The performance of prediction abilities over the training and the testing phases for the hybrid models are summarized in Tables 710. The most remarkable note can be observed that the GA improved the performance of the most predictive models in comparison with pure AI models, which have been carried out in the first scenario of this paper. For instance, the hybrid (GA-M5Tree-M8) recorded good accuracy of predictions compared with standard (M5Tree) model and the statistical measures, such as RMSE and MAE reduced by 8.55% and 3.77%, respectively, during the testing phase. The robustness of GA in properly selecting the optimal input parameters can be clearly seen when GA-M5Tree-M6 (with seven parameters) model generated slightly higher predicting accuracy than standard M5Tree model (with ten input variables). For more comparative analysis, GA-ANN-M6 (with 7 input variables) performed better than the best standalone DLNN and GA-M5Tree-M6, respectively. With respect to SVR-GA models, they were slightly improved in comparison with standard SVR. Moreover, all these GA-SVR models (8 models) showed the lowest accuracy and none of them could outperform the standard DLNN. On top of that, GA-SVR models scored the highest values of relative error (ranging from 20.25 to 25.44%) in comparison with other modelling approaches. Although all GA-SVR models performed the lowest accuracy of performances in comparison with other hybrid models in this scenario, they also showed lower prediction accuracy than standard SVR models which have been carried out in the first scenario. Therefore, it can be concluded that there was no specific advantage in the hybridization of SVR with GA for this problem and dataset. On the contrary, the hybridization of GA with DLNN provided more excellent predicted results than all standalone models which performed in the first scenario.

Among the different hybrid DLNN models, the hybrid GA-DLNN- M6 model performed very well (with NSE of 0.914, RMSE of 0.039 KN, MAPE of 0.052, and MAE of 0.021 KN). Furthermore, it was noted that very fewer bias indicators were generated in GA-DLNN-M6, GA-DLNN-M7, and GA-DLNN-M8 (−0.0004, -0.0004, and -0.01) than standalone DLNN model (-0.011). It was evident that GA-based hybrid models can improve the performance when hybridized with DLNN, NN, and M5 Model Tree with a fewer number of input parameters for this dataset. Moreover, the GA-DLNN-M6 was considered very simply when compared to the other models because fewer input parameters are needed, and it can achieve a significant improvement compared to the standard DLNN, in which the RMSE and the RMSRE reduced by 12.62% and 6.06%, respectively, whereas the NSE was increased by 2.98%. The superiority of this model (GA-DLNN) did not only appear in comparison to simple models, but also appeared when compared to hybrid models (GA-SVR, GA-ANN, and GA-M5Tree). Generally, the best input parameters improved the models comprising. These input combinations are considered the most efficient parameters which significantly affect the PRSC.

For visualization assessment, scatter plots for each hybrid model were shown in Figures 1013. These figures are very important in evaluating the performance of each predictive model. Besides, the best model should be established based on fewer input variables as well as generating predicted values with less diversion from the actual ones. It can be seen from these figures that GA-DLNN-M6 produced the highest accuracy performance with R of 0.96 with respect to other hybrid models, they generated lower accuracy of performances, and, in most cases, they required many input parameters to gain slight improvements.

Figure 14 portrays more concrete and convincing statistical relationship between the forecasted and the actual shear strength, using the Taylor diagram. A visual comparison of the four plots shows that the points corresponding to the high-end hybrid models M6–M8 resulted in points closer to the point corresponding to the actual data. Among these models, M6 (7 input model) lies much closer except for the SVR-based hybrid model. Also, it is clearly evident that the GA-DLNN-M6 hybrid model recorded the closer predicted values to the actual ones. This also supported the selection of the adopted model (GA-DLNN-M6), which has been considered in this study, and its assessment was very consistent with other quantitative and visualized assessments which performed previously. Furthermore, it can be noted that the GA-DLNN-M6 model presented fewer input parameters with the best accuracy in comparison with other comparable AI models and yielded more accurate predictions of PRSC values based on all quantitative and visualized assessments. The adopted technique for the selection of the best and most suitable input parameters for suggested approaches has in general a significant influence on the predictive models’ performances by removing the redundant information and hence producing sufficient and clean data to the predictive models [67]. Finally, the integration of GA with a deep learning model was found to produce the best model in terms of minimizing the forecasted errors according to assessments carried out using different statistical measures.

In this study, a novel hybrid modelling framework for the Perfobond rib connectors based on several AI methods has been presented. This study performed rigorous sensitivity by considering eight different combination models, in which the input parameter is optimally selected by the effective utilization of GA and the following four AI methods (DLNN, M5Tree, ANN, and SVR) as prediction tools. The use of GA was very successful in selecting the optimal number and combination of the predictor dataset, which considerably reduced the model complexity. Increasing the number of input parameters alone will not help in improving the predictive power of AI models; instead a recognition of appropriate predictor dataset is very important. The DLNN displayed excellent generalization capabilities in understanding the nonlinear relationships between the candidate variables and PRSC. Hybridizing DLNN with GA successful in identifying the 7-input model (M6) was found to be the best for shear strength predictions, i.e., it was found to be successful in identifying the best model with the least number of input parameters with excellent prediction skill for the shear strength predictions of PRSCs. A multitude of statistical performance evaluation measures and graphical representations confirmed the robustness of the DLNN-GA hybrid model for prediction of shear strength of Perfobond rib connectors. This could solve many complexities and problems in structural engineering.

4.3. Validating the Proposed Model against Several Models Conducted in the Previous Research Studies

It is an important aspect to validate the reliability and accuracy of the suggested GA-DLNN model in predicting the PRSC capacity against the recognized researches over the literature studies. Herein, the results obtained by the GA-DLNN model over the testing phase are validated against some predictive models which were described in the literature. Allahyari et al. [16] developed several models based on ANN approaches trained by Bayesian Regularization (BR) backpropagation algorithm to predict the capacity of PRSC. The main challenge was to probably select the best input combinations; therefore, they adopted classical and statistical methods. The adopted approaches may lead to select redundant information and decrease the efficiency of the predictive models. Subsequently, this study revealed that the best models were GA-DLNN based on 7 input parameters; therefore, we compared the proposed model (GA-DLNN) with several models established in [16] depending on 7 to 10 parameters (i.e., 7-BR1 and 8-BR1). The comparison assessment, as shown in Figure 15, reveals that the proposed model in this study outperformed the comparable models. Moreover, Oguejiofort and Hosaint [2] utilized empirical models to predict the shear strength capacity and yielded good accuracy of prediction with R2 of 0.8577, as shown in Figure 15. In accordance with these comparative analyses, it can be observed that the adopted models of this study shows better performance prediction and yielded a higher value of R2 than the comparable models.

5. Conclusions

Accurate prediction of Perfobond Rib Shear Strength is very important in structural engineering sectors. In this investigation, four AI approaches comprising ANN, SVR, DLNN, and M5Tree were developed using 90 experimental samples collected from previous studies. Ten input parameters were used in this current study including concrete compressive strength , area of concrete dowels , rib holes number (n), area of cross reinforcement bars and yield stress of reinforcement bars , area of cross reinforcement bars and the tensile strength of cross rebar, area of the connector at the end-bearing zone , the ratio between the thickness of the concrete slab to connector height , connector height , the contact area between the connector and the concrete , and coefficient of end-bearing force . The simulated result revealed that the DLNN model achieved a high prediction accuracy and outperformed the comparable models based on ten input variables. Moreover, the SVR models were the worst predicted models during the training and the testing phases. In order to reduce the number of input parameters, we hybridized the AI models with a bioinspired natural optimization approach called a genetic algorithm (GA) for properly selecting the best input parameters for each AI model separately. Optimal selection of input combinations can effectively reduce the complexity of the model, thereby, obtaining better generalization capabilities, decreasing the computational cost, and increasing the quality of predictions, as well as removing the redundant information. The obtained results showed that the hybridization of AI models with GA importantly improved the prediction accuracy for all predicted models (GA-ANN, GA-M5Tree, and GA-DLNN) except the GA-SVR model. Moreover, the GA-DLNN produced a higher accuracy of performance than other hybrid modelling approaches. The obtained result revealed that the GA-DLNN models required only 7 input parameters to generate the best result accuracy in comparing other hybrid models and classical AI models (DLNN, ANN, SVR, and M5Tree) which were developed based on ten input parameters. Additionally, the outcomes of this study illustrated that removing three input parameters (area of the connector at the end-bearing zone, connector height, and concrete slab thickness) efficiently improved the prediction accuracy of GA-DLNN. The remarkable observation in this study is that is possible to accurately predict the Perfobond rib shear strength with fewer input parameters. This study found that the proper selection of input parameters has a great influence on the performances of AI models. Accordingly, the recommendations for future studies are to intensify the use of GA as a feature selection and combine that algorithm with different AI models to address the most difficult issues related to structural and material engineering.

Abbreviations

PRSC:Perfobond rib shear strength connector
DLNN:Deep learning neural network
SVR:Support vector regression
ANN:Artificial neural network
GA:Genetic algorithm
FE:Finite element method
GP:Genetic programming
ANFIS:Adaptive neuro-fuzzy inference system
LR:Linear regression
:Concrete compressive strength
:Area of concrete dowels
(n):Rib holes number
:Area of cross reinforcement bars and yield stress of reinforcement bars
:Area of cross reinforcement bars and the tensile strength of cross rebar
:Area of the connector at the end-bearing zone
:The ratio between the thickness of the concrete slab to connector height
:Connector height
:The contact area between the connector and concrete
:Coefficient of end-bearing force
(MT):Model tree
PART:Partial and regression tree
SDR:Standard deviation reduction
RMSE:Root mean square error
MAE:Mean absolute
MAPE:Mean absolute percentage error
RRMSE:Relative root mean squared error
RE:Relative error
MRE:Mean relative error
NSE:Nash–Sutcliffe efficiency.

Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.