Abstract

Evapotranspiration represents the water requirement of plants during their growing season, and its accurate measurement at the farm is essential for agricultural water planners and managers. Field measurements of evapotranspiration have always been associated with many difficulties that have led researchers to seek a way to remotely measure this component in horticultural and agricultural areas. This study aims to investigate an indirect approach for daily rice crop evapotranspiration (ETc) measurement by machine learning (ML) techniques and the least available climatic variables. For this purpose, daily meteorological variables were obtained from three ground meteorological stations in rice cultivation regions of northern Iran during 2003–2016. The ETc rates were calculated by seven meteorological variables, the FAO-56 Penman-Monteith equation, and the regional calibrated rice crop coefficient and considered as the reference data. The MLs, including Multilayer Perceptron (MLP), Radial Basis Function (RBF), Generalized Regression Neural Network (GRNN), and Group Method of Data Handling (GMDH), were utilized for ETc modeling. Different input combinations were applied, based on the use of minimum effective variables as input. Results showed that the models showed the most accurate performances in the input combination of four climatic variables: sunshine duration, maximum temperature, relative humidity, and wind speed. Investigating the accuracy of models in rice growth phases showed that the least estimation error belonged to the initial growing stage, which increased during the mid-season and late-season growing stages. A comparison of the models showed that the GMDH model performed better against the other competitors. For this model, both the Nash-Sutcliffe (NS) coefficient and R2 were greater than 0.98, and the Root Mean Square Error (RMSE) ranged between 0.214 and 0.234 mm per day in all stations. The current approach showed promising results in rice evapotranspiration modeling by only four common meteorological variables and can be reliably applied for indirect measurement of this variable over the rice farms. The studied approach will have research value for rice and other crops in similar/different climatic conditions.

1. Introduction

Evapotranspiration is one of the key components in the hydrological cycle. This variable is defined at the field level, which indicates the total water output from the soil and plants and represents the water required by a plant during its growing season. Evapotranspiration is affected by climatological and meteorological factors in a region, such as solar radiation, temperature, humidity, and teleconnection patterns. It is also one of the most important variables in studying dryness stress of the plants and monitoring agricultural drought conditions, which are used in the calculations of drought indices like the Standardized Precipitation-Evapotranspiration Index and Palmer Drought Severity Index [1, 2]. Therefore, its accurate measurement at the farm level is crucial and necessary for farmers and managers, as well as planners of agricultural water resources. Its value can be measured by a lysimeter embedded in a field. The lysimeter is sensitive in terms of accuracy so a full-time technician specialized in the field is needed to eliminate the errors and provide accurate measurements of evapotranspiration. It makes working with the lysimeter extremely difficult, and researchers turned to numerical models to solve this problem. These models can estimate the reference crop evapotranspiration (ET0), which refers to the evapotranspiration of grass plants, using meteorological variables such as temperature, humidity, and wind speed. The evapotranspiration of a particular product (ETc) is obtained by multiplying ET0 by the plant coefficient of that plant. Several numerical models such as Hargreaves-Samani, Blaney-Criddle, Thornthwaite, and FAO-56 Penman-Monteith (FAO-56 PM) have been introduced for this purpose, among which the World Meteorological Organization (WMO) model FAO-56 PM can be recommended as the most suitable and a reliable alternative to lysimeter data [3]. Outputs of the FAO-56 PM method are now completely acceptable to be considered as observational ET0 rates in ET0 modeling and generally ET0 related studies [413].

In recent years, machine learning (ML) models have also been used to estimate and predict evapotranspiration. These models are also known as black-box models that can discover complex numerical relationships between input-target variables with high accuracy. In a comparative study in California, Kumar et al. [14] showed that artificial intelligence models such as the Multilayer Perceptron (MLP) can estimate true daily ET0 data (lysimeter) even more accurately than the FAO-56 PM model. The MLP model has also been compared with FAO-56 PM in estimating the daily evapotranspiration of barley crops in an agricultural research station at Shiraz University, Iran [15]. The results were validated by weighing lysimeter and, due to the accuracy and also adaptability with the limitation of climatic data, the superiority of MLP’s performance was reported. Antonopoulos et al. [16] investigated the MLP model in estimating daily evapotranspiration in Greece. They also used FAO-56 PM outputs as a basis and showed that the MLP model has a reliable performance in estimating ET0 with minimal variables. In southeastern Australia, Falamarzi et al. [17] used the FAO-56 PM model as a basis for measurement and they examined the performance of the MLP model in ET0 estimation. They showed that this model is only able to provide an accurate estimate of ET0 with temperature and wind speed variables. Artificial intelligence models such as Support Vector Machine (SVM) were also used to estimate evapotranspiration. This model provided an acceptable estimate of ET0 on a daily scale with the minimum meteorological variables in estimating daily ET0 in central [18], south [19], and southeast China [20], as well as in Florida in USA [21] and in Turkey [22]. SVM also provided good results in monthly ET0 long-term prediction [23]. Using the MLP, Adaptive Neurofuzzy Inference System (ANFIS), and Support Vector Regression (SVR) models, Nourani et al. [24] performed the same work for neighboring countries in three continents of Europe, Asia, and Africa. They also reported the mentioned models as valid.

ET0 estimation by MLs has also been welcomed in Iran. The commonly utilized models in this way were SVM, SVR, Multivariate Adaptive Regression Splines (MARS), and Gene Expression Programming (GEP). Mehdizedeh [25] used MARS and GEP for this purpose in the semiarid, arid, and hyperarid climates of Iran. Mehdizedeh showed that the mentioned models in these climates have acceptable performances in estimating ET0. In similar areas (in terms of climate) with previous research in Iran, Mohammadi and Mehdizadeh [26] also used ML models such as SVR. They calculated the ET0 by the FAO-56 PM model and considered it as a measurement basis. The results revealed that this model could also estimate ET0 with very high accuracy with minimum inputs (including mean temperature, minimum temperature, solar radiation, and wind velocity). Ahmadi et al. [27] also estimated monthly ET0 in arid areas of Iran. They also found ML models to be appropriate in estimating monthly ET0 using minimum inputs. Generalized Regression Neural Network (GRNN) and Radial Basis Function (RBF) neural network are two other MLs that are successfully used for ET0 estimation in different areas around the world. GRNN was developed for Turkey [28, 29], China [30, 31], United States [32, 33], and Algeria [34]. RBF has also been utilized in United States [33, 35, 36], Italy [37], Algeria [34], Turkey [28], India [38], and Serbia [39]. But both have rarely been used in ET0 estimation of Iranian climates; GRNN has not been used and RBF was only reported in a study by Hashemi and Sepaskhah [15]. Some of the previously studied cases in evapotranspiration estimation are illustrated in Table 1.

As shown in Table 1, evapotranspiration has been modeled by artificial intelligence models in different parts of the world. The similarity of these studies is that they used meteorological variables as estimator input, and all of them evaluated these models as efficient in this regard. Also, in most such studies, the models were able to estimate the ET0 variable and have not focused on a special crop’s evapotranspiration. In the literature, there were only two evapotranspiration estimation studies of the crops barley [15] and potato [46], which were examined in Iran and Italy, respectively. In this case, the models MLP, RBF, and GRNN were the most commonly used cases.

Group Method of Data Handling (GMDH) neural network is a polynomial-based ML model, which can solve complex nonlinear function approximation problems. In hydrometeorological studies, GMDH has been well used to estimate and predict the variables such as radiation [47], drought indices [4852], river flow [53, 54], and snow [55, 56], but it has very rarely been used in evapotranspiration estimation studies. The current study aims to examine the performance of the GMDH model for the first time in a special crop’s evapotranspiration estimation case and compare it with the commonly used models such as MLP, RBF, and GRNN. Investigations indicate that ET0 estimation in Iran is mostly done in arid, semiarid, and hyperarid areas. The northern part of Iran alongside the Caspian Sea has a humid climate, which is very important in terms of the cultivation of high-quality rice. So, the present study is an attempt to evaluate the mentioned ML types in estimating the daily evapotranspiration rates of this region. As another novelty, this study focuses on the evapotranspiration of the rice crop, which is a plant with high water requirement and also the main crop with a high cultivation area in the mentioned region. In fact, this study tries to find a numerical approach, for the indirect measurement of the evapotranspiration variable, on the arable lands under rice cultivation.

2. Materials and Methods

2.1. Study Area and Data

This study focuses on the southern areas of the Caspian Sea. Based on the extended De Martonne classification [57], these areas are in the moderate temperature class and the perhumid, humid, semihumid, and Mediterranean rainfall classes. This area in Iran includes three provinces of Gilan, Mazandaran, and Golestan, where most of the agricultural lands are under extensive cultivation of different rice cultivars. To calculate the reference evapotranspiration and then the rice evapotranspiration in these areas, three cities that are capitals of these three provinces were studied. Figure 1 shows the location of this area and the cities studied in it.

The data studied in this study on a daily scale during 2003–2018 were received from the Iranian Meteorological Organization (IRIMO). These data include 10 variables: minimum air temperature (Tmin), maximum air temperature (Tmax), mean air temperature (T), minimum relative humidity (RHmin), maximum relative humidity (RHmax), mean relative humidity (RH), sunshine duration (SSD), precipitation (P), wind speed at a height of 2 meters (U2), and pan evaporation (Epan). The ET0 value on a daily scale was calculated using the FAO-56 PM equation and the variables of Tmin, Tmax, RHmin, RHmax, SSD, U2, and P. Evapotranspiration package in R software was used for this calculation. Then, the growing period of rice in this area, from May to August, was separated from the whole period. Evapotranspiration rates of rice (ETc) at different stages of its growth were calculated according to the rice crop coefficients. For this, the FAO recommended rice crop coefficients and also locally calibrated coefficients exist, which are shown in Figure 2.

The local coefficients of rice are extracted from previously studied cases in southern Caspian Sea regions [5860]. This figure illustrates that the FAO recommended rice crop coefficient in the initial stage of the growth period (May) is within the range of local calibrated coefficients and can be acceptable for this stage. But, for both other stages, the mid-season stage (June-July) and late-season stage (August), the FAO coefficients are far from the local coefficients and significantly smaller, especially for late-season stage. Therefore, using the average value of local calibrated rice crop coefficients to calculate ETc in this region, which is equal to 1.024, 1.390, and 1.094 for initial, mid-season, and late-season stages, respectively, was decided. For modeling, the input-target samples were divided into training and test sections, which include the initial 75% (rice growth periods during 2003–2014) and the final 25% (rice growth periods during 2015–2018) of the study period, respectively. Table 2 presents the statistical characteristics of all data used.

2.2. Machine Learning Models
2.2.1. Multilayer Perceptron (MLP)

This model is the most widely used model of artificial intelligence in the area of numerical modeling in all sciences. These networks typically include a set of sensory units (basic neurons) consisting of an input layer, one to several hidden layers, and an output layer. This method creates a nonlinear mapping between the input-target samples. Input signals from the input layer to the output layer are spread in a forward direction [61]. Along this direction, the desired transfer function is applied to the input variables, and weight () and bias (b) are multiplied by it in each layer. Finally, after optimizing the objective function, the output variables are extracted from the last layer with a linear transfer function. Figure 3 shows a schematic structure of the steps of this model.

To implement this model, different transfer functions such as hyperbolic tangent sigmoid (tansig), logarithm sigmoid (logsig), saturating linear (satlin), and linear (purelin) were examined and tested. The Levenberg-Marquardt (LM) algorithm was also used to train this model. See [62] for more information on the details and mathematical equations of this method.

2.2.2. Radial Basis Function (RBF)

The RBF model, like the MLP, has input, hidden, and output layers. The input layer receives and collects the data and then formulates the input vector X. The hidden layer consists of L number of nodes that apply nonlinear transformations to the input vector and the output layer receives the final response. The output of RBF network is formed by a linear combination of hidden layer responses, defined as the following equation [63]:where ‖-‖ is Euclidean distance norm, k is the maximum hidden layer neuron, ∅j is the response of neuron j in the hidden layer, is the output weight, is the input vector, yj is the output of the output neuron j, is the center and S is the number of output neurons. Figure 4 shows the schematic structure of this model.

In this network, in the hidden layer, Gaussian radius transfer functions have been used, which are an example of the most widely used radial functions in engineering problems. Also, in the output layer, the linear transfer function has been used. The Gaussian radial transfer function is defined as [64]

In this equation, σi is the width i of neuron in the hidden layer, which is the same as the spread parameter. This parameter is defined as follows [65]:

In the above equation, ci and are the centers of radial functions and p is the number of centers of RBF. The parameters of the RBF model include spread and maximum hidden layer neurons, which are optimized by the trial-and-error method in the current study.

2.2.3. Generalized Regression Neural Network (GRNN)

GRNN model was introduced by Specht [66] as a tool for numerical modeling and approximation of functions. The structure of this model is based on the theory of core regression. From this point of view, this network is equivalent to a nonlinear regression relation and does not require repeated training. GRNN is similar to the RBF neural network in other respects, except for its training process in the second layer. Figure 5 shows the general structure of this model.

In short, GRNN has 4 layers: input layer, pattern layer, summation layer, and output layer. The input layer consists of input vectors that have m dimensions. The pattern layer has n dimensions and performs calculations related to the Gaussian transfer function. The summation layer is the sum of n dimensions of the pattern layer, and finally the output layer yields the model outputs [67].

The GRNN model output is obtained using the following equation:

In the above equation, y is the output of the model and σ is the spread parameter. Di is also a scalar function obtained from the following equation:where x is the input corresponding to y and xi is the input corresponding to yi. The only parameter of the GRNN model is the spread parameter, which is optimized by trial-and-error method in the present study.

2.2.4. Group Method of Data Handling (GMDH)

In general, Volterra-Kolmogorov-Gabor (VKG) polynomial can be used to model complex systems that include a set of several input variables and one target variable:where represents the input vectors, y represents the output vectors of the model, and are polynomial coefficients. VKG polynomials are approximated using quadratic polynomials. These quadratic polynomials are based on binary combinations of network inputs [68]. The GMDH algorithm was also introduced using this idea as a learning method to approximate complex [69]. The GMDH neural network has a multilayer structure and feedforward and consists of a set of neurons that are created by linking different input pairs by a quadratic polynomial. Figure 6 shows the three-layer structure of this model.

In this network, each layer consists of one or more processor units, each of which has two inputs and one output. These units practically play the role of the components of the model and are assumed as a quadratic polynomial:

The unknown parameters of the GMDH algorithm are the polynomial coefficients of the above equation. To calculate the output value for each input vector , the sum of squares of the error should be minimized.

To find the minimum error, the partial derivative of the above equation is used. By placing equation (7) in this partial derivative, a matrix equation (Aa = y) is obtained. In this equation, , and and matrix A is as follows:

One method to solve this matrix equation (Aa = y) is to use the Singular Value Decomposition (SVD) method. If the SVD method is used, will be calculated using the following equation:

In this equation, is transposed into matrix A. Using this solution method, unknown can be computed under any circumstances. If matrix () is not invertible, the Tikhonov method will be used to solve the equation. Its main reference [69] is suggested for complete information on the details of this model. The adjusting screws of this model include the number of layers and the number of neurons in the layers, which were optimized by trial-and-error method in this study.

2.3. Measuring the Accuracy of Estimates

Performance evaluation criteria are used to evaluate the accuracy of the estimates provided by the models. These criteria require observational data and simulated data for calculation. The evaluation criteria used in this study include Root Mean Square Error (RMSE), Nash-Sutcliffe (NS), coefficient of determination (R2), Normalized Root Mean Square Error (NRMSE), and Mean Absolute Error (MAE), and the equations of these criteria are described as follows:

In the above equations, Oi is the observed value on day i; Ei is the estimated value on day i; is the mean of observed values; is the mean of estimated values, Omax and Omin are maximum and minimum observational data, respectively, and n is the number of days under study. The closer the values of RMSE, NRMSE, and MAE are to zero and the closer the values of NS and R2 are to one, the more accurate the model estimates are. The NRMSE also has qualitative classes of model accuracy. According to it, if the NRMSE value is greater than 0.3, the performance of the model is considered poor; if its value is between 0.2 and 0.3, the performance of the model is considered moderate; if its value is between 0.1 and 0.2, the performance of the model is considered good, and if its value is less than 0.1, the performance of the model is considered excellent [53, 70].

In this study, coding in MATLAB software was used to implement machine learning models. Minitab and Excel software were used to evaluate models and draw graphs. Figure 7 provides a general flowchart of evapotranspiration estimation processes and evaluation of current models.

3. Results

3.1. Modeling and Evaluation

In this section, after calculating the evapotranspiration of rice crop (ETc) in the growing period (May to August), the model inputs should be determined first for estimation. For this purpose, a correlation matrix between ETc and 10 meteorological variables was used, the results of which are presented in Figure 8.

To investigate the relations between the variables (Figure 8), the Spearman correlation test was used. The results of this test show that there is a significant correlation between ET0 and all meteorological variables at the level of 0.01 in all three stations. However, the use of all variables can be incorrect in two ways: (1) There is similarity to the FAO-56 PM model. (2) Some inputs (like Tmin or U2) have poor correlation despite the significant correlation, which ultimately leads to reduced model accuracy. Therefore, in this section, we decided to reduce the number of input variables. The FAO-56 PM model used 7 variables (Tmin, Tmax, RHmin, RHmax, SSD, P, and U2) to calculate ET0. Therefore, a maximum of four variables are reasonable to be considered as input combinations to evaluate the ML models under the condition of limited climatic data availability. The highest correlation belongs to the SSD variable, which is selected as a fixed input. Among the temperature variables and among the relative humidity variables, the Tmax variable and the RH variable had the highest correlation with ETc in all three stations. Therefore, these two variables are considered as fixed input along with SSD and form the input scenario of S1. The three variables of P, U2, and Epan are also added separately to S1 and form the other 4 scenarios. These scenarios are shown in Table 3.

The identified input scenarios (Table 3) for each model in each station were tested during the growing period of rice. In each scenario from each station, the parameters of the models were optimized by trial-and-error method and the best arrangement of each model was selected by RMSE and NS criteria. The optimal parameters of the models as well as the evaluation results of the estimates are shown in Table 4.

Since the validity of a model is determined in the test section, this section also deals with the testing phase. Table 4 shows that the NS value in the models is above 0.84, indicating that the accuracy of the models used in estimating rice evapotranspiration is very high. The most accurate performance at the Rasht station belongs to scenario S3 and the GMDH model, which achieved RMSE = 0.220 and NS = 0.986 by arranging 5 layers and 6, 15, 50, 50, and 1 neuron in layers 1 to 5. The poorest performance of this station belongs to the GRNN model in scenario S1, which reached RMSE = 0.602 and NS = 0.894 with spread = 2.95. In Gorgan and Sari stations, the weakest and most accurate models were similar to those in the Rasht station (GRNN and GMDH, respectively).

The evaluation criteria for these two cities are as follows. For Sari station, the strongest performance was RMSE = 0.214 and NS = 0.986 (GMDH, S3) and the poorest performance was RMSE = 0.641 and NS = 0.878 (GRNN, S4). For Gorgan station, the strongest performance was RMSE = 0.234 and NS = 0.990 (GMDH, S3) and the poorest performance was RMSE = 0.898 and NS = 0.847 (GRNN, S1).

Among the scenarios used, in most cases, scenario S3 provided the best performance, and scenario S1 provided the poorest performance. Radar charts were used to examine this issue graphically (Figure 9).

In Figure 9, the overestimated and underestimated days for each model were separated in each of the 16 input scenarios (in accordance with Table 3). Then, the MAE value was calculated separately for them and this was done for all three stations. In general, it is clear that the estimates provided by the models were more in the underestimation state than in the overestimation state. Also, the lowest estimation errors occurred in all four models for scenario S3, which includes SSD, RH, Tmax, and U2 inputs. Another important point is that, in this scenario (S3), the underestimation and overestimation of the models are at the same size. For example, in examining the different scenarios of the MLP model for the Gorgan station, the MAE in the MLP2 (S2) scenario overestimation is equal to 0.422 and in the underestimation state it is 0.613 . In the same station, in the MLP3 (S3) scenario, the overestimation is equal to MAE = 0.197 and the underestimation is equal to MAE = 0.173 . This issue is clearly seen for scenario S3 in all three stations and all the models used, indicating that the models are more balanced with this input combination (SSD, RH, Tmax, and U2) and are better compatible with it.

3.2. Comparing the Models

Scatter plots are used to examine the correlation between ETc estimates and observations (Figure 10). In this chart, the output of the best scenario from each model (scenario S3) at each station was selected and graphically plotted against the observed ETc values.

As the values of R2 indicate (R2 >  95%), all models had a very good performance in ETc estimation. In other words, models tested recently in this area (GRNN, RBF, and GMDH) are also considered suitable for this purpose. A very small slope between the 1 : 1 line and the fitted estimates-observation regression line also confirms this issue. Comparison between models shows that, in all stations, the GMDH model, with a slight advantage over MLP, is recognized as the best estimator of rice evapotranspiration. The R2 value for the GMDH model in Rasht, Sari, and Gorgan stations is 98.76%, 98.83%, and 99.04%, respectively. Among the utilized models, the poorest correlation belongs to GRNN outputs, as it has the lowest value of R2 in all three stations (98.06% for Rasht, 95.02% for Sari, and 97.56% for Gorgan). Also, the scatter of the points around the fitted regression line is relatively weak in the GRNN model. For example, in the Sari station, the points on the GRNN’s graph are more scattered around their regression line, but they are much more concentrated in the MLP and GMDH models.

3.3. Investigating the Accuracy of Models during Different Phases of Rice Growth

In this section, the vegetative phases of the rice plant were separated from each other in the test period, and then the value of NRMSE was calculated between the observed ETc of each station and the estimates of the models in each phase separately. Figure 11 presents the results as area charts.

As shown in area charts, the NRMSE is less than 0.1 in states, indicating that the quality of performance of the models is excellent. Examination of the trend of errors during the growing period of rice shows that the least number of errors occurs in the initial phase, and the accuracy of estimates is somewhat reduced in mid-season phase and late-season phase, respectively. In all rice vegetative phases, GRNN and RBF models are relatively weaker and this is confirmed in all three stations. In the initial phase, especially in Rasht and Sari stations, the GMDH model has the lowest normalized error (0.023 for Rasht and 0.017 for Sari) and is in a better condition compared to MLP (0.030 for Rasht and 0.024 for Sari).

In addition, the superiority of the GMDH model over the others is confirmed with a slight difference in the mid-season stage and late-season stage. To obtain a better understanding of the accuracy of the models during the growing period, one of the rice growing periods in 2018 as a sample is examined graphically (Figure 12). In this section, the most accurate (GMDH) and least accurate (GRNN) outputs of the model, along with their observational values, are shown on time series plots. In this chart, the intensity and weakness of the estimates-observation overlap are clear and the performances of the two models are comparable.

4. Discussion

Several studies have investigated the accuracy of machine learning models in evapotranspiration modeling [16, 17, 2527, 71]. All of these studies have found MLs suitable for ET0 modeling with minimum input variables, so their results are consistent with those of this study. Mohammadi and Mehdizadeh [26] used the SVR AI model in combination with the Whale Optimization Algorithm (WOA) and reached an average RMSE = 0.265 at their best input scenario. However, in this study, the GMDH model, which is a nonhybrid model, could reach an average accuracy of RMSE = 0.222 . In investigating MLP in Greece, Antonopoulos and Antonopoulos [16] reached a maximum accuracy of RMSE = 0.574 . Also, in investigating MLP hybrid model in Australia, Falamarzi et al. [17] reached the maximum accuracy of RMSE = 1.03 . Comparing these two studies [16, 17] with this study suggests that the results of MLP in the present study are evaluated with much greater accuracy (0.222 < RMSE < 0.232 ). The RBF model was used in combination with Particle Swarm Optimization (PSO) in a study conducted by Petković et al. [71] in Serbia in which R2 = 97.13% was reached. However, in this study, the nonhybrid RBF model showed better performance (R2 = 98.62%). Its reason can be attributed to the study plant. In all of the above studies, the study was based on ET0, which refers to the evapotranspiration from the grass surface. However, this study has focused on the evapotranspiration of rice. Moreover, the reason for this difference can be attributed to climatic differences of Iran with regions such as Greece, Serbia, and Australia. Even in the study conducted by Mohammadi and Mehdizadeh [26] in Iran, the study areas had semiarid, arid, and hyperarid climates of western and southwestern Iran, while this study was done on humid areas of northern Iran.

The reason for the difference between the studied models in terms of accuracy can also be attributed to the number of their parameters. For example, in the GMDH model, the parameters of the number of layers and the number of neurons in each layer can create multiple arrangements for this network. In MLP, in addition to these parameters, the transfer function parameter is also involved. Such models will have higher maneuverability in optimization and can discover complex relationships between input-target samples over a wider space. However, the GRNN model has only one parameter, and the RBF has two parameters for optimization, so the structural space is more limited for them, which can cause these models to achieve less accurate estimates of this variable.

In estimation cases of specific crops’ evapotranspiration by ML models, rice has not been studied comprehensively, but there are two studies on potato and barley that will be discussed in continuation. Yamaç and Todorovic [46] in Italy and Hashemi and Sepaskhah [15] in Iran have used MLP and RBF neural networks in evapotranspiration estimation of potato and barley crops, respectively. Similar to this study, they used climatic variables as the models’ input. The mentioned models in the study by Hashemi and Sepaskhah [15] reached 0.26 < RMSE < 0.31 and 0.91 < R2 < 0.93, and in the study by Yamaç and Todorovic [46] they reached RMSE = 0.24 and R2 = 0.98. Comparison of these studies with this study shows that these models are capable for all three crops, and they can be reliably used for indirect measurement of the crops’ evapotranspiration. However, there is a slight difference in accuracy among them, which can be due to the water needs of the crop during the growing season or may be influenced by the geographical and climatic conditions of the areas under study.

5. Conclusion

Performance evaluation of the GMDH model, which was used for the first time in evapotranspiration estimation studies, shows the excellent accuracy of this model. Models such as the GRNN and RBF were also new for this region and had acceptable performance, but they could not compete with the powerful and conventional model of MLP. However, comparing MLP with GMDH showed that GMDH was slightly superior to MLP in all three stations studied. Therefore, this study proposes GMDH to estimate evapotranspiration by least climatic inputs. On the other hand, this study’s approach was estimating evapotranspiration of rice crop, which yielded very promising results. Thus, estimating the daily evapotranspiration of this plant in rice cultivated areas and also other agricultural products in their unique areas has research value. Also, the use of optimization algorithms (e.g., genetic algorithm, firefly, and particle swarm) in combination with models such as GMDH can significantly increase the accuracy of these models, which is recommended to future researchers in the area of evapotranspiration modeling. However, using lysimeter data in rice fields is recommended to validate the estimates provided to gain a more reliable understanding of artificial intelligence models’ accuracy. The current approach is also valuable for remote measurement of rice (or other crops), without the need for field measurements at the farm level, from the meteorological factors that are easily measurable.

Data Availability

The datasets used and/or analyzed during this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

The design of the study and collection, analysis, and interpretation of data and writing of the manuscript are done by the authors.