Skip to content
BY 4.0 license Open Access Published by De Gruyter February 17, 2022

On the correction of errors in English grammar by deep learning

  • Yanghui Zhong EMAIL logo and Xiaorui Yue

Abstract

Using computer programs to correct English grammar can improve the efficiency of English grammar correction, improve the effect of error correction, and reduce the workload of manual error correction. In order to deal with and solve the problem of loss evaluation mismatch in the current mainstream machine translation, this study proposes the application of the deep learning method to propose an algorithm model with high error correction performance. Therefore, the framework of confrontation learning network is introduced to continuously improve the optimization model parameters through the confrontation training of discriminator and generator. At the same time, convolutional neural network is introduced to improve the algorithm training effect, which can make the correction sentences generated by the model generator better in confrontation. In order to verify the performance of the algorithm model, P-value, R-value, F 0.5-value, and MRR-value were selected for the comprehensive evaluation of the model performance index. The simulation results of the CoNLL-2014 test set and Lang-8 test set show that the proposed algorithm model has significant performance improvement compared with the traditional transformer method and can correct the fluency of sentences. It has good application values.

1 Introduction

English is currently one of the most widely used languages in the world, but some English speakers’ mother tongue is not English. Limited by the different language habits and cultural backgrounds of people in different regions, various grammatical errors often occur in the process of writing and using English [1]. These grammatical errors will bring great trouble to users and information audiences. However, if we blindly use manual error correction, there will be much work, which is difficult to meet the needs of English use in the information age [2]. However, in the traditional English translation software, the error correction ability of English grammar is very limited, and an efficient English grammar error correction technology that can identify and correct needs to be developed. Using computer programs to achieve automatic error correction of English grammar can realize the practical significance of high efficiency and accuracy, and can liberate many language teachers from simple grammar correction work to carry out more difficult and complex teaching error correction [3].

Automatic error correction in English grammar uses computer technology to set up a program to find and correct text errors, which is of great practical significance. The use of automatic error correction in English grammar can greatly reduce the teaching cost and even play a role as a teacher in many areas. The most ideal grammar correction system is to recognize all kinds of grammatical errors in the sentence text. In recent years, neural network technology has achieved success in natural language processing, and it is possible to apply it to English grammar correction. There are three stages in English grammar correction as a whole: the first is based on artificial rules, the second is based on statistical classifier, the third is the current mainstream method, that is, the method based on machine translation (MT) [4]. In the typical MT technique, there are some problems such as loss assessment of exposure deviation of the mismatch, which will lead to the serious deficiency of the performance of the English grammar correction model. Therefore, a learning model against the problem is proposed to improve the performance of the English grammar correction model.

2 The research status of English grammar error correction

At present, deep learning has become a widely used machine learning technology, which has a very prominent application performance in various fields, such as speech recognition, text recognition, image processing, and so on. Zhang and others have made a comprehensive analysis of the application of deep learning. In-depth learning, including the supervised and unsupervised strategies, are used to express the multiple features in the learning hierarchy, and classify and recognize the patterns. In the big data solution, deep learning plays a very important role, and especially in the face of large capacity and high speed and accuracy requirements, the application effect is significant [5]. There are great differences between English expression, usage standard and Chinese, which makes it necessary to require computer-aided teaching tools to have higher and special requirements when dealing with English text. Hu and others have constructed English grammar correction model by neural network technology, and combined with logistic regression model to improve the error correction rate of the English grammar correction model. A text correction optimization feature representation method [6] has been proposed. In the process of English learning and writing, it is very important for English grammar learning. Kao and others have analyzed the grammar learning process of 306 students of Polytechnic University, especially investigated the ideas of correcting grammatical errors in the learning process, emphasized the process of English writing, and the need to pay attention to the relationship between grammar learning and writing. When students face grammar correction, they are more negative [7]. Yang and Yang put forward an intelligent scoring system for English translation by computer based on natural language processing. Using this system, back propagation neural network can be used as an adaptive learning model, so as to realize the English writing assistant teaching system based on automatic scoring technology [8]. Jónsson and others studied the neural network-based machine translation system, and stated that the system can effectively remove the noise fragments in the corpus, and have a very good performance. The conventional transformer system model is shown in Figure 1 [9].

Figure 1 
               General transformer model structure.
Figure 1

General transformer model structure.

Premjith and others have combined deep neural network and machine translation to improve the reasonable expression of words through deep neural network learning. They collected sentences from different sources and cleaned them up, constructed a parallel corpus, and successfully achieved analog translation [10]. A corpus is the key to the realization of text data processing in machine translation. Chen and others believe that the traditional frequency-based approach will produce a large deviation in word sorting. A corpus is proposed to deal with the problem of large deviation by combining Hirsch index, and the simulation results show that the method has advantages over the traditional method [11]. Dhyani and Kumar put forward the construction of neural machine translation model by using a deep learning algorithm, and adopted the bidirectional recursive neural network model so that large long sentences can be translated, and the learning rate of the model is provided [12]. Naghshnejad et al. proposed that syntax error processing includes two parts: syntax error detection and syntax error correction, and proposed a deep learning method to realize syntax error processing. Its process mainly includes data preparation stage, model learning stage, and final error correction stage [13]. Raheja and others proposed a grammar error correction method of antagonistic learning method, which uses the discriminator to judge the grammar errors in English sentences, obtains the corresponding correct sentences through training, compares different grammar types, and finally makes further adjustment through the strategy gradient method to obtain better results [14].

From the current analysis of deep learning algorithm, machine translation, and English error correction, we can see that (1) the current mainstream translation model faces difficulties in achieving good English grammar error correction function, and in the process of error correction, it is geared more toward modifying it as a whole, and it is difficult to recognize and distinguish specific errors by combining the whole; (2) the application of deep learning algorithm in English grammar error correction is very limited, and deep learning algorithm, especially neural network algorithm, has a good training process, which can better identify errors in English grammar and continuously improve the correct rate through learning and training; (3) and manual correction of English grammar is time-consuming and laborious, and it may not be able to achieve the ideal state. Therefore, this paper proposes a confrontation network model based on deep learning to construct English grammar error correction model and uses confrontation learning training to realize English grammar error correction function.

3 Construction of English grammar error correction model based on deep learning algorithm

3.1 An analysis of the need for error correction in English grammar

English grammatical error correction refers to putting forward the grammatical problems existing in the written sentences and correcting these errors to get a correct sentence. From the analysis of daily English grammar problems, we can see that the common grammatical errors mainly include nouns, verbs, prepositions, singular and plural nouns, subject predicate, and so on. Due to the complexity of grammatical errors, it is quite difficult to correct them automatically. In English grammar automatic error correction, the common difficulties are as follows:

  1. There is more than one error in a complete single sentence. These multiple errors will make it more difficult for the machine to recognize. It is easy to form a misjudgment based on the whole sentence.

  2. As there are many types of errors involved in the use of grammar, and different types of errors may cross each other to form some new complex grammatical errors. Even some errors with low frequency may also form a variety of grammatical errors.

  3. The composition of many words is often correct, but it is wrong to put them in the specific context of the text. Therefore, when judging, we must make a comprehensive judgment combined with the context, which undoubtedly increases the difficulty of English grammatical error recognition.

The correction of English grammar is not limited to the correction of closed grammatical errors such as articles, prepositions, and verb forms but should be extended to open grammatical errors, such as word order, collocation, and word selection. According to the common English grammatical errors, they can be divided into five levels: textual structure, semantics, pragmatics, syntax, and vocabulary. In the past, grammatical error correction in English translation would not be limited to the partial grammatical errors in the text but was regarded as a monolingual translation task from the sentence level of the whole text. Traditional English grammar error correction often has the following two obvious shortcomings: first, the exposure deviation between the sequence and the sequence model is large, that is, when it is difficult to output the correct prediction in a certain time step of the model, it will affect the subsequent time step of the whole model, and it becomes difficult to return to the correct track smoothly; and the second is that the performance of word granularity prediction has a decisive impact on the loss of the whole model, so this type of model usually adopts phrase or sentence level when selecting evaluation indicators.

3.2 Basic principles of generative confrontation network learning model

In order to solve the problem of English grammar error correction, this study proposes a confrontation network model based on deep learning. The birth of generative confrontation network model is the inevitable product of the upsurge of modern artificial intelligence. At present, artificial intelligence is divided into perception stage and cognitive stage. Machines have their own understanding of the world in the cognitive stage, but this understanding is the internal performance that cannot be directly measured, and confrontation network can deepen the understanding of artificial intelligence. As a typical network learning model in deep learning, neural network can solve the problem of data training to a certain extent in the face of a large number of data analysis and data required by submission calculation ability. Using this idea of game confrontation, the training process of game confrontation is realized through generator and discriminator so as to continuously optimize the learning to get the best model parameters. The general calculation flow of generative countermeasure network is shown in Figure 2.

Figure 2 
                  General calculation flow of countermeasure network.
Figure 2

General calculation flow of countermeasure network.

In the framework of confrontation network learning, there is a binary classification model as a discriminator. In the process of confrontation learning, the generator and the discriminator are part of cooperative training, and they promote each other through training. At the beginning of the training, the error correction of manually labeled grammatical sentences is taken as positive samples, and the error correction sentences generated by the iterative update generator are taken as negative samples. The discriminator can improve its discrimination ability by discriminating the two samples. The training process of the discriminator will be fed back to the generator, and the generator will continuously improve its own parameters to cheat the discriminator by correcting sentences with higher quality. The overall confrontation learning framework of the confrontation network model is shown in Figure 3.

Figure 3 
                  General learning framework of confrontation network model.
Figure 3

General learning framework of confrontation network model.

It can be seen from the analysis in Figure 1 that in the confrontation learning framework, the generator adopts the encoder model of the sequence order framework, and the discriminator is a binary classification model. The basic principle of the model is derived from convolutional neural network. The learning and training process of the confrontation learning framework can be understood as that the generator generates the correct sentence timed step based on the random strategy. After generating a complete correct sentence after correction, it inputs the uncorrected wrong sentences into the discriminator, and the discriminator discriminates these sentences. For the output result of the discriminator, the model is fed back to the generator in the form of probability value to encourage it to determine the correct error correction sentence. The learning process of confrontation network is to adjust the parameters of the discriminator constantly in order to obtain greater probability value.

Let a generator G be trained based on parallel corpus ( X , Y ) . the original error sentences not corrected are represented by x , and the error correction target sentences generated by the generator are y .

In time step t , the generator’s state is s , the prefix sequence generated in the current time step is ( y 1 , y 2 , , y t 1 ) , and the next generated word y t is the deterministic state transition of the generator based on the random strategy model. The action a of y t word generated by generator can be understood as the probability of y 1:t−1y t:1 from state is 1.

3.3 Optimization of English grammar error correction model based on deep learning

The error sentences at the source end are x, the target sentences are y after manual correction, and the sentences generated by model generator are y′. They constitute the modeling problem of English grammar correction. Convolutional neural network as the basic component of discriminator can give full play to its superior performance in classification tasks. The model of the countermeasure network is optimized and improved based on convolutional neural network. The improved structure of the discriminator of the model of the antijamming network is shown in Figure 4.

Figure 4 
                  Discriminator structure of confrontation network model.
Figure 4

Discriminator structure of confrontation network model.

In Figure 4, (x, y) is the input sentence of discriminator network. The first step of discriminator model is to combine x and y word vectors into an input representation similar to two-dimensional image. In this image, the width and height corresponds to the length of sentences y and x, respectively. Let the ith word in x correspond to the jth word in y. According to the previous image formation rules, the position feature mapping of the input matrix (i, j) can be expressed as

(1) z i , j = [ x i , y i ] .

The convolution operation based on convolution neural network can capture the corresponding relationship between x and y in the window as

(2) z i , j ( 1 , f ) = σ W ( 1 , f ) z i 1 , j 1 ( 0 ) z i 1 , j ( 0 ) z i 1 , j + 1 ( 0 ) z i , j 1 ( 0 ) z i , j ( 0 ) z i , j + 1 ( 0 ) z i + 1 , j 1 ( 0 ) z i + 1 , j ( 0 ) z i + 1 , j + 1 ( 0 ) + b ( 1 , f ) .

In equation (2), a is σ nonlinear activation function and its relu is

(3) Relu ( x ) = x , x > 0 , 0 , x 0 .

After completing the convolution operation of 3 × 3, the pooling operation of 2 × 2 window size is given by

(4) z i , j ( 2 , f ) = max z 2 i 1 , 2 j 1 ( 1 , f ) z 2 i 1 , 2 j ( 1 , f ) z 2 i , 2 j 1 ( 1 , f ) z 2 i , 2 j ( 1 , f ) .

By repeating the above steps, we can capture the fragment correspondence between x and y at different levels of abstraction. The feature image pixels are evaluated and stitched, and classified by a fully connected network layer so that the discriminator outputs a probability value, which is in the range of [0,1]. Set the initial state of generating and correcting sentences to be corrected is s 0 = ( BOS ) , the generator generates the correct sentence as y 1:T , and the generator generates the next y t word according to policy G θ (θ is the generator parameter). This operation process can be expressed as the action value function R D G θ ( y 1 : t 1 , x , y t ) , and thus the generator target function formula can be expressed as

(5) H ( θ ) = y 1 : T G θ ( y 1 : T x ) R D G θ ( y 1 : T 1 , x , y T ) .

The value of manually labeled probability in the output sentence of the model pair discriminator is R D G θ ( y 1 : T 1 , x , y T ) , and its calculation formula is

(6) R D G θ ( y 1 : T 1 , x , y T ) = D ( x , y 1 : T ) b ( x , y 1 : T ) .

In equation (6), b(x, y 1:T ) is the baseline value of the output probability of the model, which is introduced to reduce the estimation of reward value. Here, its value can be set to 0.5.

(7) M C G θ ( ( y 1 : t , x ) , N ) = { y 1 : T 1 1 , y 1 : T N N } .

In order to make the discriminator have a clear meaning for the corrected prefix sequence, the model Carroll search strategy is used to generate the word sequence of the corrected sentence. At the end of the search, it is necessary to reduce the estimation variance in the reward value through N repeated Monte Carlo searches in order to avoid the search spaces index level becoming too large. Let the sequence length of the ith search be T i , the current state obtained by Monte Carlo search be (y 1:t ,x), and the single sequence generated by G θ strategy be y t + 1 : T i i .

(8) R D G θ ( y 1 : t 1 , x , y t ) = 1 N n = 1 N ( D ( x , y 1 : T n n ) b ( x , y 1 : T n n ) ) t < T D ( x , y 1 : t ) b ( x , y 1 : t ) t = T .

In the training phase of the confrontation network learning model, the discriminator and the generator improve each other’s performance. The parameter results generated by the discriminator can be updated to the generator to make the generator generate higher quality corrective sentences. This further trains the discriminator, which can be represented by the minimum loss function of the discriminator

(9) min E x , y p data [ log D ( x , y ) ] E x , y G θ [ log ( 1 D ( x , y ) ) ] .

Combined with equation (5), the parameter generation formula of the objective function can be further deduced by

(10) H ( θ ) t = 1 T ( R D G θ ( y 1 : t 1 , x , y t ) θ log ( G θ ( y t y 1 : t 1 , x ) ) ) .

Finally, the gradient optimization algorithm is used to update the generator parameters. Let the learning efficiency of step h be a h , and the formula of parameter update is

(11) θ θ + α h θ H ( θ ) .

Thus, the whole learning process of confrontation training is completed.

4 Model test results

4.1 Index setting

The selected evaluation indexes mainly include precision (P value) evaluation, recall (R value) evaluation, and the comprehensive evaluation index F 0.5 value of the two indexes. Precision evaluation is the proportion of the correct degree of the whole module modification action, and recall evaluation is the proportion of the degree of error in the modification of the module. The prediction results of the model are positive whereas those of the positive ones are TP and FN are the negative ones. The prediction results are negative whereas those of positive ones are FP and those of negative ones are TN. The formula of P value and R value can be obtained as

(12) P = TP TP + FP ,

(13) R = TP TP + FN .

Let the set of sentences to be corrected be n, the correct sentence i to be modified be g i , and the correct sentence i to be modified be e i . The P and R values of equations (12) and (13) can be converted into the following definitions:

(14) P = i = 1 n g i e i i = 1 n e i ,

(15) R = i = 1 n g i e i i = 1 n g i .

In the model, the accuracy rate and recall rate are mutually restricted indicators, so F 0.5 value is selected to comprehensively evaluate the comprehensive performance of the model:

(16) F 0.5 = ( 1 + 0.5 2 ) × R × P R + 0.5 2 × P .

In the judgment on English grammar, if a correct sentence is wrongly judged and corrected, it is better not to judge the play. The introduction of F 0.5 value can weaken the contribution of R value and enhance the contribution of P value.

For the final output of model error correction, it is essentially the result of scoring and ranking, so it is necessary to implement quantitative evaluation on the output sequence of the model, which is expressed by the mean reciprocal ranking (MRR). Suppose there are n sentences that need to be corrected, the result list of the i correct correction is the r i , and the calculation formula of MRR is:

(17) M R R = 1 n i = 1 n 1 r i .

The optimized algorithm flow is shown in Figure 5.

Figure 5 
                  Optimized algorithm flow chart.
Figure 5

Optimized algorithm flow chart.

4.2 Comparative analysis of algorithm results

This paper selected the data of CoNLL-2014 test set and Lang-8 test set to compare the P value, R value, F 0.5 value, and MRR index of the proposed algorithm model and the traditional machine translation transformer method. The analysis results are shown in Table 1.

Table 1

Comparative analysis of training results evaluation of CoNLL-2014 test set

Syntax error correction type Transformer Model of this paper F 0.5 difference
P value R value F 0.5 value P value R value F 0.5 value
Article 52.44 30.18 46.52 54.19 31.18 48.99 +2.47
Preposition 31.92 6.77 17.25 35.18 7.24 18.17 +0.92
Verb form 39.41 12.55 27.97 45.82 15.18 30.54 +2.57
Noun singular and plural 30.57 15.18 25.92 38.15 21.29 32.14 +6.22
Subject predicate agreement 61.54 30.95 50.27 68.37 35.00 59.14 +8.87

Here, five common grammatical errors are selected for error correction, specifically articles, prepositions, verb forms, singular and plural nouns, and subject predicate consistency. The choice of a, an, or the in the article can be inferred from the context. The preposition may have errors in the target word, such as the target word of while the correct word is to. When adding the target word information, it may interfere with the selection of the model. The subject predicate consistency problem can be inferred from the context. Singular and plural nouns may be words similar to water with the same singular and plural, or words with large differences in singular and plural in person, which can give a model according to the word information to achieve better results. There are also different tenses or large differences in dynamics, so its own information can be used to help predict.

Furthermore, the comparison chart of the evaluation indexes of the two methods is drawn (Figure 6).

Figure 6 
                  Comparison and analysis of accuracy of three methods.
Figure 6

Comparison and analysis of accuracy of three methods.

It can be seen from Table 1 and Figure 5 that the proposed algorithm of confrontation network model based on deep learning has better error correction effect and higher error correction performance in five common error types. The difference between the proposed algorithm and transformer method is +08.87.

Furthermore, the data of Lang-8 test set was used for training comparative analysis of the two algorithm models. The overall P-value, R-value, F 0.5-value and MRR-value are compared here, and specific grammatical errors are no longer classified. The test results are shown in Table 2.

Table 2

Comparative analysis of evaluation indexes of two algorithm models for Lang-8 test set data

Algorithm model P value R value F 0.5 value MRR value
Transformer 59.14 27.18 51.18 76.15
Model of this paper 64.33 32.54 58.25 84.21

It can be further seen from Table 2 that the scores of each index of the proposed model algorithm are better than the traditional machine translation method.

Considering that in the literature, specifically [13] and [14], certain typical representative significance in English grammatical error correction are mentioned, the methods of this study are compared with those in the two works. In all, 1,000 groups of data in the CoNLL-2014 data set were selected for testing to verify the different performance of the three syntax error correction methods. The comparative analysis results are shown in Figure 6.

As can be seen from Figure 6, the accuracy of the English grammar error correction method proposed in this paper and the other two methods in the literature decreases with the increase of the number of data. Among them, the decline range of the method proposed in this paper is lower, and it has higher accuracy in the case of the same data set.

5 Conclusion

The purpose of this study is to solve the problem that the current mainstream English grammar error correction function in machine translation finds it difficult to meet the performance requirements, and to propose a convolutional neural network-based learning algorithm model against network so as to achieve higher and better English grammar error correction function. The learning algorithm model of confrontation network is based on the perspective of deep reinforcement learning, using the random strategy of generator combined with the strategy gradient method to solve the gradient backhaul problem. In the process of confrontation training, the output of the discriminator is fed back to the generator by the probability value reward so as to realize the parameter update of the generator and the discriminator in the confrontation training. In order to objectively analyze the performance of the proposed grammar correction algorithm, P value, R value, F 0.5 value, and MRR value were selected for comprehensive evaluation. Through the simulation analysis of the CoNLL-2014 test set and Lang-8 test set, it can be seen that the proposed algorithm based on deep learning has better error correction effect. In order to improve the error correction effect of the algorithm model, we also need to consider introducing data enhancement methods to improve the error correction ability.

  1. Conflict of interest: Authors state no conflict of interest.

References

[1] Alik KR, Alik B. Multi-objective evolutionary algorithm using problem-specific genetic operators for community detection in networks. Neural Comput Appl. 2018;30:1–14.Search in Google Scholar

[2] Jiang H. Coastal atmospheric climate and artificial intelligence English translation based on remote sensing images. Arab J Geosci. 2021;14:1–13.10.1007/s12517-021-08918-ySearch in Google Scholar

[3] Andayani U, Arisandi D, Hasugian M, Syahputra MF, Siregar B. The English language scientific literature classification based on abstract using rocchio algorithm. J Phys Conf Ser. 2019;1235:012059.10.1088/1742-6596/1235/1/012059Search in Google Scholar

[4] Malik S, Bawa S. A Sanskrit-to-English machine translation using hybridization of direct and rule-based approach. Neural Comput Appl. 2021;33:2819–38.10.1007/s00521-020-05156-3Search in Google Scholar

[5] Zhang Q, Yang LT, Chen Z, Li P. A survey on deep learning for big data. Inf Fusion. 2018;42:146–57.10.1016/j.inffus.2017.10.006Search in Google Scholar

[6] Gaikwad V. English language learners' response to written corrective feedback. Int J Comp Lit Translat Stud. 2021;8:64–79.Search in Google Scholar

[7] Zhou S, Liu W. English grammar error correction algorithm based on classification model[J]. Complexity. 2021;2:1–11.10.1155/2021/6687337Search in Google Scholar

[8] Yang H, Yang Y. Design of English translation computer intelligent scoring system based on natural language processing. J Phys Conf Ser. 2020;1648:022084 (5pp).10.1088/1742-6596/1648/2/022084Search in Google Scholar

[9] Lin N, Chen B, Lin X, Wattanachote K, Jiang S. A framework for indonesian grammar error correction. ACM Transactions on Asian and Low-Resource Language Information Processing. 2021;20:1–12.10.1145/3440993Search in Google Scholar

[10] Premjith B, Kumar MA, Soman KP. Neural machine translation system for english to Indian language translation using MTIL parallel corpus. J Intell Syst. 2019;28:387–98.10.1515/jisys-2019-2510Search in Google Scholar

[11] Chen L, Chang K. A novel corpus-based computing method for handling critical word-ranking issues: Anexample of COVID-19 research articles. Int J Intell Syst. 2021;36:3190–216.10.1002/int.22413Search in Google Scholar

[12] Dhyani M, Kumar R. An intelligent Chatbot using deep learning with bidirectional RNN and attention model. Mater Today Proc. 2020;34:817–24.10.1016/j.matpr.2020.05.450Search in Google Scholar PubMed PubMed Central

[13] Naghshnejad M, Joshi T, Nair VN. Recent trends in the use of deep learning models for grammar error handling. arXiv. 2009;02358.Search in Google Scholar

[14] Raheja V, Alikaniotis D. Adversarial grammatical error correction. arXiv. 2010;02407.10.18653/v1/2020.findings-emnlp.275Search in Google Scholar

Received: 2021-07-26
Revised: 2021-11-07
Accepted: 2021-12-15
Published Online: 2022-02-17

© 2022 Yanghui Zhong and Xiaorui Yue, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 2.6.2024 from https://www.degruyter.com/document/doi/10.1515/jisys-2022-0013/html
Scroll to top button