International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 27 Face Recognition Using Dct And Neural Micro-Classifier Network Abdellatief Hussien AbouAli HICI, Computer Science Dept., El-Shorouk Academy, Cairo, Egypt, Email: dr.abdullatif.hussein@sha.edu.eg; aabouali4@gmail.com; Abstract- In this study, a proposed faces recognition methodology based on the neural micro-classifier network. The proposed methodology uses simple well known feature extraction methodology. The feature extraction used is the discrete cosine transformation low frequencies coefficients. The micro-classifier network is a deterministic four layers neural network, the four layers are: input, micro-classifier, counter, and output. The network provide confidence factor, and proper generalization is guaranteed. Also, the network allows incremental learning, and more natural than others. The proposed face recognition methodology was tested using the standard ORL data set. The experimental results of the methodology showed comparative performance. Keywords: Neural networks; Classifier; Feature extraction; Image processing; Discrete Cosine transform; DCT. 1. INTRODUCTION Face recognition progress is a collaboration effort of researchers in diversity of fields such as: neuroscience, computer vision, psychology, pattern recognition, digital image processing, and machine learning. Those efforts started decades back due to its applications and the demands [1-12]. Security systems, robotics, man-machine-interfaces, digital cameras, games, entertainment, authentication, intelligence, satellite, reconnaissance, as well as image indexing applications use face detection and recognition. Acquired images recognition is not a comparison process with recalled pre-stored known individual‟s images. Face detection is a lengthy search process for features within an image for blocks contains such face features, and possibly tracks them within a video feed. Recognition process includes: finding set of discriminating features, searching for them, extracting and proper learning these features form training data for discriminations. The learning process induces a discriminating metrics or boundaries. After learning, the induced metrics or boundaries are used to stamp the unknown face by its class ID or recalls its associated data. Face recognition difficulties originates from the variance in the acquired image, for same person face, as a result of: scale, translation, illumination, poses, occlusion, clutters, orientation, expression variations, Imaging conditions, the commonalities in large feature are high, and computational complexity [10]. A face recognition based system basically contains detection, feature extraction, learning, and classification. Yan, Kriegman and Ahuja presented a well-accepted classifications for face detection in [11]. This classification considers four categories that may overlap: knowledge-based or ruled-based, feature-invariant, template matching, and appearance-based. The knowledge-based methods, basically contains rules that acts on segmented elements properties and find out relationships. Then, these properties and relationships are verified using knowledge base rules to find out if it could constitute a human face. Feature invariant methods, uses the structure features that are not affected by acquisition conditions such as: pose, view angle, scale, and orientation to locate faces. Template matching uses a pre-stored standard faces and face features to correlate with. Appearance based uses training images to induce templates and appearance variances to be used in the correlation for detection. Feature selection and extraction goal is finding a minimal subset of features from a set of large possible features that if used properly will lead to minimal recognition errors. The factors affecting feature selection include: dimensionality, inclusion of inclass similarity, inclusion of dissimilarity of inter-classes, ease of finding and implementation, computation complexity, and robustness to former mentioned challenges. Feature selection is a focal point in the recognition process. Therefore, it gained the attention of the researchers‟ community. The features presented by the community include [4-9] [13][36-38] include: moments, Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component Analysis (ICA), Multidimensional Scaling (MDS), Self-organizing map (SOM), Active Shape Models (ASM), Gabor Wavelet Transforms (GWT), and Discrete Cosine Transform (DCT). Classifiers, according to [38], are built based on similarity, probability or decision boundaries. Similarity classifiers use similarity metrics to measure the closeness to the class members or the class preset representative(s). The probability based classifiers uses in or out of class probability. Decision boundary based classifiers; basically find out the separating hyper-surfaces between the classes to find out classes polyhedrons that represent the class‟s containers. This process could be done through a training process from data sets through a training vehicle that evolves the surfaces and decision boundaries either implicit or explicit such as case of neural networks. The other way to extract features from the training set which goes through non-iterative process to build up the decision boundaries, such as cases in VQ methods. The classifying vehicles in face recognition could be set to three categories: i) structural ii) statistical and iii) neural networks. In structural, facial attributes such as eyes, nose, mouth, and chin their areas, relative distances and angles between them are used International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 28 [2][15]. In other words local features as well as the relationships features are used. The statistical approach uses image transforms where features are extracted from the whole face [2-9]. The third approach based on learning from known individuals such as: neural networks. Learning directly from raw images requires complex network structures, algorithms of high time complexity [10]. So, Dimension reduction needed through ahead or within the learning process to reduce the complexity and being more focused [16-18]. Artificial Neural networks, ANN, are massive parallel operated interconnected computing elements contains adapted parameters and has associated learning vehicle(s). The training process aims at setting the network parameters to hopefully the proper decision boundaries between the sets. The network basic computational element is the neuron. In the network, neurons organized in layers from input to output. The layers between input and output are called hidden layers. The neurons are interconnected, unidirectional, bidirectional, or both. Neurons interconnections could exist of the same layer, to forward layer, backward one, or combinations. The network topology, transfer function, and learning method are the focal points in network design. The networks topologies includes feed-forward and recurrent [19-20][31-33]. The transfer functions used in neural networks include: linear, sigmoidal, Gaussian, and bi-radial. Networks learning could be supervise, unsupervised or reinforced. The learning could be, also, deep or shallow in structure. From the wildly used networks, multilayer perceptron, radial basis, Hopfield, and self-organizing maps. Neural networks application areas include: functions approximation, classification, forecasting, mapping, security alerts, marketing, classification, as well as recognition. There are many hardware realizations to networks [21]. Neural networks as a classifiers, in general, takes the classification burden to a learning process however there are still basic questions about the proper structure, evolution, and the correct generalization ability [22][23]. The efficiency of face recognition depends on three basic parameters (i) an efficient invariant feature representation with respect to illumination, scaling, rotation, pose... etc. (ii) Classification technique that maps the feature vectors into their appropriate classes with minimal misclassification. (iii) Prober generalization abilities to unknown cases. Human recognition development recognized when babies start classifying all males as father and all females as mother. That is, more or less building a separating hyper plane between the two sets. This classification ability grows and develops by time. So, elder babies consider only males and females with common features with their father, and mother. A finer classification develops by time and association with names in a great complex unknown organization. However, in general, one can infer that adding a new person to some human life requires considering separating him from the previously known others and does not affect among the previous known. Also, one can easily infer that during human‟s recognition some sort of features recall happens. That means, there is some form of features memorization exist, not just the boundaries such as ANN, which is linked to that great recognition vehicle. This study proposes face recognition methodology based on a proposed binary classifier neural network. The binary classifier network built on learning separating hyper-planes between pair of classes proposed in [24]. The classifier doesn‟t require rebuild of knowledge when adding new classes to the system rather it integrate the knowledge of the new class to the network. The proposed network generalization ability guaranteed giving proper selection of the training set. The network structure is deterministic per problem. Moreover, adding members to a class requires rebuild of those class boundaries with others. The remaining of this paper organized as following: Section two outlines the classifier neural network and the face recognition methodology under the study. Section three contains tests and results. Section four is the study conclusion. 2. FACE RECOGNITION METHODOLGY AND CLASSIFER NETWORK The proposed network and recognition methodology operates in three modes: initial learning, recognition, and incremental learning when needed. During the first mode, initial learning; the system learns the separation between the initial set of classes. In the second mode, operational or recognition, unknown face presented to the system to classify. In the third mode, incremental learning, a new classification abilities integrated to the current network or new subjects added to a class training set. In the first mode, the sets of faces that are the subject of the initial recognition run through three stages: a preprocessing, feature extraction then set in classespairs to an elementary learning separation process. In this elementary learning, the learning algorithm simply finds the separating hyper-plane of its two classes if exist or a local-minima if not [24]. A successful first mode produces per class a set of hyper-planes separate a class from others. The elementary learning processes are parallel processes and don‟t include dependencies. The learning process sets its micro-classifier parameters/ weights. The micro-classifier output is similar to the flip-flops. That is, its output ( qq , ). The micro-classifier, MC, assigned the classification of the two classes YX , will have its output q will be 1, if and only if the presented pattern is seen to X class side and vice versa. Figure (1) presents conceptual view of the micro classifier. International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 29 Figure (1) Micro-Classifier Figure (2) Classifier Network In the second mode, an unknown face image runs through three stages: a preprocessing, feature extraction. The feature vector is applied, in parallel, to all MCs. The MCs are linked to classes counter array. The MC designated for classification between the two classes YX , its outputs qq , connected YX , counters respectively. The counter array designates the position of the feature vector with respect to class‟s polyhedrons. The classes counter array output is the input to the comparator. The comparator sets the class index, which is the class of the greatest vote and compute the recognition quality RQ . In the third mode, adding more classification ability to the network, the training set of images for the classes run through the first two stages to get the new classes features vectors. Then, the system recalls the former classesfeatures. A learning process confined to the separating the new classes from the former classes is initiated. The outcome of the learning processes is used to add more configurations to the network without affecting the existing ones. The added configurations simply activate and set new MC‟s, and activate the new classes‟ counters. Also, as the system goes on a class with miss classification rate over the normal its polyhedron could be reset using more training elements using some agent in a similar process to the former one. Figure (1) shows the learning process of the proposed system. Figure (3) Classifier initial and incremental training International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 30 3. MICRO-CLASSIFIERS The proposed classifier network built based on „micro-classifier‟ as basic building block. The micro-classifier concerns with classification of two classes of the set of classes. The micro-classifiers training process could be summarized as following:Assuming the two finite sets ,,, finitelRYX l  l iiN M RyxyyyyY andxxxxX   ,},......,,,{ },......,,,{ 321 321 The search process is for l yx R ** , such that ,, ** ** YyXxyy andxx iixiyi yixi     The learning algorithm is based on the study in [24]. The algorithm evolves to representatives say ** , yx  then: Having the ** , yx  for the two sets the separating hyper plane equation is )()(5.0)( ****** yxyxyx p   Where „  ‟ is the vector dot product operator, and p is plane point plane . The hyper plane divide the range of points either X or Y. The classifier for unknown vector Z considered X side if 0))(5.0())(5.0( *****  yxxyx Z  otherwise considered Y side. The micro-classifiers increment the counters of the potential classes from their perspectives. The micro-classifiers act in parallel on its inputs. The second layer acts on the class‟s counters i clsc the vector assumed to be of class TmclscRQandniclscclsciffn nin  )/)1(( where m is classes count, ]1,5.0[T is the recognition threshold, and RQ is the recognition quality. 4. FEATURES EXTRACTIONS In general, the failure in feature extraction or the choice of wrong features collapses the recognition process. In biometric systems, features could be global, local, or both. The global feature operator applies to the entire image. However, Local uses chucks or window of the image. In local, window position is used together with operator outcome. Feature bases include: colors, edges, corners, textures, statistical, frequencies coefficients and combinations of them. The feature extraction could be done in spatial and frequency domains [25-26]. DCT used effectively image and video encoding schemes such as JPEG, and MPEG. The basics on which such schemes count on DCT is the fact that lower frequencies contribute more significantly to images quality compared to higher ones. That is roughly points to recalling face from a subset of low DCT coefficients is possible. Therefore, it makes the use of such coefficients as base for classifier legitimist. Figure (4) shows original face against a recalled with 1% lowest coefficients. From the figure the overall shape of the face, position of the eyes, head shape, as well part of: ear and nose are to recognition extend is kept. So, a small percent of coefficients carries significant amount information required for recognition. Figure (4) a recalled picture from 1% of the holistic 2-D DCT International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 31 5. DISCRETE COSINE TRANSFORMATION The photoreceptors have over 1.5 hundred million signals while retinal level approximately receives 1 millions of these which are the subject of biological keeping and recognition [42]. Consequently, thinking of forms or transformations is natural. The transformations role concludes the high redundancies to more abstracted representations. Generally, human vision, and machine processing saturates at certain point of details. Therefore adding more details after saturation does not benefit, and could have negative effect. From the widely used transformations: Fourier, Discrete Cosine transformation (DCT), Karhunen-Loeve transform (KLT), Legendre moments, Hue moments, and others. The discrete cosine transformation was widely used as a mean for image abstraction for the purpose of compression [25] as well as feature extraction [43] [45]. The use of the DCT is used by holistic approaches in which transformation is done on the entire image. Also, it is used by the local based and block based approaches to overcome the computation complexity of the transform [25]. Some used combinations of both local and holistic for sake of more informative abstraction [46-47]. The use of the transform in image processing included both single (based on raw and column scans) and two dimensional transform. The two dimensional discrete cosine transform is a transformation from real to real domain. The outcome of the transformation is a real matrix of the same dimension of the original one. The DCT has an inverse that could be used to retrieve the original image from the transformation frequency domain matrix. The matrix elements of lower indices contain the low frequencies of the images. It has been reported that eliminating the highest frequencies is not significantly noted by the human vision and consequently does not affect computer vision. That points to the fact that lowers frequencies more informative than other side. The transformation for an image matrix ),( yxf of dimensions MN , consequently is as following:-                  0 2 0 1 )( ) 2 )12( cos() 2 )12( cos(),()()(),( 1 0 1 0 u N u N u where M vy N ux yxfvuvuC N x M y    The inverse transformation is:-        1 0 1 0 ) 2 )12( cos() 2 )12( cos(),()()(),( N u M v M vy N ux vuCvuyxf   6. TESTING AND RESULTS The source images used in the testing process is the standard ORL database [34]. The most commonly used subset of the ORL database is the famous 40 subjects. Each subject has 10 images in different poses/orientations. These images are gray scaled 0 to 255. Samples from the dataset are in figure (5). International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 32 Figure (1) Samples of the standard testing dataset. Table (1) contains the coefficients of the lowest 8 frequencies of the first two pictures of the first three subjects. Table (1): DCT coefficients Subject 1.0e+03 * 1-1 1-2 2-1 2-2 3-1 3-2 DCT(0,0) 6.2691 6.8428 4.4684 4.3182 5.5249 5.5232 DCT(1,0) -0.1947 0.0471 -0.2885 -0.2588 0.1716 0.0335 DCT(0,1) 0.1710 0.3278 -0.0168 0.2444 0.1937 0.4050 DCT(1,1) 0.0081 -0.3400 0.0554 0.1059 0.2580 -0.0512 DCT(0,2) -0.4683 0.0296 -0.7195 -0.7008 -0.5569 -0.5703 DCT(2,0) -0.8877 -0.1159 -0.4006 -0.4265 -0.5430 -0.6127 DCT(2,1) -0.0659 0.0847 0.0258 -0.0980 -0.0258 -0.0091 DCT(1,2) -0.1078 -0.1183 -0.0261 0.0671 0.0314 0.0025 Experiment 1:In this experiment we will use the entire set, 400, in training. The number DCT coefficients used varied from 1 to 10 of the lowest frequencies. Figure (2) show the result of the experiment. Figure (2) DCT coefficients performance International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 33 Experiment 2:In this experiment eight randomly selected are used in network training and the entire set, 400, in testing. The DCT coefficients used are that of the lowest 25 frequencies. The selections repeated 100 times. The results of the experiment are:The best cases are 31 trails which end with 100% recognition. The worst of the rest 69 trails was having 5 subjects miss-classified i.e. 98.75% of correct recognition. Experiment 3:In the second experiment, six out of the set random selected and the entire set was used in testing. The number of DCT coefficients used varied from three to twenty of lower frequencies. For each case fifty trails are done on the 400 images, using different sets for training. Figure (3) presents the results of the experiments: mean, best and worst. Figure (3) Recognition Percent Against correct recognition For the case of twenty coefficients case the fifty trails correct recognition percent relationship presented in figure (4). The two figures show that the network has stable and decent generalization ability. Also, the network correct classification ability increases with higher rates for smaller number of coefficients which coincide with our former claim. The third point is the network classification ability with relatively low number of coefficients or features is good. The proposed network architecture is simple compared to that of deep networks. Also, the training and use more straight forward [10] [16] [19]. Moreover, the architecture allows incremental learning and partial retraining when needed. There are a lot of bench marks on ORL database which includes: [39] of correct recognition reports percent‟s from 85.42% to 93.75 using PCA and NPCA, [40] reported similar to former results on PCA and LDA recognition percent‟s from 88.1% to 95.98% on ORL database. Neural networks was used with Eigen-faces in [41] using neural networks case of testing on the entire 400 images 360 was correct identified with 50 Eigen-faces and 15 hidden layer neurons. Figure(4) Correct recognition percent in the 50 trails of 20 DCT coefficients International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 34 7. CONCLUSION In this paper, face recognition methodology was introduced. The methodology was based on micro classifiers network which contains four layers: input, micro-classifier, counter, and output. The micro-classifier concerns with classification of two classes of the problem. The micro-classifiers votes for the classes neurons of the next layer. The internal layer process of the classes-neurons evolves to a winner class and quality factor that set the output layer neurons. The DCT transformation was used as feature extraction vehicle after median filter for noise removal. The proposed face recognition methodology was tested using the standard ORL data set. The experimental results of the vehicle showed comparable performance. REFERENCES [1] Chellapa, R., Wilson, C. L., and Sirohey, S. Human and machine recognition of faces: A survey. Proc. IEEE,83, 705– 740.1995 [2] Nese Alyuz, Berk Gokberk, and Lale Akarun, 3-D Face Recognition Under Occlusion Using Masked Projection, IEEE transaction on information forensics and security, vol. 8, No. 5, May 2013. [3] Berk Gökberk∗ , M. Okan  Irfano  glu, Lale Akarun, Ethem Alpaydın, "Learning the best subset of local features for face recognition", ElsevierPattern Recognition, Vol. 40, 2007, 1520 – 1532. [4] C.Magesh Kumar, R.Thiyagarajan , S.P.Natarajan, S.Arulselvi,G.Sainarayanan, "Gabor features and LDA based Face Recognition with ANN classifier",Procedings Of ICETECT 2011. [5] Önsen TOYGAR Adnan ACAN ,"Face recognition using PCA,LDA and ICA approaches on colored Images", Journal Of Electrical and Electronics Engineering, vol13,2003. [6] Issam Dagher,"Incremental PCA-LDA algorithm", International Journal of Biometrics and Bioinformatics (IJBB), Volume (4): Issue (2), May 2010 . [7] J. Shermina,V. Vasudevan,"An Efficient Face recognition System Based on Fusion of MPCA and LPP", American Journal of Scientific Research ISSN 1450-223X Issue 11(2010), pp.6-19. [8] Yun-Hee Han,Keun-Chang Kwak, " Face Recognition and Representation by Tensor-based MPCA Approach", 2010 The 3rd International Conference on Machine Vision (ICMV 2010). [9] Neerja,Ekta Walia, "Face Recognition Using Improved Fast PCA Algorithm", Proceedings of IEEE Congress on Image and Signal Processing, Vol. 1 Volume 01. Pages 554-558,2008. [10] Steve Lawrence,C.Lee Giles,A.h Chung Tsoi,Andrew D. Back, " Face Recognition: A Convolutional Neural Network Approach", IEEE Transactions on Neural Networks, Vol. 8, Jan 1997. [11] M.-H. Yang, D. Kriegman, and N. Ahuja. Detecting faces in images:A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(1):34–58, January 2002. [12] K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(6):559–572, 1901. [13] M. Turk and A. Pentland. "Eigenfaces for recognition", Journal of Cognitive Neurosicence, 3(1):71–86, 1991. [14] Abdelatif Hussien A. Ali, "Face Recognition With Pre-moment processing", Journal of the Advances in Computer Science, vol. 8, June 2015 [15] V.V. Starovoitov, , D.I Samal, D.V. Briliuk, "THREE APPROACHES FOR FACE RECOGNITION", The 6-th International Conference on Pattern Recognition and Image Analysis, October 21-26, 2002, Velikiy Novgorod, Russia, pp. 707-711 [16] Ajoy Kumar Dey, Susmita Saha, Avijit Saha, Shibani Ghosh, "A Method of Genetic Algorithm (GA) for FIR Filter Construction: Design and Development with Newer Approaches in Neural Network Platform", International Journal of Advanced Computer Science and Applications, Vol. 1, No. 6, pp. 87-90, 2010. [17] Lin-Lin Huang, Akinobu Shimizu, Yoshihiro Hagihara, Hidefumi Kobatake,"Face detection from cluttered images using a polynomial neural network", Elsevier Science 2002 [18] Li, S. Z. and Lu,J. (1999). "Face recognition using the nearest feature line method". IEEE Transactions on Neural Networks,10(2):439-443. [19] Martin T Hagan , Howard B Demuth, Mark H Beale , Orlando De Jesús ," Neural Network Design (2nd Edition) 2nd Edition", Amazon, 2015 [20] Abdellatief H. Abou ALI," Non-Preemptive Multi-Constrain Scheduling for Multiprocessor with Hopfield Neural Network ", International Journal of Computer Science and Information Security, Vol. 11 No. 3 ( pp. 125-130), March 2013 [21] Esraa Zeki Mohammed and Haitham Kareem Ali," Hardware Implementation of Artificial Neural Network Using Field Programmable Gate Array", International Journal of Computer Theory and Engineering, Vol. 5, No. 5, October 2013 [22] L. Franco and S. A. Cannas, "Generalization and selection of examples in feedforward neural networks", Neural Comput., vol. 12, pp. 2405-2426, 2000. [23] Huan Xu and Shie Mannor. Robustness and generalization. Machine Learning, 86(3):391–423, 2012. International Journal of Engineering and Information Systems (IJEAIS) ISSN: 2000-000X Vol. 2 Issue 3, March – 2018, Pages: 27-35 www.ijeais.org 35 [24] Abdellatief Hussien AbouAli "LEARNING LINEAR SETS SEPARATION", International Journal of Engineering and Information Systems (IJEAIS), Vol. 1 Issue 9, November – 2017, Pages: 196-205 [25] R.C.Gonzalez, R.E.Woods Gonzalez, Digital Image Processing, AddisonWesley, 2012 [26] S. Dabbaghchian, M. P. Ghaemmaghami, A. Aghagolzadeh, Feature Extraction Using Discrete Cosine Transform and Discrimination Power Analysis with a Face Recognition Technology, Pattern Recognition 43(4), 2010, 1431–1440. [27] S.Annadurai, A.Saradha. " Face recognition using Legendre Moments", Indian Conference on Computer Vision, Graphics and Image Processing 2004 [28] D. Sridhar, I.V. Murali Krishna ,Combined Classifier for Face Recognition using Legendre Moments , Computer Engineering and Applications Vol. 1, No. 2, December 2012. [29] Rajiv Kapoor, Pallavi Mathur, Face Recognition Using Moments and Wavelets, International Journal of Engineering Research and Applications , Vol. 3, Issue 1, January -February 2013, pp.632-635. [30] J. Shen, W. Shen and D. Shen, "On Geometric and Orthogonal Moments", Inter. Journal of Pattern Recognition and Artificial Intelligence, Vol.14, No.7, pp.875-894, 2000. [31] F. Roli, G. Giacinto, and G. Vernazza. "Methods for Designing Multiple Classifier Systems", In of 2nd International Workshop on Multiple Classifier Systems, Lecture Notes in Computer Science, Cambridge, UK, Springer-Verlag, vol. 2096, pp. 78, 2001. [32] G. P. Zhang, "Neural Networks for Classification: A Survey", IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications And Reviews pp. 451-462, November, 2000. [33] K. Hornik. "Approximation Capabilities of Multilayer Feedforward Networks", Neural Networks pp.251–257, 1991. [34] Collection of Facial Images: Faces94, http://cswww.essex.ac.uk/mv/allfaces/faces94.html [35] W. A. Porter and W.Liu, NEURAL NETWORK TRAINING ENHANCEMENT, CIRCUITS SYSTEMS SIGNAL PROCESSING,VOL.15, NO. 4,1996,PP. 467-480 [36] Dilipsinh Bheda, Mahasweta Joshi, Vikram Agrawal, "A Study on Features Extraction Techniques for Image Mosaicing", International Journal of Innovative Research in Computer and Communication Engineering, Vol. 2, Issue 3, March 2014 [37] Yan Ke ; Sukthankar, R.," PCA-SIFT: a more distinctive representation for local image descriptors", Computer Vision and Pattern Recognition, 2004.Proceeding of IEEE CVPR 2004 Computer Vision and Pattern Recognition Conference,Volume2. [38] A. Jain, R. Duin, and J. Mao. ,"Statistical pattern recognition: A review." IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1):4–37, January 2000. [39] Ajay Kumar Bansal and Pankaj Chawla , "Performance Evaluation of Face Recognition using PCA and N-PCA", International Journal of Computer Applications, Volume 76–No.8, August 2013 [40] Anas Fouad Ahmed," A comparative study of human faces recognition using principle components analysis and linear discrimination analysis", Journal of Engineering and Sustainable Development, Vol 20, No. 05, Sep. 2016. [41] Prashant Sharma, Amil Aneja, Amit Kumar, Dr.Shishir Kumar, "Face Recognition using Neural Network and Eigenvalues with Distinct Block Processing", International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 [42] Alice C. Parker; Adi N. Azar, A hierarchical artificial retina architecture , SPIE Europe Microtechnologies for the New Millennium, 2009, Dresden, Germany. [43] SumanpreetKaur and Rajinder SinghVirk, DCT Based Fast Face Recognition Using PCA and ANN, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 2, Issue 5, May 2013. [44] Hazim Kemal Ekenel and Rainer Stiefelhagen ,‟ Local appearance based face recognition using discrete cosine transform, the 13th Europe conference on signal processing 2005. [45] ZIAD M. HAFED AND MARTIN D. LEVINE, Face Recognition Using the Discrete Cosine Transform, International Journal of Computer Vision 43(3), 167–188, 2001. [46] Aman R. Chadha, Pallavi P. Vaidya, M. Mani Roja, Face Recognition Using Discrete Cosine Transform for Global and Local Features, Proceedings of the 2011 International Conference on Recent Advancements in Electrical, Electronics and Control Engineering (IConRAEeCE). [47] Abu Naser, S., Zaqout, I., Ghosh, M. A., Atallah, R., & Alajrami, E. (2015). Predicting Student Performance Using Artificial Neural Network: in the Faculty of Engineering and Information Technology. International Journal of Hybrid Information Technology, 8(2), 221-228. [48] Abu Naser, S. S. (2012). Predicting learners performance using artificial neural networks in linear programming intelligent tutoring system. International Journal of Artificial Intelligence & Applications, 3(2), 65. [49] Elzamly, A., Hussin, B., Abu Naser, S. S., Shibutani, T., & Doheir, M. (2017). Predicting Critical Cloud Computing Security Issues using Artificial Neural Network (ANNs) Algorithms in Banking Organizations. International Journal of Information Technology and Electrical Engineering, 6(2), 40-45.