Skip to main content
Log in

UTTAMA: An Intrusion Detection System Based on Feature Clustering and Feature Transformation

  • Published:
Foundations of Science Aims and scope Submit manuscript

Abstract

Detecting Intrusions and anomalies is becoming much more challenging with new attacks popping out over a period of time. Achieving better accuracies by applying benchmark classifier algorithms used for identifying intrusions and anomalies have several hidden data mining challenges. Although neglected by many research findings, one of the most important and biggest challenges is the similarity or membership computation. Another challenge that cannot be simply neglected is the number of features that attributes to dimensionality. This research aims to come up with a new membership function to carry similarity computation that can be helpful for addressing feature dimensionality issues. In principle, this work is aimed at introducing a novel membership function that can help to achieve better classification accuracies and eventually lead to better intrusion and anomaly detection. Experiments are performed on KDD dataset with 41 attributes and also KDD dataset with 19 attributes. Recent approaches CANN and CLAPP have showed new approaches for intrusion detection. The proposed classifier is named as UTTAMA. UTTAMA performed better to both CANN and CLAPP approaches w.r.t overall classifier accuracy. Another promising outcome achieved using UTTAMA is the U2R and R2L attack accuracies. The importance of proposed approach is that the accuracy achieved using proposed approach outperforms CLAPP, CANN, SVM, KNN and other existing classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  • Abaei, G., & Selamat, A. (2014a). A survey on software fault detection based on different prediction approaches. Vietnam Journal of Computer Science, 1, 7995. https://doi.org/10.1007/s40595-013-0008-z.

    Article  Google Scholar 

  • Abaei, G., & Selamat, A. (2014b). A survey on software fault detection based on different prediction approaches. Vietnam J Comput Sci, 1, 79–95. https://doi.org/10.1007/s40595-013-0008-z.

    Article  Google Scholar 

  • Aggarwal, P., & Sharma, S. K. (2015). Analysis of KDD dataset attributes: Class wise for intrusion detection. In 3rd international conference on recent trends in computing 2015 (ICRTC-2015), procedia computer science (vol. 57, pp. 842–851).

  • Aljawarneh, S. (2011). A web engineering security methodology for e-learning systems. Network Security 2011(3), 12–15, ISSN 1353-4858. https://doi.org/10.1016/S1353-4858(11)70026-5.

  • Aljawarneh, S., Radhakrishna, V., Kumar, P. V., & Janaki, V. (2016a). A similarity measure for temporal pattern discovery in time series data generated by IoT. In 2016 international conference on engineering & MIS (ICEMIS), Agadir (pp. 1–4).

  • Aljawarneh, S., Yassein, M. B., & Talafha, W. A. (2017a). A resource efficient encryption algorithm for multimedia big data. Multimedia Tools and Applications, 76, 22703. https://doi.org/10.1007/s11042-016-4333-y.

    Article  Google Scholar 

  • Aljawarneh, S., Yassein, M. B., & Talafha, W. A. (2018). A multithreaded programming approach for multimedia big data: Encryption system. Multimedia Tools and Applications, 77, 10997. https://doi.org/10.1007/s11042-017-4873-9.

    Article  Google Scholar 

  • Aljawarneh, S. A., Alawneh, A., & Jaradat, R. (2017b). Cloud security engineering: Early stages of SDLC. Future Generation Computer Systems 74, 385–392, ISSN 0167-739X. https://doi.org/10.1016/j.future.2016.10.005.

  • Aljawarneh, S. A., Moftah, R. A., & Maatuk, A. M. (2016b). Investigations of automatic methods for detecting the polymorphic worms signatures. Future Generation Computer Systems 60, 67–77, ISSN 0167-739X. https://doi.org/10.1016/j.future.2016.01.020.

  • Aljawarneh, S. A., Radhakrishna, V., & Cheruvu, A. (2017c). Extending the Gaussian membership function for finding similarity between temporal patterns. In 2017 international conference on engineering & MIS (ICEMIS), Monastir (pp. 1–6).

  • Aljawarneh, S. A., & Vangipuram, R. (2018). GARUDA: Gaussian dissimilarity measure for feature representation and anomaly detection in Internet of things. Journal of Supercomputing. https://doi.org/10.1007/s11227-018-2397-3.

    Article  Google Scholar 

  • Aljawarneh, S. A., & Yassein, M. O. B. (2016). A conceptual security framework for cloud computing issues. International Journal of Intelligent Information Technologies. https://doi.org/10.4018/ijiit.2016040102.

    Article  Google Scholar 

  • Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1–127.

    Article  Google Scholar 

  • Biggio, B., Fumera, G., & Roli, F. (2014). Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data Engineering, 26(4), 984–996. https://doi.org/10.1109/TKDE.2013.57.

    Article  Google Scholar 

  • Cardoso-Cachopo, A., & Oliveira, A. (2007). Semi-supervised single-label text categorization using centroid-based classifiers. In Proceedings of the ACM symposium on applied computing (pp. 844–851).

  • Cha, S.-H. (2007). Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences, 1(4), 300–307.

    Google Scholar 

  • Chandola, V., Banerjee, A., & Kumar, V. (2009a). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 1–72.

    Article  Google Scholar 

  • Chandola, V., Banerjee, A., & Kumar, V. (2009b). Anomaly detection: A survey. ACM Computing Surveys, 41(3), 15:1–15:58.

    Article  Google Scholar 

  • Chmielewski, A., & Wierzchoń, S. (2007). On the distance norms for detecting anomalies in multidimensional datasets. Zeszyty Naukowe Politechniki Białostockiej, 2, 39–49.

    Google Scholar 

  • Detristan, T., Ulenspiegel, T., Malcom, Y., & Underduk, M. (2003). Polymorphic shell code engine using spectrum analysis. Phrack Issue 0x3d.

  • Dickerson, J. E., & Dickerson, J. A. (2000). Fuzzy network profiling for intrusion detection. In PeachFuzz 2000. 19th international conference of the North American fuzzy information processing societyNAFIPS (cat. no. 00TH8500), Atlanta, GA (pp. 301–306). https://doi.org/10.1109/nafips.2000.877441.

  • Eskin, E., Arnold, A., Prerau, M., Portnoy, L., & Stolfo, S. (2002). A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In D. Barbara & S. Jajodia (Eds.), Applications of data mining in computer security. Dordrecht: Kluwer.

    Google Scholar 

  • Esposito, C., Su, X., Aljawarneh, S. A., & Choi, C. (2018). Securing collaborative deep learning in industrial applications within adversarial scenarios. IEEE Transactions on Industrial Informatics, 14(11), 4972–4981. https://doi.org/10.1109/TII.2018.2853676.

    Article  Google Scholar 

  • Gaffney, J., & Ulvila, J. (2001). Evaluation of intrusion detectors: A decision theory approach. In IEEE symposium on security and privacy (pp. 50–61).

  • Ganapathy, S., Kulothungan, K., Muthurajkumar, S., Vijayalakshmi, M., Yogesh, P., & Kannan, A. (2013). Intelligent feature selection and classification techniques for intrusion detection in networks: a survey. EURASIP Journal on Wireless Communications and Networking, 2013, 271.

    Article  Google Scholar 

  • Gunupudi, R. K., Nimmala, M., Gugulothu, N., & Gali, S. R. (2017). CLAPP: A self-constructing feature clustering approach for anomaly detection. Future Generation Computer Systems 74, 417–429, ISSN 0167-739X.

  • Hidayanto, B. C., Muhammad, R. F., Kusumawardani, R. P., & Syafaat, A. (2017). Network intrusion detection systems analysis using frequent item set mining algorithm FP-max and apriori. Procedia Computer Science 124, 751–758, ISSN 1877-0509.

  • Ibrahimi, K., & Ouaddane, M. (2017). Management of intrusion detection systems based-KDD99: Analysis with LDA and PCA. In 2017 international conference on wireless networks and mobile communications (WINCOM), Rabat (pp. 1–6). https://doi.org/10.1109/wincom.2017.8238171.

  • Imran, A., Aljawarneh, S. A., & Sakib, K. (2016). Web data amalgamation for security engineering: Digital forensic investigation of open source cloud. Journal of Universal Computer Science, 22(4), 494–520.

    Google Scholar 

  • Ji, S.-Y., Jeong, B.-K., Choi, S., & Jeong, D. H. (2016). A multi-level intrusion detection method for abnormal network. Journal of Network and Computer Applications, 62, 9–17. https://doi.org/10.1016/j.jnca.2015.12.004.

    Article  Google Scholar 

  • Karapistoli, E., & Economides, A. A. (2014). ADLU: A novel anomaly detection and location attribution algorithm for UWB wireless sensor networks. EURASIP Journal on Information Security, 2014, 3.

    Article  Google Scholar 

  • Kloft, M., & Laskov, P. (2010). Online anomaly detection under adversarial impact. In Proceedings of the 13th international conference on artificial intelligence and statistics (AISTATS) 2010, Chia Laguna Resort, Sardinia, Italy. Volume 9 of JMLR: W&CP 9.

  • Kruegel, C., Mutz, D., Robertson, W., & Valeur, F. (2003). Bayesian event classification for intrusion detection. In Proceedings of the 19th annual computer security applications conference (ACSAC ‘03) (p. 14). IEEE Computer Society, Washington, DC, USA.

  • Kruegel, C., Toth, T., & Kirda, E. (2002). Service specific anomaly detection for network intrusion detection. In ACM symposium on applied computing (SAC).

  • Kumar, G. R., Mangathayaru, N., & Narsimha, G. (2016a). An approach for intrusion detection using novel gaussian based kernel function. Journal of Universal Computer Science, 22(4), 589–604.

    Google Scholar 

  • Kumar, G. R., Mangathayaru, N., & Narsimha, G. (2017). A feature clustering based dimensionality reduction for intrusion detection. IADIS International Journal on Computer Science & Information Systems, 12(1), 26–44.

    Google Scholar 

  • Kumar, G. R., Nimmala, M., & Narsimha, G. (2016b). A novel similarity measure for intrusion detection using gaussian function. Technical Journal of the Faculty of Engineering, 39(2), 173–183.

    Google Scholar 

  • Liang, H., Sun, X., Sun, Y., & Gao, Y. (2017). Text feature extraction based on deep learning: A review. EURASIP Journal on Wireless Communications and Networking, 2017, 211. https://doi.org/10.1186/s13638-017-0993-1.

    Article  Google Scholar 

  • Libralon, G. L., de Leon Ferreira de Carvalho, A. C. P., & Lorena, A. C. (2009). Pre-processing for noise detection in gene expression classification data. Journal of the Brazilian Computer Society, 15, 3. https://doi.org/10.1007/BF03192573.

    Article  Google Scholar 

  • Lin, W.-C., Ke, S.-W., & Tsai, C.-F. (2015). CANN: An intrusion detection system based on combining cluster centers and nearest neighbors. Knowledge-Based Systems 78, 13–21, ISSN 0950-7051.

  • Lippmann, R. P., et al. (2000). Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation.

  • Mukkamala, S., Sung, A., & Abraham, A. (2005). Intrusion detection using an ensemble of intelligent paradigms. Journal of Network and Computer Applications, 28(2), 167–182.

    Article  Google Scholar 

  • Mukkamala, S., & Sung, A. H. (2006). Significant feature selection using computational intelligent techniques for intrusion detection (pp. 285–306). Berlin: Springer.

    Google Scholar 

  • Nagaraja, A., Aljawarneh S., & Prabhakara H. S. (2018b). PAREEKSHA: a machine learning approach for intrusion and anomaly detection. In Proceedings of the first international conference on data science, e-learning and information systems (DATA ’18). New York: ACM. https://doi.org/10.1145/3279996.3280032.

  • Nagaraja, A., Mangathayaru, N., Rajashekar, N., & Kumar, T. S. (2016). A survey on routing techniques for transmission of packets in networks. In 2016 international conference on engineering & MIS (ICEMIS), Agadir (pp. 1–6).

  • Nagaraja, A., & Satish Kumar, T. (2018). An extensive survey on intrusion detection- past, present, future. In Proceedings of the fourth international conference on Engineering & MIS 2018 (ICEMIS ’18). New York: ACM. https://doi.org/10.1145/3234698.3234743.

  • Nagaraja, A., Sravan Kiran, V., Prabhakara H. S, & Rajasekhar, N. (2018a). A membership function for intrusion and anomaly detection of low frequency attacks. In Proceedings of the first international conference on data science, e-learning and information systems (DATA ’18). New York: ACM. https://doi.org/10.1145/3279996.3280031.

  • Nelson, B., & Joseph, A. D. (2006). Bounding an attack’s complexity for a simple learning model. In Proceedings of the first workshop on tackling computer systems problems with machine learning techniques (SysML), Saint-Malo, France.

  • Perdisci, R., Ariu, D., Fogla, P., Giacinto, G., & Lee, W. (2009). McPAD: A multiple classifier system for accurate payload-based anomaly detection. Computer Networks 53(6), 864–881, ISSN 1389-1286. https://doi.org/10.1016/j.comnet.2008.11.011.

  • Portnoy, L., Eskin, E., & Stolfo, S. (2001). Intrusion detection with unlabeled data using clustering. In ACM CSS workshop on data mining applied to security.

  • Radhakrishna, V., Aljawarneh, S., & Cheruvu, A. (2018a). Sequential approach for mining of temporal itemsets. In Proceedings of the fourth international conference on engineering & MIS 2018 (ICEMIS ‘18). New York, NY, USA: ACM, Article 33, 6 p.

  • Radhakrishna, V., Aljawarneh, S. A., Kumar, P. V., et al. (2018b). A novel fuzzy gaussian-based dissimilarity measure for discovering similarity temporal association patterns. Soft Computing, 22, 1903. https://doi.org/10.1007/s00500-016-2445-y.

    Article  Google Scholar 

  • Radhakrishna, V., Aljawarneh, S. A., Kumar, P. V., & Janaki, V. (2017a). A novel fuzzy similarity measure and prevalence estimation approach for similarity profiled temporal association pattern mining. Future Generation Computer Systems, ISSN 0167-739X. https://doi.org/10.1016/j.future.2017.03.016.

  • Radhakrishna, V., Aljawarneh, S. A., Kumar, P. V., & Janaki, V. (2017b). ASTRA: A novel interest measure for unearthing latent temporal associations and trends through extending basic gaussian membership function. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-5280-y.

    Article  Google Scholar 

  • Radhakrishna, V., Kumar, P. V., Aljawarneh, S. A., & Janaki, V. (2017c). Design and analysis of a novel temporal dissimilarity measure using Gaussian membership function. In 2017 international conference on engineering & MIS (ICEMIS), Monastir (pp. 1–5).

  • Radhakrishna, V., Kumar, P. V., & Janaki, V. (2015). A temporal pattern mining based approach for intrusion detection using similarity measure. In Proceedings of the international conference on engineering & MIS 2015 (ICEMIS ‘15). New York, NY, USA: ACM, Article 64, 8 p. https://doi.org/10.1145/2832987.2833077.

  • Radhakrishna, V., Kumar, P. V., & Janaki, V. (2016a). A novel similar temporal system call pattern mining for efficient intrusion detection. Journal of Universal Computer Science, 22(4), 475–493.

    Google Scholar 

  • Radhakrishna, V., Kumar, P. V., & Janaki, V. (2017d). SRIHASS-a similarity measure for discovery of hidden time profiled temporal associations. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-5185-9.

    Article  Google Scholar 

  • Radhakrishna, V., Kumar, P. V., & Janaki, V. (2018c). Krishna Sudarsana: A Z-space similarity measure. In Proceedings of the fourth international conference on engineering & MIS 2018 (ICEMIS ‘18). New York, NY, USA: ACM, Article 44, 4 p.

  • Radhakrishna, V., Kumar, P. V., Janaki, V., & Aljawarneh, S. (2016b). A similarity measure for outlier detection in timestamped temporal databases. In 2016 international conference on engineering & MIS (ICEMIS), Agadir (pp 1–5).

  • Sammulal, P., Usha Rani, Y., & Yepuri, A. (2017). A class based clustering approach for imputation and mining of medical records (CBC-IM). IADIS International Journal on Computer Science & Information Systems, 12(1), 61–74.

    Google Scholar 

  • Siddiqui, M. K., & Naahid, S. (2013). Analysis of KDD CUP 99 dataset using clustering based data mining. International Journal of Database Theory and Application, 6(5), 23–34. https://doi.org/10.14257/ijdta.2013.6.5.03.

    Article  Google Scholar 

  • Subudhi, S., & Panigrahi, S. (2018). A hybrid mobile call fraud detection model using optimized fuzzy C-means clustering and group method of data handling-based network. Vietnam Journal of Computer Science. https://doi.org/10.1007/s40595-018-0116-x.

    Article  Google Scholar 

  • Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. (2009). A detailed analysis of the KDD CUP 99 data set. In Submitted to second IEEE symposium on computational intelligence for security and defense applications (CISDA).

  • Tsai, C.-F., Lin, W.-Y., Hong, Z.-F., & Hsieh, C.-Y. (2011a). Distance-based features in pattern classification. Journal on Advances in Signal Processing, 2011, 62.

    Article  Google Scholar 

  • Tsai, C.-F., Lin, W.-Y., Hong, Z.-F., Hsieh, C.-Y., et al. (2011b). Distance-based features in pattern classification. EURASIP Journal on Advances in Signal Processing, 2011, 62.

    Article  Google Scholar 

  • Wang, K., & Stolfo, S. (2006). Anagram: A content anomaly detector resistant to mimicry attack. In Recent advances in intrusion detection (RAID).

  • Wang, W., Dunqiang, L., Zhou, X., Zhang, B., & Jiasong, M. (2013). Statistical wavelet-based anomaly detection in big data with compressive sensing. Journal on Wireless Communications and Networking, 2013, 269.

    Article  Google Scholar 

  • Wang, Y., et al. (2014). Problems of KDD cup 99 dataset existed and data preprocessing. Applied Mechanics and Materials, 667, 218–225.

    Article  Google Scholar 

  • Weller-Fahy, D. J., Borghetti, B. J., & Sodemann, A. A. (2015). A survey of distance and similarity measures used within network intrusion anomaly detection. IEEE Communications Surveys & Tutorials, 17(1), 70–91.

    Article  Google Scholar 

  • Xue-qin, Z., Chun-hua, G., & Jia-jun, L. (2006). Intrusion detection system based on feature selection and support vector machine 2006. First International Conference on Communications and Networking in China, Beijing, pp. 1–5.

  • Yelipe, U., Porika, S., & Golla, M. (2018). An efficient approach for imputation and classification of medical data values using class-based clustering of medical records. Computers & Electrical Engineering, 66, 487–504.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun Nagaraja.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nagaraja, A., Uma, B. & Gunupudi, R.k. UTTAMA: An Intrusion Detection System Based on Feature Clustering and Feature Transformation. Found Sci 25, 1049–1075 (2020). https://doi.org/10.1007/s10699-019-09589-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10699-019-09589-5

Keywords

Navigation