Malware Classification Using Feature Reduction Method and Autoscaling
10 Pages Posted: 3 Oct 2022
Abstract
Today, cloud computing services are revolutionizing businesses and their operations. This new technology became the basis for the digital transformation of companies. Currently, the most widely offered service model is Infrastructure as a Service (IaaS) clients to perform specific computing tasks. The use of IaaS carries several dangers. One of the typical risks is the possibility of a Virtual Machine (VM) becoming infected with malware and the malware spreading to other VMs inside the data center. This situation may put cloud vendors' and customers' sensitive data at risk. These attacks are unknown to the human eye due to malicious intent to harm any underlying infrastructure. So, to overcome the problems and make a flexible solution, we propose a framework where machine learning algorithms are applied to find relevant features from the existing dataset. This paper presents Malware classification using the feature reduction method and autoscaling (MCFA) techniques utilizing static approaches to categorize various malware families. The study uses feature engineering to classify malware families using bytes and asm files from publicly available databases provided by the Kaggle Microsoft Malware Classification Challenge(2015). The convolutional neural network is used to extract features from bytes files and Xgboots is used for extraction of relevant and discriminative features from asm files. The concept is to integrate several feature set to produce hybrid features that give diversity. Finally, the classification of nine malware families is achieved by training a multilayer perceptron with a hybrid feature set. The approach achieves the log-loss of 0.02 with a prediction accuracy 98.91\%,and experimental findings demonstrate improvements when compared to the baseline classifier utilized.
Keywords: Machine LearningMicrosoft Malware Classification Challange(BIG 2015)ASMBytesDeep LearningFeature ReductionAutoscalingClassification
Suggested Citation: Suggested Citation