Detecting Cybersecurity Threats Using AI Network

: In the field of cybersecurity, one of the most important tasks is to find a way to automatically and very effectively find online risks. This study shows how artificial intelligence and artificial neural networks can be used in a new way to find online threats. The suggested method turns a large number of security events into individual event profiles and uses a deep learning-based detecting method to greatly improve the ability to find cyber threats. To do this, an AI-SIEM (Artificial Intelligence Security Information and Event Management) system has been created. This system combines event profiling for data preprocessing with different artificial neural network techniques, such as Fully Connected Neural Networks (FCNN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) networks. The main goal of this system is to tell the difference between real alerts and false ones, so that security experts can react quickly to new online risks. The writers did a lot of tests using real-world data and two standard datasets (NSLKDD and CICIDS2017) to see how well the system worked. Five common machine-learning methods (Support Vector Machines, k-Nearest Neighbors, Random Forests, Naive Bayes, and Decision Trees) were used in tests to compare how well they worked with known methods. The study's results show that the suggested methods work well for detecting network intrusions, even in real-world situations, where they do a better job than traditional machine-learning approaches.


INTRODUCTION
The use of artificial intelligence (AI) techniques has made learning-based methods for finding cyberattacks much more effective.Several studies have found that these methods work well.But the fact that cyber risks are always changing makes it very hard to protect IT systems from bad things happening in networks.Because network attacks and bad behavior happen all the time, it's very important to come up with effective defense mechanisms and security solutions [1,2,3,4].
Usually, there are two main methods used to find cyber risks and network attacks.An Intrusion Prevention System (IPS) is set up in the business network.Most of the time, signature-based methods are used to look at the protocols and flows of the network.It sends attack reports, which are called security events, to another system, like a Security Information and Event Management (SIEM) system.SIEM systems are widely used to collect and manage the reports that IPSs send out.SIEM stands out as the most common and effective method for examining security events and logs that have been gathered [5].Security experts are very important when it comes to looking into strange alerts.They do this by following rules and limits that have already been set and by studying links between events based on what they know about attack trends.
The use of learning-based methods to find where an attack happened in a big amount of data has been helpful for analysts who need to quickly evaluate many events.Analyst-driven solutions and machine learning-driven solutions are the two main types of information security solutions [10].Analyst-driven solutions are based on rules that security experts, called analysts, have already set up.On the other hand, methods based on machine learning are designed to find unusual or rare patterns, which makes it easier to find new cyber dangers [10].Even though learning-based techniques are useful for finding hacks in systems and networks, there are some problems with the methods that are already in place.
First of all, learning-based methods of spotting need tagged data to train and test models.It is often hard to get a large enough amount of named data for correct model training.Many commercial SIEM systems don't keep labeled data that can be used for supervised learning models, even though labeled data is important for these methods [10].
Second, many of the ways of learning that are suggested by research studies are not easy to use in the real world.Most network protection systems don't have these functions, which makes it hard to use them in real life.In recent intrusion detection study, automation through deep learning technologies has been looked into.Well-known datasets like NSLKDD [11], CICIDS2017 [12], and Kyoto-Honeypot [13] have been used to test performance.But because these standard samples only have a small number of features, they may not work well in the real world.To get around these problems, learning models should be tested with data taken from the real world.

DOMAIN INTRODUCTION
A Deep Neural Network (DNN) is a sophisticated type of neural network characterized by its complexity, specifically having more than two layers.These neural networks leverage advanced mathematical models to process data in intricate ways.Neural networks, in general, are technology designed to mimic human brain activities, particularly pattern recognition and the flow of input through layers of simulated neural connections.
Deep neural networks are typically defined as networks comprising an input layer, an output layer, and at least one hidden layer in between.Each layer serves a unique purpose in a process often referred to as "feature hierarchy," where data is sorted and organized hierarchically.Deep neural networks excel in handling unlabeled or unstructured data.The term "deep learning" is also commonly used to describe these networks, signifying a subset of machine learning that leverages artificial intelligence to classify and organize information in more complex ways than traditional input/output protocols.

Benefits of Deep Neural Networks
Neural networks intentionally incorporate randomness into their design to enhance their ability to learn and approximate complex functions effectively.Randomness is strategically employed in these machine learning algorithms because it often leads to improved performance.
One of the primary uses of randomness in neural networks is in the initial setup of network weights.While randomness can be applied in various aspects, here are a few key areas where it plays a role: 1. Randomness in Initialization: This involves the random assignment of weights in the neural network, which helps prevent the network from getting stuck in suboptimal solutions during training.

Randomness in Regularization:
Techniques like dropout introduce randomness by temporarily removing some neurons during training, preventing overfitting and improving generalization.3. Randomness in Layers: In tasks like natural language processing, randomness is introduced in layers like word embedding to capture the rich semantic relationships between words.

III. RELATED STUDY
This literature review delves into the extensive body of research surrounding the application of deep learning methods for intrusion detection.It provides a comprehensive overview of the key studies and developments in this field, examining the methodologies, models, and datasets used, as well as the performance and limitations of deep learning-based intrusion detection systems.By synthesizing the findings of these studies, this review aims to shed light on the current state of research, identify trends and challenges, and offer insights into the future directions of leveraging deep learning for the protection of digital ecosystems against cyber threats.Through this exploration, we seek to contribute to a deeper understanding of the efficacy and potential of deep learning techniques in the critical domain of intrusion detection.Naseer et al. [1] introduced and trained intrusion detection models employing different deep neural network architectures, including Convolutional Neural Networks (CNNs), Autoencoders, and Recurrent Neural Networks (RNNs).Their models were trained on the NSLKDD training dataset and evaluated on test datasets provided by NSLKDD.Notably, the DCNN and LSTM models achieved commendable performance, with 85% and 89% accuracy, respectively, on the test dataset.
B. Zhang et al. [2] categorized network intrusion detection methods into two types: direct methods employing a single algorithm and combination methods that combine multiple approaches.They proposed a novel detection model based on a directed acyclic graph (DAG) and a belief rule base (BRB).Their results revealed that the DAG-BRB combination model outperformed conventional detection models, demonstrating a higher detection rate when using the KDD 99 dataset.
Wang et al. [3] put forth a hierarchical spatial and temporal features-based intrusion detection system (HAST-IDS), which autonomously learns network traffic features.The approach involves initially learning spatial features using deep Convolutional Neural Networks (CNNs) and subsequently incorporating temporal features via Long Short-Term Memory (LSTM) networks.Experiments were conducted using datasets provided by DARPA and ISCX.
Vinayakumar et al. [15] developed a hybrid intrusion detection system capable of analyzing both network and host-level activities.They harnessed distributed deep learning models, employing Deep Neural Networks (DNNs), to process and analyze extensive real-time data.The selection of the DNN model was informed by comprehensive evaluations against classical machine learning classifiers using benchmark IDS datasets, such as NSLKDD and UNSW-NB15.
Khan et al.
[38] introduced an innovative two-stage deep learning model, built upon a stacked autoencoder with a softmax classifier, designed for efficient network intrusion detection.Extensive experiments were conducted on two widely used public datasets: the benchmark KDD99 and UNSW-NB15 datasets.Remarkably, this study achieved remarkable results, with accuracy rates reaching up to 99.9% for the KDD99 dataset and 89.1% for the UNSW-NB15 dataset.
Liao et al.
[39] proposed a novel algorithm based on the k-Nearest Neighbors (k-NN) classifier method using Term Frequency-Inverse Document Frequency (TF-IDF) for modeling program behavior in intrusion detection, particularly concerning system calls.This approach utilizes the frequencies of system calls to characterize program behavior.They employ text categorization techniques, like TF-IDF, to transform system call data into vectors and calculate the similarity between program system call activities.The authors reported that the TF-IDF-based k-NN classifier demonstrates promise in the domain of intrusion detection, particularly in the context of malware detection.These studies collectively highlight the diverse applications of deep learning techniques in intrusion detection, showcasing impressive results across different datasets and model architectures.

IV. METHODOLOGY a. Existing System
Since there is no wait staff in an automated café, it is difficult for the management to gauge how well the concept and the cuisine are received.Existing rating systems, such as Google and Trip Advisor, only partially address this problem since they only cover a subset of the client's findings.A small percentage of the customers who review the cafe on independent websites really use these rating systems.Clients who leave feeling hopeless or dissatisfied are the primary target audience here.

b. Proposed System
All consumers need to be encouraged to leave a review in order to address the aforementioned issue.This article presents a strategy for a restaurant rating system that encourages as many customers as possible to rate their experience after dining there.The system's ratings are based on facial expression recognition via the use of pre-trained convolutional neural network (CNN) models, making it suitable for usage in unattended dining establishments.The consumer may take a photo of his face while rating the cuisine to indicate how he feels about it.There is far less data and no first-hand accounts of user experience than in a text-based rating c.System Architecture

Module Description
To construct the complete prediction model, it is imperative to execute all the required modules in a stepby-step manner.To initiate the project, simply double-click on the 'run.bat'file, which will open the screen as depicted below.
To upload a dataset, use the 'Upload Train Dataset' button on the aforementioned interface.
After selecting the 'kdd_train.csv'dataset in the previous screen, the next screen will appear.

e. The TF-IDF Algorithm
To transform the raw dataset into TF-IDF values, as seen on the previous screen, click the 'Run Preprocessing TF-IDF Algorithm' button.
Select the 'Generate Event Vector' button on the previous page to generate a vector from TF-IDF including various events.

f. Generate Even Vector
You can see the complete amount of the dataset and how the application is using 80% of it for training (7999 records) and utilizing 20% for testing (2000 records) in the window to the right.After getting your datasets ready for training and testing, you may use the 'Neural Network Profiling' button to create an LSTM or CNN model.

g. Neural Network Profring
The accompanying screen depicts the generation and epoch running of an LSTM model with an initial accuracy of 0.94.It may take some time for the LSTM and CNN training processes to finish before the whole dataset can be run.There are 7999 records in this dataset, and LSTM will loop over all of them to apply filters and construct the model.
Here, we see that the LSTM model has finished all of its iterations and that the CNN model has begun running.

h. SVM Algorithm
Even while CNN's initial accuracy is just 0.72 on this screen, after 10 cycles of filtering and training, it reaches an impressive 0.99, which when multiplied by 100 yields an impressive 99%.So, as you can see in the accompanying GUI screen, CNN is providing more accuracy than LSTM.

KNN Algorithm
The above window displays the FMEA sure values, accuracy, precision, recall, and recall rates of both methods.Next, choose the option to "Run SVM Algorithm" to use the pre-built SVM algorithm.
You can see the results of the SVM algorithm above, and you can start the KNN algorithm by clicking the button labeled "Run KNN Algorithm."

Random Forest Algorithm
KNN algorithm result values are shown on the previous screen; to proceed with Random Forest algorithm, choose 'Run Random Forest Algorithm' button.
The result values of the Random Forest algorithm have been shown; to proceed with the Naive Bayes algorithm, choose the button labeled "Run Naive Bayes Algorithm" from the previous screen.

Naïve Base Algorithm
The result values of the Naive Bayes algorithm are shown on the preceding screen; to proceed with the Decision Tree Algorithm, click the "Run" button.

Decision Tree Algorithm
To see a graph comparing how well different algorithms perform, click the "Accuracy Comparison" button.
From the preceding graph, where the x-axis indicates algorithm names and the y-axis represents accuracy, it is clear that LSTM and CNN are effective methods.Below is a graph showing the results when you click "Precision Comparison Graph." Having looked at the preceding graph, you may now go to the 'Recall Comparison Graph' to see how well CNN is doing.

System Study:
The system study phase involves a thorough analysis of the project's feasibility and the presentation of a business proposal with a preliminary project plan and cost estimates.During this phase, the feasibility of the proposed system is critically examined to ensure it does not pose an undue financial burden on the organization.To conduct a feasibility analysis, it is crucial to have a clear understanding of the system's major requirements.The three primary considerations in feasibility analysis include:

Economic Feasibility:
This study assesses the economic impact that the system will have on the organization.It involves an examination of the funds available for research and development, considering budget limitations.The system's development should align with the budgetary constraints.This goal is achieved by leveraging freely available technologies, with only customized components incurring costs.

Technical Feasibility:
Technical feasibility analysis evaluates whether the system's technical requirements align with the organization's available resources.It is imperative that the system does not place excessive demands on the existing technical infrastructure, as this could strain the client's resources.The ideal system should have modest technical requirements, necessitating minimal or no changes to be made for its implementation.

Social Feasibility:
Social feasibility assessment aims to gauge the level of acceptance of the system by its users.This includes the process of training users to effectively utilize the system.Users should not perceive the system as a threat; rather, they should embrace it as a necessary tool.User acceptance depends on the methods employed to educate users about the system and make them familiar with it.Boosting user confidence is crucial, as it encourages constructive criticism, which is valuable given that users are the ultimate beneficiaries of the system.

Input and Output Design: Input Design:
Input design serves as the vital link between the information system and the user.It encompasses the development of specifications and procedures for data preparation, ensuring that transaction data is in a usable form for processing.Data input can occur through various means, such as reading data from written or printed documents using computerized scanning or having individuals manually enter data into the system.The primary focus of input design is to control the volume of input required, minimize errors, prevent delays, streamline processes, and maintain simplicity.Input design also takes into account security and user-friendliness, ensuring data privacy and ease of use.

Output Design:
Output design is essential for providing users with information that meets their requirements and presents data clearly and effectively.In any information system, the results of processing are communicated to users and other systems through outputs.Output design determines how information is displayed for immediate use and also considers hard-copy outputs.High-quality output design is critical for facilitating user decision-making.When designing computer output, it's essential to: • Organize the design process systematically, ensuring that the right output is developed.
• Choose appropriate methods for presenting information.
• Create documents, reports, or other formats containing information produced by the system.
The objectives of output design include conveying information about past activities, current status, or future projections, signaling important events or triggers, and confirming actions.Well-structured output enhances the system's ability to assist users in making informed decisions.

V. RESULT AND DISCUSSION Datasets
Our data comes from two major commercial systems (ESX1 and ESX-2) that we've given the descriptive names ESX1 and ESX-2.Over a period of 5 months for ESX-1 and 30 days for ESX-2, respectively, security raw events were gathered, with the detecting threat information recorded independently by SOC security experts whenever a network intrusion occurred.Time of threat occurrence, related attacks, attack type, response content, attack IP address, and victim network details are all included in the list of threat detection data.
We looked over 798 cyber threat detection cases in ESX-1 that span the whole collecting window.There were 240 scanning attacks, 547 system hack-king assaults, and 11 worm attacks among the total number of registered cyber threats.Comparatively, there are 941 scans, 3,077 hacks, and 51 worm assaults in ESX-2.SOC analysts manually sorted attacks into these categories.System hacking attacks may be broken down into many types, such as cross-site scripting, distributed denial of service, brute force, and injection.Scanning attacks include trojans, backdoors, and other malware.In all, 4,079 cyber threats were identified.

Data Visualization
t-SNE (t-distributed stochastic neighbor embedding) serves a dual purpose in data analysis.It's not only a common tool for visualizing vector data but also functions as an embedding tool to portray highdimensional data effectively.t-SNE achieves this by transforming high-dimensional data into two-dimensional maps while preserving the neighborhood relationships among data points.In this context, the t-SNE method is employed to visualize datasets like CICIDS 2017 and ESX-2.The t-SNE plots illustrate that both normal and attack data points are closely situated in the same space, making it challenging to distinguish between them as either normal or malicious.Despite the clustering of normal and attack data in the t-SNE plots, it's evident that they aren't linearly separable.Deep learning approaches are commonly utilized for handling high-dimensional data with non-linear characteristics, which is one of the reasons why deep learning methods are chosen for cyber threat detection.

Experimental Results
The experimental results yield two significant conclusions.First, the proposed machine learning models show promise as effective tools for network intrusion detection.When evaluated using well-established benchmark datasets such as NSLKDD and CICIDS 2017, these models perform on par with conventional machine learning techniques.This implies that the methods integrated into the AI-SIEM system have the potential for learning-based network intrusion detection.Second, when conventional learning-based methods that perform well on benchmark datasets are applied in real-world scenarios, their overall accuracy tends to be less reliable.However, the accuracy of the three EP-ANN models proposed in this study does not exhibit significant degradation, even when handling substantial amounts of data and lacking benchmark dataset features, as seen in the results for ESX-2.In contrast, the accuracy of conventional methods decreases from approximately 0.90 to 0.85 in such real-world scenarios.These findings underscore the robustness and practicality of the EP-ANN models in real-world network intrusion detection applications.

Epoch
In the context of neural networks, an epoch is a single iteration over the whole training set.A single forward pass and a single reverse pass are the iterations.The effectiveness of a classification system may be summarized with the help of a confusion matrix.Count numbers summarize the number of accurate and wrong predictions and are further broken down by class.This solves the mystery of the muddled matrix.The proportion of correctly identified cyber threats is the classification accuracy.Large numbers are difficult for computers to store.Instead, computers only have a certain amount of storage capacity for each individual number.Therefore, difficulties may arise when the number of time units that have passed since a system's epoch surpasses the maximum number that can fit in the space allocated for the time representation.Overflow may cause unpredictable behavior in certain systems, but in others, the time value will be reset to zero and the computer will treat the current time as if it were the epoch time again.

VI. Conclusion
In the realm of cybersecurity, Intrusion Detection Systems (IDS) and skilled network security auditors play a crucial role in promptly identifying and mitigating threats.These systems and experts are entrusted with the responsibility of sifting through a multitude of security events to pinpoint the most critical threats in real-time.Their primary objective is to discern the key elements of an attack and translate them into IDS signatures.Threat detection stands as a cornerstone in the cybersecurity strategy of IT organizations, particularly those reliant on cloud infrastructure.It encompasses the ability of IT entities to swiftly and accurately recognize threats directed at the network, applications, or other assets within the network.
The proposed approach revolves around the transformation of a large volume of collected security events into individual event profiles.It employs a deep learning-based detection method to enhance the identification of cyber threats.In this context, an AI-Security Information and Event Management (AI-SIEM) system has been developed, incorporating event profiling for data preprocessing and harnessing various artificial neural network techniques such as Fully Connected Neural Networks (FCNN), Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM).The primary objective of this system is to distinguish between true positive and false positive alerts, thereby enabling security analysts to respond swiftly to digital threats.
In conclusion, this paper introduces an AI-SIEM system that leverages event profiles and artificial neural networks.The innovation lies in the ability to condense vast amounts of data into event profiles and apply deep learning techniques for more effective cyber-threat detection.The AI-SIEM system empowers security analysts to efficiently handle significant security alerts, facilitating the comparison of long-term security data.By reducing false positive alerts, it enables rapid responses to cyber threats embedded within a multitude of security events.Performance evaluation involved benchmark datasets (NSLKDD, CICIDS2017), as well as real-world datasets.Results indicate that this technology outperforms traditional machine learning methods in terms of accurate classifications.

Future Work
Looking ahead, future work will concentrate on improving early threat prediction through multiple deep learning approaches aimed at identifying long-term patterns in historical data.Additionally, efforts will be directed towards enhancing the precision of labeled datasets for supervised learning, involving the meticulous labeling of raw security events by SOC analysts over several months.The creation of purposebuilt test beds, encompassing big data platforms and AI-SIEM systems, will continue to play a role in performance evaluations.Furthermore, the collection of real-world Intrusion Prevention System (IPS) data over an extended period will provide valuable insights in the context of cybersecurity defense mechanisms

Figure 1 :
Figure 1: System Design d.MODULES The proposed algorithms consist of the following modules: Data Parsing: In this module, the input dataset is processed to create a raw data event model.TF-IDF: This module is used to convert the raw data into an event vector that contains normal and attack signatures.Event Profiling Stage: The processed data is divided into training and testing sets based on profiling events.Deep Learning Neural Network Model: Within this module, Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) algorithms are applied to both the training and testing data to generate a predictive model.The generated trained model is then applied to the test data to calculate prediction scores, Recall, Precision, and FMEA (Failure Modes and Effects Analysis) score.An algorithm that learns