fbpx

Using Machine Learning to Improve Endpoint Security

Home » Cybersecurity Blog » Using Machine Learning to Improve Endpoint Security

The threat landscape is as dangerous as ever. Machine learning, endpoint security will help improve the security of the most vulnerable devices, endpoints. Learn more about how Machine learning tools can help improve your endpoint security.  Read More

The threat landscape is as dangerous as ever. Machine learning, and endpoint security will help improve the security of the most vulnerable devices, endpoints. Learn more about how Machine learning tools can help improve your endpoint security.

What Is Meant By Endpoint Security?

Endpoint Security is the approach that organizations take to protect their network when accessed by endpoint devices. Endpoints can be laptops, desktops, and even smartphones. Today’s digital resources combined with the increase of remote workers open a multitude of entry points for hackers to be able to access your corporate network. This is why endpoint security is a vital piece of network security in your security strategy.

Machine learning endpoint security is a software tool that organizations use to monitor their endpoints. Managed detection and response is the outsourced service where security analysts monitor your endpoints on a 24/7 basis.

What is the Difference Between Endpoint Security and Antivirus?

Traditional antivirus programs are more simplistic and limited in scope compared to machine learning endpoint security, like Managed Detection and Response.  Antivirus can be perceived as a part of an MDR system.

Antivirus is generally a single program that serves basic purposes like scanning, detecting, and removing viruses and different types of malware.

Endpoint security systems, on the other hand, serve a much larger role. Endpoint Security contains many security tools like firewalls, whitelisting tools, monitoring tools, etc. to provide comprehensive protection against digital threats. It usually runs on the client-server model and protects the various endpoints of an enterprise’s digital network and keeps the endpoints secure.

Hence, Machine learning endpoint security solutions are more suited for the modern-day enterprise as the traditional antivirus has become an obsolete security tool to provide total security.

Read more at Traditional Antivirus vs. EDR (Endpoint Detection and Response)

What is Machine Learning in Security?

Machine learning is the use of statistics to find patterns in large amounts of data. Many platforms are using machine learning and artificial intelligence to improve their algorithms which will improve the overall user experience. Machine learning endpoint security helps find unusual patterns in user behavior to detect potential malware attacks.

According to SentinelOne, there are two main approaches for AI-based malware detection on the endpoint right now: looking at files and monitoring behaviors. The former approach uses static features — the actual bytes of the file and information collected by parsing file structures. Static features are things like PE section count, file entropy, opcode histograms, function imports and exports, and so on. These features are similar to what an analyst might look at to see what a file is capable of.

With enough data, the learning algorithm can generalize or “learn” how to distinguish between good and bad files. This means a well-built model can detect malware that wasn’t in the training set. This makes sense because you’re “teaching” software to do the job of a malware analyst. Traditional, signature-based detection, by contrast, generally requires getting a copy of the malware file and creating signatures, which users would then need to download, sometimes several times a day, to be protected.

The other type of AI-based approach is training a model on how programs behave. The real trick here is how you define and capture behavior. Monitoring behavior is a tricky, complex problem, and you want to feed your algorithm robust, informative, context-rich data which captures the essence of a program’s execution. To do this, you need to monitor the operating system at a very low level and, most importantly, link individual behaviors together to create full “storylines”. For example, if a program executes another program, or uses the operating system to schedule itself to execute on boot up, you don’t want to consider these different, isolated executions, but a single story.

Training AI models on behavioral data are similar to training static models, but with the added complexity of the time dimension. In other words, instead of evaluating all features at once, you need to consider cumulative behaviors up to various points in time. Interestingly, if you have good enough data, you don’t need an AI model to convict an execution as malicious. For example, if the program starts executing but has no user interaction, then it tries to register itself to start when the machine is booted, then it starts listening to keystrokes, you could say it’s very likely a keylogger and should be stopped. These types of expressive “heuristics” are only possible with a robust behavioral engine.

How Do You Evaluate AI Solutions?

This question comes up a lot, and understandably so. I’ve written about this before in What Matters with Machine Learning. Essentially, since AI is so new, people don’t know the right questions to ask, and there’s a lot of marketing hype distorting what’s truly important.

The important thing to remember is that AI is essentially teaching a machine, so you shouldn’t care how it was taught. Instead, you should only care how well it has learned. For example, instead of asking what training algorithm was used (e.g. neural network, SVM, etc), ask how the performance was tested and how well it did. They should probably be using k-fold cross-validation to know if they’re overfitting the model and generalizing well, and they should optimize for precision to avoid false positives. Of course, raw model performance won’t be an indicator of how well the product works because the model is probably just one component in a suite of detection mechanisms.

Another important consideration is training data quality. Consider for example two people trying to learn advanced calculus. The first person practices by solving 1,000,000 highly similar problems from the first chapter of the book. The second person practices by only solving 100 problems, but made sure that those 100 problems were similar to and more difficult than questions on practice tests. Which person do you think will learn calculus better? Likewise for AI, you shouldn’t bother asking how many features or training samples are used. Instead, ask how data quality is measured and how informative the features are. With machine learning, it’s garbage in, garbage out, and it’s important to ensure training data are highly varied, unbiased, and similar to what’s seen in the wild.

Can Attackers Hide from AI Detection?

Since static and dynamic AI are both very different, adversaries must use different evasion techniques for each one. However, it should be noted that since AI is still fairly new, many attackers have not fully adapted and are not actively seeking to evade AI solutions specifically. They still rely heavily on traditional evasion techniques such as packing, obfuscation, droppers & payloads, process injection, and tampering with the detection products directly.

If attackers want to avoid static AI detection, they essentially must change how their compiled binary looks, and since it’s impossible to know how they should change it a priori, they’ll have to try a bunch of variations of source code modification, compilation options, and obfuscation techniques until they find one that isn’t detected. This is a lot of work, and it scales up with the number of products they’re trying to avoid.

What is Next-Generation Endpoint Security?

It was once believed that antivirus was enough to protect your endpoints. Endpoint security has taken over as the better technology to protect your endpoints. Endpoint Detection and Response (EDR) was formerly known as Endpoint Threat Detection and Response (ETDR) and is sometimes referred to as Next-Generation Anti-Virus (NG AV). Source

The industry vernacular then moved to Managed Detection and Response or MDR. At Cybriant, we call our MDR service Managed Detection and Remediation because our team will walk you through the remediation process, which is a valuable step in prevention. The next step in endpoint security is XDR. The X in XDR stands for multiple data sources that will help prevention and detection.

Data Loss Prevention DLP Solutions: Everything You Need to Know

endpoint security solutions

 

Prevent Cyberattacks with Artificial Intelligence