Data has fundamentally changed the way that companies operate, redefining business models, creating new industries and opening up additional revenue streams, writes AWS Technical Evangelist, Ian Massingham. In the last five years alone internet users have increased by more than 82 percent and Gartner anticipates that data volume is set to grow 800 percent by 2022, with 80 percent of it residing as unstructured data.
While there is no denying that it is a huge opportunity for businesses across the globe, the flood of unstructured data also represents an evolving and ever greater challenge, especially for the cybersecurity team.
Whether it’s a multitude of IoT devices, web services, logs, videos, user chats, mobile apps, photos or streaming data that is now flooding the network, every source still needs to be investigated and assessed on the risk posed.
Machine learning and AI has long been lauded as the solution to extracting the value from unstructured data and evolving and assessing security postures across a business. But this is easier said than done and first and foremost, requires the ability to understand unstructured data.
The Interpretation of Unstructured Data
Historically, one of the key problems for businesses trying to extract the true value from their data has been the overwhelming majority of unstructured data isn’t set up for machine processing. Fundamentally, computers have not been able to understand context with emotions, speakers’ accents, and other details that humans take for granted, not analyses or captured. This represents a major challenge, as businesses look to utilise content such as images, audio, videos, e-mails, spreadsheets, and word processing documents to inform business decisions.
It is here where machine learning and artificial intelligence can have a marked difference with the ability to seamlessly derive insights from the multitude of sources.
Using AWS technologies, AI systems, devices and programmes such as chatbots, are now able to recognise, interpret, process, and simulate human emotions. Utilising machine learning, conversational IVRs (interactive voice response) and chatbots will route customers to the right service flow faster and more accurately factoring in emotions and tone of voice – bringing together the unstructured and structured data sets. As a result, when the system detects an angry or disgruntled user, they can be routed to a specific channel that will better serve in diffusing the situation.
As businesses capitalise on machine learning to understand unstructured data, it is no surprise that this has been mirrored within the cybersecurity sector.
Smarter Detection
As cybersecurity strategies evolve to protect against a fast-changing threat landscape, hackers are developing increasingly sophisticated methods to bypass these protections. Using machine learning to automate their attacks, hackers are making breaches more and more difficult to detect. With more than 40% of businesses experiencing cyber attacks in the last 12 months, businesses must beat cyber criminals at their own game by using machine learning to better protect their data, employees and crucially their brand.
Traditional cyber security solutions are able to detect attacks by aggregating information such as directories, URLs, parameters, and acceptable user inputs. However, this approach is no longer enough because of new data, applications and code broadening the threat of attacks is constant.
Enter AI and machine learning which enables a smarter approach to threat detection. These technologies can understand patterns of behaviour across business databases and applications. By understanding what ‘normal’ looks like they can swiftly pinpoint anomalies that could indicate an attack. This is a complex task that could not be undertaken manually, but a critical one that can enhance a businesses cyber defence strategy.
Machine learning technology also processes and organises data quickly and effectively, meaning that security teams are able to assess threats within the context of comprehensive, well-organised insights, rather than being inundated with an overwhelming amount of information. This is vital in helping teams focus their investigations on genuine threats rather than on false positives. Furthermore, machine learning-driven analysis ensures that any attacks that could be obscured by the flood of security events don’t go unnoticed and can be mitigated quickly and seamlessly.
Automated reasoning tools today provide functionality to customers through AWS services such as: Config, Inspector, GuardDuty, Macie, Trusted Advisor, and the storage service S3. As an example, customers using the S3 web-based console receive alerts – via SMT-based reasoning – when their S3 bucket policies are possibly misconfigured.
AWS Macie uses the same engine to find possible data exfiltration routes. The service behind this functionality regularly receives 10s of millions of calls daily. GuardDuty uses automated reasoning to detect anomalous account and network activities. For example, GuardDuty will alert you if it detects remote API calls from a known malicious IP address – indicating potentially compromised AWS credentials. GuardDuty also detects direct threats to your AWS environment indicating a compromised instance, such as an Amazon EC2 instance sending encoded data within DNS queries.
The future is Machine Learning
There is no denying that machine learning is a clear point of investment for businesses and a vital component of any cybersecurity strategy. As businesses across the globe look to protect against the latest wave of threats, having the ability to make sense of unstructured data and have clear insight into security incidents and the risks posed to a business has never been more important.
In the modern world, without machine learning, it would be impossible for security professionals to gather, organise and act on the sheer magnitude of security events that occur on a day to day basis. While security professionals will always have an important role to play in deciding how to tackle on these events, going forward the role of machine learning will be to distil the large amounts of data into information these professionals can act on into one clear hub.
Using machine learning to automate attack detection, response and streamline unstructured data, companies will have a quick and robust cyber defence system that evolves with their business. In doing so, businesses can put their customers at the centre of their processes and protect against threats, before, during and after an attack.
Ian Massingham leads Technical and Developer Evangelism at Amazon Web Services and has been working with cloud computing technologies since 2008. He and his team around the world work with developers and other types of technical end-users within AWS customers of all sizes, from start-ups to large enterprises, to increase awareness and adoption of AWS cloud services amongst developers