BAE Systems FAST Labs research and development organisation has been tasked by the Defense Advanced Research Projects Agency (DARPA) to develop a scalable machine learning system designed to provide data anonymity to improve data sharing for its Cooperative Secure Learning (CSL) programme, which has many potential applications, including cybersecurity.
Image courtesy BAE Systems / copyright Shutterstock
Security Operations Centres (SOC) function at the forefront of machine learning technology as they work to identify emerging cyber threats by analysing large volumes of data. Yet, standard security log data fields such as IP addresses and URLs reveal sensitive details about their internal infrastructure and services that adversaries can use to craft attacks. This leads to limited information sharing among SOCs, resulting in a situation of many SOCs having only pieces of the cyber threat landscape puzzle.
To advance research in this area, DARPA asked BAE Systems to develop a scalable machine learning solution that preserves the privacy of the data and the model, enabling Cooperative Secure Learning. By sharing information – while keeping the actual data secure – individual puzzle pieces can be connected to build the complete picture, enabling new cybersecurity models and protections to be created.
“The challenge was immense but very straightforward – to create a system of information sharing for advanced research and security modelling in a way that preserves the security of the data,” said Bernard McShea, principal investigator at BAE Systems’ FAST Labs. “Our approach allowed us to leverage the organization’s extensive experience in machine learning, security and success on previous DARPA programmes.”
BAE Systems' Privacy-preserving Arithmetic Computation for Encrypted Learning solution, known as PArCEL solution, overcomes common privacy challenges by combining the company's recent research in cooperative learning on encrypted feature embeddings with new network log sanitisation techniques. Unlike other approaches such as encryption of raw private data, BAE Systems' focus on feature embeddings reduces computational complexity while providing additional protection against information leakage and reduced information sharing.
Work on this approximately $1 million programme, which includes research teammates led by Prof. Khorrami at New York University, builds on BAE Systems’ previous work in cyber hunting, automated defence of cyber dataset, machine learning, and related techniques.