IBM [IBM] announced a version of its Watson cognitive technology system that will be used in a year-long research project to train it on the language of cybersecurity, the company said May 10.

Named ‘Watson for Cyber Security,’ this new cloud-based version of the Watson system is learning the nuances of security research findings and discovering patterns and evidence of cyber attacks and threats that may otherwise be missed, IBM said. Starting this fall the company will work with eight universities to further train Watson on the language of cybersecurity.

The academic partners include the California State Polytechnic University, Pomona; Pennsylvania State University; Massachusetts Institute of Technology (MIT); New York University (NYU); the University of Maryland, Baltimore County (UMBC); the University of New Brunswick; the University of Ottawa; and the University of Waterloo.

Students at the universities will initially help build Watson’s body of knowledge by annotating and feeding the system security reports and data. As they start working with IBM Security experts to learn the nuances of these reports, the students will also “be amongst the first in the world to gain hands-on experience in this emerging field of cognitive security,” IBM said.

The company expects to process up to 15,000 security documents per month over the next training phase as it combines the work of university partners, clients, and IBM experts. The security documents include threat intelligence reports, cybercrime strategies, and threat databases.

IBM said training Watson will also help build the taxonomy for the system. This includes the understanding of hashes, infection methods, indicators of compromise, and help identify advanced persistent threats.

The company highlighted it will use its X-Force research library to feed information and documents to the Watson for Cyber Security program. This includes 20 years of security research, details on 8 million spam and phishing attacks, and over 100,000 documented vulnerabilities.

IBM noted the current volume of data presented to cybersecurity analysts is immense: the average organization sees over 200,000 security event data pieces per day, enterprises spending $1.3 million per year alone on false positives, over 75,000 known software vulnerabilities reported in the National Vulnerability Database, 10,000 security research papers published annually, and over 60,000 security blogs published monthly. Watson is meant to be a tool that can better use this plethora of information.

The new Watson system is designed to be the first technology that can offer cognition of security data at scale using Watson’s ability to reason and learn from unstructured data. Such data is defined as 80 percent of all data on the internet that traditional security tools cannot process like blogs, articles, videos, reports, and alerts. Watson “also uses natural language processing to understand the vague and imprecise nature of human language in unstructured data,” IBM said.

With all of this work and training, Watson for Cyber Security “is designed to provide insights into emerging threats, as well as recommendations on how to stop them, increasing the speed and capabilities of security professionals,” IBM said. The program is also set to include other Watson capabilities like the system’s data mining techniques for outlier detection, graphical representation tools, and techniques for finding connections between related data points in different documents.

The company described an example of Watson’s eventual capabilities: it can find data on an emerging type of malware in a security bulletin and combine that with data from an analyst’s blog on an emerging remediation strategy.

“The volume and velocity of data in security is one of our greatest challenges in dealing with cybercrime. By leveraging Watson’s ability to bring context to staggering amounts of unstructured data, impossible for people alone to process, we will bring new insights, recommendations, and knowledge to security professionals, bringing greater speed and precision to the most advanced cybersecurity analysts, and providing novice analysts with on-the-job training,” Marc van Zadelhoff, general manager of IBM Security, said in a statement.

This announcement is part of a larger cognitive security project that is aimed at addressing a general cybersecurity skills gap. The company’s effort is designed to use cognitive systems to improve the capabilities of security analysts. Cognitive systems automate the connections between data, emerging threats, and remediation strategies.

The company intends to start beta production deployments using IBM Watson for Cyber Security later in 2016.