Maximizing the Potential of Open Source Threat Intelligence Feeds

May 11, 2017 | Payton Bush

Open source feeds are a popular and abundant source of threat intelligence indicators. These feeds originate from a variety of sources- companies, special projects, honeypots, individual contributors, and more. There are hundreds to choose from, providing a vast reserve of millions of indicators of compromise (IOCs) that can be ingested into security systems.

Open source threat intelligence feeds are appealing for a number of reasons. One of the more obvious reasons is their price- absolutely nothing. This is critical for smaller organizations that lack the resources for robust sources of intelligence. Cost aside, open source threat intelligence is also appealing because it provides a wide scope of information on different industries, topics, and locations. With the collaborative efforts of many contributors, users can benefit from intelligence without the hassle of contracts and data limits.

Open source threat intelligence is also popular because much of it derives from honeypots, which are decoy entities used to study invasive behaviors. These open and closed-source applications register anomalies and problematic activity that can be then be turned into feeds, software patches, and studies of adversarial behavior. Sharing the results of these honeypots across open source feeds allows interested parties to understand not only how an adversary has approached that particular system but also how they might attack other systems in the future. For those interested in deploying their own honeypots, Anomali manages the Modern Honeynet Project (MHN) to help simplify the process.

Problem

If it’s sounds a little too good to be true, unfortunately it is. Open source threat intelligence feeds are marked by a few key drawbacks. It’s not uncommon to see information overlaps between feeds, requiring some sort of manual de-duplication process. This is a daunting task considering the sheer quantity of indicators and range of feeds- depending on format a new script might be required per source.

Another persistent issue with open source intelligence is a lack of necessary context. Seeing the presence of an IP address in a particular feed doesn't give much detail about why that IP is considered bad. Knowing that the feed is centered around a specific type of data may be the only indication as to why an IP is malicious. Additional research and enrichment is then necessary to learn more about that IP and hopefully lead to an understanding of what it is related to.

Perhaps the most pressing concern though is data quality. There are no standards or accountability, which allows feeds to be laden with false positives and even vulnerable to contamination from adversaries. Threat actors can also check for their presence within these feeds, giving them real time updates on whether or not they’ve been detected.

Finally, what to do with the feeds after they have been deduplicated, tagged with contextual information, scored in some way for confidence and seriousness of the threat they represent, and then enriched with additional data? The common choice is to try and compare the feed data with internal log data to see if there are indications of potentially malicious activity within the organization. The most obvious place for this to occur is in a SIEM if one is available. Comparing millions of indicators against perhaps billions of internal log entries is a daunting task. Without any kind of scoring to know where to start with the potentially tens or hundreds of thousands of matches is important if this many matches are to be usable in any way.

Solution

Difficulties with data curation shouldn’t prevent security teams from utilizing a valuable source of information. One solution is to deploy a Threat Intelligence Platform (TIP), which is a SaaS or on-premises application that manages the lifecycle of threat intelligence.

A Threat Intelligence Platform resolves five issues common amongst open source feeds:

  1. Different sources formats - Information provided in STIX, .csv, PDF, word, and more is automatically ingested and transformed into one usable stream of information
  2. Duplicate data - All duplicate data is automatically eliminated, saving analysts valuable time (and headaches)
  3. Lack of context - Information from feeds without context is enriched with intelligence such as WHOIS, PassiveDNS, and associations to Actor Groups, Campaigns, TTPs, etc.
  4. Scoring and false positive detection - Indicators are scored for both confidence (fidelity) and criticality making it easy to know which are the most pressing
  5. Integration with SIEMs or other internal systems - Having out-of-the-box, easy integrations with an internal SIEMs, endpoint security tools, or network security platforms are ways to operationalize all the available open source threat data

The automation of data ingestion and enrichment allows intelligence from open source feeds to be made as immediately useful as intelligence from other curated feeds. A Threat Intelligence Platform also allows users to leverage open source intelligence not just as supplementary within investigations but as a dependable starting point. Ultimately, open source intelligence feeds can be an important component of information security efforts but their value is dependent upon how they are curated, ingested, and leveraged with other tools and data.

Payton Bush
About the Author

Payton Bush

Get the latest threat intelligence news in your email.