After you have watched this Webinar, please feel free to contact us with any questions you may have at email@example.com.
JOE GEHRKE: Just first, quickly, introduction.
My name's Joe Gehrke.
I'm a solutions architect covering the Americas, based in Chicago.
Been doing security for about 20 years, threat intel for the past four or five.
I will say that I am not an analyst, so my background is security integrations.
A lot of what I talk about is going to be focused, obviously, on integrations, but from a process standpoint, I think a lot of you guys probably will know that way better than I do.
So forgive me if I mess up, and just interrupt and let me know.
This is kind of a mix of platform agnostic, and then I was planning to get a little bit deeper into the platform itself.
So if anybody is not a customer, I'll try not to get too deep because it won't be too relevant.
But with that, I think a lot of you are familiar with a browser tab that looks like this, probably Everybody knows this.
And one of the things that threat intelligence platforms, in general, have not been great at is providing all of that same kind of functionality in a usable way inside the tip.
Anomali's the same.
And so getting some of these easy ones in there to start to give flows where maybe you can do things like DNS twist, or you can check your internal systems to see if anybody has been calling out to a URL that's been reported by the threat intelligence feed.
All these types of things are pretty common, and what I hope to show today is that the frameworks that are available in Anomali and other tips today are extensible, and we should start thinking about enrichments beyond what we traditionally have.
And when I say beyond what we traditionally have, enrichments have usually been things like reputation.
IP, tell me a score.
Give me a file hash, call VirusTotal.
Passive DNS, passive SSL.
In the threat intel world, those are the things that have typically been considered enrichments.
And I think we can take that a lot further.
And these are just a couple examples of ones that I've written based on interactions that I've had with our customers.
Neither of these are traditional, really, in the sense that one, the DNS twist is give me a domain.
We're going to take all the variations of bit squatting, typo squatting, all of that, see if they're alive, see if it's been registered and then if it's alive.
The defender ATP is that use case where, if I'm looking at an investigation, say, and an email comes in, I've got a URL that somebody clicked on.
I really don't know if it's malicious.
I'm doing an investigation.
And it turns out it is malicious, and I just want to find out who else in the organization has actually called out to that.
So it's a combination of investigation, threat research, all these different things that can be enabled inside of the platform, and do it in a very rich way.
So for anomalies specifically, one of the challenges has been that there's not a good way to display this content always.
Things like passive DNS are easy because you're given an IP or domain and you're getting back an IP or domain.
Those are things that a tip was meant to handle, very easy to do.
Some other data is not so easy to do, at least historically.
And that's things like visualization, temporal data.
So example of that would be things like open ports, which is one of the enrichments that we have that tells you which ports have been open on an IP address over time.
And usually that type of data is kind of hard to represent in a platform.
It's hard to represent, and actually, you can't inside of kind of a Maltego-like graph.
Couple things here.
I'm big on not integrating things that just don't make sense.
People always say fewer clicks is good.
A lot of people I know are really, really good at hot keys and doing Control C, Control V, and tabbing around.
And if you're good at that, that's great.
Let's keep it.
I think what I'm trying to identify, and help you guys identify as I go through this, is what makes a good candidate for an enrichment.
And then one step further would be if you have a good candidate, is this something that you want to develop in-house?
Is it something you want to push Anomali to develop?
Is it something you want to push our partners to develop?
So the framework is open to everybody, customers, partners, of course Anomali, to do this development.
So you don't have to wait on one person to do this if you're comfortable doing that.
So a couple of things on a general approach here.
So a lot of times we, as, I think, analysts, see something.
We do it every day, and we think, gosh, this would be really easy if you could just bring it into the platform, bring it into the Explorer graph so I can right click on it.
The value, a lot of times, is pretty clear, but the cost isn't.
A lot of these enrichment services are totally free until you need the API key.
Things like that.
There's all sorts of things I'll go into as far as licensing.
A lot of times we get asked, hey, how come you haven't released the URL scan enrichment?
Or how come you haven't released this enrichment?
And a lot of that also has to do with things like licensing.
So things that are free for you to use through a website oftentimes require a lot of negotiation for us to expose to our customers, even if they remain totally free.
That's a really, really important point.
A couple of other things.
Won't go through this in too much detail.
Not everything has to be external.
So enrichments we typically think about as calling out to an API.
Here's something I'm looking at, a domain or an IP.
Tell me more about it.
DNS twist is an example of one that's totally self-contained.
That is just using algorithms to produce variations of a domain, and then we can pivot further to see if any of those have been registered.
So start to think about things like that that are candidates.
It doesn't have to be your traditional hey, VirusTotal exposes an API.
I'm going to query out to them to tell me how many positive scans have been on this file hash.
Specific to the Anomali platform, there is some architectural considerations.
If you're in the cloud, cloud to cloud is easy.
So if a service is exposed by the cloud, great.
If you're on prem, it actually opens up more avenues because you can execute those in your own environment.
So those systems that might respond to an API call or offer more data that are listening internally can actually be queried with an enrichment from an on prem system.
Talk a little bit more about the architecture in a second.
Some of it's pretty obvious, but it's worth noting.
So when I'm thinking about what candidates make for good enrichments, to actually spend the time to develop them, the business ones are ones that at least I don't think about too much.
I mentioned a few of these earlier.
The obvious one is just intuitively-- and I think a lot of us know this-- is it worth the time to do it?
Or am I really, even though it's three clicks, just spending it's just not worth it?
API keys cost money.
The security and compliance side is important as well.
There's privacy considerations.
When you start to integrate this into a platform, it becomes a whole lot easier for more people to click buttons that send information to other places.
When you're doing it deliberately by going to the website, you're consciously making that choice to do it.
So you have to think through some of those things.
Then the licensing as well.
So commercial use is a big one.
A lot of these things are open source licenses.
MIT is one of the least restrictive, but they have other types of licenses there.
These are the things that we as a company, if we are going to release an enrichment platform wide, have to be very aware of.
And that's, like I said before, a lot of the reasons why sometimes you'll see some things in there.
Sometimes you'll work in tools that give you that context, and you'll ask why can't we do this?
A lot of times it has to do with licensing and commercial use restrictions for us to expose it to multiple customers.
And then just generally on the support.
Who's going to support this thing if it breaks?
If it's a free service, generally it's going to be Anomali for sure from a support standpoint if we release it as fully supported.
If you build it on your own, that's obviously something that you got to think about who is actually going to support the thing.
On the technical side, again, one that's pretty obvious but often overlooked when you at least first come into a service that you think would be a good candidate-- where is it located?
If we are living in a SaaS environment, we almost universally cannot talk to something that lives on prem.
That's just how it is.
That's how our architecture works today.
That's how most tips out there will work today.
If you're on prem, you can talk to on prem, you can talk to cloud, so a little bit more flexibility there.
Of course, the data mapping.
We're tips, so we generally like to get back information that's compatible with how we store it.
So if a service is returning a IP, a domain, and a URL, a file hash, that's great.
That'll map straight in there.
They always return additional context.
I'll show you examples of that as well.
The most generic thing that most tips are going to handle are tags.
So at the very least, if you have returning information, you can slap a tag and associate that with an element you're working on.
In general, though, these should be pretty low volume.
You shouldn't have something that's going to return to you a page of results.
Passive DNS is kind of the exception.
Sometimes you can have thousands of those.
But for ones that I've seen that our customers and our partners are developing, simple is better.
You don't want some service that's going to return to you 1,000 words or 1,000 different things just based on one call.
And then, of, course , on the complexity side, I won't go too much into code because I know people don't really like looking at code too much.
I don't either.
But it's a simple framework for getting it into the platform.
And then obviously, you get back to the questions of is it worth spending the time to develop this thing?
So those are the two considerations that I kind of go to when I'm looking at a new source or a customer comes to me in terms of just the business and the technical considerations and the overall value.
So I'm going to go through a really specific example.
I will say that the example I'm going to go through is one that's provided by a customer.
I'm happy to share the information after if you want to ask me so I can connect you with that customer.
But the things, again, that I kind of went through from a technical standpoint was those two business and technical things.
So once I identified that it was a suitable candidate, which is way, way, way more important and way harder than actually developing it, you start to get into those technical considerations.
And those are things here like, all right, does it have an API?
Again, it seems obvious, but everybody these days will say they have an API.
Some, sadly, still use some old SOAP-based API which are incredibly difficult to work with.
And that, for the most part, is something that I would just not do because it's going to change.
Nobody really uses it except for a few folks.
Of course, the data input, output, the alignment.
And then that note I made a mention a second ago on the quantity and the quality of the data.
And then, of course, just the ability to test this environment.
We at Anomali do give customers, partners, anybody access to a test environment.
So if for one reason or another you don't want to test this out in your own system, we can do that in our cloud, test cloud, that is.
So again, getting specific into the Anomali things.
And this is considerations more-- again, it doesn't mean that you're going to be developing this.
But if you're thinking about these things and you happen to know Python, that's good because that's what our framework, our SDK is written in, is Python.
Our examples are going to be in Python.
I think that's in line, I would say, with most threat intel Data providers, enrichment providers.
Python's kind of the language of choice these days.
And the way that it works is that whatever it is you're looking at in the platform, that becomes the object we want to enrich.
It could be that domain.
It could be that IP address.
Very soon it's going to be things like a registry key.
Again, in Anomali world, the I types, basically any I type, you'll be able to right-click on, view the details of, and this is what will make that query.
So there's two parts to this, though, when you do make a query out to that source, or it could be a self-contained code.
It's always required to send back something.
If there's nothing, you can make kind of a message that just says nothing was found.
But when you do send back something, it does have to have one of these types, which is IP domain, the general high level types, or a phrase.
Phrase is just that generic catch-all that says I'm dealing with something that's not of that well-known type.
And then these get passed to the SDK for display.
A couple of things here is these things can be chosen, where you put these in the platform.
So it doesn't have to be in just the Details page or just our Explorer graph, the Maltego-like thing.
It could be one and the other.
It could be both.
The whole idea I mentioned a second ago about temporal data or non-temporal data, time series data, well-suited for looking at the details of an indicator.
Not going to work when you're graphically looking at something to return a time series set of data.
So the one example I'm going to go into in a little bit more detail is publicwww.
This just gives you a couple examples of ones that I've created in order of complexity.
And I just state this because sometimes the concept or the value that they're providing is really straightforward, and it seems like it might be easy, but it's not.
Just that last one on service now, which is, hey, if I'm looking at an indicator, tell me if I've ever seen that in any incident or ticket over in service now.
In practice in the code, extremely difficult.
The DNS twist one was pretty easy mostly because that's a open source code, and it's reusable, but the code itself is a little bit complex.
And then the one I'll go into, publicwww, is 70 lines of code.
That was all it required to write this integration.
And this was the example that my customer gave.
They are kind of in-- well, kind of-- they're in the payment business, and they had a lot of concern over Magecart.
So the publicwww one is great because it was never really intended to be for security researchers.
You go to that website and it's all about marketing.
And hey, how can I find cool stuff to help improve my search results?
Lots of use cases for what you do when you find those sites out.
That was another great thing.
You can even get an API.
It's limited access, limited number of results, but you can get a free API.
So in action, what this would look like inside of Anomali-- and again, this would apply to any tip platform.
This would be really easy to do.
They all have extendable capabilities.
Doesn't really matter which one.
It queried out to the service, and it returned any website that contained a reference to that code.
So that was literally an include statement in the HTML of a web page, and they index it.
Just like Google, they're indexing I don't know how many millions and millions of sites to find this information.
And then we have the additional context, which is they have their ranking system.
Again, think of this like kind of like Google.
If there is multiple results, it would just say that 8,000, or 86,000.
Just says this is the 86,000th most popular website that they've indexed.
So it's pretty low.
Probably not a ton of users go into it.
So let me just actually show you an example here of that in action inside of the platform.
This is what it looks like.
Source code search engine.
Nothing to do with security.
So I hope that when you're thinking about this, these are the types of things that pop up.
And that's why I love this one so much.
And if I just kind of carry on that example here just to show you what it looks like inside of the website, it's doing this.
It's going to say, hey, we've got whatever, one hit, two hits.
What you'll see is there's actually a lot more hits in here if I had a paid key.
So this became a really good candidate from integration, also, from the data that's returned.
It's returning URLs that are malicious, and that's great.
That is really good stuff when you're doing investigations to uncover additional malicious infrastructure that is not otherwise being reported by feeds, which is a lot of times the case with these sort of odd compromised websites that are mom and pop shops that never get fixed.
That's what this thing returns.
We could have one.
We could have had This whole thing, though, is just available be a very simple API.
Well, actually I'm going to show you how we deploy the code, too, because it's not deployed.
This might look a little different than some of your threat stream instances.
Just to how we deploy these things, these are bundled up pieces of code, all well-documented.
If you register for the SDK, I can give you information on the end on that.
You just bundle the code up in a zipped up or tared up file.
And I will show you that right now.
So here is mine.
This is not something we expose to our customers in the cloud just because we need to test it for purposes of making sure it's not going to contain some bad code or infinite loops or otherwise cause the availability of the Saas environment to take a hit.
But right here is what happens here.
And when I was setting this up, I gave things like icons and instructions inside of the code for how this thing would appear.
And the setup is quite simple here.
Take my API key.
You guys can use this.
Really doesn't matter to me.
This is a free API key.
And then all I got to do in here is pop in my key.
Moving too quick.
So here we go, and activate it, and that's it.
So that's how we do this with our partners as well.
As I'm doing that, one thing I do want to point out-- and this is a test instance, but our partners are doing this development as well.
So we involve our partners sometimes, and we kind of both do it.
Sometimes they have in-house developers to do it for them.
What you notice in our platform recently is that a lot of these have been enhanced.
So if you happen to have DomainTools or FARSite today, or Umbrella, or any number of ones, those are using the exact same SDK that we use that you can use as customers to develop this thing.
So it's very extensible to do that.
It also speaks to the point if you do have a partner that you want to see integrated in here, we can help you to engage the partner to get that thing written for you.
Now that I've got that thing available, I'm going to go back to my Explorer and show you what that would look like.
And if you don't know our Explorer graph, this is kind of nice because this is also embedded in our investigations.
This is where you can pop in any URL you want.
It doesn't have to be known to the system.
So you can run these types of infringements on data that you're just interested in exploring.
I could pop in Google in there and run enrichments on that thing if I wanted to.
But now that I have this guy and I've activated that enrichment, you'll see it here.
I don't have a really popular name that I called with it.
But this thing is making that API call out.
It returns the results.
In this case, it's just one.
But when I do that hover over, we'll have those additional fields.
So you can get pretty good even in the graph view as far as the additional data that she might return when you're hovering over this.
If I was looking at the details of this, it would just be in a table, a fixed table view.
And when you click the button to say hey, tell me what publicwww knows about this thing, it would just have a similar-looking list view that's just a lot more in a static, non-clickable page.
So that's it in action.
I'm going to bounce back here to the code a little bit just to explain how I did that.
There's a couple things that can control this.
And this is the expanded code bundle.
So this is exactly what I just deployed.
It has a couple things.
One is this metadata file.
The metadata file-- I can show an example in a second-- is what describes how it's going to interact with the system, the fields available, what enrichments I want to make it available on, because this one was focused on URLs.
It doesn't work with IPs, so you can't right click on an IP and query out to publicwww.
You can control where you want this thing to appear, what types of observables you want to you expose it on.
Of course, you can add pretty things like logos.
And then the actual 70 lines of code is in that source folder.
So back to the metadata file, just really quickly, how easy this thing was to collect credentials.
Sometimes these things require more than just an API key.
It's a user and an API key.
Sometimes it's three things.
It's like a tenant ID, an API key, a user, whatever it is.
You can collect whatever you want by specifying it inside of the metadata field.
That's just then exposed through the UI to be collected in reference to the time it's executed.
There are the transforms that I was just talking about, which is the how does this thing appear?
So this one I got a little lazy and just called it URL to domain.
In reality in production, you would probably just call this code search or something like that on publicwww so you can control that type of display.
And then you also control where it's going to be available.
So we call this pivoting an enrichment.
Enrichment would be when you're looking at the details of a given indicator on the kind of static page, pivoting is the Maltego-like graph.
The Maltego-like graph appears in two spots in our platform, in Investigations and Explore.
And the details, again, would just be in the details of a particular observable.
Any questions on any of that, because this is kind of getting a little bit into the weeds.
Want to make sure if you have them here, you don't lose it in five minutes as I bore you with a few more bits of code.
And just to repeat, the question was if you develop your own app does it need to be released with a specific license?
The answer is no.
There's a couple of things here.
One is you as a customer are often entitled to usage of a particular service on your own and deployed it privately into your own organization that we might not deploy system wide.
We do have customers that have written enrichment-- we call them private enrichments-- deployed to our public cloud.
But they are the only organization that can use those.
And that oftentimes means it doesn't really matter what the license is.
If we're going system wide, we just have to reference the license.
And we actually have in the metadata-- I don't believe I have it specified here-- one of our optional elements in that, in addition to things like version, is the license and the license terms, a link to the license terms as well.
Question was does anything that anybody develops, I guess, that ends up in our cloud become the property of Anomali?
I don't know.
There is an SDK agreement.
So you do need to register for the SDK.
And as part of that, there is an agreement.
I will take that one as a follow-up and let you know, but it would be outlined in the agreement when you register.
The code itself.
If you know Python, this should look really familiar.
If you don't, no big deal.
The biggest thing I wanted to make here was that this one was simple.
So the actual code that required you to do this inside of our SDK was probably 20 lines.
And the really important part here is what you see there kind of in that box, which is the adding the entity and adding additional fields.
You can do whatever you want inside of your code to make that call out.
You've got that URL.
Now let's do something with it.
When you get it back, we're going to add that back into the UI by calling these two simple functions.
We're going to add the entity back, which, in this case, happened to be a URL.
And if we had additional context about that URL, we could add as many fields as we wanted.
So it might be tough to see, but these last three lines in the box are those same additional fields that were displayed when I hovered over it to bring back the things like the result number, the rank, and then the direct link if you wanted to go back into publicwww.
So that's the framework.
These are largely, even for our partner developed ones, they're going to look a lot like this.
Some of the complexity that I've seen in a code will come when you start to get fancy with how you display things.
And that's great because partners love doing that.
They love-- if I'm Risk IQ or I'm Umbrella or I'm DomainTools, I want to make sure that the content that I'm displaying inside the platform looks as good as it possibly can.
And that's where we start to get into these things like these widgets, so fancy donut charts and hover text and URLs.
I think one of the really interesting things is that generally, as a tip, we don't make any link hot.
When you're using the SDK, you can, because oftentimes there are results that come back, and you want to reference an external link to make it easy just to go right into the details from the source.
You can do that really easily here.
So this is kind of one step further as far as if you're looking at details of an observable and you want to display it in a graphical way, there's a lot more options now to do that.
That's what's allowed us, inside of ThreatStream, to start to look at larger volumes of data, time series data.
That open ports example that I used the same framework and displays it in a chart.
We use some open source charting libraries that we're compliant with.
So these are just some examples, but there's all sorts of different ways you can display the same information.
A couple of learnings here just to wrap it up.
Again, more on the technical side than anything, you need to know Python if you're going to write this.
And it's not one of those SDKs that's just magic and all of a sudden you don't have to do anything.
For us, this is a new interface.
It's not like there was something before to do this, and so suddenly it's easier because we have an SDK.
This is brand new, so you do need to know Python at some level.
Some of it, that publicwww, it's not complex.
And I'm not a coder, by the way, either.
Intermediate skills are fine.
But you'll quickly realize for some of these complex ones, it might not be something you want to take on.
We can help.
Partners can help.
It is on an older version of Python.
We will eventually get to a newer version of Python.
This will be, obviously, a really deep technical detail.
There are some services that simply will not work using older Python libraries.
So I had somebody come up with a really, really good integration.
I think it was with Akamai for a right click to make some query out to their reputation service.
They have a very complex authentication scheme required for their API.
And the only code that they issued to help with that is written in Python 3.
So it's kind of important.
The note on the SaaS code being Q8 is important.
And then just, again, emphasizing if you are on prem in one of our versions that's come out in the last six months or so, think about other things you might want to integrate with that only listen on prem.
It's a whole universe of options out there.
So with that, any other questions, things I can go back to?
So recipes today are pretty limited.
If you're working inside of an investigation-- we got a bunch of tests up here.
Let me see what we got.
I'm going to go into production.
Today we have a couple phishing recipes.
And phishing recipes is usually what the use case is focused on.
So if you haven't used investigations yet, this is something that we redid, again, probably about six months ago to make it a little bit more usable.
One of the use cases for these, in addition to just manually creating things, is phishing.
So if we have URLS and we have domains, we're going to go through a series of steps to identify hey, are these known to be suspected fishing URLs, do all sorts of other things.
So if you're working inside of one of these guys, and you happen to have a domain or a URL or whatever it might be, you'll see recipes as a right click option.
And then this here, in turn, if I did this, would execute that series of steps in the top right.
So this is definitely the direction we're going for a lot more things.
Like I said, we haven't exposed this to our customers so that they can create their own recipes yet, but that's what we're doing.
You'll see a lot of other things come out.
We're going to have a compromised credentials recipe, I think, in a couple months.
Berto, the gentleman that was in the video for [?
lens ?] yesterday, if you saw him working at the computer, he is a threat intel analyst, luckily.
And he's the one that's actually looking at the recipes and saying hey, I did this for 20 years.
This is what's going to work.
So he's driving a lot of that.
And again, on the customer side, for those of you their customers, if you have ideas that are recipes for existing enrichments that we already have, let us know.
We're only as good and know as much as we're told.
And we to make sure, if one of you guys are using it, that we can get everybody to use it.
Yeah, so the question is, is there a directory that lists all the enrichments that are available, maybe what customers have developed?
Generally in the SaaS environment, or any environment, that directory will be this Integrations tab.
And anybody that's an organization administrator will have this Integrations tab available, and this will list all of the enrichment that have been released company wide.
If you need to know, hey, has anybody developed a private enrichment for a customer, that would just kind of be a one off conversation with customer success.
But I can kind of compile that.
And for anything that's, I'd say, a well-known vendor, we can get that to our customer success team and just make sure that they're at least socializing that hey, somebody else has done this so you're not rewriting code.
We would definitely know internally, if you said, hey, we want to start working on some code, we would know right away, hey, no, don't rewrite that, somebody has already done it.
We'll do a more in-depth review and just release it platform wide.
So this one, though, by the way, this is great, too, because there's a lot of new things.
We've started doing these weekly or monthly product updates that list new things.
There's a Help page in here that also has what's new in the platform.
But it's still kind of hard to keep up on what's new and what's been released if you do come here, you might notice that there are new integrations that have been released.
A lot of these are free.
Coming into that page, take a look at the ones that are free.
You might just want to enable them and look at them.
This is also an interesting place now with our freemium programs where you can, say, sign up for DomainTools or FARSite with a limited or a capped amount of API calls per month, do things like passive DNS.
Otherwise, I know a lot of users that I talk to, this is tough to get to if you don't know where you're looking.
The question was if, as an individual user, I enable one of these things, is that setting organization wide for all users?
Answer is yes.
One setting per organization and one API key per organization.
So you will share the same API key.
That question also before, if there are services that you use that have API rate limits, you just need to be aware that you're using that same single key for every analyst that might be in the platform.
So one thing that wasn't in here was, yeah, you might have to pay for an API key, but you also have to think through is this going to increase the volume that you're entitled to, and is it realistic, because now every analyst might be clicking the button 10 times a day when before, they are going to the website, not using the API at all.
And these are expensive.
A lot of times you use them three through the thing, and you find out how much the API key costs, and you say, oops, well, I'm not going to enable that one.
I mentioned on prem opens up a whole lot of possibilities that aren't available in the cloud.
And the reason why is just architecturally, if we are living in our cloud environment, we only have access to things that we can query.
And we don't offer private DNS services down to your environment.
Customers aren't going to open up, in general, their external environment for API calls coming from the ThreatStream cloud.
So architecturally it just doesn't work, whereas if you're on prem, in most cases where you're not in a super restrictive environment, you can make those API calls freely because you're living on the same network as your other security tools.
So you might have an on prem ticketing system.
It'd be pretty easy or trivial to call that ticketing system because you're on the same network.
If you're in the cloud, you can't get to it because it would require poking holes in the firewall, and the security team is not going to do it.