There is a rise in the number of malware ecosystems that use legitimate internet services as part of a command-and-control (C2) schema, adding yet another layer of abstraction between attackers and network defenders. There is currently little public research on use of legitimate services for C2. The purpose of this paper is to detail the technique, note its rise in prevalence, and suggest experimental methods to detect it.
Backdoor Command-and-Control via “Legitimate Services”
Defenders rely on their ability to identify attacks early, providing enough time for a thoughtful and thorough incident response. Attackers, on the other hand, are constantly evolving their toolsets to subvert existing detection techniques and technologies. One of the most notable trends in the evolution of malware is the rise of command-andcontrol (C2) channels using so-called “legitimate services,” or simply “legit services.” In this context, and for the purposes of this report, “legit services C2” refers to malware abusing common internet services such as Twitter and GitHub, employing fake users and accounts, and otherwise utilizing such service APIs as part of a C2 schema. In this report we expand on this concept, identify challenges and potential detection techniques, and provide an appendix of examples of what this network traffic may look like.
We would like to state at the beginning of this paper that in describing legitimate services C2 we will be specifying internet service providers that are being abused by malicious actors. We by no means wish to imply that these providers are negligent in policing abuse, nor do we wish to suggest that these services are unsafe for use by the layperson. Each service provider listed is working unremittingly to identify and stop abuse whenever possible.
Before we detail the variety of legitimate services and malware using them, let’s first address how backdoors typically connect and communicate.
Backdoor-Controller Relationship and Nomenclature
For the purposes of this discussion, we will bypass the common lingo for client-server relationships and instead deviate into specific malware vernacular.
In the greater information technology space, the “client” is usually a program/user on a host/workstation (inside your network), and, in that relationship, the client sends numerous requests to the server (outside your network). The server then provides resources back to the client system. This is the standard client-server relationship. Imagine, if you will, the size and the directionality of the network data in the standard client-server relationship. For internet browsing, a client system may send 10 packets (in this case implying data packets payloads, not just TCP packets) to a server for every 100 packets it receives, making the internal:external sent packets (or data size) a 1:10 ratio. This relationship (and ratio) is reversed in the world of malware.
A malware backdoor is implanted on a compromised host, yet instead of being a “client” and connecting to a “server” to ask for resources, the backdoor itself is the server. Let that sink in for a second. What is typically seen as a “server” and what is usually (inaccurately) referred to as a “C2 server” is often just an attacker-controlled system with a console or “controller” program. The attacker uses the command prompt or controller to connect to the backdoor “implant”, to query and command the implanted/compromised system. Imagine the volume of output if the attacker runs each of the commands “systeminfo,” “ipconfig /all,” and “tasklist.” For this internal reconnaissance effort, the compromised host may send 10 packets for every 1 packet it receives from the attacker controller. What should normally be a 1:10 ratio is reversed, and is now 10:1.
Figure 1. Illustration depicting the difference between a client-server and malware-controller relationship.
Thinking of the compromised system as a host and thinking of the C2 system as a server is a gross mischaracterization of the relationship. When you consider the directionality, the size, and the ratio of data sent from the remote (attacker-controlled) system to internal (implanted) system, you realize that the client-server relationship has indeed been inverted. Sure, we’re mincing words, but the reversal of this standard relationship is important. The distinction in how you think about the flow of network traffic data is helpful as you invent creative ways to detect otherwise invisible backdoors.
Accordingly, rather than saying client/host and C2/server in this article, we will refer to the malware backdoor as an “implant” and the remote attacker system as the “C2 system” or “C2 controller.” Generally we use the terms “backdoor” and “implant” synonymously, however “backdoor” is used more broadly to describe the entire class of malware with backdoor functionality, and Implant is used more to specify an instance of a backdoor when installed on a compromised system.
Furthermore, we will refer to the “C2 schema” as the totality of IP addresses, domains, legitimate services, and all the remote systems that are part of the implant’s communications architecture.
When we say simply “C2” we mean command and control.
Two Types of Backdoors
Backdoors are differentiated from other types of malware because they can be interactively controlled from a remote location by a human operator. We break backdoors into two broad categories based on how they receive initial communique: active backdoors and passive backdoors.
Most backdoors are classified as active, meaning that the onus is on the implant to call out to its designated C2 system and tell the world that it is online and ready for instruction. That typically works like this:
- Upon execution, the implant sends data to preconfigured C2 address (domain,IP, or URL) on some regular interval (such as every 60 seconds or every 5 minutes). We have different definitions for what this is and how it works based on the malware family, but this functionality is sometimes part of a “check-in,” a “heartbeat,” a “keep alive,” or a “beacon”.
- Controller, if up, may conduct a “handshake” to verify that the implant is authorized to check-in. The C2 controller interprets some basic metadata sent in the initial traffic.
- Implant and controller maintain a connection. When attacker/controller is ready, it will send commands, instructions, or other information to the implant.
- Backdoor implant parses the controller information, executes as necessary, and responds as necessary with results.
- This usually requires both the implant and server to be “online” and successfully connected in some way.
The SOGU backdoor, for example, is considered an “active backdoor” because, upon execution and successful resolution/connection to its C2 address, the backdoor will send a “beacon” or “hello” packet.
- Upon execution, implant sets up a network listener on a pre-configured port.
- Controller sends “magic packet” or password to implant on defined port when it has a command to be run. The controller does not talk to the implant unless necessary.
- The implant parses the command, follows instructions, and responds if necessary with response data.
- This requires the implant to be online and accessible from the Internet(externally addressable). The implant does not need a preconfigured C2 address. The C2 controller system can be anywhere and does not need to be online. C2 traffic to a passive backdoor implies an active human operator.
If webshells are considered backdoors, then they are passive backdoors. For example, the ASPXSPY webshell (sample on Github ) makes no outbound communications from the compromised system unless it first receives instructions from an external source.
It is worth noting that passive backdoors often:
- Require implantation on publicly addressable compromised systems (IP or domain)
- Require passwords or “magic” values for access
- Use only low-level APIs and custom binary protocols
- Do not use DDR or legit services in any way
Though it is outside of the scope of this discussion, we would like to note that searching for passive backdoors is an onerous task that begins with stacking network listeners and ends with sorting through a mountain of unsolicited inbound communications to the defended network. Other than hunting for webshells, few organizations have the resources or the tenacity to search for passive backdoors because the juice is not always worth the squeeze.
The distinction between passive and active systems is important in terms of how we think about what we’re looking for and the types of network traffic we would expect for these systems. Active backdoors almost always beacon, and they typically employ a multitude of communications protocols.They also utilize different APIs and different C2 schemas. Passive backdoors do not typically use legit services for C2, often use low-level custom binary protocols, and usually simply lie in wait for someone to come knocking on the correct door with the correct password.
High-level vs Low-level APIs
It is also important to consider the types of APIs that backdoors use when communicating. Windows has dozens of network APIs across all layers of the OSI model. For the purposes of this discussion we will classify APIs into two groups, highlevel and low-level.
Most types of backdoors use high-level APIs. When doing static analysis of a backdoor sample you may see imports of wininet.dll or urlmon.dll and imported functions such as HttpOpenRequest, URLDownloadToFile, and FtpCommand.
While there is little flexibility to modify or add to the communications protocols involved, these libraries make networking simple and efficient because there are lots of built-in functionalities. For example, if an HTTP connection is established, certain HTTP functions will apply system defaults in areas where some headers aren’t specified. These libraries may also allow for automatic proxy checking and easy proxy authentication.
When malware authors are feeling frisky and require more flexibility, they may use low-level APIs such as Winsock, with libraries such as ws2_32.dll and functions such as Socket, Recv and Bind.
Low-level APIs have their benefits and drawbacks. Once a socket is created, every bit of the protocol is manually created. One might implement binary TCP or HTTP over a socket, however it requires additional laborious programming to make that happen. The flexibility in designing one’s own C2 protocol introduces opportunities for errors, and makes things like proxy identification a pain.
Introduction to Legitimate Services C2
Organizations and vendors across the world tend to coin their own names for the same things. With respect to the topic at hand, some organizations will call this “alternative C2,” “bravo channel C2,” or any number of other terms. The dominant nomenclature for this technique is “legitimate services C2.” Legit services C2 comes in many forms, but can be broken into two categories: Legit Services C2 “Dead Drop Resolving” (DDR) and Legit Services “Full C2.”
Legit Services C2 DDR
Dead drop resolving is a technique where a backdoor is configured to look at a web resource to extract its true C2 address. This typically works in the following manner:
- Implant executes and reaches out to a web URL and scrapes HTML text
- Implant looks for specific tags/markers/delimiters in the text and extracts an encoded value
- Implant decodes encoded value, which contains commands and further config information such as a true C2 domain or IP address
- Implant initiates connection to true C2 address awaits further instructions
Figure 2. Legit Services C2 DDR
The name “dead drop resolving” is an allusion to traditional intelligence tradecraft, where one agent may secretly cache valuable information for another in a seemingly public place, allowing a security buffer between the dropping party and the receiving party.
Legit Services Full C2
Full C2 using legitimate services means that the backdoor and the controller system never talk directly, but rather they communicate through a middle party, using a legitimate service such as GitHub or Twitter to pass communications back and forth. This typically works in the following manner:
- Implant executes and uses hard-coded credentials to connect to a legit service.
- Implant scrapes account page or uses API to search for recent “comments” or “posts” or “updates” and looks for special encoded text within these portions.
- Implant decodes encoded value, which contains commands and other information. If the attacker is not ready for interactive access, the information will often contain configuration updates, sleep commands, or instructions on when the implant should call back to the legit service for updates.
- Controller connects to legit service and uses hard-coded credentials to apply updates and enter encoded values into account page or other data storage areas (attacker may also do this manually in some small capacity).
- Controller Attacker monitors sites/changes data to add new commands.
Figure 3. Legit Services Full C2
Using legit services for full C2 requires more effort on part of the malware authors, who must create backdoors and controllers with credentials necessary to use the service APIs.
Further Considerations for Use of Legit Services C2 DDR
Malware code families that use legit services for C2 DDR will typically use high-level APIs and HTTP compliant requests for initial communications, and then switch to low-level APIs and more non-traditional (custom) C2 protocols once the backdoor has acquired its true C2 address.
The implications of this typicality are twofold. Foremost, the initial C2 DDR communications in plaintext HTTP may represent a better (and potentially the only) network detection opportunity for these backdoors, especially if they switch to custom encrypted traffic post DDR. If you’re wondering “...but I thought the legit services would be encrypted?” you’d be correct. However, some of the backdoors using legit services for C2 DDR are relying on the encryption of the service provider, and using regular HTTP APIs without making an HTTPS mandatory for communication. Believe it or not, sometimes the SSL handshake will fail and in that case, the connection would revert to HTTP and thus make HTTP-based methodology detections possible based on the initial GET requests.
In addition to the network detection aspects, there may be ways to identify legit services backdoor binaries based on the variety and volume of runtime imports.
Furthermore, backdoors that use legit services for C2 DDR are more likely to:
- Be classified as “active” and perform “beaconing” to pre-configured web resources
- Use high-level APIs such as HTTP to perform DDR
- Use low-level APIs and custom binary protocols for post-DDR C2
Backdoors that use legit services for Full C2 are more likely to:
- Use high-level Windows APIs and HTTP protocols to scrape legit service accounts and pages
- Contain hard-coded account credentials for legit services
Notable Legit Services Being Abused
The number of legitimate services used for C2 is nigh unlimited. The following list is a sample of high profile and otherwise prolific services that have been observed in backdoor C2 schemas in the last few years.
Figure 4. A variety of legitimate services seen abused for C2
Notable Malware Families Using Legit Services C2
There are likely hundreds of distinct malware code families that have used (or are currently using) legitimate services for C2.
The table below describes a few notable malware code families that have been observed using legitimate services for C2. This is but a fraction of the total code families and legit services used for C2. In terms of the observed legit services used, these examples may indicate attacker preferences (in configuring samples) rather than capabilities of the backdoors or functionalities of the legit services themselves.
Please note that for aspects of C2, many malware families are fully customizable and possess the ability to communicate in multiple ways. This is especially true for code families that are actively developed. Accordingly, the following details are not expected to be permanently comprehensive.
Table 1. Notable malware code families observed using legit services C2
|Code Family||AKAs||Use Type||Services Used||Notes|
|SOGU||Kaba, Gulpix, PlugX, Thoper, Destory||DDR Only||Microsoft Answers, Microsoft Technet, Google Code, Pastebin, GitHub||Used by a multitude of China-based APT actors|
|BLACKCOFFEE||DDR Only||Microsoft Technet||APT17|
|WHISTLETIP||DDR Only||Microsoft Social|
|BARLAIY||POISONPLUG||DDR Only||Microsoft Answers, Microsoft Technet, Pastebin, GitHub|
|BELLHOP||ggldr||Full C2||Google Docs||Associated with CARBANAK (and possibly FIN7)|
|HAMMERTOSS||HammerDuke, NetDuke||Full C2||Twitter, GitHub||Twitter -> URL -> Git Stego Image w Cmds and Creds|
|LOWBALL||Full C2||Dropbox||Purportedly associated with threat actor admin@338|
Motivations and Technology Drivers for Abusing Legit Services
Use of legitimate services for some form of C2 dates back to at least 2009 . While there have been a few incidents of botnets and worms using legit services for C2, at the time of this writing, the technique is usually employed only by so-called Advanced Persistent Threat (APT) actors and state-sponsored (enabled or tolerated) threat groups. This technique may have started with traditional state-sponsored groups such as APT1, but has since branched out into many other groups and backing nations, as well as gaining traction in the international criminal underground.
There are several driving forces for the adoption of this C2 technique. First, when using legit services for C2 the malware network traffic becomes nearly impossible to identify because it mimics the behavior of legitimate network traffic. This is in part driven by the “open workplace,” bring-your-own-device and telecommuting movements.
Modern enterprises that give their employees latitude in internet usage (allowing social media and unfettered web access) are ultimately providing an auspicious cover (and opportunity) for emerging threat actor tactics. Social media and encrypted cloud services are everywhere. Many users rely on Google Docs, OneDrive, and Dropbox to get their work done, regardless of whether these services are offered by or systemically endorsed by their employer. With legit services C2, the threat actor activity is blending in with the noise — and defenders have few ways to differentiate the good from the evil.
While use of legit services C2 is on the rise, common malware such as ransomware and botnet implants rarely use this technique. Some publicly available penetration testing frameworks such as Empire and Powersploit have been implemented to use legitimate services like Pastebin to redirect or download malicious text code, however, this is a trivial use of the service and it does not come with the benefits of other, more comprehensive legit services.
There is a barrier for entry in using legit services C2 because it may require more complicated programming for malware tools. It may also induce overhead in managing legit services accounts. This requires thoughtful organization and nuanced management of the infrastructure, which is more diverse in nature than traditional C2 schemas.
If there is a rise in the use of legitimate services C2, it must be happening for a significant reason. Attackers, like any goal-oriented people, are driven by convenience, cost, and operational security.
Advantages of Using Legit Services for C2
- It is easy to hide your C2 inside communications that are believed to be good.
- Nobody questions network traffic to Google, Microsoft, Twitter or Github.
- It is easy to register new accounts on these services.
- There is minimal vetting to create new accounts and personas for most public cloud services and social media services.
- Authorities such as Twitter, Microsoft and Github claim to be cracking down on account abuse, but it is to sign up for a new account and remain undetected. Service providers may be able to programmatically reduce abuse by “bot” accounts, however those controlled by human attackers are much more difficult to tease out of the haystack.
- It is easy to get a “web page” up somewhere on the publicly accessible internet.
- Never in our history has it been so easy to get public data up somewhere.
- Image and text “paste” and “dump” sites make this simple.
- It is easy to usurp encryption for your C2 protocols
- Why set up C2 servers with encryption and build encryption into your malware if all you need to do is use a legit service and adopt its SSL certificate?
- This is worth it for the convenience alone, but it provides the added benefit of a publicly endorsed SSL stream that makes the C2 traffic nearly undetectable.
- It is easy to adapt and transform when the situation becomes complicated.
- You can reconfigure implants in the moment without waiting for DNS updates.
- You can reuse implants across attacks without reusing DNS or IP addresses.
- You can reduce likelihood of burning your C2 infrastructure (better OPSEC).
- Putting a major service provider in the middle of your C2 schema and make it difficult to detect and block your malware communications.
- No more hard-coding malware with your IP addresses and domains. When your operation is done, you simply take down your legit services pages and nobody will ever know your IP addresses.
- Never register a domain or SSL certificate again! Attribution for cyber threats is primarily based on registrants, domains, and IP addresses. Legit services places an immense layer of anonymity between attackers and their victims.
- You can reduce overhead and increase ROI and other business metrics.
- If switching to legit services C2 means you succeed in your attack mission more often, and also spend less time and money retooling, then it is a smart investment.
Challenges Presented to Defenders
As mentioned in the previous paragraphs, using legitimate services for C2 has many benefits that are in turn challenges for defenders.
Legit services are difficult to block — If you are a large international enterprise, you know how difficult it can be to remediate compromised systems around the world. Sometimes you may simply implement IP or DNS blocking at an egress point, or potentially just sinkhole the malware C2 address. However, with legitimate services this may become impossible. Do you have the technical capability to block a full Uniform Resource Identifier (URI)? And if so, does the service actually utilize a unique URI for the C2 landing page or rely on being “logged in” and dynamically delivering the content? Can you risk blocking parts of Google or Twitter?
Legit services are often encrypted and innately difficult to inspect (difficult to monitor/ enforce for misuse) — SSL decrypting is expensive and not always possible at enterprise scale, so the malware hides its communications inside of the encrypted traffic, making it difficult, if not impossible, to identify the evil traffic at all (unless you locate the malware on the endpoint). Even if you do identify evil via the endpoint, if you do not know the profile pages or exact location in the legit service that is being used, you may never be able to extract the encoded information or identify further C2 addresses, commands, and responses that are stored on these services pages — making the effectiveness of your incident response negligible.
Use of legit services subverts domain and certificate intelligence — Many companies buy indicator feeds for reputation filtering and indicator blacklisting, yet many of these feeds are based on newly generated and newly registered domains, certificates, and IP addresses connected thereto. Using legit services for C2 will circumvent all of this for obvious reasons.
Use of legit services complicates clustering and attribution — A huge amount of threat intelligence is based on clustering IPs, domains, email addresses, and other registrant information in order to form groups and serve as the basis for attribution. Anyone that tells you different is full of it. With a shift to legit services C2, it is possible to move away from domain registration because you can consider the legit service account as the initial C2 address. No longer will truly sophisticated attackers continue to register SSL certificates or use self-signed SSL certificates for C2 schemas, which we all know has historically played a big part in tracking and clustering threat activity. Furthermore, backdoor binaries no longer require hard-coding real C2 addresses, so even if you find a sample on an endpoint, you may never be able to trace that to an attacker IP address if it has been removed from the legit service.
Experimental Detection Methodologies
Detecting legit services C2 is fundamentally difficult, thus the methodologies we are discussing are experimental and will likely yield a number of false positives when first tested. We recommend evaluating these methodologies on a small subset of your network, then spending some time whitelisting and tuning the logic before deploying to an entire enterprise.
For detecting evil traffic to legit services, we can apply the simple thesis that browsers are smart and malware is stupid. Browsers have undergone decades of development to optimize network usage with things like caching, cookies, and session memory. Even though some network traffic is encrypted, there will be differences in how a piece of malware communicates with a legitimate services. We suggest exploring the following four experimental methodologies to detect malware using legitimate services for C2.
1. Non-browser non-app process network connections to legitimate services
Endpoint products from vendors such as Vector8, Tanium, Crowdstrike, Carbon Black and Mandiant (ask about “HIP”) may be able to inspect system data and trigger on nonbrowser process network connections. Using these triggers, you may be able to tease out network connections to IP ranges for things like Microsoft Technet or GitHub or Twitter, thereby identifying source processes of interest for further investigation.
For this method, we recommend creating specific rules for each legitimate service. If you’re looking for legit services C2 to GitHub, create a rule for non-browser process connections to GitHub IP space that are also non-app processes such as Git.exe and GitHub Desktop. Purely as an example, the abstract detection logic might look something like this:
Source Process NOT (firefox.exe OR chrome.exe OR iexplore. exe) AND TCP Connection To (netblock is 22.214.171.124/22 OR AS is AS36459) AND Source Process NOT (git.exe OR github. exe OR OR git-bash.exe OR git-cmd.exe OR git*.exe OR gitgui.exe OR githubdesktop.exe)
You could also tune this logic to look for traffic sourced from designated parts of your enterprise, such as source subnets or lists of computers that should never be talking to legitimate services. At first, this will probably generate a fair number of false positives for apps and social media programs, but after tuning this is a viable way to find lots of weird and potentially unwanted things communicating with these services.
2. Unique or low DDR page response traffic sizes from legitimate services
When pieces of malware make HTTP GET requests to profile pages or other legit services pages, they’re typically doing a dirty download of HTML to a temp file, which is then later processed for the encoded DDR text to find the real C2 address. This is fundamentally different from how a browser views a page because a browser will download and render additional linked content from the page. This opens up an opportunity to “fingerprint” near-default page sizes for legit services profiles.
As a small experiment, we simulated malware GETs of new profile pages on a few legit services to get an idea of the network flow differences.
Table 2. Experiment data depicting the differences between browsing and raw page downloads
|Profile/Page||Base Page (Raw)||Get Page HTTPS||Browsing Page|
|MS Technet 1||41kb||50kb||72kb|
|MS Technet 2||41kb||49kb||72kb|
|MS Social 1||27kb||35kb||69kb|
|MS Social 2||27kb||25kb||69kb|
Even with the TLS/SSL encryption overhead, the raw HTTPS GETs of the profile pages were significantly smaller than browsing the pages in Chrome. This suggests the possibility of identifying the abuse of legit services using fingerprint sizes to identify abnormally small network flows.
In the opening paragraphs we discussed the reversal of the standard client-server relationship in terms of directionality and ratios of data sent and received. We can expand further on this concept by thinking about this for legit services C2 as well. For example, if a backdoor is using GitHub for full C2, it might be searching a page or a project for encoded instructions and then providing responses to those commands in the form of comments. In this example, the backdoor would be sending a high volume of data to GitHub over a long period of time and receiving very little from the page or project itself, likely less than 1kb per command received. This might characterize the natural flow of committing or uploading to Github, but a time graph of active malware C2 versus a legitimate upload will show entirely different patterns visualized. There are opportunities here for profiling or “fingerprinting” standard traffic and looking for things that deviate from the norm.
Consider the following anecdote. In one intrusion where the threat actor deployed malware using legit services C2, a keen defender noticed suspicious network flows to GitHub IP space where the average response size from GitHub was between 11000 and 15000 bytes. These sizes, mind you, are significantly smaller than the average web page size for a profile or project page, leading the incident responder to suspect non-browser traffic, and potentially something malicious. With this behavior in mind, he crafted a netflow query to search for the top flows with bytes in a designated size range. Using this netflow query, the responders identified additional machines worth investigating, ultimately locating the backdoors using the GitHub API for C2.
nfdump -R %NFDUMPFILES% -t YYYY/MM/dd.00:00:00-YYYY/ MM/dd.23:59:59 ”(src net 126.96.36.199/24 or src net 188.8.131.52/24) and (bytes > 9000 and bytes < 17000)” -s dstip/bytes -n 25
3. High certificate exchange frequencies to legitimate services
Once again, with the assumptions that backdoors are stupid, we can consider that every time a backdoor calls out to its DDR page there will be an SSL handshake and certificate exchange. To illustrate the difference in cert exchange frequency, we compare on the left 20 minutes of malware GETs and on the right 20 minutes of browsing to a Microsoft Social profile. The malware does indeed prove to be stupid.
Figure 5. (left) 20 minutes of malware GETs to a Microsoft Social Profile; (right) 20 minutes of browsing to a Microsoft Social profile
One might implement detection logic for this behavior by creating Snort or Suricata rules looking for “social.microsoft.com” in an SSL certificate where the certificate is presented more than, say, 50 times per day to the same internal IP address (implying that the internal system was making an unusually high number of GETs to Microsoft Social). This is just an example, of course. This methodology of measuring certificate exchange numbers can be applied to any of the legit services as long as they are encrypted.
The following Snort rules demonstrate the logic that may be used to identify systems emanating unusually voluminous requests for SSL certificates:
alert tcp $HOME_NET any -> $EXTERNAL_NET 443 (msg: “TLS Client Hello - Microsoft Answers”; content”|00 00 00 19 00 17 00 00 14|social|2e|microsoft|2e|com”; threshold: type both, track by_src, count 30, seconds 86400; sid:13370001; revision:1;) alert tcp $HOME_NET any -> $EXTERNAL_NET 443 (msg:”TLS Client Hello - Google Docs”; content:”docs.google.com”; ssl_state:client_hello; threshold: type both, track by_ src, count 30, seconds 86400; sid:13370002; revision:1;)
This same logic can be deployed on both client-side and server-side certificate exchanges. We have specified client-side here because they represent traffic that is “closer to the malware”. Naturally, the specific logic and format of these rules would depend on the implementation of Snort, which may include certain traffic preprocessors and special configurations for IP ranges and ports. Moreover, the thresholds for alerting would need to be fine tuned and tailored to the resident network. Still, we hope that this helps illustrate the potential for methodology detections based only on network traffic.
4. Bulk processing samples for suspicious DNS calls to legitimate services
There is an argument to be made for malware “hunting” in bulk by processing large amounts of samples in a sandbox. For example, if you purchased some type of “sandboxing” appliance that “detonates” inbound binaries for malware analysis, you could implement special rules for those evil-looking binaries that make DNS requests to legitimate services and legit services APIs. For example, if you see a sample make a call to “api-dropboxusercontent.dropbox.com,” even if the sample is not deemed “malicious,” it may merit more scrutiny before you allow such a binary into your enterprise environment. When you do find samples like this that are malicious, you can create tactical detection rules or blacklists on a per-sample, per-family or perfunctionality basis. (We maintain a philosophical objection to detection by file hash, which we view as trite and ineffective, but we do acknowledge that in this case there is a small use case for hash blacklisting).
5. Bulk processing samples for morphology matching
Morphology means the study of the form and structure of things and the relationships between things. In our little slice of the cyber universe, “morphology matching” means identifying similarities between files with the purpose of identifying unknown malicious files based on relationships to known-good and known-bad files. Pretty simple, right? Though this is not specific to legit services C2, we see a lot of promise in the method of bulk processing large amounts of samples using “binary similarity” or “morphology” malware technologies. This detection methodology would focus on file attributes rather than network behavior.
There are several names and buzzwords for describing this concept and how it might work in practice. The basic premise of this idea is that you take a piece of malware and perform static attribute analysis, disassembly, and dynamic analysis to extract a list of tool marks and features such as: opcodes, byte patterns, symbolic functions, system calls, code blocks, processor instructions, debug symbols, unique strings, and run-time behaviors. You combine all of these features into a special pattern (a/k/a “morphology”) that describes the attributes that are unique to this single sample and the entire family of malware. Then you compare the “morphology pattern” to a database of other patterns to help you identify or classify your unknown file as malicious or non-malicious. At the highest level it does not seem like rocket science, but engineers and data scientists have been working on this problem for decades and still haven’t nailed down a great solution.
We believe that by bulk processing samples for morphology matching, you may be able to identify samples using legit services C2 that would otherwise not be detected by standard endpoint or network detection methodologies. At this time, there are few commercial or open source technologies that offer morphology matching with a solid reference database. We can, however, highlight two notable examples. VxClass by Zynamics (acquired by Google in 2011 and unfortunately VxClass is no longer for sale), performed disassembly of binaries and used “bioinformatics algorithms to classify malware info “family trees based on a matrix of similarity values”. More recently we can note products from Intezer, a Tel Aviv startup that describes their technology as “DNA mapping for software”. The Intezer technology, “dissects any given file or binary into thousands of small fragments, and then compares them to Intezer’s Genome Database, which contains billions of code pieces (‘genes’) from legitimate and malicious software offering an unparalleled level of understanding of any potential threat.” We’ve used only the community edition, but in our opinion, Intezer has the most promising commercial offerings in the malware morphology space. We have yet to determine if deploying such technology in large scale security operations would indeed find malicious files based only on code similarity or “gene” matching.
Data Points on Rise of Legit Services C2
In March 2017, I co-presented “Middle-out Network Analysis: Finding Evil with a Low Signal-to-Noise Ratio” at Bsides Canberra, Australia. In this presentation we discussed the rise of legit services C2 and detailed a longitudinal study of malware capabilities that we called “Evolution of Malware,” all of which was publicly presented under the auspice of Mandiant and FireEye.
In short, we studied ten years of malware capabilities to get a better understanding of how things like C2 protocols were changing over time. The data set was comprised of all the malware samples that underwent reverse engineering by Mandiant’s incident response and intelligence teams from 2006 to 2016. Analysis of these samples presented several trends, including the rise of “marketshare” for samples that used legit services C2.
Though these trends do not necessarily reflect what’s happening in the world at large, after looking at the stats from this data set, we can indeed see that Mandiant/ FireEye is seeing an increase in the amount of malware using legit services C2 each year. This research was presented publicly at Bsides Canberra in early 2017. (See Appendix D for further details.)
The biggest takeaway from all of this is that, for malware involved in some of the most notable cyber attacks, the biggest data breaches, and the most world-shaking intrusions... the use of legit services for C2 is on the rise. This has several implications.
Chart A. This graph depicts the percentage of samples analyzed annually that are capable of legit services C2. From 2008 to 2011, legit services C2 samples represented approximately 3% of the annual submissions, later rising to 6% in 2014 and 9% in 2016. According to this data set, the percentage of annual samples using legit services C2 tripled in the last decade.
Chart B. This graph depicts the raw number of unique families seen “in the wild” using legit services C2. In 2006 there were 4 distinct malware families using legit services for C2, rising to 26 unique families active in 2016.
Key Points and Conclusions
- Backdoors that use legit services C2 are almost always active, and perform beaconing to legit services websites and APIs.
- Backdoors that use legit services C2 for DDR often switch from high-level Windows APIs and HTTP protocols to low-level Windows APIs and custom protocols that are very difficult, if not impossible, to detect with traditional network detection rules.
- We have observed a rise in abuse of legitimate services for attack operations, and we expect this trend to continue from both the actor group perspective and the malware code family perspective.
- We advise pentesters and red teamers to look into this technology as well, both in terms of maturing the attack infrastructure and becoming undetectable/ unattributable.
- We advise defenders to explore experimental methodologies for detecting these C2 techniques.
- We recommend security researchers everywhere begin to measure, study, and develop methods to mitigate these C2 techniques.
Threat actors are in the business of conducting network intrusion operations, which are costly in resources. For long term mission success, they must use their resources wisely, adapt when necessary, and invest in strategies that allow them to scale and operate undetected for long periods of time.
If the threat actors (or if a development “quartermaster” in the supply chain) create attack tooling that is easily found by detection systems, not only would it ruin the success of any current intrusion operations, but it would also cost time and money before the mission could resume. This is why security researchers hasten to publicly unveil attack technologies, effectively forcing the attackers to retool, and ideally driving up the cost of conducting an attack.
Threat actors need to deploy malware and maintain access to their implants in victim networks. It is easier to hide and maintain access to these implants if they can be controlled through the conduit of legitimate services. We are detailing use of legitimate services C2 because threat actors are investing in this technique and we expect the use of this technique to increase in the coming years.
We hope that this paper has been interesting, useful, or at least fun to read. There are few resources on legit services C2, and, to date, longitudinal research on the general topic of malware C2 capabilities has been shaky at best. This paper is admittedly no better, but we hope it serves to illustrate the type of study necessary for defenders to do what they do best: find evil and solve crime.