Blogs

A Beginner’s Guide to Tracking Malware Infrastructure

February 9, 2024

Tags:

Building queries for malware infrastructure can be a valuable step in the security lifecycle. Sadly, there are few resources for how to get started and which indicators can be used to build queries from. Today we aim to fill this gap by demonstrating approachable and high value methods that can be used to hunt for malware infrastructure.

What Is Query Building For Malware Infrastructure?

Query building is the process of observing suspicious or known malicious infrastructure and creating queries to identify the configuration pattern that the creator of the infrastructure has used. Since threat actors often re-use the same or similar configuration across multiple deployed servers, there is often a pattern that can be used to identify multiple servers from a single initial indicator.

A well built query allows an analyst to identify additional servers related to the actor’s infrastructure. The analyst can then proactively block, investigate or perform any additional actions needed to limit compromise and gather intelligence.

Why Build Queries On Malware Infrastructure?

Building queries on Malware Infrastructure can be a highly efficient means of obtaining IOC’s for blocking and hunting.

Traditional means of listing malware infrastructure involves obtaining a large set of unique malware samples and extracting individual IOC’s from each file.

This can be a highly tedious and technical process requiring a dedicated reverse engineer to deconstruct a sample, develop and test a Yara hunting rule, acquire new samples, and then develop and apply a configuration extractor to obtain individual IOCs.

This reverse engineering capability involves a significant amount of technical know-how which most teams outsource to threat intelligence feeds. Outsourcing to threat intelligence feeds can be effective and there are good paid feeds available, but they are often expensive and can vary significantly in quality and timeliness.

Benefits of Building Infrastructure Queries

By developing your own infrastructure queries for the purposes of hunting, you can establish a far greater list of malware IOCs with a significantly smaller set of malware samples, technical expertise, and overall cost. You can also leverage queries to expand on alerting from your own environment, allowing you to establish a list of IOCs related to known malware impacting your organisation.

Using the techniques shown in this post, you can potentially identify dozens of current malware IOCs and infrastructure with only a single available sample or alert.

What Are The Indicators That We Can Use?

A single malicious IP address contains a great deal of information that can be used to identify additional servers. This is due to unique patterns related to the software and configuration deployed by an actor.

Since threat actors often re-use the same software and configurations across multiple instances of malicious infrastructure, a single pattern can be used to identify other servers.

Some of the most common indicators that threat actors will re-use are:

Certificate Information – Fields inside of TLS and SSL certificates. Hardcoded values are often re-used.
Server Headers – Actors deploying custom software may forget to change default headers that contain indicators.
Data in HTTP Responses – Custom software containing unique values in HTTP responses
Location, ASN and Hosting Providers – Actors re-using hosting providers for infrastructure. Similar servers may be hosted at the same ASN.
JA3 Hashes – Actors deploying uncommon software configurations can be fingerprinted by JA3 signatures.
Port Configurations – Actors will often leave the same ports open across infrastructure.
Regular Expressions – Actors may deploy unique values with highly similar structure that can be captured with Regular Expressions.

Now that we’ve covered the key concepts, let’s dive in with some examples.

Hunting Infrastructure with TLS Certificates

Threat actors and malware developers utilise TLS certificates to encrypt communications and establish connections between a target host and malicious infrastructure.

For many reasons, actors rarely deploy unique certificates for each deployed sample. This results in values within a single TLS certificate being present on numerous other servers, which introduces simple patterns that can be signatured and queried.

Example 1: Hunting AsyncRAT with TLS Certificates

The malware family of AsyncRAT contains a hardcoded TLS certificate left by the developer. This certificate contains the hardcoded subject common name value of AsyncRAT Server

Take for example an IP Address of 91.109.176[.]4. Querying this IP in Censys Search confirms a subject common name (CN) value of AsyncRAT Server on port 8808.

Tracking AsyncRat with TLS certificates

By expanding the host information and locating the exact field where the AsyncRAT Server value is stored services.tls.certificates.leaf_data.subject_dn, we can build a query to locate additional servers.

In this case, either the subject_dn or issuer_dn field can be used as they both contain the same hardcoded value.

AsyncRAT certificate leaf

By searching for AsyncRAT Server in either of these fields, we can locate an additional 110 servers with the same certificate value.

AsyncRat servers in Censys Search view

Example 2: Hunting Cobalt Strike with TLS Certificates

The infamous Cobalt Strike toolkit can also be tracked using TLS Certificate values.

This is primarily due to a default subject common name of Major Cobalt Strike

Take for example the IP address 23.98.137[.]196 with the following certificate on port 50050.

Cobalt Strike with TLS Certificates Censys Search view There are multiple hardcoded values here that can be utilised, but for the sake of simplicity we will leverage the issuer common name of Major Cobalt Strike.

We can expand the detailed host view again to determine the exact field name. Services.tls.certificates.leaf_data.issuer.common_name.

Querying this value returns 236 results. The chances of legitimate software containing “Major Cobalt Strike” is very low, so these are likely all active Cobalt Strike deployments.

Major Cobalt Strike Leaf Data Censys Search View

Hunting Infrastructure with HTTP Response Titles

Developers of malware control servers often leave unique and identifying strings in web page data.

Most commonly these can be found in the HTML Titles and HTTP Bodies.

It’s useful to note that these values can be changed, but many actors do not go to this effort and leave the identifying strings intact.

Example 1: Mythic C2 Framework

The Mythic C2 framework is often utilised by threat actors and contains a default HTML Title of Mythic.

Looking at the IP 89.223.66[.]195, we can confirm the Mythic string present in the HTML title.

Mythic C2 Censys Search View

Querying for the Mythic string inside of services.http.response.html_title, we can locate a total of 75 servers.

Many of these servers have already been marked as C2 servers by the Censys platform. (You can locate other C2’s with the query labels:C2)

Mythic C2 Servers Censys Search view

Hunting Infrastructure with Service Banners

Threat actors and malware developers often leave identifying strings inside of service banners. These are often left intentionally by the author to limit misuse of the software.

These can be identified, queried, and tracked using similar methods to the HTML Titles.

Note that service banners may not be displayed by default, and you may need to open the detailed host view to see them.

Example 1: Havoc C2 Framework

Take for example the Havoc C2 framework. Havoc is an open source C2 framework developed by C5pider that has been leveraged by threat actors due to its high quality implementation of modern offensive techniques.

With default settings, the Havoc Team Server contains the X-Havoc string inside of the service banner. This has been left intentionally by the author to limit mis-use of the software.

Havoc C2 framework Censys Search view

Searching for services.banner with the X-Havoc string returns a total of 71 results.

A string like this is specific and unlikely to have false positives. So there is a strong chance that these are all active deployments of Havoc.

X Havoc services banner Censys Search view

Example 2: Hunting DarkComet with Service Banners

A second example of hunting with service banners can be seen with DarkComet malware.

Looking a the IP address of 187.135.84[.]89, we can observe a unique looking service banner of BF7CAB464EFB on port 2000

DarkComet Censys Search view

We can search services.banner with the value BF7CAB464EFB to identify a total of 25 servers.

Dark Comet C2 Hosts Censys Search view

Summary – Hunting Infrastructure with Service Banners

Threat actors often work in a hurry and avoid changing default strings in custom or open source software. When investigating a host, be sure to check all service banners for unique or interesting strings.

These default strings are great indicators for building queries and you should absolutely use them to your advantage.

If you’d like to see for yourself, here is a prebuilt query for both Havoc and DarkComet.

Hunting with Locations and ASN Providers

Threat actors often re-use the same hosting providers when deploying multiple servers for malicious purposes.

Often this is done to avoid takedowns and investigations. Other times it is done because hosting providers in an actor’s home country are easier to obtain.

Regardless of the reasons, actors often re-use providers and we can use this to our advantage to locate malicious servers and to fine-tune existing queries on other indicators.

Example 1: Amadey Bot Servers

Let’s look at the IP address of 185.215.113[.]68 . This IP address has a relatively unique HTTP response body of none.

Amadey Bot Servers Censys Search view Honing in with the query of services.http.response.body=”none”, we have an initial result of 84 servers. Many of these servers are located in the United States and do not appear to be malicious in nature.

Amadey Bot Servers Response Body None Censys Search View

Returning to the initial results on the host page for 185.215.113[.]68, we can see that the server is hosted in Moscow with an Autonomous System Number of 51381.

We can add this number as an additional filter to hone in our query results.

Amadey Bot ANS search

By adding autonomous_system.asn=51381 to our search, we have now limited our search to only 4 results. Querying these results in Virustotal shows that they are all related to Amadey Bot malware.

Summary – Hunting with Locations and Hosting Providers

Threat actors often utilise the same hosting providers when deploying infrastructure. The hosting provider is not necessarily an indicator in itself, but it can be combined with other indicators to produce a highly effective query.

When investigating an IOC and finding that you have too many search results. Try adding the physical location of the server or the ASN Number to hone in your search.

You can experiment with the Amadey example using this prebuilt query.

Hunting Infrastructure with Open Directories

Threat actors will often host malware and supporting software on open directories which are exposed to the public internet. These directories are generally deployed so that malware can easily retrieve additional files to facilitate exploitation.

To locate open directories, we can search for common directory titles like Directory Listing or Index Of.

Alternatively we can search for pre-labelled servers with the open-dir tag provided by Censys.

Hunting infrastructure with open directories in Censys Search Hunting malware infrastructure with open directories

Example 1: Hunting Open Directories with Common File Names

Searching for open directories alone can return hundreds of thousands of results. Many of which are benign and non-malicious.

To identify malicious cases, we can combine the search with a specific file name related to suspicious software.

Take for example nc.exe which is a common file name for the netcat tool.

Common names open directories Censys Search

Investigating one of the first returned addresses of 123.57.56[.]129, we can observe an open directory containing nc.exe as well as references to Hacktools and a .bat script with foreign characters. This information is enough to assume that the IP is highly suspicious.

With these indicators identified we can attempt to retrieve the files to perform additional analysis and confirm malicious-ness, or we can use the new file names to refine the query and identify additional servers.

Hunting infrastructure with common file names

For example we can leverage the newly identified string of hacktools to create a new query.

We can do this with labels:open-dir and response bodies containing hacktools.

In this case only a single result is returned, but this demonstrates the concept of using initial results to establish new queries. This query could easily be re-run at a later date to identify new instances of the suspicious server.

Open directory hack tools

Example 2: Open Directories Containing Procdump.exe

To demonstrate the concept further, we can search for open directories containing references to the process dumping tool procdump.exe.

Process dumping tool open directories

One of the results is a server hosting procdump.exe, beacon.exe and shell.exe.

We can use these results to identify new strings and pivot to open directories containing beacon.exe.

This identifies a new server with IP 62.204.41[.]104 . This server contains references to `beacon.exe` but not the initial netcat or procdump. Highlighting the useful-ness of building new queries based on initial results.

Summary – Hunting Open Directories

Open directory hunting can be a useful means to hunt for suspicious servers. This is particularly useful when dealing with Downloader malware that has called out to a server with an open directory.

When building queries, use the pre-built open-dir tag and leverage known file names. Then add new file names and strings based on your results.

We’ve included some prebuilt queries here for beacon.exe, procdump.exe and nc.exe.

Incorporating Regular Expressions Into Hunting Queries

Advanced threat actors will often avoid using hardcoded values across multiple servers.

When unique values are deployed, this is often done via scripting and automated programs. This means that even though the values are unique, the “structure” of the values is often repeated and can be signatured using regular expressions.

Let’s take a look at some examples of unique values that can be signatured with regular expressions. Note that Regular Expression searching is a paid feature of Censys Search.

Example 1: Catching Qakbot Servers With Regular Expressions

A great example of unique values with the same structure is with Qakbot.

Qakbot uses an automated system to deploy and refresh unique TLS certificate values across servers. But these values have a similar structure which can be queried with regular expressions.

We can see two such certificates in the below screenshots.

Catching Qakbot servers with Regex

Observing the two certificate values, we can see that they are wildly different in their values. However, they follow a similar pattern which can be caught with regular expressions.

We can verify this by copying out the values and bringing in Cyberchef. This allows us to prototype a regular expression and confirm that we can match on both values.

Cyberchef screenshot Repeating this process for both the issuer and subject fields of the TLS certificate, we can develop a query that catches a total of 64 servers.

To verify that the results are matching as intended, we can generate a Censys report on the returned results. This allows us to list all returned certificate values in an easy-to-read list.

Below is a short snippet of the results, confirming that the query is working as intended.

Catching BianLian Servers with Regular Expressions

Example 2: Catching BianLian Servers with Regular Expressions

The BianLian malware is another example of unique TLS certificate values containing identical structure and formatting.

Below we can observe two certificates. Both the subject and issuer contain only the “C”, “O” and “OU” fields. (This contrasts with a “typical” certificate, which would contain also contain the “ST” and “L” fields)

Each value contains exactly 16 characters with no spacing or use of special characters.

Bringing the two issuer values to Cyberchef, we can prototype a regular expression that captures the values from both certificates.

BianLian with Regular Expressions in CyberChef

We can now search on services.tls.certificate.parsed.issuer_dn with our regular expression. This returns a total of 24 servers.

BianLian Regex Certificates

We can go ahead and generate another search report on the services.tls.certificate.parsed.issuer_dn to confirm that the results are matching as intended.

Example 3: Viper Servers with TLS Certificates and Regular Expressions

To demonstrate one more example, we can take a look at an IP of 139.155.90[.]81 which was marked as Viper Malware on ThreatFox.

We can view the TLS certificate information in Censys. Showing subject and issuer fields that are exactly 8 characters in length and contain only lowercase letters and numbers.

Viper TLS Handshake

Bringing one of the values into Cyberchef, we can again prototype a regular expression to match on the identified structure.

We can then search for this regular expression in the services.tls.certificates.leaf_data.issuer_dn field. This returns a total of 1593 results.

Viper TLS Certificates Leaf Data

Generating another search report verifies that many of these results contain the same TLS structure as the initial server.

Same TLS structure as initial server

Summary – Hunting Infrastructure with Regular Expressions

There will be cases where hardcoded values won’t be enough to hunt infrastructure.

Many of these situations can be handled by identifying the structure of seemingly unique values and incorporating regular expressions into your queries.

If you’re unfamiliar with regular expressions, there is a great free resource over at regexone that will help you get started.

How Can I Get Started?

All of the queries (excluding regular expressions) can be performed with a Censys Search Community account. You can sign up today and begin threat hunting, gathering intelligence, and building up lists of IOCs.

To obtain initial IOCs, we recommend using public IOC repositories like ThreatFox and URLHaus and starting your hunt from there. We can also recommend leveraging pre-built queries like those shared by drb_ra and Michael Koczwara.

Conclusion

We’ve now looked at several indicators that can be used to identify malicious infrastructure. You can and should use all of these to your advantage when investigating an IP address or performing a threat hunt.

Threat actors vary in quality and sophistication, some will be more difficult to track than others. But in many cases you can track actors using only the techniques shown here today.

About the Author

Matthew

Embee Research

Matthew (aka @embee_research) is a security researcher based out of Melbourne, Australia. Matthew has a passion for all things malware, burritos and creating educational cyber content.

Products

Use Cases

Industries

Resources

Company

A Beginner’s Guide to Tracking Malware Infrastructure

What Is Query Building For Malware Infrastructure?

Why Build Queries On Malware Infrastructure?