Investigating Windows Systems with Dissect – IWS Chapter 2

I recently found out about the Dissect toolset by Fox-IT/NCC Group, which abstracts out a lot of the target format and filesystem to streamline accessing particular artifacts. I’m curious how easy it is to use and its limitations, since it seems very portable and easy to install. To practice, I’m using several different images from the book Investigating Windows Systems by Harlan Carvey. In a later post, I’ll use DFIR Challenge 7 from Ali Hadi.

Trying Out the Demo

First, I decided to try using the demo instance of Dissect to play around with some features before installing. My first test is one of backwards compatibility using a Windows XP image. In this case, I used WinXP2.E01:

The Dissect demo GUI.

No automatic recognition of OS and other host information yet, but I haven’t interacted with it via the shell so far. My first step was to use a couple commands in the shell to check these details:

The results of my first commands came through quickly.

This is a good sign! So continuing with the scenario, we’re interested in malware. I’ve read Harlan’s approach to investigating this image, and I’m interested in rapid-triage type approaches. In this case I’ll want to look at persistence mechanisms, including Run keys, Services, Scheduled Tasks, Startup items, KnownDLLs and anything else I have access to. Granted, I’m not expecting the kind of coverage I’d get with RegRipper on unconventional persistence techniques.

Unfortunately, here seems to be where the demo, at least in terms of rendering things in the top pane, fell flat (at least for this image). When choosing several functions from the drop-down menu nothing happened. So back to the shell I went:

Run Keys

Using the runkeys command quickly outputs a list of autostart extensibility points, a couple of which look suspicious. But the number of Run keys recovered is rather small. I noticed no RunOnce keys were present, so I took a look at the Dissect source code to see what keys were supported. I’m pretty okay with the list they have. In this case I find it suspicious that a Run key is named RPC Drivers, since generally drivers are loaded into the kernel as part of a service and you generally don’t need programs to run at login in order to do anything with them. These keys stick out especially:

<windows/registry/run hostname='REG-OIPK81M2WC8' domain=None ts=2004-06-18 23:49:49.937500+00:00 name='RPC Drivers' path='C:/WINDOWS/System32/inetsrv/rpcall.exe' key='HKEY_LOCAL_MACHINE\\Software\\Microsoft\\Windows\\CurrentVersion\\Run' regf_hive_path='sysvol/windows/system32/config/SOFTWARE' regf_key_path='$$$PROTO.HIV\\Microsoft\\Windows\\CurrentVersion\\Run' username=None user_id=None user_group=None user_home=None>
<windows/registry/run hostname='REG-OIPK81M2WC8' domain=None ts=2004-06-18 23:49:49.937500+00:00 name='RPC Drivers' path='C:/WINDOWS/System32/inetsrv/rpcall.exe' key='HKEY_CURRENT_USER\\Software\\Microsoft\\Windows\\CurrentVersion\\Run' regf_hive_path='sysvol/Documents and Settings/vmware/ntuser.dat' regf_key_path='$$$PROTO.HIV\\Software\\Microsoft\\Windows\\CurrentVersion\\Run' username='vmware' user_id='S-1-5-21-1123561945-606747145-682003330-1004' user_group=None user_home='%SystemDrive%\\Documents and Settings\\vmware'

Another interesting piece of information we get is the user associated with this the username associated with the last key, vmware. This gives us an indication that this particular user might have been infected. You might also note that the timestamps for both entries are the same: 2004-06-18 23:49:49. The path to the executable rpcall.exe is also interesting, since it seems like inetsrv could possibly be an IIS server directory.

Checking the Hash

The next thing I wanted to do for triage purposes was checking the hash of this executable. I poked around for a bit by running “help” in the shell:

To calculate the hash of a particular file, we can just run hash <filepath>:

hash "C:/WINDOWS/System32/inetsrv/rpcall.exe"
MD5:	a183965f42bda106370d9bbcc0fc56b3
SHA1:	5d5a53182e73742acb027bb3a3abc1472d02dde9
SHA256:	776b26c9c516e1cd60871097e586026f73bc0f0c210582d1b2ea1ae7c954b2be

Pivoting on this, we can see that someone has uploaded it to VirusTotal for analysis, and it’s being widely detected as malicious:

While the detection names are rather generic and may be low confidence, by clicking on the Behavior tab we can see a sandbox run. In addition to the Run keys we expected, I saw that many keys under HKCU\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\DisallowRun were written to:

A variety of Registry keys were written under Policies\Explorer\DisallowRun.

Googling this registry key tree, I found that it’s a technique for disallowing the execution of certain programs (mostly Antiviruses in this case). In addition, the first key in the screenshot is written to create a firewall exception for the worm. These actions match activity described in reports on several worm variants.

Information Filtering

Now that we have some situational awareness with targeted artifacts, what I want to do is test Dissect’s ability to filter larger amounts of data. Event logs, MFT and Prefetch are what I’m hoping for here. So how do we filter?

The answer, after some digging, is the command rdump. We can pipe the result of a command to rdump and do all sorts of filtering. For example, with prefetch! Unfortunately, at this point I needed to officially install Dissect locally, since the demo doesn’t seem to support piping to sort or rdump.

The prefetch information came quickly and had a surprising amount of detail. In addition to the name of the executable that may have run, the prefetch records also included a list of loaded libraries, which is great for investigating DLL hijacking incidents:

Snippet of all Prefetch records in the image.

But there were more than just DLLs: the list includes .nls files, .log files, ocx libraries and others.

Issues with PowerShell/Windows

Now, to try filtering by the filename field I followed the docs and tried this command:

PS> target-query.exe -f prefetch .\WinXP2.E01 | rdump.exe -s '"rpcall" in r.filename.lower()'

Here I’m searching for prefetch records that contain the keyword “rpcall.”

However, I got the error ERROR RecordReader('-'): Unknown file format, not a RecordStream. After this point I did some troubleshooting and ran into a number of issues in both PowerShell and the Command Prompt. The authors behind Dissect were very helpful in explaining the following:

  1. PowerShell does not support putting binary data (like our record streams) in a pipe. It will try to interpret it as text. Thus, it is easier to use the normal command prompt.
  2. rdump.exe -s ‘”rpcall” in r.filename.lower()’ (as it says in the docs) will not work with the Command Prompt (cmd.exe), you’ll need to use rdump.exe -s “‘rpcall’ in r.filename.lower()”. This is due to how rdump.exe was compiled here (apparently an artifact of compilation for windows). So in this case, you need double quotes on the outside, single quotes on the inside (for strings within the statement).

If this was a bit confusing, I apologize, but in summary: I recommend installing Dissect on Linux in a Python virtual environment, whether that’s in a separate Ubuntu virtual machine (maybe the SIFT VM) or on Windows Subsystem for Linux in your Windows VM. For the latter I recommend WSL 1, as the nested virtualization required for WSL 2 broke countless times on VirtualBox. Install on Linux to be able to follow the Dissect documentation without these issues, and use a Python virtual environment to avoid dependency issues. But since I figured out how to get piping and commands working on Windows, I continue the walkthrough there. Back to the challenge!

Again, But in the Command Prompt

After trying the following, I got the output I expected:

target-query.exe -f prefetch .\WinXP2.E01 | rdump.exe -s "'rpcall' in r.filename.lower()"
Prefetch records where the executing file contains ‘rpcall.’

The cool thing about this the Prefetch output from Dissect and the linked files is that we can look not only at the DLLs loaded (which can indicate things about functionality of the malware, but also we can see accessed files that are not DLLs, by simply adding to our Python condition for the filter:

target-query.exe -f prefetch .\WinXP2.E01 | rdump.exe -s "'rpcall' in r.filename.lower() and not r.linkedfile.lower().endswith('.dll')"
Filtering the previous Prefetch records for non-DLL linked files.

How interesting! We can see that the last 3 files linked to this prefetch that are not DLLs are related to Internet Explorer:

/DEVICE/HARDDISKVOLUME1/DOCUMENTS AND SETTINGS/VMWARE/LOCAL SETTINGS/TEMPORARY INTERNET FILES/CONTENT.IE5/INDEX.DAT
/DEVICE/HARDDISKVOLUME1/DOCUMENTS AND SETTINGS/VMWARE/COOKIES/INDEX.DAT
/DEVICE/HARDDISKVOLUME1/DOCUMENTS AND SETTINGS/VMWARE/LOCAL SETTINGS/HISTORY/HISTORY.IE5/INDEX.DAT

While accessing these files is not conclusive evidence of stealing cache, history or cookies, it gives a potential thread to pull in the malware analysis and may be a part of networking functionality.

Other File Artifacts

Now that we know how to filter using rdump, we should check out noisy evidence sources like the MFT. The following query took a bit longer, probably on the order of a minute and a half. For comparisons, queries before this took about 5 seconds:

target-query.exe -f mft .\WinXP2.E01 | rdump.exe -s "'rpcall' in r.path.lower()"
MFT entries with “rpcall” in the path.

We can see the four timestamps for Birth, MFT Change, Modification, and Access are each different for the malicious file, whereas for the Prefetch records all four are the same. That lines up with intuition. I wonder what else is in that same directory?

I didn’t find anything else searching the directory, but I did notice something cool that I hadn’t spotted in the previous query:

There are 2 different types of timestamps output by the plugin.

Upon closer inspection we have both records named filesystem/ntfs/mft/std and filesystem/ntfs/mft/filename, referring to $STANDARD_INFO and $FILENAME timestamps respectively. As we might expect, the $STANDARD_INFO timestamps (especially the C timestamp) reflect metadata changes, whereas the $FILENAME timestamps are all aligned at the last move or copy action. This definitely aligns with my intuition.

Conclusion

I checked for other forms of persistence and didn’t find much else going on in this image. Chapter 2 in the book (I encourage reading it, it’s short) goes into some time anomalies in this image, but I was mostly focused on targeted artifact searching capabilities.

I’ve been impressed! I started this with an XP image expecting more hiccups in artifact extraction, but I successfully used the plugins info, evt, prefetch, userassist, mft, and runkeys with no issues. Unfortunately, the following were not supported on this XP image: shimcache (not implemented ShimCache version) and tasks (I saw that C:\Windows\Tasks, the directory for tasks in the legacy Task Scheduler, isn’t in the list of paths in the plugin source).

But the project is open source and I’m excited to see it develop! This could make for a very fast and flexible triage tool for answering specific questions and whipping up particular artifact timelines. Thanks to the Fox-IT squad for making such a cool tool open-source.

Threat Intel #1

It’s been a while since I posted, but now that papers and final projects are done, I can get back at it. Last week I started an awesome internship and will be doing a lot of DFIR work. In order to not burn out, I’ll be taking it easy with the research and blogging after hours. But I am getting exposed to more communities and cool info, which encourages me to research and post more.

For example, one of my coworkers got some threat intel from a group he’s in and sent it over to me to have a look at. It was a base64-encoded Powershell script, which decodes into a lightweight downloader. In this post I’ll use it as an example of how I do some quick threat tracking. So let’s start with the decoded payload.

The (poorly obfuscated) downloader script.

So now that we have some second stage URLs, I like to pivot to VirusTotal (VT), using their search function to see if the URL has already been scanned.

VT results for the first URL.

And it has been, so that saves me a little time. We get intel that this is a compromised site helping the bad guys serve malware, as often happens with WordPress sites involved in infections. Next, let’s get the hash of the downloaded file from VT.

Searching the associated hash from VT.

As we can see, this malware has a high detection rate, so it’s no 0-day. The Behavior tab on VT is pretty valuable, but there’s an analysis service popular with malware analysts that can do even better: Let’s take the hash to Any Run to see if the file has been analyzed. If not, we might have to do some VM work to get the sample.

I didn’t find anything by searching the hash, but I was able to pivot off of the IP to find a report that was already run.

Right in the middle is the submission we’ll look into. I could’ve made one myself, but why reinvent the wheel

And if we open that submission, we get a taste of a beautiful, yet functional UI:

If we look closer at that network activity section, it’s already alerting us to the fact that the malware is being served out of an open directory. And it’s never been easier to pivot to the sample. All we have to do is click on the packet where the executable is downloaded…

And we get the above window. We can see headers, resources, sections and imports from here. We could submit it for analysis, but since we now have a hash for the executable, let’s try using that to pivot.

Nice, it’s already been run for us.

So we click in. Now, VT already told us this is likely Emotet, an extremely common polymorphic trojan, but if you want to get into the details about what happens at the registry and filesystem level, Any Run gives you that in the window on the right side.

A nice process tree.

Clicking on any of the spawned processes in the tree gives you a more granular look at what happened. Similar to procmon. With the little icons, you can easily see if the child processes use the network, drop executables, or engage traditional persistence techniques. Let’s take a closer look at PID 1300.

Now we can see the associated filesystem and network events. Any Run gives this process an extremely suspicious rating due to it’s IOCs. and if we look at the network activity, it seems to be beaconing out to Argentinian C2s. They didn’t respond but there is a response from Singapore (looks like a droplet from DigitalOcean). Let’s look at that exchange.

So here’s the response from Singapore. It’s identified as a FLIC FLI video. I’ve never heard of it, but apparently it’s like a GIF? This is kind of where the trail ends. The file doesn’t open with FLIC viewers and doesn’t seem to have a way of executing. Other compromised hosts in the original intel file are down, so that’s pretty much the end of this investigation! It was a lot of screenshots but overall pretty quick triage. Video could definitely be a better format for this series; I’ll strongly consider that.

Still working on my honeypots and finding opendir malware to analyze. My next post might be about those topics, or on one of the forensics challenges I’ve found online.

As always, thanks for reading.

Shorter Post: Goals for the Next 2 Weeks

So, now that the semester will be starting soon, I want to get some action items off the ground while I have (some) time. Mainly:

  1. Get my own honeypot(s) working so I can see what is out in the wild, especially what’s attacking MITnet (the 18.X.X.X subnet). I’m planning to use nepenthes or Dionaea, low interaction honeypot tools, on an old iPod touch with the MobileTerminal emulator on it. I might use my Raspberry Pi as an HTTP server for uploads from the iPod, since it’s a bit weak/slow for constant access. If the iPod can’t handle nepenthes, I can run the honeypot on the Pi, but I would prefer to have some use for this old thing. I’ve been wanting to do this for a bit, and I’m not sure if it’s possible, but it’s worth a try.
  2. Perform full analysis of a malware sample from current trackers. It’s time to look at some malware from the wild rather than from books and courses (although that malware was, of course, originally in the wild).

Stay tuned for the blog post follow-up for these goals!