Tuesday, August 22, 2017

Beyond Getting Started

I blogged about getting started in the industry back in April (as well as here and here), and after having recently addressed the question on an online forum again, I thought I'd take things a step further.  Everyone has their own opinion as to the best way to 'get started' in the industry, and if you look wide enough and far enough, you'll start to see how those who post well thought out articles have some elements in common.

In the beginning...
We all start learning through imitation and repetition, because that's how we are taught.  Here's the process, follow the process.  This is true in civilian life, and it's even more true in the military.  You're given some information as to the "why", and then you're given the "how".  You do the "how", and you keep doing the "how" until you're getting the "how" right.  Once you've gotten along for a bit with the "how", you start going back to the "why", and sometimes you find out that based on the "why", the "how" that you were taught is pretty darned good.   Based on a detailed understanding of the "why", the "how" was painstakingly developed over time, and it's just about the best means for addressing the "why".

In other cases, some will start to explore doing the "how" better, or different, questioning the "why".  What are the base assumptions of the "why", and have they changed?  How has the "why" changed since it was first developed, and does that affect the "how"?

This is where critical thinking comes into play.  Why am I using this tool or following this process?  What are my base assumptions?  What are my goals, and how does the tool or process help me achieve those goals?  The worst thing you could ever do is justify following a process with the phrase, "...because this is how we've always done it."  That statement clearly shows that neither the "why" nor the "how" is understood, and you're just going through the motions.

Years ago, when I had the honor and the pleasure of working with Don Weber, he would regularly ask me "why"...why were we doing something and why were we doing it this way?  This got me to consider a lot about the decisions I was making and the actions I was taking as a team leader or lead responder, and I often found that my decisions were based not just on the technical aspects of what we were doing, but also the business aspects and the impact to the client.  I did not take offense at Don's questions, and actually appreciated them.

Learn to program
Lots of folks say it's important to learn a programming language, and some even go so far as to specify the particular language.  Thirty-five years ago, I started learning BASIC, programming on an Apple IIe.  Later, it was PASCAL, then MatLab and Java, and then Perl.  Now it seems that Python is the "de facto standard" for DFIR work...or is it?  Not long before NotPetya rocked the world, the folks at RawSec posted an article regarding carving EVTX records, and released a tool written in Go.  If you're working on Windows systems or in a Windows environment, PowerShell might be your programming language of choice...it all depends on what you want to do.

There is a great deal of diversity on this topic, and I'd suggest that the programming language you choose should be based on your needs.  The main point is that learning to program helps you see big problems as a series of smaller problems, some of which must performed in a serial fashion.  What we learn from programming is how to break bigger problems into smaller, logical steps.

Engage in the community
Within the DFIR "community", there's too much "liking" and retweeting, and not enough doing and asking of questions, nor actively engaging with others.  Not long ago, James Habben posted an excellent article on his blog on "being present", and he made a lot of important points that we can all learn from.  Further, he put a name to something that I've been aware of for some time; when presenting at a conference, there's often that one person how completely forgets that they're in a room full of other people, and kidnaps and dominates the presenter's time.  There are also those who attend the presentation (or training session) who spend the majority of their time engaged in something else entirely.

Rafal Los recently posted a fascinating article on the SecurityWeek web site.  I found his article well-considered and insightful, and extremely relevant.  It's also something I can relate to...like others, I get connection requests on LinkedIn from folks who've done nothing more than clicked a button.  I also find that after having accepted most connection requests, I never hear from the requester again.  I find that if I write a blog post (like this one) and share the link on Twitter and LinkedIn, I'll get "likes" and retweets, but not much in the way of comments.  If I ask someone what they "like" about the article...and I have done this...more often than not the response is that they didn't actually read it; they wanted to share it with their community.  Given that, there is no difference between having worked on writing and publishing the article, and not having done so.

Engaging in the community is not only a great way to learn, but also a great way to extend the community itself.  A friend recently asked me which sandbox I use for malware analysis, and why.  For me to develop a response beyond just, "I don't", I really had to think about the reasons why I don't use a sandbox.   I learned a little something from the engagement, just as I hope my friend did, as well.

An extension of engaging in the community is to write your own stuff.  Share your thoughts.  Instead of clicking "like" on a link to a blog post, add a comment to the post, or ask a question in the comments.  Instead of just clicking "like" or retweeting, share your reasons for doing so.  If it takes more than 140 characters to do so, write a blog post or comment, and share *that*.

I guess the overall point is this...if you're going to ask the question, "how do I get started in the DFIR industry?", the question itself presupposes some sort of action.  If you're just going to follow others, "like" and retweet all the things, and not actually read, engage, and think critically, then you're not really going to 'get started'.

Friday, August 18, 2017

Updates, New Stuff

Specificity
The folks at Talos recently posted an interesting article, "On Conveying Doubt".  While some have said that this article discusses "conveying doubt" (which it does), it's really about specificity of language.  Too often in our industry, while there is clarity of thought, there's a lack of specificity in the language we use to convey those thoughts; this includes all manner of communication methods; not only reports, but presentations and blog posts.  After all, it doesn't matter how good you or your analysis may be if you cannot communicate your methodology and findings.

Ransomware
Ransomware is a pretty significant issue in the IT community, particularly over the past 8 months or so.  My first real encounter with ransomware, from a DFIR perspective, started last year with a couple of Samas ransomware cases that came in; from a perspective external to my employer, this resulted in two corporate blog posts, one of which was written by a colleague/friend, and ended up being one of the most quoted and referenced blog posts published.  Interestingly enough, a lot of aspects about ransomware, in general, have continued to evolve since then, at an alarming rate.

Vitali Kremez recently posted an interesting analysis of the GlobeImposter ".726" ransomware.  From an IR perspective, where one has to work directly with clients, output from IDAPro and OllyDbg isn't entirely useful in most cases.

However, in this case, there are some interesting aspects of the ransomware that Vitali shared, specifically in section VI.(b).; that is, the command line that the ransomware executes.

Before I go any further, refer back to Kevin's blog post (linked above) regarding the evolution of the Samas ransomware.  At first, the ransomware included a copy of a secure deletion tool in one of it's resource sections, but later versions opted to reduce the overall size of the executable.  Apparently, the GlobeImposter ransomware does something similar, relying on tools native to the operating system to run commands that 'clean up' behind itself.  From the embedded batch file, we can see that it deletes VSCs, deletes user's Registry keys related to lateral movement via RDP, and then enumerates and clears *all* Windows Event Logs.  The batch file also includes the use of "taskill" to stop a number of processes, which is interesting, as several of them (i.e., Outlook, Excel, Word) would immediately alert the user to an issue.

FortiNet recently published an analysis of a similar variant.

Online research (including following #globeimposter on Twitter) indicates that the ransomware is JavaScript-based, and that the IIV is via spam email (i.e., "malspam").  If that's the case (and I'm not suggesting that it isn't), why does the embedded command line include deleting Registry keys and files associated with lateral movement via RDP?

Marco Ramilli took a look at some RaaS ransomware that was apparently distributed as an email attachment.  He refers to it as "TOPransomware", due to a misspelling in the ransom note instructions that tell the victim to download a TOP browser, as opposed to a TOR browser.  This is interesting, as some of Samas variants I've seen this year include a clear-text misspelling in the HTML ransom note page, setting the font color to "DrakRed".

I also ran across this analysis of the Shade ransomware, and aside from the analysis itself, I thought that a couple of comments in the post were very interesting.  First, "...has been a prominent strand from around 2014."  Okay, this one is new to me, but that doesn't really say much.  After all, as a consultant, my "keyhole" view is based largely on the clients who call us for assistance.  I don't claim to have seen everything and know everything...quite the opposite, in fact.  But this does illustrate the differences in perspective.

Second, "...spread like many other ransomware through email attachments..."; this is true, many other ransomware variants are spread as email attachments.  The TOPransomware mentioned previously was apparently discovered as a .vbs script attached to an email.  However, it is important to note that NOT ALL ransomware gets in via email.  This is an important distinction, as all of the protection mechanisms that you'd employ against email-borne ransomware attacks are completely useless against variants such as Samas and LeChiffre, which do not propagate via email.  My point is, if you purchase a solution that only protects you from email-borne attacks, you're still potentially at risk for other ransomware attacks.  Further, I've also seen where solutions meant to protect against email-borne ransomware attacks do not work when a user uses Chrome to access their AOL email, and the system/infrastructure gets infected that way.

On August 7, authorities in the Ukraine arrested a man for distributing the NotPetya malware (not ransomware, I know...) in order to assist tax evaders.  According to the article, he isn't thought to be the author of the destructive malware, but was instead found to be distributing a copy of the malware via his social media account so that companies could presumably infect themselves and not pay taxes.

Recently, this Mamba ransomware analysis was posted; more than anything else, this really highlights one of the visible gaps in this sort of analysis.  As the authors found the ransomware through online searches (i.e., OSINT), there's no information as to how this ransomware is distributed.  Is it as an email attachment?  What sort?  Or, is it deployed via more manual means, as has been observed with the Samas and LeChiffre ransomware?

The Grugq recently posted thoughts as to how ransomware has "changed the rules".  I started in the industry, specifically in the private sector, 20 yrs ago this month...and I have to say, many/most of the recommendations as to how to protect yourself against and recover from ransomware were recommendations from that time, as well.  What's old is new again, eh?

Cryptocurrency Miners
Cylance recently posted regarding cryptocurrency mining malware.  Figure 7b of the post provides some very useful information in the way of a high fidelity indicator..."stratum+tcp".  If you're monitoring process creation on endpoints and threat hunting, looking for this will provide an indicator on which you can pivot.  Clearly, you'll want to determine how the mining software got there, and performing that root cause analysis (RCA) will direct you to their access method.

Web Shells
The PAN guys had a fascinating write-up on the TwoFace web shell, which includes threat actor TTPs.  In the work I've done, I've most often found web shells on perimeter systems, and that initial access was exploited to move laterally to other systems within the network infrastructure.  I did, however, once see a web shell placed on an internal web server; that web shell was then used to move laterally to an MS SQL server.

Speaking of web shells, there's a fascinating little PHP web shell right here that you might want to check out.

Lifer
Retired cop Paul Tew recently released a tool he wrote called "lifer", which parses LNK files.  One of the things that makes tools like this really useful is use cases; Paul comes from an LE background, so he very likely has a completely different usage for such tools, but for me, I've used my own version of such tools to parse LNK files reportedly sent to victims of spam campaigns.

Want to generate your own malicious LNK files?  Give LNKup a try...

Supply Chain Compromises
Supply chain compromises are those in which a supplier or vendor is compromised in order to reach a target.  We've seen compromises such as this for some time, so it's nothing new.  Perhaps the most public supply chain compromise was Target.

More recently, we've recently seen the NotPetya malware, often incorrectly described as "ransomware".  Okay, my last reference to ransomware (for now...honest) comes from iTWire, in which Maersk was able to estimate the cost of a "ransomware" attack.  I didn't include this article in the Ransomware section above because if you read the article, you'll see that it refers to NotPetya.

Here is an ArsTechnica article that discusses another supply chain compromise, where a backdoor was added to a legitimate server-/network-management product.  What's interesting about the article is that it states that covert data collection could have occurred, which I'm sure is true.  The questions are, did it, and how would you detect something like this in your infrastructure?

Carving for EVTX
Speaking of NotPetya, here's an interesting article about carving for EVTX records that came out just days before NotPetya made it's rounds.  If you remember, NotPetya included a command line that cleared several Windows Event Log files

The folks who wrote the post also included a link to their GitHub site for the carver they developed, which includes not only the source code, but pre-compiled binaries (they're written in Go).  The site includes a carver (evtxdump) as well as a tool to monitor Windows Event Logs (evtxmon).

Wednesday, August 02, 2017

Document Metadata

Okay, yeah, so I've been blogging a lot over the past couple of months about extracting document metadata as part of gathering threat intelligence.


This handler diary provided analysis of "malspam pushing Emotet", and this follow up post illustrated how to conduct static analysis of the document itself.  I have used several of the tools mentioned, but had not yet heard of "vipermonkey", and open-source VBA emulator.  Used in conjunction with oledump.py, you can really get a lot of traction with respect to static analysis of the malicious document.

While the second handler diary post focuses on analysis of the malicious macro, what neither post does is illustrate the document metadata. Below is the output of wmd.pl, run against a sample downloaded from VT:

C:\Perl>wmd.pl d:\cases\maldoc\maldoc
--------------------
Statistics
--------------------
File    = d:\cases\maldoc\maldoc
Size    = 215040 bytes
Magic   = 0xa5ec (Word 8.0)
Version = 193
LangID  = Russian

Document has picture(s).

Document was created on Windows.

Magic Created : MS Word 97
Magic Revised : MS Word 97

--------------------
Summary Information
--------------------
Title        : sdf
Subject      : df
Authress     : admin
LastAuth     : admin
RevNum       : 2
AppName      : Microsoft Office Word
Created      : 26.07.2017, 11:51:00
Last Saved   : 26.07.2017, 11:51:00
Last Printed :

--------------------
Document Summary Information
--------------------
Organization : home

...and from oledmp.pl:

C:\Perl>oledmp.pl -f d:\cases\maldoc\maldoc -l
Root Entry  Date: 26.07.2017, 11:51:59  CLSID: 00020906-0000-0000-C000-000000000046
    1 F..   55949                      \Data
    2 F..    7359                      \1Table
    3 F..    4148                      \WordDocument
    4 F.T    4096                      \ SummaryInformation
    5 F.T    4096                      \ DocumentSummaryInformation
    6 D..       0 26.07.2017, 11:51:59 \Macros
    7 D..       0 26.07.2017, 11:51:59 \Macros\VBA
    8 FM.   88908                      \Macros\VBA\ThisDocument
    9 F..     532                      \Macros\VBA\__SRP_2
   10 F..     156                      \Macros\VBA\__SRP_3
   11 FM.    8137                      \Macros\VBA\zjUb2S
   12 FM.    8877                      \Macros\VBA\cvDTF
   13 FM.    4906                      \Macros\VBA\FX9UL
   14 F..   15451                      \Macros\VBA\_VBA_PROJECT
   15 F..     739                      \Macros\VBA\dir
   16 F..    1976                      \Macros\VBA\__SRP_0
   17 F..     198                      \Macros\VBA\__SRP_1
   18 F..      98                      \Macros\PROJECTwm
   19 F..     476                      \Macros\PROJECT
   20 F.T     114                      \ CompObj

We can see that the dates displayed by both tools line up, and we can use oledmp.pl to further list the contents (raw, or hex) of the various streams.  

So, how can any of this be of value, and why does any of this matter?  Well, at BlackHat last week, Allison Wikoff spent a great deal of her time being interviewed about some really fantastic research that she'd conducted on the "Mia Ash" persona (here is the original SecureWorks posting of the results of her research).

From the Wired article:
Eventually, Ash sent the staffer an email with a Microsoft Excel attachment for a photography survey. She asked him to open it on his office network, telling him that it would work best there. After a month of trust-building conversation, he did as he was told. The attachment promptly launched a malicious macro...

So, this really illustrates the dedication of these threat actors...they establish a persona, including social media "pocket litter", and spend time developing a relationship with their target.  As a very small part of her research, Allison took a look at the metadata embedded within the Excel spreadsheet, and found that the user information referred to "Mia Ash".  This further illustrated the depths to which the threat actors would go in order to make the persona appear authentic; not only did they populate multiple social media sites and create a "history" for the persona, but they also ensured that the metadata in the documents sent to intended victims included the 'right' contents to support the persona.  That's right, it's exactly the way it sounds...the metadata embedded in the spreadsheet specifically referred to "Mia Ash" as the authorized user of the MS Office products.

I know what you're going to say..."yeah, but that stuff can be changed/modified...".  Yes, it can...but the point is, how often is that actually done?  Look at the above listed output from wmd.pl...does it look as if any effort was put into modifying the metadata that populated the Word97 file?

Something I've said about Windows systems and DFIR work is that as the versions of Windows have been developed, the amount of information that is automatically recorded as malware or an adversary interacts with the endpoint environment has increased significantly.  In many cases, this seems to be overlooked when it comes to developing threat intelligence for some reason; in spam and phishing campaigns, a lot of the different artifacts are examined...the contents of the email (headers, body, etc.), attachment macros, second-stage downloads, etc.  But what is often missed is document metadata embedded in the attachment; Word docs, Excel spreadsheets, and even LNK shortcut files can all be rich in valuable information.  One such example is looking at time stamps...when an email was sent, when a document was created, when a binary was compiled, etc., and lining all of those up to illustrate just how organized and planned out an attack appears to be.

Sunday, July 09, 2017

Tools

Today's post is a mish-mash of tools and techniques that I've seen or used recently...

Hindsight is a great free, open source tool for parsing a user's Chrome browser data.  I've used it a number of times to great effect; in one instance, I was able to show that a system became infected with ransomware when the user used Chrome to access their AOL email, where they downloaded and launched the malicious attachment.  The tool is very easy to use, and all you need to do is either point it at the user's "Default" folder (within the Chrome path), or extract the sqlite3 files and run it locally against the data.

Joe Gray over at AlienVault published an interesting article on data carving; this has always been an interesting DFIR topic, ranging from file carving to carving for individual records.  In the wake of the recent NotPetya attacks, Willi's EVTXtract might come in handy for some.  Another tool that I've run against decompressed hibernation files, pagefiles, and unallocated space is bulk_extractor, specifically when looking for indications of network communications.  My point is that if you're going to go carving, sometimes it's a good idea to first think about what it is you're carving for, and then seek an suitable approach to performing the carving.

Not "new" by any stretch, but Yogesh's research into Windows8/8.1 search history is still very relevant for a number of reasons.  For one, it illustrates the continued use of the LNK file format (which is actually pretty pervasive throughout the Windows platform...), telling us that not all of the stuff we learned from previous versions of Windows needs to be thrown out the door.  Second, Yogesh's finding that the retention mechanism for search terms changed between Windows 8 and 8.1 illustrates how quickly things can change on Windows systems.  I mean, look at what the Volatility folks have had to deal with!  ;-)

I ran across the Network Usage View tool from NIRSoft recently...that's a pretty interesting capability.  The write-up for the tool indicates that it gets it's data by reading the SRUDB.DAT database on Win8 and Win10 systems.  This is potentially a pretty valuable data source for DFIR work and analysis. In case you haven't seen it, Yogesh has a pretty fascinating presentation available on SRUM Forensics that is worth checking out.

I saw on Twitter recently that there's a Python-based tool available available now for diff'ing Registry hive files.  I completely agree with those who've commented that this is some great functionality to have available, and has a great deal of potential...but this functionality has been around from quite some time already, via other sources.  For example, James McFarlane's Parse::Win32Registry Perl module distribution includes a script that implements this functionality.  Another tool that allows you to diff two Registry hive files is RegShot.  I agree that this is great functionality to have available, particularly if you want to see what differences exist between a hive file extracted from a VSC or found in the RegBack folder, with one in the config folder.

Speaking of the Registry, I saw this paper from DFRWS 2008 that discusses recovering deleted data from Registry hive files.  My first real encounter with this sort of information was via Jolanta Thomassen's dissertation paper on the topic, and the regslack tool she provided to go along with it.  Since then, other tools (RegRipper plugins del.pl and del_tln.pl) have implemented similar functionality, largely due to the demonstrated value of this functionality.

Jason Hale posted a while back (2 yrs) on the DeviceContainers key on Windows systems, and I ran across his post again recently.  What he found is pretty interesting...I'll have to dig into it a bit more and see what else is available out there.  Jason's research seems to provide a pretty good idea of what can be derived from the key data, so this may be well worth developing a RegRipper plugin, even if just to research what's available in various hives.

I was working on some analysis recently, and was facing an issue where a good number of NTUSER.DAT files had been recovered from an image, all of which had been extracted from the image and placed in folder paths.  While there were a lot of these files, I was only interested in one Registry key (pertinent to the case), a key for which a RegRipper plugin did not exist.  So, I modified an existing plugin to give me information about the key in question, if it did exist, and then wrote a DOS batch file to iterate through all of the folders, running the new plugin (via rip.exe) against the hive file.  A few minutes of development and testing, and I had a repeatable, documented process in place and functioning, providing a capability that had not been in my hands just a few moments before.  My point in sharing this is to illustrate what can be achieved through simple problem definition, and the use of open sources to develop a solution.  I have the batch file I used, so it's pretty much self-documenting, and I pasted the command line from the batch file into my case notes.

pundup - Python script from herrcore to extract contents of McAfee *.bup files.  Even in 2017, there are a great deal of systems (and infrastructures) without any real endpoint monitoring capability employed, and sometimes you need to dig around a bit to get some really useful information about an incident.  One place you can look is AV detections (via the logs), and as such, any available quarantined files may provide even greater insight into the incident.  Further, if the system is running an older version of Windows, and you don't have an Amcache.hve file collecting process execution artifacts (like SHA-1 hashes), having the actual EXE itself to document, hash, and analyze would be very beneficial.

AppCompatProcessor - I ran across this little open source gem recently (note: according to the readme, this does not currently run on Windows); this tool runs through either AppCompatCache or AmCache data and allows you to...well, do a LOT with the data. It's well worth a look; just reading through the main page, I can easily see that a lot of what I include in my own workflow is used as pivot points, and then to expand the data.  For example, I tend to look for things like "$Recycle",

SysMon View - this is a really interesting approach to filtering and visualizing data collected by Sysmon on a Windows system.  Unfortunately, the only time I see Sysmon in use is on my own test systems; it does not seem to have been widely adopted by members of the corporate community who call for IR assistance.  I do think that this is a great approach to making better use of the data, though.

LimaCharlie - from refractionPOINT, described as an open source, cross-platform endpoint sensor.  There isn't a great deal of information available via the web page, but there are a few tweets available.

SideNote:
Speaking of endpoint agents, SANS recently conducted a product review of CrowdStrike's Falcon platform...you can get the PDF report here.
/SideNote

Invoke-Phant0m - The description states that the script "walks thread stacks of Event Log Service process (spesific[sic] svchost.exe) and identify Event Log Threads to kill Event Log Service Threads. So the system will not be able to collect logs and at the same time the Event Log Service will appear to be running."  Given the recent release of tools that claim to be able to remove individual Windows Event Log records, this is an interesting approach.  However, the biggest issue with the released tools is the inability to validate findings; while some on Twitter (I've been pointed to tweets) have claimed success, the actual EXE and process used haven't been shared to the point of allowing others to validate the findings.  To say, "...just start with this DLL..." does not provide a means of validation.

Of course, if you're not able to remove individual records, Inslainity provided another approach, albeit one that can be validated.  I've tested another approach to removing specific ranges of event records from the Security Event Log, using a method that can be scaled to all logs, but is much more insidious if you don't.

The folks at Javelin Networks have come up with an in-memory PowerShell script that can peek into consoles and provide detailed information about what's being done.  The description states that the script "extracted the content of the following command-line shells: PowerShell, CMD, Python, Wscript, MySQL Client, and some custom shells such as Mimikatz console. In some cases, the tools might be helpful to extract encrypted shells like the one used in PowerShell Empire Agent."

News
Adam (Hexacorn) has published yet another article demonstrating means of persistence and EDR bypass.  If nothing else, this is an excellent example of why endpoints of all versions (not just Windows) need to be instrumented to monitor and record process creation events, including full command lines.

Pete James over at Precision Discovery have a fascinating blog post in which he discusses records left behind in an often-overlooked Windows Event Log file.  I can't say that I've ever had a case where I've needed to know which Office files had been accessed by a user, but if you're tracking such artifact categories, then this is a good one to include.

Speaking of Office products, Will Knowles at MWRLabs published a blog post on using Office add-ins for persistence.

Here are a couple of links I've had sitting around for a while that I really haven't dug into...
Javelin Networks - CLI Powershell
NViso - Hunting Malware with Metadata
FireEye - Shim Databases used for Persistence

Thursday, June 15, 2017

Analyzing Documents

I've noticed over time that a lot of the write-ups that get posted online regarding malware or downloaders delivered via email attachments (i.e., spear phishing campaign) focus on what happens after the malicious payload is activated...the URL reached to, the malware downloaded, etc.  However, few seem to dig into the document itself, and there's a great deal can be gleaned from those documents, that can add to the threat intel picture.  If you're not looking at everything involved in the incident, if you're not (as Jesse Kornblum said) using all the parts of the buffalo, then you're very likely missing critical elements of the threat intel picture.

Here's an example from MS...much of the information in the post focuses on the embedded macro and the subsequent decrypted Cerber executable file, but there's nothing available regarding the document itself.

Keep in mind that different file formats (LNK, OLE, etc.) will contain different information.  And what I'm referring to here isn't about running through analysis steps that take a great deal of time; rather, what I'm going to show you are a few simple steps you can use to derive even more information from the attachment/wrapper documents themselves.

I took a look at a couple of documents (Doc 1, Doc 2) recently, and wanted to share my process and see if others might find it useful.  Both of these OLE-format documents have hashes available (or you can download and compute the hashes yourself), and they were also found on VirusTotal:

VirusTotal analysis for Doc 1
VirusTotal analysis for Doc 2

The VT analysis for both files includes a comment that the file was used to deliver PupyRAT.

Tools
Tools I'll be using for this analysis include my own oledmp.pl and wmd.pl.

Doc 1 Analysis
Running oledmp.pl against the file, we see:

Fig. 1: Doc 1 oledmp.pl output

















That's a lot of streams in this OLE file.  So, one of the first things we see is the dates for the Root Entry and the 'directories' (MS has referred to the OLE file format as "a file system within a file", and they're right), which is 1 Jan 2017.  According to VT, the first time this file was submitted was 1 Jan 2017, at approx. 20:29:43 UTC...so what that tells us is that it's likely that one of the first folks to receive the document submitted it less than 14 hrs after the file was modified.

Continuing with oledmp.pl, we can view the contents of the various streams in a hex dump format, but we see that stream number 20 contains a macro.  Using oledmp.pl with the argument "-d 20", we can view the contents of the stream in hex dump format.  In the output we see what appear to be 2 base64-encoded Powershell commands, one that downloads PupyRAT to the system, and another that appears to be shell code.  Copying and decoding both of the streams gives us the command that downloads PupyRAT, as well as a second command that appears to be some form of shell code.  Some of the variable names ($Qsc, $zw5) appear to be unique, so searching for those via Google leads us to this Hybrid-Analysis write-up, which provides some insight into what the shell code may do.

Interestingly enough, the same search reveals that, per this Reverse.IT link, both encoded Powershell commands were used in another document, as well.

Moving on, here's an excerpt of the output from wmd.pl, when run against this document:

--------------------
Summary Information
--------------------
Title        :
Subject      :
Authress     : Windows User
LastAuth     : Windows User
RevNum       : 2
AppName      : Microsoft Office Word
Created      : 01.01.2017, 06:51:00
Last Saved   : 01.01.2017, 06:51:00
Last Printed :

Notice the dates...they line up with the previously-identified dates (see fig.1)


Doc 2 Analysis
Following the same process we did with doc 1, we can see very similar output from oledmp.pl with doc 2:

Fig. 2: Doc 2 oledmp.pl output















One of the first things we can see is that this document was created within about 24 hrs of doc 1.

In the case of doc 2, stream 16 contains the data we're looking for...extracting and decoding the base64-encoded Powershell commands, we see that the commands themselves (PupyRAT download, shell code) are different.  Conducting a Google search for the variables used in the shell code command, we find this Hybrid-Analysis write-up, as well as this one from Reverse.IT.

Here's an excerpt of the output from wmd.pl, when run against this document:

--------------------
Summary Information
--------------------
Title        : HealthSecure User Registration Form
Subject      : HealthSecure User Registration Form
Authress     : ArcherR
LastAuth     : Windows User
RevNum       : 2
AppName      : Microsoft Office Word
Created      : 02.01.2017, 06:49:00
Last Saved   : 02.01.2017, 06:49:00
Last Printed : 20.06.2013, 06:27:00

--------------------
Document Summary Information
--------------------
Organization : ACC

Remember, this is a sample pulled down from VirusTotal, so there's no telling what happened with the document between the time it was created and submitted to VT.  I made the 'authress' information bold, in order to highlight it.

Summary
While this analysis may not appear to be of significant value, it does form the basis for developing a better intelligence picture, as it goes beyond the more obvious aspects of what constitutes most analysis (i.e., the command to download PupyRAT, as well as the analysis of the PupyRAT malware itself) in phishing cases.  Valuable information can be derived from the document format used to deliver the malware itself, regardless of whether it's an MSOffice document, or a Windows shortcut/LNK file.  When developing an intel picture, we need to be sure to use all the parts of the buffalo.

Wednesday, May 31, 2017

Use of LNK Files...Again

I've discussed LNK file before...here, and more recently, here.  Windows shortcut files are an artifact on Windows systems that have been available for so long that there are likely a great many analysts who understand the category this artifact falls into, but may not know a great deal about the nature of the artifact itself.  That is to say that it's likely that a good many training and certification courses tend to gloss over some of the more esoteric aspects of the Windows shortcut file as an artifact.

Most analysts likely understand the Windows shortcut, or "LNK" file to be an artifact of "user knowledge of files", as the most commonly understood activity that leads to the automatic creation of these files is a user double-clicking on a file via the Windows Explorer shell.

We also know that understanding the format of these files can be very important, as well.  The LNK file format is used as the basis for individual streams within JumpLists, and there are several Registry keys that contain value data that follows the LNK file format.  In fact, LNK files are compose, at least in part, of shell items...so now we're getting into the format of the individual building blocks of many artifacts on Windows systems.

Let's take a step back for a moment, and consider the most common means of "producing" Windows shortcut files.  From a standpoint of developing and interpreting evidence, let's say that a user plugs a USB device into a laptop, opens the contents of the device in Windows Explorer, and then double-clicks an MS Word document.  At that point, a shortcut file is created in the user's Recent folder that contains not only a series of shell items that comprise the path to the file, but also a string that refers to the file, as well.  However, the LNK file also contains information about that system on which it was created, and it's this aspect of the file format that is missed or forgotten, due the fact that in most cases, this embedded information isn't very interesting.  However, if an LNK file is not created as an artifact of, say, a malware infection, but is instead used by an attacker to infect other systems, then the LNK file will contain information about the system on which it was developed, which would NOT be the victim system.  This is when the contents of the LNK file become very interesting.

TrendMicro's TrendLabs recently published a blog article that gives a very good example of when these files can be extremely interesting.  The article describes the reported increase in use of LNK files as weaponized attachments by the group identifed as "APT 10".  I find this absolutely fascinating, not because something that has been part of Windows from the very beginning has been turned against users and employed as an attack vector, but because of the information and metadata that seems to be left on the floor because these files simply are not parsed or dug into...at least not to any extent that is discussed publicly.  It seems that most organizations may miss or discount the value of Windows shortcut file contents, and not incorporate them into their overall intelligence picture.

This JPCERT blog post describes "Evidence of Attackers' Development Environment Left in Shortcut Files", and this NViso blog post discusses, "Tracking Threat Actors Through .LNK Files".  These are great at describing what can be done, providing illustration and example.

Sunday, April 09, 2017

Getting Started

Not long ago, I gave some presentations at a local high school on cybersecurity, and one of the questions that was asked was, "how do I get started in cybersecurity?"  Given that my alma mater will establish a minor in cybersecurity this coming fall, I thought that it might be interesting to put some thoughts down, in hopes of generating a discussion on the topic.

So, some are likely going to say that in today's day and age, you can simply Google the answer to the question, because this topic has been discussed many times previously.  That's true, but it's as blessing as much as it is a curse; there are many instances in which multiple opinions are shared, and at the end of the thread, there's no real answer to the question.  As such, I'm going to share my thoughts and experience here, in hopes that it will start a discussion that others can refer to.  I'm hoping to provide some insight to anyone looking to "get in" to cybersecurity, whether you're an upcoming high school or college graduate, or someone looking to make a career transition.

During my career, I've had the opportunity to be a "gate keeper", if you will.  As an incident responder, I was asked to vet resumes that had been submitted in hopes of filling a position on our team.  To some degree, it was my job to receive and filter the resumes, passing what I saw as the most qualified candidates on to the next phase.  I've also worked with a pretty good number of analysts and consultants over the years.

The world of cybersecurity is pretty big and there are a lot of roads you can follow; there's pen testing, malware reverse engineering, DFIR, policy, etc.  There are both proactive and reactive types of work.  The key is to pick a place to start.  This doesn't mean that you can't do more than one...it simply means that you need to decide where you want to start...and then start. Pick some place, and go from there.  You may find that you're absolutely fascinated by what you're learning, or you may decide that where you started simply is not for you.  Okay, no problem.  Pick a new place and start over.

When it comes to reviewing resumes, I tend to not focus on certifications, nor the actual degree that someone has.  Don't get me wrong, there are a lot of great certifications out there.  The issue I have with certifications is that when most folks return from the course(s) to obtain the certification, there's nothing that holds them accountable for using what they learned.  I've seen analysts go off to a 5 or 6 day training course in DFIR of Windows systems, which cost $5K - $6K (just for the course), and not know how to determine time stomping via the MFT (they compared the file system last modification time to the compile time in the PE header).

I am, however, interested to see that someone does have a degree.  This is due to the fact that having a degree pretty much guarantees a minimum level of education, and it also gives insight into your ability to complete tasks.  A four (or even two) year degree is not going to be a party everyday, and you're likely going to end up having to do things you don't enjoy.

And why is this important?  Well, the (apparently) hidden secret of cybersecurity is that at some point, you're going to have to write.  That's right. No matter what level of proficiency you develop at something, it's pretty useless if you can't communicate and share it with others.  I'm not just talking about sharing your findings with your team mates and co-workers (hint, "LOL" doesn't count as "communication"), I'm also talking about sharing your work with clients.

Now, I have a good bit of experience with writing throughout my career.  I wrote in the military (performance reviews, reports, course materials, etc.), as part of my graduate education (to include my thesis), and I've been writing almost continually since I started in infosec.  So...you have to be able to write.  A great way to get experience writing is to...well...write.  Start a blog.  Write something up, and share it with someone you trust to actually read it with a critical eye, not just hand it back to you with a "looks good".  Accept that what you write is not going to be perfect, every time, and use that as a learning experience.

Writing helps me organize my thoughts...if I were to just start talking after I completed my analysis, what came out of my mouth would not be nearly as structured, nor as useful, as what I could produce in writing.  And writing does not have to be sole source of communications; I very often find it extremely valuable to write something down first, and then use that as a reference for a conversation, or better yet, a conference presentation.

So, my recommendations for getting started in the cybersecurity field are pretty simple:
1. Pick some place to start.  If you have to, reach to someone for advice/help.
2. Start. If you have to, reach to someone for advice/help.
3. Write about what you're doing. If you have to, reach to someone for advice/help.

There are plenty of free resources available that provide access to what you need to get started; online blog posts, pod casts/videos, presentations, books (yes, books online and in the library), etc.  There are free images available for download, as part of DFIR challenges (if that's what you're interested in doing).  There are places you can go to find out about malware, download samples, or even run samples in virtual environments and emulators.  In fact, if you're viewing this blog post online, then you very likely have everything you need to get started.  If you're interested in DFIR analysis or malware RE, you do not need to have access to big expensive commercial tools to conduct analysis...that's just an excuse for paralysis.

There is a significant reticence to sharing in this "community", and it's not simply isolated to folks who are new to the field.  There are a lot of folks who have worked in this industry for quite a while who will not share experiences or findings.  And there is no requirement to share something entirely new, that no one's seen before.  In fact, there's a good bit of value in sharing something that may have been discussed previously; it shows that you understand it (or are trying to), and it can offer visibility and insight to others ("oh, that thing that was happening five years ago is coming back...like bell bottoms...").

The take-away from all of this is that when you're ready to put your resume out there and apply for a position in cybersecurity, you're going to have some experience in the work, have visible experience writing that your potential employer can validate, and you're going to know people in the field.

Understanding What The Data Is Telling You

Not long ago, I was doing some analysis of a Windows 2012 system and ran across an interesting entry in the AppCompatCache data:

SYSVOL\Users\Admin\AppData\Roaming\badfile.exe  Sat Jun  1 11:34:21 2013 Z

Now, we all know that the time stamp associated with entries in the AppCompatCache is the file system last modification time, derived from the $STANDARD_INFORMATION attribute.  So, at this point, all I know about this file is that it existed on the system at some point, and given that it's now 2017 it is more than just a bit odd, albeit not impossible, that that is the correct file system modification date.

Next stop, the MFT...I parsed it and found the following:

71516      FILE Seq: 55847 Links: 1
[FILE],[BASE RECORD]
.\Users\Admin\AppData\Roaming\badfile.exe
    M: Sat Jun  1 11:34:21 2013 Z
    A: Mon Jan 13 20:12:31 2014 Z
    C: Thu Mar 30 11:40:09 2017 Z
    B: Mon Jan 13 20:12:31 2014 Z
  FN: msiexec.exe  Parent Ref: 860/48177
  Namespace: 3
    M: Thu Mar 30 11:40:09 2017 Z
    A: Thu Mar 30 11:40:09 2017 Z
    C: Thu Mar 30 11:40:09 2017 Z
    B: Thu Mar 30 11:40:09 2017 Z
[$DATA Attribute]
File Size = 1337856 bytes

So, this is what "time stomping" of a file looks like, and this also helps validate that the AppCompatCache time stamp is the file system last modification time, extracted from one of the MFT record attributes.  At this point, there's nothing to specifically indicate when the file was executed but now, we have a much better idea of when the file appeared on the system. The bad guy most likely used the GetFileTime() and SetFileTime() API calls to perform the time stomping, which we can see by going to the timeline:

Mon Jan 13 20:12:31 2014 Z
  FILE                       - .A.B [152] C:\Users\Admin\AppData\Roaming\
  FILE                       - .A.B [56] C:\Windows\explorer.exe\$TXF_DATA
  FILE                       - .A.B [1337856] C:\Users\Admin\AppData\Roaming\badfile.exe
  FILE                       - .A.B [2391280] C:\Windows\explorer.exe\

Fortunately, the system I was examining was Windows 2012, and as such, had a well-populated AmCache.hve file, from which I extracted the following:

File Reference: da2700001175c
LastWrite     : Thu Mar 30 11:40:09 2017 Z
Path          : C:\Users\Admin\AppData\Roaming\badfile.exe
Company Name  : Microsoft Corporation
Product Name  : Windows Installer - Unicode
File Descr    : Windows® installer
Lang Code     : 1033
SHA-1         : 0000b4c5e18f57b87f93ba601e3309ec01e60ccebee5f
Last Mod Time : Sat Jun  1 11:34:21 2013 Z
Last Mod Time2: Sat Jun  1 11:34:21 2013 Z
Create Time   : Mon Jan 13 20:12:31 2014 Z
Compile Time  : Thu Mar 30 09:28:13 2017 Z

From my timeline, as well as from previous experience, the LastWrite time for the key in the AmCache.hve corresponds to the first time that badfile.exe was executed on the system.

What's interesting is that the Compile Time value from the AmCache data is, in fact, the compile time extracted from the header of the PE file.  Yes, this value is easily modified, as it is simply a bunch of bytes in the file that do not affect the execution of the file itself, but it is telling in this case.

So, on the surface, while it may first appear as if the badfile.exe had been on the system for four years, it turns out that by digging a bit deeper into the data, we can see that wasn't the case at all.

The take-aways from this are:
1.  Do not rely on a single data point (AppCompatCache) to support your findings.

2.  Do not rely on the misinterpretation of a single data point as the foundation of your findings.  Doing so is more akin to forcing the data to fit your theory of what happened.

3. The key to analysis is to know the platform you're analyzing, know your data...no only what is available, but it's context.

4.  During analysis, always look to artifact clusters.  There will be times when you do not have access to all of the artifacts in the cluster, so you'll want to validate the reliability and fidelity of the artifacts that you do have.

Saturday, April 08, 2017

Understanding File and Data Formats

When I started down my path of studying techniques and methods for computer forensic analysis, I'll admit that I didn't start out using a hex editor...that was a bit daunting and more than a little overwhelming at the time.  Sure, I'd heard and read about those folks who did, and could, conduct a modicum of analysis using a hex editor, but at that point, I wasn't seeing "blondes, brunettes, and redheads...".  Over time and with a LOT of practice, however, I found that I could pick out certain data types within hex data.  For example, within a hex dump of data, over the years my eyes have started picking out repeating patterns of data, as well as specific data types, such as FILETIME objects.

Something that's come out of that is the understanding that knowing the structure or format of specific data types can provide valuable clues and even significant artifacts.  For example, understanding the structure of Event Log records (binary format used for Windows NT, 2000, XP, and 2003 Event Logs) has led to the ability to parse for records on a binary level and completely bypass limitations imposed by using the API.  The first time I did this, I found several valid records in a *.evt file that the API "said" shouldn't have been there.  From there, I have been able to carve unstructured blobs of data for such records.

Back when I was part of the IBM ISS ERS Team, an understanding of the structure of Windows Registry hive files led us to being able to determine the difference between credit card numbers being stored "in" Registry keys and values, and being found in hive file slack space.  The distinction was (and still is) extremely important.

Developing an understanding of data structures and file formats has led to findings such as Willi Ballenthin's EVTXtract, as well as the ability to parse Registry hive files for deleted keys and values, both of which have proven to be extremely valuable during a wide variety of investigations.

Other excellent examples of this include parsing OLE file formats from Decalage, James Habben's parsing Prefetch files, and Mari's parsing of data deleted from SQLite databases.

Other examples of what understanding data structures has led to includes parsing Windows shortcuts/LNK files that were sent to victims of phishing campaigns.  This NViso blog post discusses tracking threat actors through the .lnk file they sent their victims, and this JPCert blog post from 2016 discusses finding indications of an adversary's development environment through the same resource.

Now, I'm not suggesting that every analyst needs to be intimately familiar with file formats, and be able to parse them by hand using just a hex editor.  However, I am suggesting that analysts should at least become aware of what is available in various formats (or ask someone), and understand that many of the formats can provide a great deal of data that will assist you in your investigation.

Sunday, March 26, 2017

Links, Updates

LNK Attachments
Through my day job, we've seen a surge in spam campaigns lately where Windows shortcuts/LNK files were sent to the targets as email attachments.  A good bit of the focus has been the embedded commands within the LNK files, and how those commands have been obfuscated in order to avoid detection or analysis.  There is some great work being done in this area (discovery, analysis, etc.) but at the same time, a good bit of data and some potential intelligence is being left "on the floor", in that the LNK files themselves are not being parsed for embedded data.

DFIR analysts are probably most often familiar with LNK files being used to either maintain malware persistence when infected, or to indicate that a user opened a file on their system.  In instances such as these, the MS API for creating LNK files embeds information from the local system within the binary contents of the LNK file.  However, if a user is sent an LNK file, that file must have been created on another system all together...which means that unless a specific script was used to create the LNK file on a non-Windows system, or to modify the embedded information, we can assume that the embedded information (MAC address, system NetBIOS name, volume serial number, SID) was from the system on which the LNK file was created.

I'd blogged on this before (yes, eight years ago), and while researching that blog, had found this reference to %LnkGet% at Microsoft.

I recently ran across this fascinating write-up from NVISO Labs regarding the analysis of an LNK file shipped embedded within an old-style (re: OLE) MSWord document.  While the write-up contains hex excerpts of the LNK file, and focuses on the command used to launch bitsadmin.exe to download malware, what it does not do is extract embedded artifacts (SID, system NetBIOS name, MAC address, volume serial number) from the binary contents of the LNK file.

I know...so what?  Who cares?  How can you even use this information?  Well, if you're maintaining case notes in a collaboration portal, you can search for various findings across engagements, long after those engagements have closed out or analysts have left (retired, moved on, etc.), developing a "bigger picture" view of activity, as well as maintaining intelligence from those engagements.  For example, keeping case notes across engagements will allow a perhaps less experienced analyst see what's been done on previous engagements, and illustrate what further work can be done (just-in-time learning).  Of course then there's correlating multiple engagements with marked similarities (intel gathering).  Or, something to consider is that there are Windows Event Log records that include the NetBIOS name of the remote system when a login occurs, and you might be able to correlate that information with what's seen embedded in LNK files (intel collection/development).

MS-SHLLINK: Shell Link Binary File Format

AutoRun - ServiceStartup
Adam's found yet another autostart location within the Registry, this one the ServiceStartup key in the Windows Registry.  I haven't seen a system with this key populated, but it's definitely something to look out for, as this would make a great RegRipper plugin, or a great addition to the malware.pl plugin.

However, while I was looking around at a Software hive to see if I could find a ServiceStartup key with any values, I ran across the following key that looked interesting:

HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\WSMAN\AutoRestartList

Digging into the key a bit, I found a couple of interesting articles (here, and here).  This is something I plan to keep a close eye on, as it looks as if it could be very interesting.

Analysis
The guys from Carbon Black recently shared some really fascinating analysis regarding "a malicious Excel document was used to create a PowerShell script, which then used the Domain Name System (DNS) to communicate with an Internet Command and Control (C2) server."

This is really fascinating stuff, not only to see stuff you've never seen before, but to also see how such things can be discovered (how would I instrument or improve my environment to detect this?), as well as analyzed.

Imposter Syndrome
I was catching up on blog post reading recently, and came across a blog post that centered on how the imposter syndrome caused one DFIR guy to sabotage himself.  Not long after reading that, I read James' post about his experiences at BSidesSLC, and his offer to help other analysts...and it sounded to me like James was offering a means for DFIR folks who are interested to overcome their own imposter syndrome.

On a similar note, not too long ago, I was asked to give some presentations on cyber-security at a local high school, and during one of the sessions, one of the students had an excellent question...how does someone with no experience apply for an entry-level position that requires experience?  To James' point, I recommended that the students with an interest in cyber-security start working on projects and blogging about them, sharing what they did, and what they found.  There are plenty of resources available, including system images that can be downloaded and analyzed, malware that can be downloaded and analyzed, etc.  Okay, I know folks are going to say, "yeah, but others have already done that..."...really?  I've looked at a bunch of the images that are available for download, and one of the things I don't see is responses...yes, there are some, but not many.  So, download the image, conduct and document your analysis, and write it up.  So what if others have already done this?  The point is you're sharing your experiences, findings, how you conducted your analysis, and most importantly, you're showing others your ability to communicate through the written word.  This is very important, because pretty much every job in the adult world includes some form of written communications, either through email, reports, etc.

Here's what I would look for if I were looking to fill a DFIR analysis position...even if the image had already been analyzed by others, I'd look for completeness of analysis based on currently available information, and how well it was communicated.  I'd also look for things like, was any part of the analysis taken a step or two further?  Did the "analyst" attempt to do all of the work themselves, or was there collaboration?  I'd want to see (or hear, during the interview) justification for the analysis steps along the way, not to second guess, but in order to understand thought processes.  I'd also want to see if there was a reliance on commercial tools, or if the analyst was able to incorporate (or better yet, modify) open source tools.

Not all of these aspects are required, but these are things I'd look for.  These are also things I'd look at when bringing new analysts onto a team, and mentoring them.

PowerShell
posted recently about some interesting Powershell findings that pertain to Powershell version 5.  I found this while examining a Windows 10 system, but if someone has updated the Powershell installation on their Windows 7 and 8 systems, they should also be able to take advantage of the artifacts.

However, something popped up recently that simply reiterated the need to clearly identify and understand the version of Windows that you're examining...Powershell downgrade attacks.

This SecurityAffairs article provides an example of how Powershell has been used.

Wednesday, March 15, 2017

Incorporating AmCache data into Timeline Analysis

I've had an opportunity to examine some Windows 10 systems lately, and recently got a chance to examine a Windows 2012 server system.  While I was preparing to examine the Windows 2012 system, I extracted a number of files from the image, in order to incorporate the data in those files into a timeline for analysis.  I also grabbed the AmCache.hve file (I've blogged about this file previously...), and parsed it using the amcache.pl RegRipper plugin.  What I'm going to do in this post is walk through an example of something I found after the initial analysis,

From the ShimCache data from the system, I found the following reference:

SYSVOL\Users\Public\Downloads\badfile.exe  Fri Jan 13 11:16:40 2017 Z

Now, we all know that the time stamp associated with the entry in the ShimCache data is the file system last modification time of the file (NOT the execution time), and that if you create a timeline, this data would be best represented by an entry that includes "M..." to indicate the context of the information.

I then looked at the output of the amcache.pl plugin to see if there was an entry for this file, and I found the following:

File Reference    : 720000150e1
LastWrite           : Sun Jan 15 07:53:53 2017 Z
Path                    : C:\Users\Public\Downloads\badfile.exe
Company Name : FileManger
Product Name    : Fileppp
File Descr          : FileManger
Lang Code         : 0
SHA-1               : 00002861a7c280cfbb10af2d6a1167a5961cf41accea
Last Mod Time  : Fri Jan 13 11:16:40 2017 Z
Last Mod Time2: Fri Jan 13 11:16:40 2017 Z
Create Time       : Sun Jan 15 07:53:26 2017 Z
Compile Time    : Fri Jan 13 03:16:40 2017 Z

We know from Yogesh's research that the "File Reference" is the file reference number from the MFT; that is, the sequence number and the MFT record number.  In the above output, the "LastWrite" entry is the LastWrite time for the key with the name referenced in the "File Reference" entry.  You'll also notice some additional information that could be pretty useful...some of it (Lang Code, Product Name, File Descr) were values that I added to the plugin today (I also updated the plugin repository on GitHub, as well).

You'll also notice that there are a few time stamps, in addition to the key LastWrite time.  I thought that it would be interesting to see what effect those time stamps would have on a timeline; so, I wrote a new plugin (amcache_tln.pl, also uploaded to the repository today) that would allow me to add data to my timeline.  After adding the AmCache.hve time stamp data to my timeline, I went looking for

Sun Jan 15 07:53:53 2017 Z
  AmCache       - Key LastWrite   - 720000150e1:C:\Users\Public\Downloads\badfile.exe
  REG                 User - [Program Execution] UserAssist - C:\Users\Public\Downloads\badfile.exe (1)


Sun Jan 15 07:53:26 2017 Z
  AmCache       - ...B  720000150e1:C:\Users\Public\Downloads\badfile.exe
  FILE              - .A.B [286208] C:\Users\Public\Downloads\badfile.exe


Fri Jan 13 11:16:40 2017 Z
  FILE               - M... [286208] C:\Users\Public\Downloads\badfile.exe
  AmCache       - M...  720000150e1:C:\Users\Public\Downloads\badfile.exe

Fri Jan 13 03:16:40 2017 Z
  AmCache       - PE Compile time - 720000150e1:C:\Users\Public\Downloads\badfile.exe

Clearly, a great deal more analysis and testing needs to be performed, but this timeline excerpt illustrates some very interesting findings.  For example, the AmCache entries for the M and B dates line up with those from the MFT.

Something else that's very interesting is that the AmCache key LastWrite time appears to correlate to when the file was executed by the user.

For the sake of being complete, let's take the parsed MFT entry for the file:

86241      FILE Seq: 114  Links: 1
[FILE],[BASE RECORD]
.\Users\Public\Downloads\badfile.exe
    M: Fri Jan 13 11:16:40 2017 Z
    A: Sun Jan 15 07:53:26 2017 Z
    C: Fri Feb 10 11:37:25 2017 Z
    B: Sun Jan 15 07:53:26 2017 Z
  FN: badfile.exe  Parent Ref: 292/1
  Namespace: 3
    M: Sun Jan 15 07:53:26 2017 Z
    A: Sun Jan 15 07:53:26 2017 Z
    C: Sun Jan 15 07:53:26 2017 Z
    B: Sun Jan 15 07:53:26 2017 Z
[$DATA Attribute]
File Size = 286208 bytes

We know we have the right file...if we convert the MFT record number (86241) to hex, and prepend it with the sequence number (also converted to hex), we get the file reference number from the AmCache.hve file.  We also see that the creation date for the file is the same in both the $STANDARD_INFORMATION and $FILE_NAME attributes from the MFT record, and they're also the same as the value extracted from the AmCache.hve file.

There definitely needs to be more research and work done, but it appears that the AmCache data may be extremely valuable with respect to files that no longer exist on the system, particularly if (and I say "IF") the key LastWrite time corresponds to the first time that the file was executed.  Review of data extracted from a Windows 10 system illustrated similar findings, in that the key LastWrite time for a specific file reference number correlated to the same time that an "Application Popup/1000" event was recorded in the Application Event Log, indicating that the application had an issue; four seconds later, events (EVTX, file system) indicated an application crash.  I'd like to either work an engagement where process creation information is also available, or conduct testing and analysis of a Win2012 or Win10 system that has Sysmon installed, as it appears that this data may indicate/correlate to a program execution finding.

Now, clearly, the AmCache.hve file can contain a LOT of data, and you might not want it all.  You can minimize what's added to the timeline by using the reference to the "Public\Downloads" folder, for example, as a pivot point.  You can run the plugin and pipe the output through the find command to get just those entries that include files in the "Public\Downloads" folder in the following manner:

rip -r amcache.hve -p amcache_tln | find "public\downloads" /i

An alternative is to run the plugin, output all of the entries to a file, and then use the type command to search for specific entries:

rip -r amcache.hve -p amcache_tln > amcache.txt
type amcache.txt | find "public\downloads" /i

Either one of these two methods will allow you to minimize the data that's incorporated into a timeline and create overlays, or simply create micro-timelines solely from data within the AmCache.hve file.

Oh, and hey...more information on language ID codes can be found here and here.

Addendum: Additional Sources
So, I'm not the first one to mention the use of AmCache.hve entries to illustrate program execution...others have previously mentioned this artifact category:
Digital Forensics Survival Podcast episode #020
Willi's amcache.py parser

Saturday, February 25, 2017

Powershell Stuff, Windows 10

Source: InterWebs
I recently had some Windows 10 systems come across my desk, and as part of my timeline creation process, I extracted the Windows Event Log files of interest, and ran them through my regular sequence of commands.  While I was analyzing the system timeline, I ran across some interesting entries, specifically "Microsoft-Windows-Powershell/4104" events; these events appeared to contain long strings of encoded text.  Many of the events were clustered together, up to 20 at a time, and as I scrolled through these events, I saw strings (not encoded) that made me think that I was looking at activity of interest to my exam.  Further, the events themselves were clustered 'near' other events of interest, to include indications that a Python script 'compiled' as a Windows executable had been run in order to steal credentials.  So, I sort of figured that this was something worth looking into, so during a team call, I posed the question of, "...has anyone seen this..." to the group, and got a response; one of my teammates pointed me toward Matt Dunwoody's block-parser.py script.  My own research had revealed that FireEye, CrowdStrike, and even Microsoft had talked about these events, and what they mean.

From the FireEye blog post (authored by Matt Dunwoody):

Script block logging records blocks of code as they are executed by the PowerShell engine, thereby capturing the full contents of code executed by an attacker, including scripts and commands. Due to the nature of script block logging, it also records de-obfuscated code as it is executed. For example, in addition to recording the original obfuscated code, script block logging records the decoded commands passed with PowerShell’s -EncodedCommand argument, as well as those obfuscated with XOR, Base64, ROT13, encryption, etc., in addition to the original obfuscated code. Script block logging will not record output from the executed code. Script block logging events are recorded in EID 4104. Script blocks exceeding the maximum length of an event log message are fragmented into multiple entries. A script is available to parse script block logs and reassemble fragmented blocks (see reference 5).

While not available in PowerShell 4.0, PowerShell 5.0 will automatically log code blocks if the block’s contents match on a list of suspicious commands or scripting techniques, even if script block logging is not enabled. These suspicious blocks are logged at the “warning” level in EID 4104, unless script block logging is explicitly disabled. This feature ensures that some forensic data is logged for known-suspicious activity, even if logging is not enabled, but it is not considered to be a security feature by Microsoft. Enabling script block logging will capture all activity, not just blocks considered suspicious by the PowerShell process. This allows investigators to identify the full scope of attacker activity. The blocks that are not considered suspicious will also be logged to EID 4104, but with “verbose” or “information” levels.

While the blog post does not specify what constitutes 'suspicious', this was a pretty good description of what I was seeing.  This Windows Powershell blog post contains information similar to Matt's post, but doesn't use the term "suspicious", instead stating that "PowerShell automatically logs script blocks when they have content often used by malicious scripts."  So, pretty interesting stuff.

After updating my Python installation with the appropriate dependencies (thinks to Jamie for pointing me to an install binary I needed, as well as some further assistance with the script), I ran the following command:

block-parser.py -a -f blocks.txt Microsoft-Windows-Powershell%4Operational.evtx

That's right...you run the script directly against a .evtx file; hence, the need to ensure that you have the right dependencies in place in your Python installation (most of which can be installed easily using 'pip').  The output file "blocks.txt" contains the decoded script blocks, which turned out to be very revealing.  Rather than looking at long strings of clear and encoded text, broken up across multiple events, I could now point to a single, unified file containing the script blocks that had been run at a particular time, really adding context and clarity to my timeline and helping me build a picture of the attacker activity, providing excellent program execution artifacts.  The decoded script blocks contained some 'normal', non-malicious stuff, but also contained code that performed credential theft, and in one case, privilege escalation code, both of which could be found online, verbatim.

It turns out that "Microsoft-Windows-Powershell/4100" events are also very interesting, particularly when they follow an identified 4104 event, as these events can contain error messages indicating that the Powershell script didn't run properly.  This can be critical in determining such things as process execution (the script executed, but did it do so successfully?), as well as the window of compromise of a system.

Again, all of this work was done across what's now up to half a dozen Windows 10 Professional systems.  Many thanks to Kevin for the nudge in the right direction, and to Jamie for her help with the script.

Additional Reading
Matt's DerbyCon 2016 Presentation

Saturday, February 04, 2017

Updates, Links

VSCs
Not long ago, I blogged about a means for accessing files within VSCs, which was based on a tweet that I had seen.  However, I could not get the method described in the tweet to work, nor could others.

Dan/4n6k updated his blog post to include a reference to volrest.exe, which is part of the Windows 2003 Resource Kit (free download).  This is a great tool...it is part of the Win2003 resource kit but works on Win10...who knew?

In my earlier blog post, I had tried to make a copy of the System and SAM hives from a VSC; however, I kept receiving errors indicating that the files could not be found.  So, I tried using volrest.exe to see if there were any previous versions of files (in this case, folders) available in my profile folder:









Okay, so, are there any previous versions of the System and SAM files available?








Hhhhmm...that might explain why I wasn't able to copy the files in my tests...there were no previous versions of the files available.

Malware
The ISC handler's diary recently had a really good write-up regarding malware analysis regarding a malicious Office document, "fileless" malware, and UAC bypass.  This is a really good write-up of what the malware does, from start to finish, and provides not only individual indicators, but by providing them in the manner that they're shared, provides a view of the behavior of the malware, as well.  This can be extremely useful for detection, by looking to the individual indicators and seeing how you would detect them, perhaps even in a more general case than what is shared.

Not long ago, I remember reading something that stated that one variant of similar malware was using the same sort of UAC bypass technique, and it was changing the Registry value back to the original value after the "exploit" had completed.  This is why timeline analysis is so important, particularly when coupled with process creation monitoring.  Activity such as this happens far too quickly for a VSC to be created, so you won't have any historical artifacts available. However, the LastWrite time for the key would serve as an indicator, much like the Win32ClockProvider key, or the Policies\Secrets key (per the Volatility team).

Here is another write-up that walks through a similar issue (macro, Powershell, UAC bypass...)...

Ransomware
Not long after sharing some thoughts on ransomware, I ran across this little gem that struck (kind of) close to home.  Assuming that this was, in fact, the case (i.e. 70% of available CCTV cameras taken down by ransomware...), what does this tell us about the implications of something like this?  The ransom wasn't paid, and the systems were 'recovered', but at what cost, in manpower and time?

RegRipper Plugin Updates
I've updated a couple of the RegRipper plugins; maybe the most notable update is to the userassist.pl plugin.  Specifically, I removed the alerts function, and added printing of decoded value names that do not have time stamp values.

One of my co-workers reached to me recently and asked about some differences in the output between the plugin, and what XWays produces.  I took a look at it...he'd provided the NTUSER.DAT...and as I was going over the output, I remembered that when I had first written the plugin, I had it output only those entries that had associated time stamps.  Apparently, he'd seen something of value, so I modified the plugin to output a list of the value names whose data did not contain a time stamp.

I did not modify the userassist_tln.pl plugin, for obvious reasons.