“Detecting the Elusive: Active Directory Threat Hunting”
Sean Metcalf, Trimarc CTO
BSides Charm (Baltimore, MD)
Transcript (courtesy of Trimarc)
Download the PDF version of this transcript.
This is “Detecting the Elusive: Active Directory Threat Hunting”, and I am Sean Metcalf. I’m the founder of Trimarc, a Security Company, a Microsoft-Certified Master (MCM) in Active Directory. There’s about 100 in the world. I’m also a Microsoft MVP. I’ve spoken about Active Directory attack and defense at a number of conferences. I’m a security consultant and researcher, and as we just found out, I run ADSecurity.org where I post a lot of interesting security information about the Microsoft platform. So what are we going to talk about?
I call this Active Directory Threat Hunting. Threat hunting has a lot of different connotations or ideas behind it. But I like to boil that down to what really do we care about? We need to log the events that will actually detect unusual or malicious or anomalous activity and flow that into our SIEM tool, whatever that may be and then configure some sort of alerting on that system so we can get the data we need to figure out what’s going on. Working backwards from that, what are some of the data sources we actually need. We need some PowerShell logs to figure out what’s going on with PowerShell because a lot of attackers are using PowerShell. They’re obfuscating PowerShell, so how do we figure out when they’re doing that? And how do we look at attacker activity in Active Directory? How are they moving around? How do we get information about what they’re doing in Windows systems? And how can we detect a common attack method called Kerberoasting. That’s what we’re going to get into.
I’m pretty sure everyone is familiar with something like this, right? Any SOC analysts here? Any people that look at these screens? Show of hands. All right. I’m sure that when you get in in the morning and you look at the screen, you might as well be looking at this, right? Where’s Wally, the European version of Where’s Waldo? This is one of the toughest Wally pictures that I’ve seen. I thought this is a good metaphor. We’re looking at the event alert data that our SIEM is flooding us with, and we might as well be looking at this trying to figure out where Wally is. How do we detect the attacker among all of the noise? How do we dig into this? We need some pointers. We need some help. We need to go from this to this, and ultimately, ideally to this. It’s a bit of a cheat, I think. It’s just his head, right?
We need to get to this point where we can see exactly where the attacker activity is or at least something that is completely unusual or anomalous in our environment. How do we figure that out? We’ve got to make sure we’re getting the right data, like I said. PowerShell system security events. There are specific event IDs that we need. Is this logging data properly flowing from our systems into our SIEM tool? And are we seeing these events? We have our SIEM tool collecting it. We have some sort of agent or whatever that’s sending it to there, but how do we know if those events stop flowing from sources? How do we know when those events are no longer coming from systems that we expect them to? And how do we correlate this to something that is useful? How do we get the events that we really need to see or the alerts that we need to see?
It all starts with what is normal, right? Have we baselined our environment? Do we know what we would expect to see? If we have set up a Splunk query and we get a ton of data or results in it, how do we filter that down to what really matters because we need to figure this part out in order to figure this part out. We need to go from normal to anomalous, the unusual things, the suspicious things, the things that we know will be bad in our environment. I said I had about 5 minutes of fluff. Let’s move onto the good stuff.
Enabling this logging provides tremendous insight into what’s going on in your environment, but the logging is only step one. We need to flow this data into our SIEM, into our central event management system. We want to understand what kind of command prompt tools or command line tools are being run in our environment. We want to see PowerShell commands that are being run. We want to have a good understanding of what sort of activity is there and flow it into our SIEM tool because, ultimately, the battle is being fought on our endpoints, on our workstations, and most organizations are not pulling events from their workstations. Nod if you agree. Yeah, I see a lot of head nods. I won’t have people raise hands on that.
Howdo we get the data from our workstations into our SIEM? A lot of times people say, “I don’t want to push agents out to all those systems. I don’t want to put another agent on a computer that already has 5 or 6 different agents because the users are already complaining about all these agents and the contention for resources that they already have.” Microsoft Windows Event Forwarding (WEF) is a great tool to do that. You don’t need an agent.
Definitely look into Sysmon. Sysmon is a good way to get enhanced auditing of Windows system activity. With Sysmon, it’s part of the sys internal suite that Microsoft bought a number of years company, bought the company. Mark Russinovich was one of the founders. They (Microsoft) brought him in and made him a technical fellow. He’s now the CTO of Azure at Microsoft. There’s an install that configures a Windows service with a device driver for 32 and 64-bit versions of windows, and the configuration is stored in the registry. But effectively there are some key components that it monitors. There are a number of things here like process activities with hashes, image loads, driver loads, file creation time changes, network connections. This is the one I really like about Sysmon is network connections. We can also look for injection into WinLogon and LSASS. We can look at raw disk pulls from the file systems. For example, Invoke-NinjaCopy is a PowerSploit tool that can actually pull the NTDS.dit file off the domain controller while it’s running because it’s doing raw access reads off of that file system.
But like I said, I really like looking at Sysmon for what applications are actually connecting on the networks, which ones are connecting to the internet because things like Notepad probably shouldn’t. We want to ignore most of the Microsoft sign image loads. There are some exceptions. Casey Smith aka @SubTee has identified a number of Microsoft signed binaries that do some very interesting things. Some of these actually enable you to connect to the internet, pull down code, and execute it. This bypasses whitelisting, and often times, the attacker doesn’t need to use PowerShell or any other command line utility because of these tools.
With Sysmon, we go ahead and run the top command. We can push it out via group policy. We can push it out via your standard application deployment tool. We can look at the configuration using Sysmon -c. We want to know where Swiper is, right? We want to know when he’s sneaking around what is happening. In Sysmon, we can get an event like this..Notepad connected to the network. It connected to destination 220.127.116.11. That’s interesting. That’s on the internet. That’s not on my internal network. I’m a 10-Dot. This actually resolves to Githubusercontent.com, and in this instance, I reached out and, using Notepad, I connected to Github and I opened up Invoke-Mimikatz in Notepad. Now, I could go through and modify that, and I could save it somewhere and run it or figure out a number of different ways to execute it on the system. Without something that’s monitoring what applications are connecting to the network, you’re probably going to miss this.
Windows event forwarding, I mentioned it a little bit ago. It’s critical to get our workstation logs into our SIEM. How do we get them there? Microsoft has this built into Windows (via WEF). There’s no agent. It uses WinRM Kerberos in order to pull that data back. You can configure a Windows event collector on a server or technically a workstation. I would use a server. Then you can use group policy to configure your clients to send their events there. You don’t need all of them, 10-20 event IDs. Getting this information specifically for PowerShell and specific events that I’m going to mention in a bit will give you tremendous insight into what’s going on in your environment, especially in your workstations. Now, there is a bit of an initial learning curve. There’s a great reference here at the bottom aka.ms/wef on some pointers on how to configure WEF in your environment. So when you configure a WEF collector, you can point your clients to one WEF collector, but there’s not any load balancing on that or fall tolerance yet.
Let’s talk about PowerShell. We know that attackers are using PowerShell. They’re doing a lot of interesting things with PowerShell, so we need to log PowerShell and see what’s going on because this is what they’re doing. They’re sending out these malicious Word documents that have macros. These macros are running PowerShell code or doing other things on the system because macros are code–third-party code from untrusted sources running on your network, running on your systems. This gives an attacker an initial foothold. Well, the last document I showed you isn’t that interesting from a perspective of “I wouldn’t click on that,” or “Why would I click on that,” but what about one of these? What about the one that says it’s protected by McAfee or the one that says it’s confidential with the DocuSign logo on it? Or the one that says Microsoft Office Enterprise Protection? Or the one that says Office 365 Improve Point? These are fairly compelling, and it’s likely that many of your users would click on enable content when they got one of these. These are in the wild. These are live samples of what attackers are using. You can’t blame the users for opening these. If you’re in contracts, and you get a document like this that says DocuSign confidential encrypted, they’re probably going to open it because a lot of contracts are using DocuSign. Why they’re getting a Word document via email? I don’t know, but people are busy. They have a job to get done. We need to better control macros in our environment, though even if we do control macros, we need to make sure we’re controlling OLE. OLE is a way to embed an object that includes code into an email so when the user double clicks on it and opens it, it actually executes this code. It looks like a Word document, but it’s really code. We can disable this in Outlook via regkey, and I have a URL at the bottom. By the way, all these slides will be available probably tonight or tomorrow on ADSecurity.org or on TrimarcSecurity.com. I have a link on the last slide where the presentation will be. [Slides are here: Slides (PDF)]
Module logging..who had PowerShell module logging enabled or some sort of PowerShell logging? Okay, a few. That’s typically what we see with our customers is there is a handful of people that are logging PowerShell. You definitely want to log PowerShell because, again, this is what attackers are using. They love PowerShell. PowerShell gives them tremendous capability without a whole lot of issues or compatibility concerns because pretty much all versions of Windows now have PowerShell on it. Logging is enhanced in v4. You need v3 in order to get PowerShell module logging, but PowerShell version 5 is really where it’s at. You definitely want to deploy PowerShell version 5 on all your workstations. Windows 10 has it built in. And look to deploy it on your servers because we have something called script block logging, which basically logs the smallest component of a PowerShell script block. In doing so, it can actually provide some more information about what the code is and what it’s trying to do because that’s effectively the code that’s delivered to the engine to be interpreted, to be executed. A lot of the older-style obfuscation gets removed before it gets logged. Whereas with module logging you might get a lot of that obfuscation, you probably wouldn’t get a lot of good insight. Systemwide transcripts are pretty impressive and, I think, very useful because you set up a share on your network–a write once share–so that whenever someone runs anything from PowerShell, it gets logged to a system-wide transcript file into that file that gets put on that share, and you can pull the data from that share from the transcript files. So you can actually see what the person is typing and running in that transcript file. I have found interesting activity in the transcript file that I could not find within the events that were PowerShell events on systems.
The anti-malware system integration for scanner interface AMSI in Windows 10 is really great. This is a tool that provides the capability on a Windows 10 system, when someone runs PowerShell code, be it in memory, downloaded from the internet, a file on the system, or just typed in manually, when it gets delivered to the PowerShell engine, PowerShell actually kicks it over to AMSI, and when there’s a registered anti-malware solution that supports AMSI, it can get give AMSI a thumbs up or a thumbs down. If it’s a thumbs down, the PowerShell engine will not run it. This works for all of the PowerShell supported scripting tools, C-Script, JScript, WSH VBScript. If the anti-malware solution understands and is watching the AMSI event flow, then it can be blocked.
Here’s the problem though. Our vendors are not listening. They don’t seem to care. Please, please, please reach out to your antivirus vendor and say, “We’re going to be deploying Windows 10. We want you to support AMSI.” Microsoft Defender supports it, and AVG is on board. ESET is on board as well. The big ones say, “We do memory scraping.” Okay, that’s nice, but how do you know you’re not missing something? And you’re letting it run. AMSI can block it.
With PowerShell, a lot of customers say, “You know what? We’re just going to block PowerShell, and that way we won’t have to worry about it.” Well, there are some benefits in actually locking PowerShell.exe down so potentially code won’t run. But let’s talk about running PowerShell code from an executable, from a compiled binary because Microsoft provides this nice script code here on the right on MSDN on how to execute PowerShell code from an executable. Ben Ten actually put out a tool called Not PowerShell. It looks like the PowerShell console, but it’s NPS.exe and it just calls the PowerShell engine and then runs the PowerShell code. We can do this through .net from any executable using PowerShell ps = PowerShell.Create.
The tool I do want to mention, which I think is very interesting is PS>Attack. PSAttack.exe is a self-contained custom PowerShell console which includes many of the popular offensive powerful tools (written by Jared Haight, @jaredhaight), and what’s interesting about it is they’re encrypted within that executable. When you run PS>Attack they get decrypted and loaded directly into memory. So guess what antivirus vendors? You’re not going to see that. AMSI can catch this, so it gets decrypted, loaded into memory, and we have a number of interesting tools we can run, like Invoke-Mimikatz. PowerShell.exe is not running on this system. What we’re running is PSAttack.exe which is calling the PowerShell engine.
Another interesting thing is that PowerShell has different language modes. Contstrained Language Mode is a way you can lock down PowerShell to its core element, and things like Invoke-Mimikatz don’t work in Constrained Language Mode. Microsoft made the design decision that any executable that’s calling the PowerShell engine with PowerShell code uses the Full Language Mode. So on this system we have Constrained Language Mode enabled, but using PSAttack, we can still run Invoke-Mimikatz.
The other interesting thing about this is on Windows 7. It had PowerShell version 2 by default. We then put PowerShell version 4 on it or version 5 or version 3. Running PSAttack actually runs the PowerShell code connecting to the PowerShell v2 engine. There are no advanced security features in PowerShell v2. There’s no module logging. There’s no script lock logging. There’s no transcription. That means when we run Invoke-Mimikatz within PSAttack it’s connecting to that PowerShell v2 engine, and it doesn’t show up in the operational log. All that logging just isn’t there. We could do this normally just by doing PowerShell-v2, calling PowerShell, and then running Get-Process, which won’t show up in our log, but if we open up PowerShell normally and we run Get-Service, it’ll be there.
How do we detect this sort of thing? If we have a tool where we can look at the processes on a system and see what modules actually get loaded, we can look for System.Management.Automation, which is the PowerShell engine. It’s that ELL. We can see here PowerShell, PowerShell, PSAttack. Well, I could have renamed PSAttack, but it’s more interesting when I don’t. When we dig into this we can look at the modules that are there, and their System.Management.NI, which is just the image file, .DLL, which is called by PSAttack. How do we detect this if our nice, new PowerShell logging isn’t working? We look at the old PowerShell logging. This is the engine started, the engine stopped. You can see here it says the host name and the host version of that tool, and then at the bottom it says “engine version = 2.” Well, now we know that it called PowerShell version 2. If we had PowerShell v3, v4, v5 deployed everywhere and we see this, we know that this is something that’s a little suspicious. We need to look into why this is happening. It starts with getting these logs back from our endpoints and our servers and getting them into Splunk or our SIEM tool or whatever else we have for central logging.
We can detect this, like I said, event 400, 800, the standard PowerShell logging, by looking for an engine version that’s less than our standard deployment of PowerShell, looking for System.Management.Automation hosted in these nonstandard processes. This is not a PSAttack problem or a PowerShell problem because PSAttack is calling PowerShell. We could just as easily drop an executable that does everything we want it to and have it do those calls. We don’t necessarily need PowerShell to do that when we can run any executable on our system that we want.
So you want to remove the PowerShell v2 engine when it’s not being used. You can do this starting in Windows 8 in 2012 and newer. It still requires the .net framework 3.5 for someone to actually leverage that PowerShell v2, but a lot of organizations have already deployed this broadly to their systems.
Invoke-Obfuscation is a tool that was released by Daniel Bohannon at DerbyCon last year, and this takes obfuscation to the next level. Effectively what it does is you can feed it script code, you can feed it a script, and it’ll go through and completely obfuscate it, rendering a lot of the signatures that have been used by blue teams for a while now to look for malicious PowerShell to not work anymore. Same thing with antivirus. I took this function from Invoke-Mimikatz, Get-Image_NT_Headers. It’s one of the core components of Invoke-Mimikatz.
We go through the year, and we can signature off the function name or off a number of these different things like HDR 64 Magic, but when we run this through Invoke-Obfuscation, it looks like this, and all the signatures don’t work anymore. This is at a base level within Invoke-Obfuscation. I took the PowerView tool that Will Schroeder wrote, which is a great way to do AD recon, get information about the Active Directory environment from PowerShell. It basically bypasses AV, so when we run Invoke-Mimikatz normally, it doesn’t work. When we run the obfuscated version, it does run. That’s what I meant when I said it could surround antivirus.
Let’s look at the PowerView sample. When I ran PowerView through Invoke-Obfuscation, it looks like this. We’re not going to be able to signature off of that pretty easily because, if I ran it through at a higher level, deeper obfuscation, it almost starts looking like machine code. We’re not going to be able to look for standard English words, phrases, etc. in this jumble.
What do we do to find this? Lee Holmes (@lee_holmes), who wrote all of the security features in PowerShell v5, in October he did this blog post on his site, and he pointed out how, when looking at 300-500 PowerShell scripts from the download script library, he saw that the top 9 from standard PowerShell scripts were letters. On the right, he looked at a bunch of obfuscated code, and he saw that they’re all characters. By looking at the type and distribution of the characters in the script, we can pretty easily determine whether it’s obfuscated or not.
This leads us to the next question. How do we find it? We get PowerShell v5 out. We enable script block logging. We can look at the length of the PowerShell command because obfuscation is going to add a whole lot of extraneous characters and symbols to it in order to do the obfuscation. Look for brackets, lots of quotes. There’s a lot more of these than you would typically see in your normal PowerShell code. We can look for other things that are like random.
Last year when I was here, I talked about PowerShell security. Some of these slides are very familiar to you if you sat in that. I also showed the offensive PowerShell detection cheatsheet, which I’ve updated since then and update regularly, but this is a good way to start looking for malicious offensive PowerShell use in your environment. Pull the PowerShell log data back, put it in your SIEM tool, and these use these indicators to look for it. Stuff like kernel32.dll is going to be more noisy, but I looked at all of the 80 PowerShell offensive toolkits and identified what the most common halls were in them, basically the key functionality and what they needed from Windows to work. I got this list, and I’ve used this list to actually detect some interesting behavior at customer sites.
All right, Sean, that’s PowerShell. We’re here to talk about Active Directory, right? So let’s get into auditing attack activity in an environment. Let’s figure out how we can see what’s going on. When we’re talking about Windows logging, originally there was 9 audit settings. In Windows 7 2008 R2, we get 53 now. The problem is that a lot of times in organizations that have had Windows over the years and had Active Directory since 2000, 2003, they’re still operating with these original 9 audit settings. The other thing we get with Win 7 2008 R2 is special logon auditing, which is fantastic because there are companies I work with that can’t log all of the logon data from all their systems. That’s a lot of us. But what we can do is we can look in particular at specific groups, specific accounts and audit when they log on and where they log on.
This is the Advanced Audit Policy. As we can see here, there’s a lot of granular data that we can get out of this by configuring success and failure, and this top left configuration within the group policy, we want to make sure that we enable this. It says Audit: Force audit policy subcategory settings Windows Vista or later to override audit policy category settings. If you don’t have that set and you have the standard policy audit setting and the advanced one, Windows is going to use this, which means you’re not getting all of that more granular event ID data. You’re not going to get the things you really need to figure out what’s going on in your environment.
How do you figure this out? Auditpol.exe is a great way to get information about how that system is configured, so you certainly want to run this on one of your domain controllers to see what sort of event log data auditing you actually have and are capturing. I usually recommend these settings for auditing on domain controllers. They’re very similar for auditing on other systems. Again, these slides will be available later I’m going to call out specifically Kerberos Service Ticket Operations and Special Logon, and I’ll show you in a bit why that matters.
Special logon auditing logs event ID 4964 when a user who is a member of one of these groups that you’ve configured within this special logon group auditing logs on. You can track these logons based on who is in these groups. They’re logged on the system to which the user authenticates, so if a domain admin logs onto a workstation and you have domain admins configured with auditing and you’re flowing that data from that workstation to your SIEM, you’re going to see that logon activity and you’re going to be able to talk to that domain admin and say, “How are you logging onto this workstation? It’s not a good idea.”
Jessica Payne (@jepayneMSFT) did a great article, and I have the link here in the bottom to this article, talking about recommendations how to configure this (Special Groups). These are groups that she recommends and I recommend as well. Create a custom group. Call it whatever you want. Add it to this. Any users that are of interest, maybe they’re executives, maybe they have special powers within the organization that aren’t actually administrators, add them to this special group, and that way you can see where they’re logging on. Pull this event ID from all your systems.
So we configure this. We configure audit special logon success and failure. We need the SIDs for each of our groups in Active Directory, so here we do domain admins, enterprise admins, special group auditing, which is a group I created and got the SID for that. In this group policy, we need to configure a registry key, which is LSA audit special groups, and then we put in the SIDs for the groups that we want to audit. Once we do that and deploy it, these systems will log event ID 4908, and it’ll list the special groups that it’s auditing logons for. On the right, we can see that Luke Skywalker, the domain admin in this Active Directory environment, logged onto this system and he’s also a member of this special group auditing, a couple different groups that said why he’s getting audited when he logs on.
Someone asked me before this started, “It would be really great to have a list of all the event IDs that we need to get from the main controllers in Windows.” What do you guys think? Wouldn’t that be useful? Have you found that anywhere? No.
Our SIEM vendors have said, “Hey, log these events. Throw them in this event ID bubble. Let’s get what those are. Let’s capture those.” Has that event ID bubble turned into a crystal ball where we can see malicious activity on our networks? What’s that? It gets murkier! Yeah, exactly. Let’s talk about the event IDs of matter and domain controllers.
Kerberos Auth Ticket was requested. This is the initial Kerberos login: Kerberos service ticket request, custom special group logon tracking, logon failure, SID history added or attempted, DSRM account password change attempt (This is the local administrator account on domain controllers), ACLs set on admin accounts, Kerberos policy changing, attempt to reset account password specifically for admins or sensitive accounts, and then a number of group modifications and Active Directory modifications. The bottom one we can use to monitor for group policy changes. This is a really good idea, especially to see who is changing group policies at the domain level, the domain controllers OU, and where our accounts are, where our workstations are, where our servers are. But more than domain controllers, we want to also audit specific event IDs on all Windows systems, so someone cleared the event log. Local security authority modification: This is the logon system. This is what stores the logged on credentials in LSASS, that protected memory space. Explicit credential logon, handle to an object requested. This is typically accessing the local SAM on a Windows system or the Active Directory database on a domain controller. Special group auditing, account password change, new service was installed, new scheduled task was created or a scheduled task was modified. This is how attackers are persisting on networks today. If we’re not monitoring this, we’re missing it. Then of course, modifications of the local group on a Windows system, so modification of administrators. Attackers love creating another user account on a system and adding it to the local administrators group. Why? Because it’s a great way to persist and very few companies are monitoring for this.
In newer versions of Windows with 8.1 in 2012 R2 or newer, we get LSASS auditing. Basically anything that tries to inject or connect, plug into LSA, which again is our key authentication component, Mimikatz does this. We can enable auditing to see which tools or components, plugins, drivers, etc. connect into LSASS and are communicating with it. Keep in mind though, before you do this, you do want to test because, if you have a security product that shims LSASS to get information about what’s happening on that Windows system, it will cause problems. So be careful, but you definitely want to test these out and use these advanced features, looking at drivers that fail to load, connecting to LSA. You can protect it and say only signed binaries can connect to LSA and plug into it.
A note about logon types event ID 4624. This is the most numerous event ID you will find in any network. System logon type zero occurs sometimes, but attackers, based on specific activity they use, you may see this logged. The ones in bold are the ones you definitely want to pull. You don’t need all 4624s, but the logon types in bold you definitely want. Batch and service: A scheduled task started or a scheduled task ran and service started, what the account name was that was used to start those or run those. New credentials RunAs /NetOnly, remote interactive which is RDP.
Let’s talk about RunAs /NetOnly. People have heard of using RunAs /NetOnly to put honey tokens into systems. I’m not talking about that. Let’s look at an interesting situation that Jessica Payne with Microsoft again pointed out. This is a way to lock down your workstations so domain admins cannot log into them, which makes sense. You prevent them from logging on interactively. They can’t connect. They can’t RDP. They can’t run batch jobs or services. Even if they log on as a regular user account, they cannot do RunAs. It makes sense. However, with RunAs /NetOnly, it just loads this into memory and that credential is leveraged when it’s called. This means that this DA account can run the Active Directory using a computer’s MMC after logging on as a regular user, which is pretty interesting. This is a way that a domain admin could bypass these security controls because maybe they don’t understand why the security controls are there in the first place. We also have to make sure the admins understand why these security controls are there because, by doing this, the DA has just loaded their credentials into memory which could be pulled by something like Mimikatz. To mitigate this, obviously you can block RunAs from specific groups that are logged onto that system.
Let’s talk about password spraying. Password spraying is really interesting because it’s automated password guessing. There’s a list of passwords we’re going to try. We start with the first one. We use that first password to authenticate as every single user in Active Directory. We run through the list. Blockout threshold is 5, so we can try 4 for every user. Then we wait for 30 minutes, 31 minutes, and then we try again. This works a lot of the time because users have bad passwords. We can connect to an S&B share or a network service, so let’s start with connections to the PDCs netlogon share. We run it through. We get a bunch of passwords. 4625 logon failure, most organizations are logging that, so we should be able to see that if we’re looking for a whole bunch at a time. We can also look at the user attribute of when that password was last attempted. We can see here it’s within the same timeframe. That’s unusual. Let’s say, instead of connecting to the S&B, we connect to the LDAP service on a domain controller. What happens? No more 4625s. Wow! Where did they go? A lot of organizations are monitoring for 4625s, but if we connect to the LDAP service for password spraying, you wouldn’t see this. You have to get 4771s. Kerberos preauthentication failed: That’s not very useful. Why would I log that? We password spray. We see 4771s. Last bad password attempt shows up, and we look at this event, 4771: failure code is 0x18. That means bad password.
We also get 4648s on the workstation or the system that the attacker is running password spraying on. We’ll see a bunch of these where Joe User logged on and attempted to use the credentials for Alexis Phillips or Christopher Kelley or whoever and a bunch of those within seconds of each other. That’s pretty unusual.
One of the methods attackers use is something that I call SPN Scanning, which is basically asking Active Directory what are the services that are using Kerberos in the environment because, in order for Kerberos authentication to work, a service has to have a service principal name associated with it, and that service principal name associates that service account with a service running on a server. It looks like this. We can do SPN Scanning, and we can get information on all the SQL servers in the environment because they have SPNs registered. We get the port number. We can get the instance name. We know that the SQL service account is associated with it. Again, we can do this just by asking the domain controller. We don’t need to do any kind of port scanning.
At the bottom, all we need to do is get a list of all of the users in the domain that have a service principal name. Those are service accounts. That’s useful because then we can do something called Kerberoasting. We get a list of all the service accounts in the organization because they have service principal names. We pull a service principal name from each of them. We request a service ticket using RC4 encryption for each of those. We get a service ticket encrypted with RC4, which means that it’s encrypted with that service accounts NTLM password hash. It’s interesting, right? The cool thing about this is we do this as a user, and we never connect to any of those servers. We just ask the domain controller.
It looks like this. We use a PowerShell command at the top, which enables us to request the service ticket for this service principal name, and at the bottom, we run a K list and we can see that we got that service ticket. In this instance, it’s adsdb01. It’s a SQL server, and it’s RC4. We can use Mimikatz to pull this out of the user memory space, but since it is user memory space, the user has access to this ticket. We can use a PowerShell tool in order to do it or something else to just pull it out, save it as a file, offload it to our attacker system on the internet somewhere, and run Kerberos, which is what Tim Medin released at DerbyCon a few years ago. Since then, most of the password crackers have been updated for this. You can use your heavy-duty password cracking rig GPUs to crack service account passwords offline. If humans have created these, generally they can be cracked. If your minimum account password length for your domain is the default, which is 7, we’re going to crack all of these probably because most people go with whatever the minimum is, even if it’s 10 or 12. It’s very likely it can be cracked.
How do we detect this? About a year and a half ago I put up a blog post on potential detection. TGS-REQ packets with RC4 looking at the network searching for excessive 4769 events. Recently I decided to look at this again and ask how can we detect this. I put up a post on the Trimarc Security website as well as on AD Security, and this is the nuts and bolts of it: Look at event ID 4769 with specific options as well as ticket encryptions 0x17 which is RC4. We filter out service accounts. We filter out computers in the service name, the account name field, keeping in mind that inter-forest tickets use RC4 by default. ADFS tends to use it as well. By doing this, we can take those 4769 events, which we may have millions of them, and take it down to maybe a thousand or so. At that point, we’re really digging into looking at what’s going on in our environment, and when we do that, when an attacker does something like this sample code on how to basically Kerberoast, get those service tickets for all those different service principal names in the organization domain KList, all of these are RC4s. When we look here, we see RC4, RC4, RC4, RC4. All they’re doing is requesting RC4 service tickets, and we can look at that by using these indicators. We see Joe User at 9:36 requested the service ticket for Citrix VDI, Microsoft BizTalk, Business Objects, Microsoft’s AGPM group policy management console, and four different SQL servers all within a couple of seconds of each other. That is extremely suspicious. I don’t know what Joe is doing. Maybe he’s trying to work overtime or figure out some security stuff, but he’s up to no good.
All right, Sean, but what if the attacker decided to request these but did one a day? We would really never see that, right? How would we know that Joe is doing something he shouldn’t? I figured if we actually set up a honeypot account where we created a new service principal name that doesn’t exist, I made it up, it’s not associated with an application, no one should ever request it, and we set admin count to 1, which makes it look like it could be an account with domain level rights, when the attacker goes through and says, “I want to know all of the user accounts that have admin count as 1 and a service principal name. Give them to me.” We see Kerberos honeypot. Hopefully they don’t actually look at the service principal name they requested for because it’s a trap. We can see that it’s an RC4 ticket. Pretty standard. Then we go back to our indicator in our search, and we see that Kerberos Honeypot is there. Now we know that Joe is up to no good. He is requesting service principal names for a system that does not exist on this network, and if we just look for that specific service name, which we don’t have to name it Honeypot, we can call it something else, we can see that it happened. From millions to thousands to one — one guaranteed high-fidelity event. Attackers are doing this.
But there’s more. In the newer versions of Windows, when you have them on your domain controllers, you get these new checkboxes. This account supports Kerberos AES 128/256, right? If you set this on a service principal or service account that has a service principal name and your application supports AES Kerberos, which most of the Windows ones do, this changes the game because now when we run this standard Kerberoasting method PowerShell code script, they get an AES ticket, which changes the game. This isn’t going to work 100% of the time because this basically tells the DC it prefers AES. Give them an AES ticket. There’s a way to say, “I really need an RC4 ticket. Give that to me,” but that’s another indicator. That’s other activity that you can look for.
The key to this is not necessarily saying I need to detect all bad activity, right? Attacker’s dilemma / defender’s dilemma. We’ve heard this several times. The attacker can just throw things at it, and once they get through, then they’re good. The defender has to be right 100% of the time. This is false. Do not think that. It’s not the way it works. When the attacker gets on the network and they’re moving around, they have to be right 100% of the time. They have to avoid all of your traps and tricks and logging that you have on your network. If you configured those and you’re alerting on these things and you’re pulling what you need, they have to trip over one single tripwire that you’ve set up, and then you know they’re there.
QUESTION: Can Group Managed Service Accounts (GMSAs) mitigate Kerberoasting?.
ANSWER: If you use Group Managed Service Accounts (GMSAs), absolutely. I’ve talked before about mitigation of that. Specifically, we’re talking about detecting, but yes, long passwords for service accounts, 20-30 characters are great, very difficult to crack. Nation-states are probably in that area, but most of us are not defending against nation-states. We’re defending against ransomware and other types of things. But yes, absolutely, you want long, complex passwords for service accounts. Ideally, a system creates it like Group Managed Service Accounts where AD actually manages the passwords for those accounts. We want to make sure we can protect those. Set up a fine grained password policy, create a new group called All Service Accounts, add all service accounts into that group, and apply that fine grained password policy to that group. Then you can tell all of your service accounts they have to be 25 characters long whenever they get changed.
In conclusion, we’ve had this event ID bubble. We’ve thrown all these event IDs in. We haven’t gotten a lot of benefit out of it, but we can track this activity if we’re looking at the right things because most attackers are following the same procedures. They run through the same parts of the playbook. We want them to go past the first page, further back into their playbook to figure out how to avoid the logging we have configured. We can detect Kerberoasting, which is a huge help to us because Kerberoasting is something a lot of attackers are using,
I just want to call out and thank Jessica Payne (@jepayneMSFT) for her resources. She also helped me identify some of the key event IDs. I also have friends at Mandiant, so Devon Kerr (@_devonkerr_) was a help there as well. Slides will be on Presentations.ADSecurity.org. Thank you very much for your time. That has been mine, and we’ll open up for questions.
So I think we have a few minutes for questions. Please raise your hand. Yes?
QUESTION: How effective is the Microsoft ATA tool at detecting attacker activity?
ANSWER: Microsoft Advanced Threat Analytics (ATA) has pretty decent detection of specific types of attacks, and the user behavior analysis component of it is maturing. If you want to talk about it in more detail, I’ll be at the Trimarc Security booth after this. I don’t see products, but yeah, I’ll be happy to talk in more detail off mic.
Any other questions? Great. Well, like I said, I’ll be at the Trimarc Security booth in the vendor area after this if you have any questions. Thanks very much. Appreciate it.
Transcript for Sean Metcalf’s talk at BSides Charm (Baltimore, MD) in April 2017.
Download the PDF version of this transcript.
Copyright © 2017 Trimarc
Content may not be reproduced or used without express written permission.