Back

sRDI – Shellcode Reflective DLL Injection

During our first offering of “Dark Side Ops II – Adversary Simulation” at Black Hat USA 2017, we quietly dropped a piece of our internal toolkit called sRDI. Shortly after, the full project was put on GitHub (https://github.com/monoxgas/sRDI) without much explanation.  I wanted to write a quick post discussing the details and use-cases behind this new functionality.

A Short History

Back in ye olde times, if you were exploiting existing code, or staging malicious code into memory, you used shellcode. For those rare few who still have the skill to write programs in assembly, we commend you. As the Windows API grew up and gained popularity, people found sanctuary in DLLs. C code and cross compatibility were very appealing, but what if you wanted your DLL to execute in another process? Well, you could try writing the file to memory and dropping a thread at the top, but that doesn’t work very well on packed PE files. The Windows OS already knows how to load PE files, so people asked nicely and DLL Injection was born. This involves starting a thread in a remote process to call “LoadLibrary()” from the WinAPI. This will read a (malicious) DLL from disk and load it into the target process. So you write some cool malware, save it as a DLL, drop it to disk, and respawn into other processes. Awesome!…well, not really. Anti-virus vendors caught on quick, started flagging more and more file types, and performing heuristic analysis. The disk wasn’t a safe place anymore!

Finally in 2009, our malware messiah Stephen Fewer (@stephenfewer) releases Reflective DLL Injection. As demonstrated, LoadLibrary is limited in loading only DLLs from disk. So Mr. Fewer said “Hold my beer, I’ll do it myself”. With a rough copy of LoadLibrary implemented in C, this code could now be included into any DLL project. The process would export a new function called “ReflectiveLoader” from the (malicious) DLL. When injected, the reflective DLL would locate the offset of this function, and drop a thread on it. ReflectiveLoader walks back through memory to locate the beginning of the DLL, then unpacks and remaps everything automatically. When complete, “DLLMain” is called and you have your malware running in memory.

Years went by and very little was done to update these techniques. Memory injection was well ahead of it’s time and allowed all the APTs and such to breeze past AV. In 2015, Dan Staples (@_dismantl) released an important update to RDI, called “Improved Reflective DLL Injection“. This aimed to allow an additional function to be called after “DLLMain” and support the passing of user arguments into said additional function. Some shellcode trickery and a bootstrap placed before the call to ReflectiveLoader accomplished just that. RDI is now functioning more and more like the legitimate LoadLibrary. We can now load a DLL, call it’s entry point, and then pass user data to another exported function. By the way, if you aren’t familiar with DLLs or exported functions, I recommend you read Microsoft’s overview.

Making shellcode great again

Reflective DLL injection is being used heavily by private and public toolsets to maintain that “in-memory” street cred. Why change things? Well…

  • RDI requires that your target DLL and staging code understand RDI. So you need access to the source code on both ends (the injector and injectee), or use tools that already support RDI.
  • RDI requires a lot of code for loading in comparison to shellcode injection. This compromises stealth and makes stagers easier to signature/monitor.
  • RDI is confusing for people who don’t write native code often.
  • Modern APT groups have already implemented more mature memory injection techniques, and our goal is better emulate real-world adversaries.

The list isn’t as long as some reasons to change things, but we wanted to write a new version of RDI for simplicity and flexibility. So what did we do?

  1. To start, we read through some great research by Matt Graeber (@mattifestation) to convert primitive C code into shellcode. We rewrote the ReflectiveLoader function and converted the entire thing into a big shellcode blob. We now have a basic PE loader as shellcode.
  2. We wanted to maintain the advantages of Dan Staples technique, so we modified the bootstrap to hook into our new shellcode ReflectiveLoader. We also added some other tricks like a pop/call to allow the shellcode to get it’s current location in memory and maintain position independence.
  3. Once our bootstrap primitives were built, we implemented a conversion process into different languages (C, PowerShell, C#, and Python). This allows us to hook our new shellcode and a DLL together with the bootstrap code in any other tool we needed.

Once complete, the blob looks something like this:

When execution starts at the top of the bootstrap, the general flow looks like this:

  1. Get current location in memory (Bootstrap)
  2. Calculate and setup registers (Bootstrap)
  3. Pass execution to RDI with the function hash, user data, and location of the target DLL (Bootstrap)
  4. Un-pack DLL and remap sections (RDI)
  5. Call DLLMain (RDI)
  6. Call exported function by hashed name (RDI) – Optional
  7. Pass user-data to exported function (RDI) – Optional

With that all done, we now have conversion functions that take in arbitrary DLLs, and spit out position independent shellcode. Optionally, you can specify arbitrary data to get passed to an exported function once the DLL is loaded (as Mr. Staples intended). On top of that, if you are performing local injection, the shellcode will return a memory pointer that you can use with GetProcAddressR() to locate additional exported functions and call them. Even with the explanation, the process can seem confusing to most who don’t have experience with the original RDI project, shellcode, or PE files, so I recommend you read existing research and head over to the GitHub repository and dig into the code: https://github.com/monoxgas/sRDI

Okay, so what?

“You can now convert any DLL to position independent shellcode at any time, on the fly.”

This tool is mainly relevant to people who write/customize malware. If you don’t know how to write a DLL, I doubt most of this applies to you. With that said, if you are interested in writing something more than a PowerShell script or Py2Exe executable to perform red-teaming, this is a great place to start.

Use case #1 – Stealthy persistence

  • Use server-side Python code (sRDI) to convert a RAT to shellcode
  • Write the shellcode to the registry
  • Setup a scheduled task to execute a basic loader DLL
  • Loader reads shellcode and injects (<20 lines of C code)

Pros: Neither your RAT or loader need to understand RDI or be compiled with RDI. The loader can stay small and simple to avoid AV.

Use case #2 – Side loading

  • Get your sweet RAT running in memory
  • Write DLL to perform extra functionality
  • Convert the DLL to shellcode (using sRDI) and inject locally
  • Use GetProcAddressR to lookup exported functions
  • Execute additional functionality X-times without reloading DLL

Pros: Keep your initial tool more lightweight and add functionality as needed. Load a DLL once and use it just like any other.

Use case #3 – Dependencies

  • Read existing legitimate API DLL from disk
  • Convert the DLL to shellcode (using sRDI) and load it into memory
  • Use GetProcAddress to lookup needed functions

Pros: Avoid monitoring tools that detect LoadLibrary calls. Access API functions without leaking information. (WinInet, PSApi, TlHelp32, GdiPlus)

Conclusion

We hope people get good use out of this tool. sRDI been a member of the SBS family for almost 2 years now and we have it integrated into many of our tools. Please make modifications and create pull-requests if you find improvements.

We’d love to see people start pushing memory injection to higher levels. With recent AV vendors promising more analytics and protections against techniques like this, we’re confident threat actors have already implemented improvements and alternatives that don’t involve high level languages like PowerShell or JScript.

@monoxgas

Back

XSS Using Active Directory Automatic Provisioning

We recently tested a web application that had implemented Azure Active Directory automatic provisioning through Cross-domain Identity Management (SCIM). Azure Active Directory can automatically provision users and groups to any application or identity store that is fronted by a Web service with the interface defined in the SCIM 2.0 protocol specification. Azure Active Directory can send requests to create, modify and delete assigned users and groups to this Web service, which can then translate those requests into operations upon the target identity store.

An interesting capability, but the real question is: “Can we exploit the the application in some way if we already have access to the Azure panel?”

First thing to test is the limitations on the various fields. Let’s test the user’s display name, first name, and last name:

Well well well, looks like we can basically add any character we want into these fields. The max length of the first and last names are both 64 characters, and the display name is 256 characters.

64 characters is enough to import a JavaScript source from elsewhere, so that’s one of the things we’ll try. Here’s our new malicious user:

Once we’ve synced our new user with the target application, let’s take a look back at our vulnerable application’s source to view the results:

Our user’s first and last names are inserted into the page source without html encoding which results in two separate XSS injection points. One pops an alert while the other imports an entire .js file from a shortened URL to display a modal login prompt used to steal user credentials.

Just goes to show that your shouldn’t trust Microsoft to do filtering for you.

For all web applications, we recommend performing input filtering AND output encoding to ensure client-side security. In this case, Azure was not performing input filtering for the application, and the input was blindly trusted when generating the page content. At it’s core, XSS is dangerous because malicious HTML or JavaScript is placed in the page content without proper encoding. Despite this, we see many instances where input filtering is the only protection implemented by the application. Encoding output and filtering input are additional defense-in-depth controls that improve application security.

This was a quick blog post, but we recommend people consider any other areas where “trusted” data is used such as SCIM.

If you want more information on how the SCIM technology functions, or want to test this out yourself, Microsoft provides some excellent documentation on how to try this out yourself:

https://docs.microsoft.com/en-us/azure/active-directory/active-directory-scim-provisioning

Back

dataLoc: A POC Tool for Finding Payment Cards Stored in MSSQL

In this blog I’ll be introducing dataLoc, a tool for locating payment cards in MSSQL databases without requiring the presence of keywords. dataLoc would be useful for anyone that would like to check their database for payment card numbers in unexpected places. This could include; DBAs, pen-testers, auditors, and others.

dataLoc Overview

At its core, dataLoc functions by using the filtering methods discussed here: https://blog.netspi.com/identifying-payment-cards-at-rest-going-beyond-the-key-word-search/

dataLoc is not an injection or attack tool. It requires a direct connection to a database along with valid user credentials. The user account requires full read access, as well as the ability to create and drop temp tables.

For those of you that are in a hurry to get started, the dataLoc source and binaries are available on GitHub:

https://github.com/NetSPI/DataLoc

Dependencies

dataLoc is a portable stand alone executable. Most systems will already have the native SQL driver the tool relies on, but if you find that your system doesn’t, it’s included with the SQL Server Native Client. https://docs.microsoft.com/en-us/sql/relational-databases/native-client/sql-server-native-client

Configuration

The tool is intended to be easy to use. All you need to do to scan for payment card numbers is provide a remote host, enter a set of credentials, or enable windows auth, and click “connect”, and then “scan”.

Img C E A

If you’d like to do targeted scanning you can narrow the focus to a specific database, table, or even column by selecting the database from the drop down, and then clicking on the table or column you’re interested in.

Img C Ceff

General

If you decide to customize some of the more advances settings, you may want to enable the use of an INI file so your changes persist. In order to keep scan times reasonable, you may want to enable the per column timeout and set a reasonable cap of 1 to 10 minutes. Most columns are processed within a few seconds.

Img Ca F Fb

Scoring

The scoring system is used to generate a confidence rating for each potential finding. The lower the number, the more likely the item is to be a false positive. This tool is a simple proof of concept, so it’s highly likely you would benefit from tuning the scoring system to your environment.

Img Cf

Scoring is broken up into several sections.

  • Luhn Valid – By default a base score of 50 is assigned for all Luhn valid matches.  Anything that fails Luhn validation is discarded.
  • Alpha Delimiters – A letter exists somewhere inside the number sequence Ex: 411a1111111111111
  • Card + CVV – Match is followed by 3 digits Ex: 4111111111111111 123
  • Phone Number – The match looks like it could be part of a phone number Ex: 1-4111111111111111
  • Keywords – The text visa, card, etc. exists in the cell containing the match Ex: visa 4111111111111111
  • Negative Keywords – Triple A membership numbers “aaa” are 16 digits and Luhn valid.
  • Delimiters – The number of delimiters and the types. Ex Count:4 Types:2: 411-111-111-111/1111
  • IIN Check – Does match contain a known IIN

Known Issues

  • The script is single threaded. Once you start a scan the GUI will become unresponsive until it completes it’s run.
  • The only way to stop a scan early is to kill the application.
  • dataLoc was tested exclusively on Windows 10. There may be issues with the GUI on anything older.

Feel free to submit a ticket to the GitHub repository if something doesn’t work as expected.  I’d love some constructive feedback.

References

https://blog.netspi.com/identifying-payment-cards-at-rest-going-beyond-the-key-word-search/

https://github.com/NetSPI/DataLoc

Back

Identifying Payment Cards at Rest – Going Beyond the Key Word Search

In this blog, I’ll be discussing an approach for locating payment card numbers stored in MSSQL databases without relying on key words for data discovery.

To overcome the impracticality of pulling an entire database over the wire for advanced analysis, we’ll focus on using MSSQL’s native capability to filter out items that can’t contain cardholder data. This will greatly reduce the need for local checks.

The Pattern in the Number

Before we can begin, we need to understand what we’re looking for.  For this exercise the focus will be on four card types; Visa, MasterCard, American Express, and Discover. These cards all have known lengths and starting sequences.

Card TypeLength (digits)Starting Sequence
Visa164
MasterCard1650-55, 222100-272099
American Express1534 or 37
Discover166011 or 65

*Visa issued 13 digit cards in the past, but those cards are no longer valid.

*MasterCard started issuing 2-series BINs (222100-272099) January 2017.  The code examples below have not been updated to support these numbers.

https://www.mastercard.us/en-us/issuers/get-support/2-series-bin-expansion.html

The first 6 digits of each of these cards make up the IIN (Issuer Identification Number) also known as the BIN (Bank Identification Number). Card issuers don’t generally provide official lists of IINs, but several community driven efforts to catalog this information exist. A good example can be found here: https://www.stevemorse.org/ssn/List_of_Bank_Identification_Numbers.html

The next 1-5 digits are known as the account range. The account range is followed by the customer identification number (CIN) and the check digit. Although the account range and CIN are going to be unknowns, the check digit is generated using a mathematical formula, and thus can be validated.

IINAccount Range & Customer Identification NumberCheck Digit
4111 1111 1111 1111

For more ideas about how to leverage the check digit, I recommend reading this post made by Karl Fosaaen. https://blog.netspi.com/cracking-credit-card-hashes-with-powershell/

MSSQL Filtering

We can’t positively identify payment cards with MSSQL’s native pattern matching capability, but we can prove several negatives that will allow us to eliminate tables, columns, and even individual cells that don’t contain payment cards.

The first thing we need to do is query the MSSQL server for a list of available databases. Filtering out default system databases is the first step in cutting down the amount of content we’ll look at locally.

SQL – List available databases

USE master;
SELECT NAME FROM sysdatabases WHERE 
(NAME NOT LIKE 'distribution') AND (NAME NOT LIKE 'master') AND 
(NAME NOT LIKE 'model') AND (NAME NOT LIKE 'msdb') AND 
(NAME NOT LIKE 'publication') AND (NAME NOT LIKE 'reportserver') AND 
(NAME NOT LIKE 'reportservertempdb') AND (NAME NOT LIKE 'resource') AND 
(NAME NOT LIKE 'tempdb') 
ORDER BY NAME;

Excluded databases

(distribution, master, model, msdb, publication, reportserver, reportservertempdb, resource, tempdb)

The next step is to list the tables that may contain payment card data for the remaining databases; again, eliminating system defaults.

The SQL examples provided were created for the AdventureWorks2014 database, made freely available by Microsoft. https://msftdbprodsamples.codeplex.com/releases/view/125550

SQL – List tables

USE AdventureWorks2014;
SELECT '[' + SCHEMA_NAME(t.schema_id) + '].[' + t.name + ']' AS fulltable_name, 
SCHEMA_NAME(t.schema_id) AS schema_name, t.name AS table_name, 
i.rows FROM sys.tables AS t INNER JOIN sys.sysindexes AS i ON t.object_id = i.id AND 
i.indid < 2 WHERE (ROWS > 0) AND (t.name NOT LIKE 'syscolumns') AND 
(t.name NOT LIKE 'syscomments') AND (t.name NOT LIKE 'sysconstraints') AND 
(t.name NOT LIKE 'sysdepends') AND (t.name NOT LIKE 'sysfilegroups') AND 
(t.name NOT LIKE 'sysfiles') AND (t.name NOT LIKE 'sysforeignkeys') AND 
(t.name NOT LIKE 'sysfulltextcatalogs') AND (t.name NOT LIKE 'sysindexes') AND 
(t.name NOT LIKE 'sysindexkeys') AND (t.name NOT LIKE 'sysmembers') AND 
(t.name NOT LIKE 'sysobjects') AND (t.name NOT LIKE 'syspermissions') AND 
(t.name NOT LIKE 'sysprotects') AND (t.name NOT LIKE 'sysreferences') AND 
(t.name NOT LIKE 'systypes') AND (t.name NOT LIKE 'sysusers') 
ORDER BY TABLE_NAME;

Excluded Tables:

(syscolumns, syscomments, sysconstraints, sysdepends, sysfilegroups, sysfiles, sysforeignkeys, sysfulltextcatalogs, sysindexes, sysindexkeys, sysmembers, sysobjects, syspermissions, sysprotects, sysreferences, systypes, sysusers)

Now we’ll list columns for each table, this time filtering on column length and data type. For this example, we’ll focus on the “CreditCard” table.

SQL – List columns

USE AdventureWorks2014;
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE 
CHARACTER_MAXIMUM_LENGTH > 14 AND 
DATA_TYPE NOT IN ('bigint','binary','bit','cursor','date','datetime','datetime2',
'datetimeoffset','float','geography','hierarchyid','image','int','money','real',
'smalldatetime','smallint','smallmoney','sql_variant','table','time','timestamp',
'tinyint','uniqueidentifier','varbinary','xml') AND 
TABLE_NAME='CreditCard' OR 
CHARACTER_MAXIMUM_LENGTH < 1 AND 
DATA_TYPE NOT IN ('bigint','binary','bit','cursor','date','datetime','datetime2',
'datetimeoffset','float','geography','hierarchyid','image','int','money','real',
'smalldatetime','smallint','smallmoney','sql_variant','table','time','timestamp',
'tinyint','uniqueidentifier','varbinary','xml') AND 
TABLE_NAME='CreditCard' ORDER BY COLUMN_NAME;

Excluded Data Types:

(bigint, binary, bit, cursor, date, datetime, datetime2, datetimeoffset, float, geography, hierarchyid, image, int, money, real, smalldatetime, smallint, smallmoney, sql_variant, table, time, timestamp, tinyint, uniqueidentifier, varbinary, xml)

The last set of server side filters we’ll apply take advantage of the weak pattern matching available in MSSQL to eliminate cells that don’t match know card formats.

SQL- Apply MSSQL pattern matching

/* create temp table with appropriate columns and data types */

CREATE TABLE #dataloc (RowNumber INT IDENTITY(1,1), "CardNumber" nvarchar(25));

/* populate temp table with data that matches payment card formats */

INSERT INTO #dataloc 
Select "CardNumber" FROM [Sales].[CreditCard] WHERE "CardNumber" LIKE 
('%4%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%') 
UNION Select "CardNumber" FROM [Sales].[CreditCard] WHERE "CardNumber" LIKE 
('%5%[1-5]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%') 
UNION Select "CardNumber" FROM [Sales].[CreditCard] WHERE "CardNumber" LIKE 
('%3%[47]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%') 
UNION Select "CardNumber" FROM [Sales].[CreditCard] WHERE "CardNumber" LIKE 
('%6%[05]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%[0-9]%')

This SQL copies matching rows to a temp table and adds row numbers for later use.

Local Validation Testing

We now have a temp table filled exclusively with rows of data containing 15-16 digits that loosely match known payment card patterns.

It’s time to start pulling content over the wire for local processing. The row numbers assigned earlier will be used to break the dataset into chunks.

SQL – Querying potential card numbers

SELECT * FROM #dataloc WHERE (RowNumber >=1 AND RowNumber <=4000) ORDER BY RowNumber;

Now that we have full cell data, it’s time to use regex to extract all potential payment card numbers. It’s entirely possible that some cells will contain multiple payment cards.

American Express(?<![0-9])3\D{0,4}(4|7)(\D{0,4}\d){13}[^0-9]
Discover(?<![0-9])6\D{0,4}(5(\D{0,4}\d){14}(\D?|$)|0\D{0,4}1\D{0,4}1(\D{0,4}\d){12})[^0-9]
MasterCard(?<![0-9])5\D{0,4}(0-5)(\D{0,4}\d){14}[^0-9]
Visa(?<![0-9])4(\D{0,4}\d){15}[^0-9]

If you’re familiar with regex, you may have noticed that these patterns exclude matches that are immediately prepended or followed with a digit. Although this will ultimately result in some false negatives, extensive real world testing has shown that these matches are almost always false positives.

The matches extracted using the above regex are then subjected to Luhn validation checks. Any regex match that fails the Luhn check is immediately thrown out.

Luhn Validation, invented in 1954 by Hans Peter Luhn – https://en.wikipedia.org/wiki/Luhn_algorithm

Conclusion

Testing has shown that this approach has the potential to provide higher accuracy when compared to a keyword search. It also enables us to locate payment card numbers in unexpected places such as free form notes fields, or repurposed legacy columns, but this accuracy comes with a price. Database performance loads and scan times are significantly greater than those generated by the keyword search offered by PowerUpSQL making the best approach dependent on your specific use case.

In the next few weeks I’ll be releasing a tool that serves as a proof of concept for the search method discussed in this post. In the meantime, if you’d like to read more about locating sensitive data with Scott Sutherland’s PowerUpSQL, you can do that here: https://blog.netspi.com/finding-sensitive-data-domain-sql-servers-using-powerupsql/

Discover how the NetSPI BAS solution helps organizations validate the efficacy of existing security controls and understand their Security Posture and Readiness.

X