Magecart Malware Obfuscation Techniques Revealed

Eggy Peggy was a language the girls at school used when they wanted to hide what they were saying to each other from eavesdropping boys like me. “Meggary heggad egga leggamb,” is what, “Mary had a little lamb,” would have sounded like in their weird tongue (not that they ever had cause to mention that exact phrase).

What they were doing is called obfuscation, which means to hide the meaning of something by altering it. Crucially it doesn’t change the meaning, just how it looks. In this case, by simply adding “egg” in front of each vowel sound the girls rendered their conversation much less easy to understand to anyone not in the know. To make matters worse, there were other versions too, with slightly different rules, so they could switch between them.

Obfuscation techniques are as old as the hills and today they’re more popular than ever. That’s because JavaScript now powers almost all the world’s web browsers. Its accessibility and versatility have made it the de facto choice for facilitating web pages, web apps, and server-side programming.

But that’s created a problem. With so many web apps now tasked with handling private customer data and money, JavaScript obfuscation techniques have become an essential part of web security, helping owners to protect their websites from revealing their business logic algorithms, trade secrets, and intellectual property, as well as protecting their checkout pages from web skimming attacks that can steal customer payment information and PII.

You may have wondered why they don’t just keep potentially vulnerable code out of harm’s way on back-end servers. Well, that would be unfeasible in some situations: some mobile apps don’t have a backend, some code needs to be hosted client-side to log user-experience analytics, and then there’s the performance slowdown from repeated server calls that would leave users cursing.

So, JavaScript must stay where it can be hacked, and why developers use obfuscation techniques to protect it. The only trouble is though, those same techniques are also available to malicious actors who can use them to hide their malware, and that’s what Magecart attackers have been doing for some time. They don’t use Eggy Peggy to hide their malicious processes, but they do have plenty of other devious tricks up their sleeves.

What is Magecart?

If you weren’t aware, Magecart is the name given to both the consortium of hacker groups that target online shopping cart systems (that mostly use the Magento shopping software) and to the type of supply chain attack that they use to do this. These web skimming attacks are designed to steal customer payment card information and sensitive personal data via online shopping checkout forms and obfuscation techniques have helped them to avoid detection, sometimes for months. Magecart attacks first appeared around 2010 and they’ve been successful against many high-profile targets, including British Airways, TicketMaster UK, Hanna Anderson, and many others.

The ongoing threat is so serious that the fourth revision of the Payment Card Industry Data Security Standard (PCI DSS v4.0) has now included stringent requirements around defending against so-called Magecart-style attacks (6.4: Public-facing web applications are protected against attacks, and 11.6: Unauthorized changes on payment pages are detected and responded to).

Obfuscating Code

Obfuscation techniques applied to code make it look meaningless to the naked eye or else they hide it completely (even from malware detectors) but once it’s executed its true nature is revealed. That’s one of the key points to remember about obfuscation techniques. Just as the spoken language obfuscator Eggy Peggy preserved the meaning of “Mary had a little lamb,” while rendering it opaque, so code obfuscators don’t alter how a program functions, they simply make it harder for humans and applications such as malware detectors to understand what it does.

Code can be obfuscated by adding unnecessary logic and complicated circuitous phrases. This distracts the reader (be they human or machine) by hiding its nature within labyrinths of difficult syntax that takes a lot of time and effort to understand. If programs like antivirus tools can’t locate the digital signatures that they rely on for identification, then it’s easier for malware to slip through their defenses unchecked.

Instruction pattern/flow transformation

This technique swaps normal instructions for more complicated versions that perform the same function but take up more space. An example of this in everyday language would be like using a convoluted term such as, ‘motorized mobile appliance for rapid oxidation events mitigation,’ instead of the simpler ‘fire engine.’

Here’s an example of some JavaScript code before it’s been obfuscated with this technique:

var greeting = ‘Hello World’;

greeting = 10;

var product = greeting * greeting;

And here’s how it looks after:

They’ll both do the same thing, but the second version looks as clear as mud. The technique has turned the original snippet into something unreadable.

Obfuscation scripts

As we know, Magecart attackers typically exploit supply chain vulnerabilities to inject malicious scripts into shopping websites and content management systems, but in mid-2022, Microsoft researchers discovered that they were trying something new. They obfuscated a skimming script by first encoding it in PHP. They then embedded this in an image file, most probably in an attempt to make use of the PHP calls that occur when the index page is loaded on a website.

In another campaign they saw, the attackers used compromised web applications injected with malicious JavaScript that could mimic Google Analytics and Meta Pixel scripts. Some of the skimming scripts even had the means to check to see if the browser’s developer tools were open, to avoid debugging efforts.

Steganography

This technique goes back centuries and is best described as, “Hiding in plain sight.” Examples include writing with invisible ink or slipping text into an inconspicuous area of a painting, but it’s now employed as a modern obfuscation technique. An example of contemporary steganography was uncovered last year when a few examples of JavaScript hidden within CSS documents emerged as the method used in a spate of skimming attacks.

The malicious JavaScript code was hidden using spaces, tabs, and newlines. Space characters were used to divide it into sections. Tabs in each section were taken to mean 1 while newlines were taken to mean 0.

Each clump of 1s and 0s represented an individual ASCII character, so when the 1s and 0s were turned back into regular letters, numbers, and symbols they were restored into executable JavaScript instructions. This was handled by a compromised version of jQuery that was being hosted and run on the victim’s website (though it isn’t known exactly how this was achieved).

The CSS documents were retrieved from a URL that would have been flagged as suspicious had it been noticed, but again, it was hidden. The attackers used a fake URL that looked like a typical Facebook CDN endpoint as a distraction from the real one, along with a number array to represent specific indexes. They could then pick characters from the decoy to assemble their malicious URL and call for the CSS documents.

To gather the sensitive user data, they installed event listeners to record their inputs, store them locally, and transmit them to a server they controlled, then they covered their tracks by removing everything they’d used. Once again, it’s the layering of methods that makes this type of attack so hard to unpick.

Inserting dummy code

The dummy code makes it harder to reverse-engineer the script. Filling it with extra instructions doesn’t change the way the application works but makes the job of understanding the logic behind it a lot more difficult.

Opaque predicate insertion

This involves adding code that doesn’t work, so unlike the previous technique, it takes up space but will never actually be run. Its job is to confuse the reader with redundant statements, usually lots of extra or/if-then conditional branches.

Arithmetic code obfuscation

This takes simple arithmetic and logical code and replaces it with more complex equivalents that again make it hard to unravel. For example, this snippet calculates the sum and average of 50 numbers:

int i=1, sum=0, avg=0

while (i = 50)

{

sum+=i;

avg=sum/i;

i++;

}int i=1, sum=0, avg=0

while (i = 50)

{

sum+=i;

avg=sum/i;

i++;

}

But if we add a conditional variable, it becomes much more difficult to work out what the code is doing, because analyzing the function would require knowledge of the initial input.

In this snippet, the conditional variable ‘sneaky’ makes the code structure more complex and so a lot harder to understand:

Int = 1;

while (sneaky != 0)

{

switch (sneaky)

{

Case 1:

{

i=0; sum=1; avg=1;

sneaky = 2;

break;

}

case 2:

{

if (i = 100)

sneaky = 3;

else sneaky = 0;

break;

}

case 3:

{

sum+=i;avg=sum/i ; i++;

sneaky = 2;

break;

}

Code transposition obfuscation

This technique shuffles routines and branches within the code in a random fashion without affecting its execution. Malware writers like to use this method to avoid antivirus software.

The Hunter technique (with obfuscation)

In truth, we’re not sure if this is a widespread technique, but it exists and it has to be called something. Malwarebytes discovered this two-stage attack and called it Hunter because the code can be found under that name on GitHub. First, it injects code into the website’s source. The code calls out a remote URL and this loads their skimmer during checkout.

The function (h,u,n,t,e,r) found in the code helped them to identify it. To decode the obfuscated string, they wrote out the content of the eval function to give them a line of JavaScript pointing to a URL.

This URL contained more code that was obfuscated by Hunter. The de-obfuscated code revealed what looked like HTML code with forms denoting credit card fields. This was the skimmer itself, and it introduced credit card fields to the form that wouldn’t normally be there. The skimmer could then steal the credit card data, encode it, store it inside a cookie, and finally exfiltrate it via a POST request.

Coming Soon: The Reflectiz Script Deobfuscator

The Reflectiz platform provides a range of tools for detecting the malicious code alterations at the core of Magecart malware attacks. We are constantly searching for new ways to assist organizations in enhancing their security posture, and cracking new obfuscation techniques is our foremost goal. That’s why we’ve developed a cutting-edge deobfuscator tool for free use. It will be released in the next few weeks so that you can witness firsthand the power of this tool.