Magecart Malware Obfuscation Techniques Revealed
Eggy Peggy was a language the girls at school used when they wanted to hide what they were saying to each other from eavesdropping boys like me. “Meggary heggad egga leggamb,” is what, “Mary had a little lamb,” would have sounded like in their weird tongue (not that they ever had cause to mention that exact phrase).
What they were doing is called obfuscation, which means to hide the meaning of something by altering it. Crucially it doesn’t change the meaning, just how it looks. In this case, by simply adding “egg” in front of each vowel sound the girls rendered their conversation much less easy to understand to anyone not in the know. To make matters worse, there were other versions too, with slightly different rules, so they could switch between them.
You may have wondered why they don’t just keep potentially vulnerable code out of harm’s way on back-end servers. Well, that would be unfeasible in some situations: some mobile apps don’t have a backend, some code needs to be hosted client-side to log user-experience analytics, and then there’s the performance slowdown from repeated server calls that would leave users cursing.
What is Magecart?
If you weren’t aware, Magecart is the name given to both the consortium of hacker groups that target online shopping cart systems (that mostly use the Magento shopping software) and to the type of supply chain attack that they use to do this. These web skimming attacks are designed to steal customer payment card information and sensitive personal data via online shopping checkout forms and obfuscation techniques have helped them to avoid detection, sometimes for months. Magecart attacks first appeared around 2010 and they’ve been successful against many high-profile targets, including British Airways, TicketMaster UK, Hanna Anderson, and many others.
The ongoing threat is so serious that the fourth revision of the Payment Card Industry Data Security Standard (PCI DSS v4.0) has now included stringent requirements around defending against so-called Magecart-style attacks (6.4: Public-facing web applications are protected against attacks, and 11.6: Unauthorized changes on payment pages are detected and responded to).
Obfuscation techniques applied to code make it look meaningless to the naked eye or else they hide it completely (even from malware detectors) but once it’s executed its true nature is revealed. That’s one of the key points to remember about obfuscation techniques. Just as the spoken language obfuscator Eggy Peggy preserved the meaning of “Mary had a little lamb,” while rendering it opaque, so code obfuscators don’t alter how a program functions, they simply make it harder for humans and applications such as malware detectors to understand what it does.
Code can be obfuscated by adding unnecessary logic and complicated circuitous phrases. This distracts the reader (be they human or machine) by hiding its nature within labyrinths of difficult syntax that takes a lot of time and effort to understand. If programs like antivirus tools can’t locate the digital signatures that they rely on for identification, then it’s easier for malware to slip through their defenses unchecked.
Instruction pattern/flow transformation
This technique swaps normal instructions for more complicated versions that perform the same function but take up more space. An example of this in everyday language would be like using a convoluted term such as, ‘motorized mobile appliance for rapid oxidation events mitigation,’ instead of the simpler ‘fire engine.’
var greeting = ‘Hello World’;
greeting = 10;
var product = greeting * greeting;
And here’s how it looks after:
They’ll both do the same thing, but the second version looks as clear as mud. The technique has turned the original snippet into something unreadable.
As we know, Magecart attackers typically exploit supply chain vulnerabilities to inject malicious scripts into shopping websites and content management systems, but in mid-2022, Microsoft researchers discovered that they were trying something new. They obfuscated a skimming script by first encoding it in PHP. They then embedded this in an image file, most probably in an attempt to make use of the PHP calls that occur when the index page is loaded on a website.
The CSS documents were retrieved from a URL that would have been flagged as suspicious had it been noticed, but again, it was hidden. The attackers used a fake URL that looked like a typical Facebook CDN endpoint as a distraction from the real one, along with a number array to represent specific indexes. They could then pick characters from the decoy to assemble their malicious URL and call for the CSS documents.
To gather the sensitive user data, they installed event listeners to record their inputs, store them locally, and transmit them to a server they controlled, then they covered their tracks by removing everything they’d used. Once again, it’s the layering of methods that makes this type of attack so hard to unpick.
Inserting dummy code
The dummy code makes it harder to reverse-engineer the script. Filling it with extra instructions doesn’t change the way the application works but makes the job of understanding the logic behind it a lot more difficult.
Opaque predicate insertion
This involves adding code that doesn’t work, so unlike the previous technique, it takes up space but will never actually be run. Its job is to confuse the reader with redundant statements, usually lots of extra or/if-then conditional branches.
Arithmetic code obfuscation
This takes simple arithmetic and logical code and replaces it with more complex equivalents that again make it hard to unravel. For example, this snippet calculates the sum and average of 50 numbers:
int i=1, sum=0, avg=0
while (i = 50)
}int i=1, sum=0, avg=0
while (i = 50)
But if we add a conditional variable, it becomes much more difficult to work out what the code is doing, because analyzing the function would require knowledge of the initial input.
In this snippet, the conditional variable ‘sneaky’ makes the code structure more complex and so a lot harder to understand:
Int = 1;
while (sneaky != 0)
i=0; sum=1; avg=1;
sneaky = 2;
if (i = 100)
sneaky = 3;
else sneaky = 0;
sum+=i;avg=sum/i ; i++;
sneaky = 2;
Code transposition obfuscation
This technique shuffles routines and branches within the code in a random fashion without affecting its execution. Malware writers like to use this method to avoid antivirus software.
The Hunter technique (with obfuscation)
In truth, we’re not sure if this is a widespread technique, but it exists and it has to be called something. Malwarebytes discovered this two-stage attack and called it Hunter because the code can be found under that name on GitHub. First, it injects code into the website’s source. The code calls out a remote URL and this loads their skimmer during checkout.
This URL contained more code that was obfuscated by Hunter. The de-obfuscated code revealed what looked like HTML code with forms denoting credit card fields. This was the skimmer itself, and it introduced credit card fields to the form that wouldn’t normally be there. The skimmer could then steal the credit card data, encode it, store it inside a cookie, and finally exfiltrate it via a POST request.
Coming Soon: The Reflectiz Script Deobfuscator
The Reflectiz platform provides a range of tools for detecting the malicious code alterations at the core of Magecart malware attacks. We are constantly searching for new ways to assist organizations in enhancing their security posture, and cracking new obfuscation techniques is our foremost goal. That’s why we’ve developed a cutting-edge deobfuscator tool for free use. It will be released in the next few weeks so that you can witness firsthand the power of this tool.