Trojan source

Nikhil Gowda
Nov 10, 2021
3 min read

According to new research, virtually all compilers — programmes that convert human-readable source code into computer-executable machine code are vulnerable to an insidious attack in which an adversary can sneakily introduce targeted vulnerabilities into any software without being detected. Multiple firms were involved in the vulnerability disclosure, and some are now delivering fixes to remedy the security flaw. A flaw has been uncovered by researchers at the University of Cambridge that affects most computer code compilers and many software development environments. The issue is a component of the Unicode digital text encoding standard, which allows computers to communicate regardless of the language they are using. Currently, Unicode defines about 143,000 characters in 154 different language scripts (in addition to many non-script character sets, such as emojis).

The flaw is in Unicode's bi-directional algorithm, which manages displaying text in mixed scripts with differing display orders, such as Arabic, which is read right to left, and English, which is read left to right. However, computer systems must be able to resolve contradictory text directionality in a deterministic manner. The "Bidi override" can be used to convert left-to-right text to right-to-left text and vice versa. The default ordering given by the Bidi Algorithm may not be sufficient in some cases, according to the Cambridge researchers, “Bidi override control characters enable modifying the display ordering of groups of characters in various situations.”

Even single-script characters can be shown in a different order than their logical encoding thanks to bidi overrides. This characteristic has already been exploited to conceal the file extensions of malware distributed over email, according to the researchers. The issue is that most programming languages allow you to write Bidi overrides in comments and strings. This is problematic since most programming languages enable comments in which all text is disregarded by compilers and interpreters, including control characters. It's also dangerous because most programming languages allow any characters, including control characters, in string literals.

While both comments and strings will have syntax-specific semantics marking their start and finish, these bounds are not followed by Bidi overrides, according to the study paper, which named the vulnerability "Trojan Source." We can smuggle Bidi override characters into source code in a way that most compilers would accept by using them exclusively in comments and strings. A human code reviewer can have a hard time detecting such an attack because the produced source code appears to be fully appropriate.

Because these tactics can be used to launch powerful supply-chain attacks, it is critical for enterprises involved in the software supply chain to establish defences. The vulnerability is genuine, but it also illustrates the even greater vulnerability of our modern code's reliance on moving dependencies and packages. The good news is that the researchers did a comprehensive vulnerability check but found no evidence that anyone had yet exploited the flaw.

In terms of what needs to be done about Trojan Source, the researchers recommend that governments and businesses that rely on critical software identify their suppliers' security postures, put pressure on them to implement adequate defences, and make sure that any gaps are filled by controls elsewhere in their toolchain. At cybersecurity link, we offer free posture assessment of worth $5000. It helps you asses your suppliers/vendors gaps and necessary steps to be taken to avoid such an attack. To learn more about our Gap assessment, please visit: https://www.cybersecuritylink.com.au/gap-analysis.

Trojan source

Recent Posts

Comments