What is Punycode?
Unicode that converts words that cannot be written in ASCII, like the Greek word for thank you ‘ευχαριστώ’ into an ASCII encoding, like ‘xn--mxahn5algcq2e’ for use as domain names.
What does this actually mean?!
Writing with numbers
As with all things computers, it all boils down to numbers. Every letter, character, or emoji we type has a unique binary number associated with it so that our computers can process them. ASCII, a character encoding standard, uses 7 bits to code up to 127 characters, enough to code the Alphabet in upper and lower case, numbers 0-9 and some additional special characters. Where ASCII falls down is that it does not support languages such as Greek, Hebrew, and Arabic for example, this is where Unicode comes in; it uses 32 bits to code up to 2,147,483,647 characters! Unicode gives us enough options to support any language and even our ever-growing collection of emojis.
So where does Punycode come in?
Punycode is a way of converting words that cannot be written in ASCII, into a Unicode ASCII encoding. Why would you want to do this? The global Domain Name System (DNS), the naming system for any resource connected to the internet, is limited to ASCII characters. With punycode, you can include non-ASCII characters within a domain name by creating “bootstring” encoding of Unicode as part of a complicated encoding process.
How does a Punycode attack work?
Unicode characters can look the same to the naked eye but actually, have a different web address. Some letters in the Roman alphabet, used by the majority of modern languages, are the same shape as letters in Greek, Cyrillic, and other alphabets, so it’s easy for an attacker to launch a domain name that replaces some ASCII characters with Unicode characters. For example, you could swap a normal T for a Greek Tau: τ, the user would see the almost identical T symbol but the punycode behind this, read by the computer, is actually xn--5xa. Depending on how the browser renders this information in the address bar, these sneaky little characters are impossible for us humans to identify.
This technique is called a homograph attack, the URLs will look legitimate, and the content on the page might appear the same on the face of it but its actually a different website set up to steal the victim’s sensitive data or to infect the user’s device. These attacks use common techniques like phishing, forced downloads, and scams.
Just Browsing – Is Punycode an issue on all browsers?
By default, many web browsers use the xn-- prefix known as an ASCII compatible encoding prefix to indicate to the web browser that the domain uses punycode to represent unicode characters. This is a measure to defend against Homograph phishing attacks. However, not all browsers display the punycode prefix, leaving visitors none-the-wiser.
Hackers can exploit the vulnerability in the browsers that don’t use the prefix to display their fake domain names as the websites of legitimate services to steal login credentials, credit card numbers and other sensitive information from users.
In this example, Chinese security researcher Xudong Zheng discovered a loophole that allowed him to register the domain name xn--80ak6aa92e.com and bypass protection, which appears as “apple.com” by all vulnerable web browsers, which at the time included Chrome, Firefox, and Opera. Internet Explorer, Microsoft Edge, Apple Safari, Brave, and Vivaldi were not vulnerable.
Our current research shows the following behavior on the two major web browsers Chrome and Safari:
- Chrome – often displays the untranslated punycode with the prefix. When it is not sure whether or not the site is suspicious, it will not translate into Unicode but still allows you to go to the site. When it is sure the site is malicious, it will issue a warning “deceptive site ahead”.
- Safari – most of the time translates the punycode to Unicode characters. When it is sure that the site is malicious, it will issue a warning “deceptive site ahead” but still translate the punycode to Unicode characters.
Do Punycode attacks work on Mobile Apps?
Punycode attacks can take place on both desktop and mobile, as the various browser developers tend to treat punycode the same across all platforms. In short, if they display unicode to a user on one device, they do it on all platforms. Most of the current research into punycode focuses on how browsers treat these domains, but our research goes beyond the browser, to demonstrate that the way apps treat punycode is just as important. In our testing, we observed deceptive punycode domains were not being flagged as suspicious by widely used communication and collaboration tools used by employees. We tested the following apps on iOS and Android devices: Gmail, Apple Mail, iMessage, Message+, Whatsapp, Facebook Messenger, Skype, and Instagram. Only Facebook Messenger, Instagram and Skype provided an opportunity for the user to identify the punycode URL by either showing a preview of the webpage with the xn prefix, or in the case of skype, by not providing a hyperlink for domains using Unicode, meaning users can’t click through from the message. While these apps are not providing the best methods of defense, they at least provide an opportunity to assess suspicious links more closely.
So it seems that by displaying the deceptive Unicode that the majority of apps are opting to deliver an enhanced user experience over providing security to catch malicious sites. Some of the responsibility should fall upon the developers of these apps to ensure multiple layers of security are enforced to effectively defend against these attacks.
Why are Punycode attacks a bigger problem on mobile?
Our research into Punycode attacks on mobile identified a number of new malicious domains (listed below). Not only are these sites hosting phishing attacks on domains that are visually deceptive to users, but they are optimized for mobile, meaning hackers are aware of the difficulties faced by mobile users in identifying deceptive URLs. By targeting mobile users, these attacks are resulting in more successful phishing campaigns.
Phishing attacks are generally more difficult to detect on mobile for a number of reasons, this becomes near impossible when punycode is introduced and displayed properly.
- Smaller screen size leaves less space to evaluate the legitimacy of a website
- OS design typically hides the already tiny address bar as the user scrolls down to make room for the page content
- Distracted users tend to rush through various pages and notifications
- There is no mouse-over or preview functionality, which prevents the user from seeing or evaluating the link destination before clicking
- Can you spot the Unicode character in the domain below?
It’s getting emotional – How do Emoji domains factor in?
In the same way that special characters of different languages are encoded as punycode so too can the ever-growing library of emojis. An emoji domain is literally a domain with an emoji in it e.g. www..com, punycode is essential for this.
Here’s a recent example identified by Jamf's intelligent machine learning machine, MI:RIAM:
In some of the examples we have seen, the sites display competitions that offer prizes in exchange for sharing a link over WhatsApp, and sometimes they redirect the user to other scam pages when the user hits the back button multiple times. In other cases, the pages immediately redirect to other sites displaying app download advertisements of software updates.
Shortly after discovery and documentation, the content from most of these sites was removed. This is proof of how fast hackers are moving and is consistent with other forms of phishing attacks we are seeing.
Our research shows a new phishing site is created every 20 seconds and they are usually only live for four hours before hackers take them down and move on to create another deceiving domain. A clever way to cover their tracks and evade detection.
7 Ways to avoid a Punycode attack
- Be cautious if the site presses you to do something quickly. This is a classic strategy by hackers to rush their potential victims so that they are less likely to notice anything suspicious. Often they will offer a "limited time only" deal, and make it difficult to exit the page with ‘are you sure you want to exit’ pop-ups: these are all tactics to make you stay on their site longer and give them your details.
- If you are being offered a deal, go to the original company site and check if it’s available there as well, if not it’s most likely a scam doing its best to mimic the established brand and trick visitors into handing over their details.
- If some of the letters in the address bar look weird, or the website design looks different, rewrite it or visit the original company URL in a new tab to compare. The letters in the address bar looking strange is a key indicator that punycode is being used to trick you into thinking you are visiting a well-established brand site when in fact you are being taken to a malicious site.
- Use a password manager; this reduces the risk of pasting passwords into dodgy sites.
- Force your browser to display Punycode names, this option is available in Firefox.
- Click on the padlock to view and inspect the HTTPS certificate.
- Use a mobile security solution, Jamf for example uses MI:RIAM’s machine learning and artificial intelligence to monitor all data traffic and to detect and block phishing links such as these.