Replacement for javascript escape?

escape() is defined in section B.2.1.2 escape and the introduction text of Annex B says:

… All of the language features and behaviours specified in this annex have one or more undesirable characteristics and in the absence of legacy usage would be removed from this specification. …

For characters, whose code unit value is 0xFF or less, escape() produces a two-digit escape sequence: %xx. This basically means, that escape() converts a string containing only characters from U+0000 to U+00FF to an percent-encoded string using the latin-1 encoding.

For characters with a greater code unit, the four-digit format %uxxxx is used. This is not allowed within the hfields section (where subject and body are stored) of an mailto:-URI (as defined in RFC6068):

mailtoURI    = "mailto:" [ to ] [ hfields ]
to           = addr-spec *("," addr-spec )
hfields      = "?" hfield *( "&" hfield )
hfield       = hfname "=" hfvalue
hfname       = *qchar
hfvalue      = *qchar
...
qchar        = unreserved / pct-encoded / some-delims
some-delims  = "!" / "$" / "'" / "(" / ")" / "*"
               / "+" / "," / ";" / ":" / "@"

unreserved and pct-encoded are defined in STD66:

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG

A percent sign is only allowed if it is directly followed by two hexdigits, percent followed by u is not allowed.

Using a self-implemented version, that behaves exactly like escape doesn’t solve anything – instead just continue to use escape, it won’t be removed anytime soon.


To summerise: Your previous usage of escape() generated latin1-percent-encoded mailto-URIs if all characters are in the range U+0000 to U+00FF, otherwise an invalid URI was generated (which might still be correctly interpreted by some applications, if they had javascript-encode/decode compatibility in mind).

It is more correct (no risk of creating invalid URIs) and future-proof, to generate UTF8-percent-encoded mailto-URIs using encodeURIComponent() (don’t use encodeURI(), it does not escape ?, /, …). RFC6068 requires usage of UTF-8 in many places (but allows other encodings for “MIME encoded words and for bodies in composed email messages”).

Example:

text_latin1="Swedish åäö"
text_other="Emoji 😎"

document.getElementById('escape-latin-1-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+escape(text_latin1);
document.getElementById('escape-other-chars-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+escape(text_other);
document.getElementById('utf8-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+encodeURIComponent(text_latin1);
document.getElementById('utf8-other-chars-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+encodeURIComponent(text_other);

function mime_word(text){
  q_encoded = encodeURIComponent(text) //to utf8 percent encoded
  .replace(/[_!'()*]/g, function(c){return '%'+c.charCodeAt(0).toString(16).toUpperCase();})// encode some more chars as utf8
  .replace(/%20/g,'_') // mime Q-encoding is using underscore as space
  .replace(/%/g,'='); //mime Q-encoding uses equal instead of percent
  return encodeURIComponent('=?utf-8?Q?'+q_encoded+'?=');//add mime word stuff and escape for uri
}

//don't use mime_word for body!!!
document.getElementById('mime-word-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+mime_word(text_latin1);
document.getElementById('mime-word-other-chars-link').href="https://stackoverflow.com/questions/26342123/mailto:?subject="+mime_word(text_other);
<a id="escape-latin-1-link">escape()-latin1</a><br/>
<a id="escape-other-chars-link">escape()-emoji</a><br/>
<a id="utf8-link">utf8</a><br/>
<a id="utf8-other-chars-link">utf8-emoji</a><br/>
<a id="mime-word-link">mime-word</a><br/>
<a id="mime-word-other-chars-link">mime-word-emoji</a><br/>

For me, the UTF-8 links and the Mime-Word links work in Thunderbird. Only the plain UTF-8 links work in Windows 10 builtin Mailapp and my up-to-date version of Outlook.

Leave a Comment