Should I use accented characters in URLs?

There’s no ambiguity here: RFC3986 says no, that is, URIs cannot contain unicode characters, only ASCII. An entirely different matter is how browsers represent encoded characters when displaying a URI, for example some browsers will display a space in a URL instead of ‘%20’. This is how IDN works too: punycoded strings are encoded and … Read more

Remove diacritical marks (ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ) from Unicode chars

I have done this recently in Java: public static final Pattern DIACRITICS_AND_FRIENDS = Pattern.compile(“[\\p{InCombiningDiacriticalMarks}\\p{IsLm}\\p{IsSk}]+”); private static String stripDiacritics(String str) { str = Normalizer.normalize(str, Normalizer.Form.NFD); str = DIACRITICS_AND_FRIENDS.matcher(str).replaceAll(“”); return str; } This will do as you specified: stripDiacritics(“Björn”) = Bjorn but it will fail on for example Białystok, because the ł character is not diacritic. If … Read more

Converting Symbols, Accent Letters to English Alphabet

Reposting my post from How do I remove diacritics (accents) from a string in .NET? This method works fine in java (purely for the purpose of removing diacritical marks aka accents). It basically converts all accented characters into their deAccented counterparts followed by their combining diacritics. Now you can use a regex to strip off … Read more

Is there a way to get rid of accents and convert a whole string to regular letters?

Use java.text.Normalizer to handle this for you. string = Normalizer.normalize(string, Normalizer.Form.NFD); // or Normalizer.Form.NFKD for a more “compatible” deconstruction This will separate all of the accent marks from the characters. Then, you just need to compare each character against being a letter and throw out the ones that aren’t. string = string.replaceAll(“[^\\p{ASCII}]”, “”); If your … Read more

How do I remove diacritics (accents) from a string in .NET?

I’ve not used this method, but Michael Kaplan describes a method for doing so in his blog post (with a confusing title) that talks about stripping diacritics: Stripping is an interesting job (aka On the meaning of meaningless, aka All Mn characters are non-spacing, but some are more non-spacing than others) static string RemoveDiacritics(string text) … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)