The answers might work but are terrible because they rely on unicode ranges that are unreadable and somewhat “magic” because it’s not always clear where do they come from and why they work, not to mention they’re not resilient to new emojis being added to the spec.
Major browsers now support unicode property escape which allows for matching emojis based on their belonging in the Emoji
unicode category: \p{Emoji}
matches an emoji, \P{Emoji}
matches a non-emoji.
Note that officially, 0123456789#*
and other characters are emojis too, so the property escape you might want to use is not Emoji
but rather Extended_Pictographic
which denotes all the characters typically understood as emojis!
Make sure to include the u
flag at the end.
console.log(
/\p{Emoji}/u.test('flowers'), // false :)
/\p{Emoji}/u.test('flowers 🌼🌺🌸'), // true :)
/\p{Emoji}/u.test('flowers 123'), // true :(
)
console.log(
/\p{Extended_Pictographic}/u.test('flowers'), // false :)
/\p{Extended_Pictographic}/u.test('flowers 🌼🌺🌸'), // true :)
/\p{Extended_Pictographic}/u.test('flowers 123'), // false :)
)
This works fine for detecting emojis, but if you want to use the same regex to extract them, you might be surprised with its behavior, since some emojis that appear as one character are actually several characters. They’re what we call emoji sequences, more about them in this question
const regex = /\p{Extended_Pictographic}/ug
const family = '👨👩👧' // "family
console.log(family.length) // not 1, but 8!
console.log(regex.test(family)) // true, as expected
console.log(family.match(regex)) // not [family], but [man, woman, girl]