Skip to content
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
This repository was archived by the owner on Dec 15, 2022. It is now read-only.

Consider unicode identifiers support #414

Open
@MaximSokolov

Description

@MaximSokolov

There is already support in TextMate, particularly in language-babel
I've tested this regex and it seems to be working fine (see at Lightshow):
[$_\\p{L}\\p{Nl}][$\\p{L}\\p{Nl}\\p{Mn}\\p{Mc}\\p{Nd}\\p{Pc}\\x{200C}\\x{200D}]*

function a () { }
function foo123 () { }
function $ () { }
function $$abc$$ () { }
function FOO () { }
function _foo_ () { }
function $foo_foo$ () { }

function π() {  }
function ლ_ಠ益ಠ_ლ() {}
function абв() {}
function d‿d() {} //\\p{Pc}
function Oo̶O()  {} // \p{Mn}
function _ැ_() {} //\p{Mc}
function می‌خواهم() {} // \x{200C}
function _ണ്‍_() {} // \x{200D}, valid in ECMAScript 6/Unicode 8.0.0, but not in ES3
function _۴_() {} // \p{Nd}
function () {} // \p{Nl}

screen shot 2016-08-23 at 12 33 09
screen shot 2016-08-23 at 12 33 39

\p{L} matches any kind of letter from any language
\p{Nl} matches a number that looks like a letter, such as a Roman numeral
\p{Mn} matches a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.)
\p{Mc} matches a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages)
\p{Nd} matches a digit zero through nine in any script except ideographic scripts
\p{Pc} matches a punctuation character such as an underscore that connects words
\x{200C} zero width non-joiner
\x{200D} zero width joiner

Refs:
JavaScript variable name validator
Unicode Character Categories
What characters are valid for JavaScript variable names? [Stack Overflow]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions