admin管理员组

文章数量:1431398

I’ve done some Google searches, but I get results related to encoding strings or files.

Can I write my Node.js JavaScript source code in UTF-8? Can I use non-ASCII characters in ments, strings, or as variable names?

ECMA-262 seems to require UTF-16 encoding, but Node.js won’t run a UTF-16 encoded .js file. It will, however run UTF-8 source and correctly interpret non-ASCII characters.

So is this by design or by “accident”? Is it specified somewhere that UTF-8 source code is supported?

I’ve done some Google searches, but I get results related to encoding strings or files.

Can I write my Node.js JavaScript source code in UTF-8? Can I use non-ASCII characters in ments, strings, or as variable names?

ECMA-262 seems to require UTF-16 encoding, but Node.js won’t run a UTF-16 encoded .js file. It will, however run UTF-8 source and correctly interpret non-ASCII characters.

So is this by design or by “accident”? Is it specified somewhere that UTF-8 source code is supported?

Share Improve this question asked Apr 12, 2012 at 14:02 NateNate 19.1k9 gold badges50 silver badges54 bronze badges 8
  • 1 I've never given this a second though, but I constantly use UTF-8 for everything I do and never had a problem. – Alex Turpin Commented Apr 12, 2012 at 14:05
  • 1 I expect that it's not so much a Node.js thing, but a V8 thing. – Pointy Commented Apr 12, 2012 at 14:07
  • 1 I was hoping someone could point to, say, Node.js or V8 documentation that says what source encodings are allowed. (Python example: python/dev/peps/pep-0263). Yeah, I can and did futz around and see what works, but I want a more concrete answer. – Nate Commented Apr 12, 2012 at 15:12
  • You're linking to a very old version of the spec (3rd rev. is from 1999, we just hit 6th rev. last June). The current version is here. The requirement is "unicode" (with, by convention, ASCII being a subset of unicode, since the lower 127 codepoints in unicode are the same as the ASCII encoding specifies) – Mike 'Pomax' Kamermans Commented Sep 11, 2015 at 17:07
  • Hi @Nate , it seems some years have past from when you asked this question. I'm seeking for something like the Python example you wrote in the ment. Had you found a concrete answer in the meanwhile? – Daniele Ricci Commented Nov 11, 2021 at 12:42
 |  Show 3 more ments

2 Answers 2

Reset to default 0

Reference: http://mathiasbynens.be/notes/javascript-identifiers

UTF-8 characters are valid javascript variable names. Go ahead and encode UTF-8.

I can't find documentation that says that Node treats files as encoded in UTF-8, but it seems that way experimentally:

/* Check in your editor that this Javascript file was saved in UTF-8 */
var nonEscaped = "Планета_Зямля";
var escaped = "\u041f\u043b\u0430\u043d\u0435\u0442\u0430\u005f\u0417\u044f\u043c\u043b\u044f";
if (nonEscaped === escaped) {
  console.log("They match");
}

The above example prints They match.

Non-BMP note:

Note that UTF-8 supports non-BMP code points (U+10000 and onwards), but Javascript has plications in that case, it automatically converts them to surrogate pairs. This is part of the language:

/* Check in your editor that this Javascript file was saved in UTF-8 */
var nonEscaped = "

本文标签: javascriptWhat encoding is expected for Nodejs source codeStack Overflow