Discussion:
[scala-user] Using 'Unicode Escape' with 5 HEX Chars?
Kevin Meredith
2016-11-17 02:52:23 UTC
Permalink
Looking at Scala's Lexical Syntax
<http://www.scala-lang.org/files/archive/spec/2.11/01-lexical-syntax.html>:

`UnicodeEscape ::= ‘\’ ‘u’ {‘u’} hexDigit hexDigit hexDigit hexDigit`

It appears that only 4 HEX characters may follow `\u`.

Is there no way to use \u for displaying a 5-length Unicode character?
--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Rex Kerr
2016-11-17 04:33:49 UTC
Permalink
No, because the JVM's strings are based on 16-bit Chars. That means you
have to use an encoding (i.e. a pair of characters) for everything beyond
the BMP, which is where the 5-char stuff would live.

--Rex
Post by Kevin Meredith
Looking at Scala's Lexical Syntax
<http://www.scala-lang.org/files/archive/spec/2.11/01-lexical-syntax.html>
`UnicodeEscape ::= ‘\’ ‘u’ {‘u’} hexDigit hexDigit hexDigit hexDigit`
It appears that only 4 HEX characters may follow `\u`.
Is there no way to use \u for displaying a 5-length Unicode character?
--
You received this message because you are subscribed to the Google Groups
"scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Daniel Manchester
2016-11-18 02:44:36 UTC
Permalink
Hi,

For more information on surrogate pairs (referenced by Rex), see: "What is
a 'surrogate pair' in Java?
<http://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java>
".

A Scala example (surrogate pair from FileFormat.Info
<http://www.fileformat.info/info/unicode/char/1f600/index.htm>):

scala> val smiley = "\uD83D\uDE00"
smiley: java.lang.String = 😀

Dan
Post by Rex Kerr
No, because the JVM's strings are based on 16-bit Chars. That means you
have to use an encoding (i.e. a pair of characters) for everything beyond
the BMP, which is where the 5-char stuff would live.
--Rex
Post by Kevin Meredith
Looking at Scala's Lexical Syntax
<http://www.scala-lang.org/files/archive/spec/2.11/01-lexical-syntax.html>
`UnicodeEscape ::= ‘\’ ‘u’ {‘u’} hexDigit hexDigit hexDigit hexDigit`
It appears that only 4 HEX characters may follow `\u`.
Is there no way to use \u for displaying a 5-length Unicode character?
--
You received this message because you are subscribed to the Google Groups
"scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...