>>  <<  Ndx  Usr  Pri  JfC  LJ  Phr  Dic  Rel  Voc  !:  wd  Help  Dictionary

Unicode u:  _ _ _ Unicode

J datatypes: char (1-byte char) — an 8-bit value from 0 to 255
  wchar (2-byte char, wide char) — a 16-bit value from 0 to 65535
Encodings:  ASCII — 0 to 127, a subset of U8
  U8 — Unicode code point value in multibyte encoding

Most u: dyads work with values, not encodings. ASCII and U8 encoding are used in 7&u: and 8&u: .

The monad u: applies to several kinds of arguments:

Argument   Result
char    same as 2&u:
wchar copy of argument
integers same as 4&u:

The inverse of the monad u: is 3&u:
 
  The dyad u: takes a scalar integer left argument and applies to several kinds of right arguments.

Left    Result               Right
1char
char  as is
wchar high-order 8 bits discarded
2wchar
char  high-order 8 bits are 0
wchar as is
3integers char or wchar
4wchar integers in the range -65536 to 65535
5char wchar in the range 0 to 255
6wchar pairs of chars are converted to wchars
7char or
wchar
U8  converted to wchar
ASCII as is
wchar if all values <128, convert to ASCII, otherwise as is
an empty right argument produces an empty char result
8U8
wchar converted to U8
char  as is
an empty right argument produces an empty char result

1&u: and 2&u: , 3&u: and 4&u: , and 7&u: and 8&u: are inverse pairs.
 

Examples:
   ] t=: u: 'We the people' 
We the people
   3!:0 t
131072                         NB. the unicode datatype numeric code is 131072

   u: 97 98 99 +/ 0 256 512 1024
aaaa                           NB. 2-byte characters have the same
bbbb                           NB. display as 1-byte characters
cccc 

   'a' = u: 97 + 0 256 512 1024
1 0 0 0

   ] t=: (2 4$'abcdefgh') , u: 'wxyz'
abcd                           NB. 1- and 2-byte characters can be catenated together.
efgh                           NB. The 1-byte characters are promoted.
wxyz
   3!:0 t
131072


>>  <<  Ndx  Usr  Pri  JfC  LJ  Phr  Dic  Rel  Voc  !:  wd  Help  Dictionary