This script reads a text file in various formats, and returns a text string in UTF-8 format, which is the default for J.
Possible file types are:
- unicode with a UTF-16 BOM (byte order mark)
- unicode with a UTF-8 BOM
- UTF-8 unicode with no BOM
- 8-bit ansi (ISO-8859-1 or Latin1)
On error, a _1 is returned.
Requires the files script.
Download the script ufread.ijs
For example:
a=. 'abc',(224+i.5){a.
a
abcàáâãä
a fwrite jpath '~temp/t1.txt'
8
b=. ufread jpath '~temp/t1.txt'
b
abcàáâãä
a.i.b
97 98 99 195 160 195 161 195 162 195 163 195 164
Links
Wikipedia Byte Order Mark
