Here's what I've figured out about how J internally represents nouns.

Details of J's Internal Data Representation

Data in J is stored as a header - which gives information about the data type, shape and size - followed by the raw data. The header consists of a fixed-length portion giving the basic information, followed by a variable length portion holding shape information.

So far, this only shows the details of the simpler of the following types (up to boxed): [from documentation for "3!:0 y Type." - "The internal type of the noun y, encoded as follows:"]

1

boolean

2

literal

4

integer

8

floating point

16

complex

32

boxed

The following remain to be explored:

64

extended integer

128

rational

1024

sparse boolean

2048

sparse literal

4096

sparse integer

8192

sparse floating point

16384

sparse complex

32768

sparse boxed

65536

symbol

131072

unicode

Fixed-length Portion of Header

Byte Positions

Description

0 - 7

Type (see above)

8 - 11

Length (#,): number of elements

12 - 15

Rank (#$): length of shape

Variable-length Portion of Header

Byte Positions

Description

16 -> 19+4*Rank

Shape ($)

nn -> end

Data

Examples

In the examples that follow, we assume familiarity with a few basics of representation, e.g. 65 <-> a. i. 'A', etc.

Boolean

   showNum 1                        
   1   0   0   0   0   0   0   0    
   1   0   0   0   0   0   0   0    
   1   0   0   0                    

   showNum 0 1                  
   1   0   0   0   0   0   0   0
   2   0   0   0   1   0   0   0
   2   0   0   0   0   1   0   0

   showNum 9$1 0
   1   0   0   0   0   0   0   0
   9   0   0   0   1   0   0   0
   9   0   0   0   1   0   1   0
   1   0   1   0   1   0   0   0     NB. Notice fullword padding
   showNum 8$1 0
   1   0   0   0   0   0   0   0
   8   0   0   0   1   0   0   0
   8   0   0   0   1   0   1   0
   1   0   1   0   0   0   0   0     NB. but
   showNum 7$1 0
   1   0   0   0   0   0   0   0
   7   0   0   0   1   0   0   0     NB. Padding seems to be slightly      
   7   0   0   0   1   0   1   0     NB. excessive in 8-bit case above:    
   1   0   1   0                     NB. is this a performance enhancement?

Character

   showNum 'A'                      
   2   0   0   0   0   0   0   0    
   1   0   0   0   0   0   0   0    
  65   0   0   0                    

   showNum 'AB'                 
   2   0   0   0   0   0   0   0
   2   0   0   0   1   0   0   0
   2   0   0   0  65  66   0   0

   showNum 2 3$'ABC'
   2   0   0   0   0   0   0   0
   6   0   0   0   2   0   0   0
   2   0   0   0   3   0   0   0
  65  66  67  65  66  67   0   0

Integer: Least-significant Byte Fist

   showNum i.0                      
   4   0   0   0   0   0   0   0    
   0   0   0   0   1   0   0   0    
   0   0   0   0                    

   showNum i.1                  
   4   0   0   0   0   0   0   0
   1   0   0   0   1   0   0   0
   1   0   0   0   0   0   0   0

   showNum i.2                      
   4   0   0   0   0   0   0   0    
   2   0   0   0   1   0   0   0    
   2   0   0   0   0   0   0   0    
   1   0   0   0                    

   showNum i.3                  
   4   0   0   0   0   0   0   0
   3   0   0   0   1   0   0   0
   3   0   0   0   0   0   0   0
   1   0   0   0   2   0   0   0

   showNum -i. 3
   4   0   0   0   0   0   0   0
   3   0   0   0   1   0   0   0
   3   0   0   0   0   0   0   0     NB. Negative numbers are two's complement.
 255 255 255 255 254 255 255 255

Floating point (IEEE standard representation of numbers)

   showNum 1.1                      
   8   0   0   0   0   0   0   0    
   1   0   0   0   0   0   0   0    
 154 153 153 153 153 153 241  63    

   showNum 1$1.1                
   8   0   0   0   0   0   0   0
   1   0   0   0   1   0   0   0
   1   0   0   0 154 153 153 153
 153 153 241  63                

   showNum 1.1 2.2                  
   8   0   0   0   0   0   0   0    
   2   0   0   0   1   0   0   0    
   2   0   0   0 154 153 153 153 

   showNum 1.1 2.2 3.3          
   8   0   0   0   0   0   0   0
   3   0   0   0   1   0   0   0

Boxed

   showNum <'AB'
  32   0   0   0   0   0   0   0     NB. Integer "20" at bytes 16-19 is pointer:
   1   0   0   0   0   0   0   0     NB. position (number of bytes from start)
  20   0   0   0   2   0   0   0     NB. of start of 1st boxed array.
   0   0   0   0   2   0   0   0     NB. Note how contents of this box matches
   1   0   0   0   2   0   0   0     NB. 2nd character array above.
  65  66   0   0

   showNum 'AB';0 1 2
  32   0   0   0   0   0   0   0
   2   0   0   0   1   0   0   0
   2   0   0   0  28   0   0   0     NB. The 2 boxes begin at positions 28
  52   0   0   0   2   0   0   0     NB. and 52.
   0   0   0   0   2   0   0   0
   1   0   0   0   2   0   0   0
  65  66   0   0   4   0   0   0
   0   0   0   0   3   0   0   0
   1   0   0   0   3   0   0   0
   0   0   0   0   1   0   0   0
   2   0   0   0

   showNum 2 2$'AB';(i.3);1.1 2.2;<'abcde'
  32   0   0   0   0   0   0   0
   4   0   0   0   2   0   0   0
   2   0   0   0   2   0   0   0
  40   0   0   0  64   0   0   0
  96   0   0   0 132   0   0   0
   2   0   0   0   0   0   0   0
   2   0   0   0   1   0   0   0
   2   0   0   0  65  66   0   0
   4   0   0   0   0   0   0   0
   3   0   0   0   1   0   0   0
   3   0   0   0   0   0   0   0
   1   0   0   0   2   0   0   0
   8   0   0   0   0   0   0   0
   2   0   0   0   1   0   0   0
   2   0   0   0 154 153 153 153
 153 153 241  63 154 153 153 153
 153 153   1  64   2   0   0   0
   0   0   0   0   5   0   0   0
   1   0   0   0   5   0   0   0
  97  98  99 100 101   0   0   0

Definition of ''showNum''

The verb showNum is a basic tool for displaying the internal representation of J nouns.

showNum=: 3 : 0
NB.* showNum: show numeric values of y's internal representation,
NB. formatted x values per row.
   8 showNum y       
:                     
   width=. x         
   vec=. a. i. 3!:1 y
   len=. >.width%~$vec
   leftover=. (len*width)-$vec     NB. Pad last line w/spaces.
   (len,width*4)$;_4{.&.>(":&.>vec),leftover$<' '
)

DevonMcCormick/Data/JInternalRepresentation (last edited 2010-12-22 22:36:05 by DevonMcCormick)