Parsing is a process of analyzing character stream according to formal lexical, syntactic and/or semantic grammar, producing output structure or evaluation.
Produces an stream of tokens from a stream of input characters. Stream can be a list. Lexing can be done using a sequential machine, regular expressions, or ad hoc splitting. AKA lexing, scanning, tokenizing.
Sequential Machine, AKA finite state machine, finite automata. Uses state transition table.
Essays/Word Formation on Lines
Sequential machine for J words with space and line tokens with extensive examples
stripping out unnecessary content from the files to reduce file size (comments, etc).
HTTP header lexer using ;: dyad, and elements of ad hoc parising
visualizing sequential machines using transition diagrams
JSON style backslash evaluator
JSON tokenizer, with details of producing the sequential machine transition table
Regular Expressions internally may use sequential machine, but have intuitive standard syntax.
Regular Expressions Lab
Guide to regex library
a lexer based on standard regular expressions and simple token declarations
Scripts/Regular Expressions Substitution
Regular expressions extended for Perl/awk/sed-like substitution
Ad Hoc looks for simple substrings for (iterative) splitting
example of ad hoc splitting for a list of first/initial/last names
has a Lisp S-expression string tokenizer
Produces a structure or evaluates a stream of tokens. The structure is typically a tree of grammar elements. AKA parsing.
Bottom-up, AKA Shift-reduce. E.g., LR parsers.
JSON shift-reduce parser
Top-down, AKA Recursive descent. E.g. LL parsers.
has a tacit recursive-descent parser
Ad Hoc parsing which alternates splitting and combining substring portions on multiple typically non-recursive levels
J pretty-print script formatter
Since a lot of parsing is based on ASTs, an introduction to efficient tree handling in J would help. You might look at
the lab Huffman Coding
Roger's Essays/Huffman Coding
Guides/Strings string and text manipulation resources
programming/2007-November/008869 some initial links
DanBron/Temp/ParseLexExecute implementing J in J
Guides/Language FAQ/J BNF Is there a BNF description of J?
chat/2007-November/000678 J syntax easy to parse? I don't think so