class
TOML::Lexer
- TOML::Lexer
- Reference
- Object
Overview
Streaming lexer for TOML v1.0 documents.
Design notes
-
Trivia is emitted, not skipped. Whitespace, newlines and comments are returned as tokens because the AST keeps them attached to nodes for byte-identical round-trips.
-
Atoms are coarse-grained. Anything that is not a delimiter, string, or trivia is returned as a single
BareKeyOrAtomtoken whoserawis the verbatim source slice. The parser decides whether it is a bare key, integer, float, datetime, boolean,inf, ornanbased on context. Two consequences:- The dot character is always emitted as
Dot. A float like1.5lexes asAtom("1") Dot Atom("5"); the parser reassembles it. This keeps the lexer context-free at the cost of a tiny bit of parser work. - The colon
:is part of an atom (so a time value like07:32:00is a singleAtom).
- The dot character is always emitted as
-
Brackets are always single-character.
[and]are emitted one at a time even when adjacent. The parser disambiguates[[products]](array-of-tables header) from[[1, 2]](nested array) by position: a line that starts with two consecutiveLBrackettokens is an AoT header, anywhere else they open two arrays. -
Line endings.
\nand\r\nare both valid newlines. A bare\rnot followed by\nis not a TOML line ending and triggers aParseErrorif it appears outside a string.
Defined in:
toml/lexer.crConstructors
Instance Method Summary
-
#eof? : Bool
Returns true if the lexer has consumed all input.
-
#next_token : Token
Returns the next token.
Constructor Detail
Instance Method Detail
Returns the next token. Once the end of input is reached,
returns an EOF token at the current position; subsequent
calls keep returning EOF.