Values

The grammar productions in this section define lexical syntax, hence no white space is allowed.

Integers

All integers can be written in either decimal or hexadecimal notation. In both cases, digits can optionally be separated by underscores.

𝚜𝚒𝚐𝚗::=𝜖  +1  |  +  +1  |  -  1𝚍𝚒𝚐𝚒𝚝::=𝟶  0  |    |  𝟿  9𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝::=𝑑:𝚍𝚒𝚐𝚒𝚝  𝑑|𝙰  10  |    |  𝙵  15|𝚊  10  |    |  𝚏  15𝚗𝚞𝚖::=𝑑:𝚍𝚒𝚐𝚒𝚝𝑑|𝑛:𝚗𝚞𝚖  _?  𝑑:𝚍𝚒𝚐𝚒𝚝10𝑛+𝑑𝚑𝚎𝚡𝚗𝚞𝚖::=:𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝|𝑛:𝚑𝚎𝚡𝚗𝚞𝚖  _?  :𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝16𝑛+

The allowed syntax for integer literals depends on size and signedness. Moreover, their value must lie within the range of the respective type.

𝚞𝑁::=𝑛:𝚗𝚞𝚖𝑛if 𝑛<2𝑁|𝟶𝚡  𝑛:𝚑𝚎𝚡𝚗𝚞𝚖𝑛if 𝑛<2𝑁𝚜𝑁::=𝑠:𝚜𝚒𝚐𝚗  𝑛:𝚞𝑁𝑠𝑛if 2𝑁1𝑠𝑛<2𝑁1

Uninterpreted integers can be written as either signed or unsigned, and are normalized to unsigned in the abstract syntax.

𝚒𝑁::=𝑛:𝚞𝑁𝑛|𝑖:𝚜𝑁signed1𝑁(𝑖)

Floating-Point

Floating-point values can be represented in either decimal or hexadecimal notation.

𝚏𝚛𝚊𝚌::=𝑑:𝚍𝚒𝚐𝚒𝚝𝑑/10|𝑑:𝚍𝚒𝚐𝚒𝚝  _?  𝑝:𝚏𝚛𝚊𝚌(𝑑+𝑝/10)/10𝚑𝚎𝚡𝚏𝚛𝚊𝚌::=:𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝/16|:𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝  _?  𝑝:𝚑𝚎𝚡𝚏𝚛𝚊𝚌(+𝑝/16)/16𝚖𝚊𝚗𝚝::=𝑝:𝚗𝚞𝚖  .?𝑝|𝑝:𝚗𝚞𝚖  .  𝑞:𝚏𝚛𝚊𝚌𝑝+𝑞𝚑𝚎𝚡𝚖𝚊𝚗𝚝::=𝑝:𝚑𝚎𝚡𝚗𝚞𝚖  .?𝑝|𝑝:𝚑𝚎𝚡𝚗𝚞𝚖  .  𝑞:𝚑𝚎𝚡𝚏𝚛𝚊𝚌𝑝+𝑞𝚏𝚕𝚘𝚊𝚝::=𝑝:𝚖𝚊𝚗𝚝  (𝙴  |  𝚎)  𝑠:𝚜𝚒𝚐𝚗  𝑒:𝚗𝚞𝚖𝑝10𝑠𝑒𝚑𝚎𝚡𝚏𝚕𝚘𝚊𝚝::=𝟶𝚡  𝑝:𝚑𝚎𝚡𝚖𝚊𝚗𝚝  (𝙿  |  𝚙)  𝑠:𝚜𝚒𝚐𝚗  𝑒:𝚗𝚞𝚖𝑝2𝑠𝑒

The value of a literal must not lie outside the representable range of the corresponding IEEE 754 type (that is, a numeric value must not overflow to ±), but it may be rounded to the nearest representable value.

Note

Rounding can be prevented by using hexadecimal notation with no more significant bits than supported by the required type.

Floating-point values may also be written as constants for infinity or canonical NaN (not a number). Furthermore, arbitrary NaN values may be expressed by providing an explicit payload value.

𝚏𝑁::=(+1):𝚜𝚒𝚐𝚗  𝑞:𝚏𝑁𝚖𝚊𝚐+𝑞|(1):𝚜𝚒𝚐𝚗  𝑞:𝚏𝑁𝚖𝚊𝚐𝑞𝚏𝑁𝚖𝚊𝚐::=𝑞:𝚏𝚕𝚘𝚊𝚝float𝑁(𝑞)if float𝑁(𝑞)|𝑞:𝚑𝚎𝚡𝚏𝚕𝚘𝚊𝚝float𝑁(𝑞)if float𝑁(𝑞)|𝚒𝚗𝚏|𝚗𝚊𝚗𝗇𝖺𝗇(canon𝑁)|𝚗𝚊𝚗:𝟶𝚡  𝑛:𝚑𝚎𝚡𝚗𝚞𝚖𝗇𝖺𝗇(𝑛)if 1𝑛<2signif(𝑁)

Strings

Strings denote sequences of bytes that can represent both textual and binary data. They are enclosed in quotation marks and may contain any character other than ASCII control characters, quotation marks (), or backslash (\), except when expressed with an escape sequence.

𝚜𝚝𝚛𝚒𝚗𝚐::=  (𝑏:𝚜𝚝𝚛𝚒𝚗𝚐𝚎𝚕𝚎𝚖)  𝑏if |𝑏|<232𝚜𝚝𝚛𝚒𝚗𝚐𝚎𝚕𝚎𝚖::=𝑐:𝚜𝚝𝚛𝚒𝚗𝚐𝚌𝚑𝚊𝚛utf8(𝑐)|\  1:𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝  2:𝚑𝚎𝚡𝚍𝚒𝚐𝚒𝚝161+2

Each character in a string literal represents the byte sequence corresponding to its UTF-8 Unicode (Section 2.5) encoding, except for hexadecimal escape sequences \𝚑𝚑, which represent raw bytes of the respective value.

𝚜𝚝𝚛𝚒𝚗𝚐𝚌𝚑𝚊𝚛::=𝑐:𝚌𝚑𝚊𝚛𝑐if 𝑐U+20𝑐U+7F𝑐𝑐\|\𝚝U+09|\𝚗U+0A|\𝚛U+0D|\U+22|\U+27|\\U+5C|\𝚞{  𝑛:𝚑𝚎𝚡𝚗𝚞𝚖  }𝑛if 𝑛<𝟶𝚡𝙳𝟾𝟶𝟶𝟶𝚡𝙴𝟾𝟶𝟶𝑛<𝟶𝚡𝟷𝟷𝟶𝟶𝟶𝟶

Names

Names are strings denoting a literal character sequence. A name string must form a valid UTF-8 encoding as defined by Unicode (Section 2.5) and is interpreted as a string of Unicode scalar values.

𝚗𝚊𝚖𝚎::=𝑏:𝚜𝚝𝚛𝚒𝚗𝚐𝑐if 𝑏=utf8(𝑐)

Note

Presuming the source text is itself encoded correctly, strings that do not contain any uses of hexadecimal byte escapes are always valid names.

Identifiers

Indices can be given in both numeric and symbolic form. Symbolic identifiers that stand in lieu of indices start with $, followed by eiter a sequence of printable ASCII characters that does not contain a space, quotation mark, comma, semicolon, or bracket, or by a quoted name.

𝚒𝚍::=$  𝑐:𝚒𝚍𝚌𝚑𝚊𝚛+𝑐|$  𝑐:𝚗𝚊𝚖𝚎𝑐if |𝑐|>0𝚒𝚍𝚌𝚑𝚊𝚛::=𝟶  |    |  𝟿|𝙰  |    |  𝚉|𝚊  |    |  𝚣|!  |  #  |  $  |  %  |  &  |    |    |  +  |  -  |  .  |  /|:  |  <  |  =  |  >  |  ?  |  @  |  \  |  ˆ    |  _  |  `    |  |  |  ˜  

Note

The value of an identifier character is the Unicode codepoint denoting it.

Conventions

The expansion rules of some abbreviations require insertion of a fresh identifier. That may be any syntactically valid identifier that does not already occur in the given source text.