Login

Syntax

Notes on Syntax

Numeric Literals

Rather than ad hoc syntax for non-decimal bases, Anselm borrows the basics of Ada's very regular system: an integer may be optionally prefixed by the radix (2-36), then the number itself is enclosed by colons. For example, the literal 36:ANSELM: denotes the decimal value 644,618,218. The enclosing colons have the clear benefit of allowing for unambiguous custom literal suffixes in radices greater than 10, for instance 16:1234ABCD:u

For floats, this syntax has another benefit. We can use the exact same integer suffix for hexadecimal floating point constants as for decimal ones: 1.5e+10 and 16:1.8:e+10. If the enclosing colons did not exist, then e would be ambiguous for radices 15 or greater.

Labels

Labels (which are first-class values in Anselm) have the same syntax as Common Lisp keyword arguments: :symbol.

Expressions

Binding strength (ascending):

  1. Binary operators
  2. Unary operators
  3. Record field selection

Operators

Operators are just functions with names consisting of any string of the following characters !$%&*+-./<=>?@^|~ (except that the single full stop . is not a valid operator name). Operators may be prefix or infix. There are no postfix operators. Prefix operators are unary/monadic and infix operators are binary/dyadic. Prefix operators must be parenthesized when mixed with infix operators (otherwise boundaries would be ambiguous). Currently, there are no ternary operators though this could change.

Operators have no precedence levels and all operators (with one exception) associate to the right. This might seem like an unusual rule, but as a long-time APL/J/K programmer, I find these rules very intuitive.. Where associativity matters, right associativity is almost always what you want. There are a few cases where I'd say this is non-negotiably true, like in the case of the arrow (type-valued) operator ->. Another benefit of right-associativity-only is that the lack of left-recursion makes writing a recursive descent parser very straightforward. Anselm's rules are as follows:

  • Monadic/prefix operators bind stronger than dyadic ones
  • All dyadic operators associate to the right, except
  • Operators which begin with the equal =, greater than >, or less than < signs do not associate at all.

There are basically no circumstances where you want equality-like operates to associate at all, this is usually just an error. The < and > symbols also trigger non-associativity, because the usual operators formed from those characters, comparisons and bitshifts, also don't usefully associate. Conveniently, because only the first character is considered for reckoning associativity, this leaves the arrow -> as a right-associative operator for function types.

Custom operators and precedence levels/associativity are tricky to get right. Some languages like F# have a rule where precedence/associativity are a product of the first character of the operators. Others like SML require fixity declarations. Swift has a very sophisticated system for specifying the precedence and associativity of custom operators. The reason why I dislike these fixity declarations is because it means that parsing at the level of building an AST is no longer context free. The F# style rules I think are more maintainable but still requires memorizing complex precedence tables and I think mixed-associativity is really difficult to cognitively manage. In fact, I think complex precedence rules can cause code to be over parenthesized (the opposite of what one might intuitively expect) because programmers want to be extra safe. Also, these fixity declarations are in my opinion very non modular.