Login

Call-By-Push-Value

Call-By-Push-Value

Call-By-Push-Value (CBPV) is a formal calculus (really a family of calculuses) originally devised by Paul Levy. In essence, CBPV is a sort of metalanguage which subsumes both call-by-value (i.e. "strict") and call-by-name (sometimes equated with "laziness" so this is imprecise) evaluation. For the working semanticist, a practical benefit of CBPV as a formalism is that semantics for a language can be defined once for CBPV rather than twice for CBV and CBV (if semantics for both are desired). Outside of PLT circles however, CBPV has received little uptake or interest as a paradigm for designing concrete programming languages as of this writing (Max New's Fiddle is a notable exception and worth checking out, as is the Levy language which is a part of Andrej Bauer's Programming Languages Zoo).

In essence, CBPV works by distinguishing between values (that which "are") and computations (that which "do") as well as a series of terms and corresponding types which allow one to convert between the two (what nLab terms "shifts").

Advantages

Principled laziness

Because CBPV's original purpose was to be a formalism in which both CBV and CBN could be encoded, it naturally lends itself to expressing CBN-like evaluation patterns. Many otherwise strict languages, such as Scala and Idris, have mechanisms to express lazily-evaluated data. But for my own part, I find these sorts of schemes to be very ad hoc and difficult to informally reason about.

In addition, Anselm is very much an impure language, and lazy evaluation famously does not play well with effects. True call-by-name is expensive, and call-by-need is only equivalent to it if the only effect allowed is non-termination.

Evaluation is never ambiguous

CBN and CBV are often contrasted as the "two" different evaluation strategies but this is a misnomer. In fact, most language display a mixture of behaviors. Haskell's seq can force evaluation, while short-circuiting operators are common in many strict language and are akin to CBN. In reality, CBN and CBV merely represents different extremes on a continuum. Moreover, "strict" evaluation itself doesn't tell the whole story about a concrete language's evaluation semantics. Do expressions evaluate left to right, or right to left? SML specifies the former whereas in OCaml evaluation order is not defined. How are expressions within compound type literals evaluated? In CBPV this can never be unclear because evaluation is never implicit.

A more natural treatment of effects

Anselm has an effects system which strongly militates for using CBPV. In conventional languages, effects are a part of the type signature of function. CBPV affords a much more fine grained and natural semantics: effects are instead of a property of computations, of which functions are but one type.

Custom control structures

Strict languages have to special case e.g. short-circuiting operators. In fact, in strict languages, short-circuiting operators have more in common with control structures than true operators. In either case, such languages generally do not afford the programmer the ability to add new control structures and those that do need macros or other inelegant features to do so. This inflexibility not only limits the expressiveness of the language but also complicates its semantics.

Disadvantages

The primary strength of CBPV is also its greatest weakness as a programming language: because CBPV is very precise about when evaluation occurs, to a human programmer it can appear pedantic and tedious. However, I believe that careful use of implicit "casts" can mitigate most of these issues.

Because CBPV is very precise about where evaluation occurs, a obvious question is whether the use of CBPV as an object language runs the risk of "overspecifying" evaluation and foreclosing on opportunities to optimize programs. I think this is unlikely to be a problem in practice for two reasons. First, it should be possible to recognize the CBV embedding where it occurs and treat those program fragments "strictly". Second, even very sophisticated strictness analyzers such as the Glasgow Haskell Compiler's cannot at all compete with strict languages augmented with lazy types, so it does not seem likely to me that CBPV could do worse than e.g. Haskell. That is to say, anything which gives the compiler even some information about when evaluation can occur that allows for non-pessimistic optimizations strikes me as an improvement over laziness everwhere viz. performance. Nevertheless, this is clearly an open research question.

Finally, and somewhat related to the previous point, I am not aware of any compilers for CBPV-based languages and it is very possible that this will present unforeseen challenges when it comes time to write one for Anselm.

Discussion

Complex Values

In Levy's terminology, "complex values" are values which may contain subterms. Since Anselm has a notion of effects, I think having complex values is a no-brainer: it removes nearly all the tedium associated with explicit shifts because for real world programs a huge portion of functions can be written to be total and so would be able to appear in values without having to be forced.

Of course, what motivates CBPV as a basis for a real-world programming language (and not just a metalanguage or semantic toolkit) is precise control over evaluation that removes the conceptual ambiguity present in most languages. An obvious concern about adding complex values to Anselm is that, while we could just allow any sort of expression to appear in values, we would have to decide on a particular evaluation strategy for values which arguably negates the benefit of adopting CBPV in the first place. Moreover, over-specifying a specific evaluation strategy and/or order requires us to make commitments that can and do interfere with optimizations. This is why Ocaml, though a strict language, does not specify an evaluation order. Because Ocaml allows arbitrary side effects to occur in expressions, the fact that evaluation order is unspecified can result in undefined behavior in user programs. And indeed, that exact same problem would appear in Anselm if we let any sort of expression (that is to say, regardless of effect) appear in complex values.

In the total regime, we can have our cake and eat it too: we can leave evaluation order unspecified and ensure that this never causes undefined behavior. This is because, Anselm's type-and-effect system can enforce that particular computations have no side effects, that is to say are total, and thus reduction is strongly Church-Rosser (see section 2.3 of Turner's paper linked there). What this means in essence is that all possible evaluation strategies are semantically equivalent and we can choose whichever one we want.

As a result, I see little reason to not have complex values. Furthermore, since Anselm is dependently typed, we can naturally restrict dependency to only be allowed on values and not computations (simpliciter) but still allow total "computations" (in the form of complex values) to appear within types. If a programmer has a need where she might want to (for performance reasons) specify an evaluation order of even total computations, she is still free to do with explicit shifts, but the beauty of complex values is that it affords her the ability to communicate precisely where she doesn't care.