[Caml-list] Question on the covariance of a GADT (polymorphic variants involved)mailto:caml-list@inria.fr2021-03-30T14:56:52ZFrançois Pottierfrancois.pottier@inria.fr[Caml-list] Question on the covariance of a GADT (polymorphic variants involved)2020-10-06T11:57:42Zurn:uuid:7d864016-02c8-f306-25b4-96e69129a4d4
```Dear all,

I have a problem convincing OCaml that a GADT is covariant in one of its
parameters.

Here is a simplified version of what I am trying to do:

type 'a t =
| Wine: [> `Wine ] t
| Food: [> `Food ] t
| Box : 'a t * 'a t -> 'a t
| Fridge: [< `Wine | `Food ] t -> [> `Fridge ] t

In this definition, the type parameter 'a is a phantom parameter. It is not
used to describe the type of a value; it is used only to restrict the set of
values of type "t" that can be constructed.

The goal here is to allow storing wine and food (and boxes containing boxes
containing wine and food) in a fridge, but forbid storing a fridge in a
fridge
(or a fridge in a box in a fridge, etc.).

The lower bounds >`Wine and >`Food and >`Fridge are used to keep track
of the
basic objects that are (transitively) contained in a box. The Box
constructor
takes the conjunction of these lower bounds. The Fridge constructor imposes
the upper bound <`Wine|`Food, thereby forbidding a fridge to appear
(transitively) inside a fridge.

You may be wondering why I am using an algebraic data type whose index is
constrained by abusing polymorphic variants, instead of using polymorphic
variants directly. The answer is that is the real type definition I am
working
with is more complex than this: it has more parameters and it is actually a
(real) GADT in these other parameters, so (I think) I cannot turn it into a
polymorphic variant.

Now, the problem is, the OCaml type-checker does not recognize that the type
'a t is covariant in 'a.

Intuitively, it seems clear that this type is covariant in 'a: the
constructors Wine, Food, Fridge impose lower bounds on 'a only, and the
constructor Box uses 'a in a covariant position (if one admits, inductively,
that 'a t is covariant in 'a).

I would like OCaml to accept the fact that t is covariant in 'a because
I have
several functions (smart constructors) that produce values of type 'a t.
When
an application of such a function produces a result of type _'a t, I would
like it to be generalized, so as to obtain 'a t. Currently, this does not
work.

This lack of generalization is not just a minor inconvenience. It leads to
pollution (that is, lower bounds and upper bounds bearing on the same type
variable, when they should normally bear on two distinct type
variables). For
instance, if an object whose type has not been generalized is once placed
*beside* a fridge, then it can no longer be put *inside* a fridge:

let food() = Food
let meat = food()              (* weak type variable appears *)
let _ = Box(meat, Fridge Wine) (* place the meat beside a fridge *)
let _ = Fridge meat            (* meat can no longer be put into
fridge *)
^^^^
Error: This expression has type [> `Food | `Fridge ] t
but an expression was expected of type [< `Food | `Wine ] t
The second variant type does not allow tag(s) `Fridge

An explanation of why OCaml cannot recognize that this type is
covariant, or a
suggestion of a workaround, would be very much appreciated. Thanks!

--
François Pottier
francois.pottier@inria.fr
http://cambium.inria.fr/~fpottier/
```
Olegoleg@okmij.orgRe: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-06T16:01:19Zurn:uuid:0a80382a-8069-acdd-eff2-963eb9fdef58
```
> type 'a t =
>    | Wine: [> `Wine ] t
>    | Food: [> `Food ] t
>    | Box : 'a t * 'a t -> 'a t
>    | Fridge: [< `Wine | `Food ] t -> [> `Fridge ] t
>
> In this definition, the type parameter 'a is a phantom parameter. It is not
> used to describe the type of a value; it is used only to restrict the set of
> values of type "t" that can be constructed.
>
> The goal here is to allow storing wine and food (and boxes containing
> boxes containing wine and food) in a fridge, but forbid storing a
> fridge in a fridge (or a fridge in a box in a fridge, etc.).

This is indeed a very `popular' problem. On June 21 this year Josh
Berdine posed almost the same question (without wine and food, alas).
On 28 June 2020 I sent the reply with three solutions. Perhaps it
would be easier to point to the code, which also has many comments and
explanations:

http://okmij.org/ftp/tagless-final/subtyping.ml

The gist of the first two solutions is to exploit the fact that
tagless-final is sort of isomorphic to GADTs. That is, lots of things
that can be done with GADTs can be done without (or with GADTs hidden
away). That hiding has benefit of no longer bringing the problems with
variance. No GADTs (at least, not in the part facing the user), no
variance problems, at least, not for the end user. The library author
may use regular ADT, which are also non-problematic. A regular ADT
doesn't have the same typing guarantees -- but the typing is enforced
by the signature (at the module level), so there is no loss.

I was going to write an article for my web site explaining this and a
few earlier answers to the similar problems. Maybe I will eventually
get to it.

```
François Pottierfrancois.pottier@inria.frRe: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-06T16:45:34Zurn:uuid:739fdcf6-1ce8-ea14-de7b-68f76572c475
```Dear Oleg,

Le 06/10/2020 à 18:05, Oleg a écrit :
> This is indeed a very `popular' problem. On June 21 this year Josh
> Berdine posed almost the same question (without wine and food, alas).
> On 28 June 2020 I sent the reply with three solutions.

Great, thanks for your quick reply!

Now I have to go do my homework and study your solutions :-)
I'll be in touch.

--
François Pottier
francois.pottier@inria.fr
http://cambium.inria.fr/~fpottier/
```
Josh Berdinejosh@berdine.netRe: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-06T19:05:11Zurn:uuid:a1a62f88-299e-710a-e628-cdacabcf2019
```> On Oct 6, 2020, at 5:05 PM, Oleg <oleg@okmij.org> wrote:
>
>> type 'a t =
>>   | Wine: [> `Wine ] t
>>   | Food: [> `Food ] t
>>   | Box : 'a t * 'a t -> 'a t
>>   | Fridge: [< `Wine | `Food ] t -> [> `Fridge ] t
>>
>> In this definition, the type parameter 'a is a phantom parameter. It is not
>> used to describe the type of a value; it is used only to restrict the set of
>> values of type "t" that can be constructed.
>>
>> The goal here is to allow storing wine and food (and boxes containing
>> boxes containing wine and food) in a fridge, but forbid storing a
>> fridge in a fridge (or a fridge in a box in a fridge, etc.).
>
> This is indeed a very `popular' problem. On June 21 this year Josh
> Berdine posed almost the same question (without wine and food, alas).
> On 28 June 2020 I sent the reply with three solutions. Perhaps it
> would be easier to point to the code, which also has many comments and
> explanations:
>
>        http://okmij.org/ftp/tagless-final/subtyping.ml
>
> The gist of the first two solutions is to exploit the fact that
> tagless-final is sort of isomorphic to GADTs. That is, lots of things
> that can be done with GADTs can be done without (or with GADTs hidden
> away). That hiding has benefit of no longer bringing the problems with
> variance. No GADTs (at least, not in the part facing the user), no
> variance problems, at least, not for the end user. The library author
> may use regular ADT, which are also non-problematic. A regular ADT
> doesn't have the same typing guarantees -- but the typing is enforced
> by the signature (at the module level), so there is no loss.
>
> I was going to write an article for my web site explaining this and a
> few earlier answers to the similar problems. Maybe I will eventually
> get to it.

Francois, your examples is way better than mine! :-)

Apologies for dropping the ball before, I got busy and there was some chaos going on.

But Oleg, thank you very much for digging into this so thoroughly and making such concrete suggestions!

There are some points I'm unsure of. My understanding of `Reducer` is that it is a way of implementing a particular (reduction) function on expressions from this little language, where rather than the "initial" approach of defining an explicit datatype for the expressions and then reducing them, you use constructor functions that eagerly perform the reduction, and extract the result with `observe`.

What I'm unsure of is how this generalizes to additional functions on expressions. For instance, suppose I also wanted to define a function which computed the set of int literals that appear in the (unreduced) expression? One possibility would seem to be to revise your second solution (say) so that `'a Reducer.t` is `'a * int list` instead of `'a`, and revise the constructor functions to compute this list of ints eagerly.

My working assumption is that which operation on the expression will be performed isn't known at the time when they are constructed. Otherwise a variant of the whole `Reducer` module could be defined for each operation, each with its own `'a t` type defined as appropriate.

More pressingly, I think, is what operations such as `map_ints : (int -> int) -> 'a exp -> 'a exp` would look like. With an initial style representation, this could be defined by traversing the abstract syntax and rebuilding it with the updated `int`s. One approach would be to change `Reducer.t` to a sort of abstract syntax, but this seems to me to lead straight back to the original `'a exp` type. This is, as far as I understand, coming up against the usual situation where the "initial" representation is syntactic in nature and the "final" representation is semantic. I don't know how much is known about handling such cases in the final style though.

Does it sound like I simply have not understood your solution? :-)

Cheers, Josh

> On Jun 27, 2020, at 4:36 PM, Oleg <oleg@okmij.org> wrote:
>
> Josh Berdine has posed a problem:
>> As a concrete example, consider something like:
>> ```
>> type _ exp =
>>  | Integer : int -> [> `Integer] exp
>>  | Float : float -> [> `Float] exp
>>  | Add : ([< `Integer | `Float] as 'n) exp * 'n exp -> 'n exp
>>  | String : string -> [> `String] exp
>> ```
>> The intent here is to use polymorphic variants to represent a small
>> (upper semi-) lattice, where basically each point corresponds to a
>> subset of the constructors. The type definition above is admitted, but
>> while the index types allow subtyping ...
>> this does not extend to the base type:
>> ```
>> # let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>> Line 1, characters 18-67:
>> 1 | let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Error: Type [ `Integer ] exp is not a subtype of
>>         ([> `Float | `Integer ] as 'a) exp
>>       The first variant type does not allow tag(s) `Float
>> ```
>> This makes sense since `type _ exp` is not covariant. But trying to
>> declare it to be covariant doesn't fly, yielding:
>> ```
>> Error: In this GADT definition, the variance of some parameter
>>       cannot be checked
>> ```
>
> We demonstrate three solutions to this problem.  The first has an
> imperfection, but is easier to explain. The second removes the
> imperfection. The third one is verbose, but requires the least of
> language features and can be implemented in SML or perhaps even
> Java. All three solutions give the end user the same static
> guarantees, statically preventing building of senseless
> expressions. All three rely on the tagless-final style, in effect
> viewing the problem as a language design problem -- designing a DSL
> with subtyping. It has been my experience that the tagless-final,
> algebraic point of view typically requires less fancy
> typing. Incidentally, the third solution, of explicit coercions, can
> be combined with GADTs and is hence the general solution to the posed
> problem, showing how to do phantom index subtyping with GADTs.
>
> Presenting all three solutions is too long for a message, so one
> should look at the complete code with many comments:
>
>        http://okmij.org/ftp/tagless-final/subtyping.ml
>
> Below is the outline of the first solution, the easiest to
> explain. (The others just add slight variations; the main
> example looks almost the same in all three).
>
> We will think of the problem as a DSL with subtyping.
> Let's define its syntax and the type system, as an OCaml signature:
>
> module type exp = sig
>  type +'a t
>  val int    : int   -> [> `int] t
>  val float  : float -> [> `float] t
>  val add    : ([< `int | `float] as 'n) t -> 'n t -> [> `int | `float] t
>  val string : string -> [> `string] t
> end
>
> The language expressions are indexed with a subset of tags `int,
> `float, `string. The language has the obvious subtyping: [> `int]
> can be weakened to [> `int | `float]. This weakening, and general typechecking
> of our DSL is performed by OCaml's type checker for us. Let's see an
> example
>
> module Ex1(I:exp) = struct
>  open I
>
>  (* We can put expressions of different sorts into the same list *)
>  let l1 = [int 1; float 2.0; add (add (int 3) (float 1.0)) (int 4)]
>  (* val l1 : [> `float | `int ] I.t list *)
>
>  let l2 = string "str" :: l1
>  (* val l2 : [> `float | `int | `string ] I.t list *)
>
>  (* An attempt to build an invalid expression fails, with a good error message:
>  let x = add (int 1) (string "s")
>                      ^^^^^^^^^^^^
>  Error: This expression has type [> `string ] t
>       but an expression was expected of type [< `float | `int > `int ] t
>       The second variant type does not allow tag(s) `string
>  *)
>
>  (* We can define a function to sum up elements of a list of expressions,
>     returning a single expression with addition
>  *)
>  let rec sum l = List.fold_left add (int 0) l
>  (* val sum : [ `float | `int ] I.t list -> [ `float | `int ] I.t *)
>
>  (* We can sum up l1 *)
>  let l1v = sum l1
>  (* val l1v : [ `float | `int ] I.t *)
>
>  (* but, predictably, not l2
>  let l2v = sum l2
>  *)
> end
>
> Now, what we can do with the expressions? We can print/show them
>
> module Show : (exp with type 'a t = string) = struct
>   type 'a t = string
>   ... elided ...
> end
>
> let _ = let module M = Ex1(Show) in M.l1
> (* - : string list = ["1"; "2."; "Add(Add(3,1.),4)"] *)
>
> We can also reduce them. The following is the reducer -- although
> in reality it does the normalization-by-evaluation (NBE)
>
> A subset of expressions: values
>
> module type vals = sig
>  type +'a t
>  val int    : int    -> [> `int] t
>  val float  : float  -> [> `float] t
>  val string : string -> [> `string] t
> end
>
> module Reducer(I:vals) :
>   (sig include exp val observe : ([> `float | `int ] as 'a) t -> 'a I.t end)
>  = struct
>   type 'a t = Unk of 'a I.t | Int of int | Float of float
>   let observe = function
>   | Unk x -> x
>   | Int x -> I.int x
>   | Float x -> I.float x
>
>   let int x    = Int x
>   let float x  = Float x
>   let string x = Unk (I.string x)
>
>   let add x y = match (x,y) with
>   | (Int x, Int y) -> Int (x+y)
>   | (Float x, Float y) -> Float (x+.y)
>   | (Int x, Float y) | (Float y, Int x) -> Float (float_of_int x +. y)
>   (* This is the imperfection. We know that the case cannot happen,
>      but this is not clear from the type system.
>      We should stress that this imperfection does not compromise the guarantees
>      offered to the end user, who never sees the internals of Reducer,
>      and can't even access them.
>    *)
>   | _ -> assert false
> end
>
> (* We can now evaluate exp to values. We need a way to show them though.
>   Luckily, exp is a superset of values, and we already wrote the show
>   interpreter for exp
> *)
> let _ = let module I = Reducer(Show) in
>        let module M = Ex1(I) in List.map I.observe M.l1
> (* - : string list = ["1"; "2."; "8."] *)
>
> The second solutions removes the imperfection in Reduce, making
> Reduce.add total, by making the types more phantom. The third solution
> does the obvious thing: implementing subtyping using explicit
> coercions. In the given example, it doesn't add too much annoyance to
> the user. It does however remove the reliance on variance and so
> is applicable to GADTs in general. One can imagine more solutions
> along the shown lines.
>

```
François Pottierfrancois.pottier@inria.frRe: [Caml-list] Question on the covariance of a GADT (polymorphic variants involved)2020-10-06T19:19:03Zurn:uuid:40e399e1-8779-f4d0-2024-0364b99b5f07
```On 06/10/2020 at 21:04, Josh Berdine wrote:
> Francois, your examples is way better than mine!:-)

Thanks :-)

As far as I understand Oleg's solutions, they all boil down to the same
idea,
which I have considered, but is only half satisfactory: that is, to
publish a
covariant abstract type +'a t which internally is implemented as an
abbreviation for an unparameterized type u. (See the food/wine/fridge
example,
rewritten in this style, below.)

As far as the user is concerned, this works: the relaxed value restriction
works, so in my example, "meat" receives a polymorphic type, and I am happy.

The reason why this solution is only half satisfactory is that inside the
abstraction barrier, we have an unparameterized type u that does not express
the desired data invariant, so we cannot exploit the invariant to prove that
certain branches in the code are dead.

Also, the user cannot perform case analysis, since the type 'a t is
abstract.
This is not a problem for me, but could be a problem in other situations.

Certainly, this is better than nothing, so I may end up adopting this
approach
(thanks again Oleg!).

I am still curious, though, to know why my original type +'a t is not
recognized as covariant by the OCaml type-checker...

--
François Pottier
francois.pottier@inria.fr
http://cambium.inria.fr/~fpottier/

module T : sig

type +'a t

val wine: [> `Wine ] t
val food: [> `Food ] t
val box: 'a t * 'a t -> 'a t
val fridge: [< `Wine | `Food ] t -> [> `Fridge ] t

end = struct

type u =
| Wine: u
| Food: u
| Box : u * u -> u
| Fridge: u -> u

type 'a t = u

let wine = Wine
let food = Food
let box (t1, t2) = Box (t1, t2)
let fridge t = Fridge t

end

open T

let food() = food
let meat = food()              (* the relaxed value restriction works *)
let _ = box(meat, fridge wine) (* place the meat beside a fridge *)
let _ = fridge meat            (* meat can still be put into fridge *)
```
Josh Berdinejosh@berdine.netRe: [Caml-list] Question on the covariance of a GADT (polymorphic variants involved)2020-10-06T20:11:01Zurn:uuid:b69628cb-66d9-a69e-5d89-9dec40434a54
```> On Oct 6, 2020, at 8:18 PM, François Pottier <francois.pottier@inria.fr> wrote:
>
> I am still curious, though, to know why my original type +'a t is not
> recognized as covariant by the OCaml type-checker...

Jacques helpfully gave a precise answer earlier:

> On Jun 22, 2020, at 8:39 AM, Jacques Garrigue <jacques.garrigue@gmail.com> wrote:
>
> The best I can say is that this combination of GADTs and polymorphic variants is not supported.
> There are actually two distinct problems:
> * GADT indices must be invariant
> * refinement of row types (both polymorphic variants and object types) is only partially supported
>  (you can instantiate an existential row only once)
>
> So I wouldn’t suggest investing time in trying to outsmart the type system in this direction.
> The best you could achieve would be discovering a soundness bug (which should of course be promptly reported).
>
> If your goal is just to move the polymorphism from the type itself to its parameter, using normal phantom types (maybe combined with phantom types to allow pattern matching) could help you, even if it may leave a few runtime checks.
>
> Otherwise, you may try to encode your domain in a proper GADT, not using polymorphic variants.
> There are ways to keep some degree of subsumption.
>
> By the way, I’m not sure what you mean by “weaker exhaustiveness” of polymorphic variants.
> I agree that polymorphic variants give weaker typing, it that they infer the declarations, but at least closed pattern-mathcing on them do guarantee exhaustiveness.

Jacques, I am sorry that I failed to address this before. I did not mean to imply that there was a problem, by "weaker exhaustiveness" I only mean that for a pattern match that is missing an intended case, the type error involving the inferred type can end up "far away" from the point of the mistake in the code. One way around this is to add type more annotations so that an incompatibility cannot propagate as far. But this relies on programmer-discipline rather than being irrefutably enforced by the type system.

Cheers, Josh

> Jacques
>
> P.S. Here is an example of specific domain encoding:
>
> type ‘a number = Number of ‘a
>
> type _ exp =
> | Integer : int -> int number exp
> | Float : float -> float number exp
> | Add : ‘a number exp * ‘a number exp -> ‘a number exp
> | String : string -> string exp
>
> or
>
> type number = Number
>
> type _ exp =
> | Integer : int -> number exp
> | Float : float -> number exp
> | Add : number exp * number exp -> number exp
> | String : string -> string exp
>
> Basically, you must choose between what is static and what is dynamic.
>
> On 21 Jun 2020, at 18:10, Josh Berdine <josh@berdine.net> wrote:
>>
>> I wonder if anyone has any suggestions on approaches to use GADTs with a type parameter that is phantom and allows subtyping.
>>
>> (My understanding is that the approach described in
>> "Gabriel Scherer, Didier Rémy. GADTs meet subtyping. ESOP 2013"
>> hasn't entirely been settled on as something that ought to be implemented in the compiler. Right? Is there anything more recent describing alternatives, concerns or limitations, etc.?)
>>
>> As a concrete example, consider something like:
>> ```
>> type _ exp =
>> | Integer : int -> [> `Integer] exp
>> | Float : float -> [> `Float] exp
>> | Add : ([< `Integer | `Float] as 'n) exp * 'n exp -> 'n exp
>> | String : string -> [> `String] exp
>> ```
>> The intent here is to use polymorphic variants to represent a small (upper semi-) lattice, where basically each point corresponds to a subset of the constructors. The type definition above is admitted, but while the index types allow subtyping:
>> ```
>> let widen_index x = (x : [`Integer] :> [> `Integer | `Float])
>> ```
>> this does not extend to the base type:
>> ```
>> # let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>> Line 1, characters 18-67:
>> 1 | let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>>                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Error: Type [ `Integer ] exp is not a subtype of
>>        ([> `Float | `Integer ] as 'a) exp
>>      The first variant type does not allow tag(s) `Float
>>
>> ```
>> This makes sense since `type _ exp` is not covariant. But trying to declare it to be covariant doesn't fly, yielding:
>> ```
>> Error: In this GADT definition, the variance of some parameter
>>      cannot be checked
>> ```
>>
>> I'm not wedded to using polymorphic variants here, but I have essentially the same trouble using a hierarchy of class types (with different sets of `unit` methods to induce the intended subtype relation) to express the index lattice. Are there other options?
>>
>> I also tried using trunk's injectivity annotations (`type +!_ exp = ...`) on a lark since I'm not confident that I fully understand what happens to the anonymous type variables for the rows in the polymorphic variants. But that doesn't seem to change things.
>>
>> Is this sort of use case something for which there are known encodings? In a way I'm trying to benefit from some of the advantages of polymorphic variants (subtyping) without the drawbacks such as weaker exhaustiveness and less compact memory representation by keeping the polymorphic variants in a phantom index. Is this approach doomed?
>>
>> Cheers, Josh
>>
>

```
David Allsoppdra-news@metastack.comRE: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-07T08:11:00Zurn:uuid:f7f009e2-60f9-2812-2be1-3732f2eb757a
```Oleg wrote:
> > type 'a t =
> >    | Wine: [> `Wine ] t
> >    | Food: [> `Food ] t
> >    | Box : 'a t * 'a t -> 'a t
> >    | Fridge: [< `Wine | `Food ] t -> [> `Fridge ] t
> >
> > In this definition, the type parameter 'a is a phantom parameter. It
> > is not used to describe the type of a value; it is used only to
> > restrict the set of values of type "t" that can be constructed.
> >
> > The goal here is to allow storing wine and food (and boxes
> > containing boxes containing wine and food) in a fridge, but forbid
> > storing a fridge in a fridge (or a fridge in a box in a fridge, etc.).
>
> This is indeed a very `popular' problem. On June 21 this year Josh
> Berdine posed almost the same question (without wine and food, alas).
> On 28 June 2020 I sent the reply with three solutions. Perhaps it
> would be easier to point to the code, which also has many comments and
> explanations:
>
>         http://okmij.org/ftp/tagless-final/subtyping.ml
>
> The gist of the first two solutions is to exploit the fact that
> tagless- final is sort of isomorphic to GADTs. That is, lots of things
> that can be done with GADTs can be done without (or with GADTs hidden
> away). That hiding has benefit of no longer bringing the problems with
> variance. No GADTs (at least, not in the part facing the user), no
> variance problems, at least, not for the end user. The library author
> may use regular ADT, which are also non-problematic. A regular ADT
> doesn't have the same typing guarantees -- but the typing is enforced
> by the signature (at the module level), so there is no loss.
>
> I was going to write an article for my web site explaining this and a
> few earlier answers to the similar problems. Maybe I will eventually
> get to it.

It is indeed popular - as are your historic answers, too :)
https://inbox.ocaml.org/caml-list/20130705102138.98516.qmail@www1.g3.pair.co
m/

--dra

```
Olegoleg@okmij.orgRe: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-07T13:13:23Zurn:uuid:27e6c9f2-3aa2-ab7a-f9a2-8ed1dcfd4be5
```> What I'm unsure of is how this generalizes to additional functions on
> expressions. For instance, suppose I also wanted to define a function
> which computed the set of int literals that appear in the (unreduced)
> expression?  ...
> More pressingly, I think, is what operations such as `map_ints : (int
> -> int) -> 'a exp -> 'a exp` would look like.

Frankly speaking, after re-reading the message I sent back in June, I
found it rather confusing. It didn't say what the main idea was and
the talk about Reducer was too long and unenlightening. Luckily, this
time I have your excellent questions, and so get another chance to
explain the approach, hopefully better.

The basic set up is a little language with int, float and string
literals, and the addition operation. The user should be able to add
(and keep adding) ints and floats, but attempting to add strings
should raise a clear error message. The language expressions should be
first-class, that is, we should be able to put them into a list, among
other things. (I too find Francois' example superb, but I already had
the code for the earlier example).

Here is the definition of the language: its operations. The type
parameter of t enforces the constraint: ints and floats can be
(recursively) added in any combination but strings may not.

module type exp = sig
type +'a t
val int    : int   -> [> `int] t
val float  : float -> [> `float] t
val add    : ([< `int | `float] as 'n) t -> 'n t -> [> `int | `float] t
val string : string -> [> `string] t
end

Here are sample expressions:

module Ex1(I:exp) = struct
open I

(* We can put expressions of different sorts into the same list *)
let l1 = [int 1; float 2.0; add (add (int 3) (float 1.0)) (int 4)]
(* val l1 : [> `float | `int ] I.t list *)

let l2 = string "str" :: l1
(* val l2 : [> `float | `int | `string ] I.t list *)

(* An attempt to build an invalid expression fails, with a good error message:
let x = add (int 1) (string "s")
^^^^^^^^^^^^
Error: This expression has type [> `string ] t
but an expression was expected of type [< `float | `int > `int ] t
The second variant type does not allow tag(s) `string
*)

(* We can define a function to sum up elements of a list of expressions,
returning a single expression with addition
*)
let rec sum l = List.fold_left add (int 0) l
(* val sum : [ `float | `int ] I.t list -> [ `float | `int ] I.t *)

(* We can sum up l1 *)
let l1v = sum l1
(* val l1v : [ `float | `int ] I.t *)

(* but, predictably, not l2
let l2v = sum l2
^^
Error: This expression has type [> `float | `int | `string ] t list
but an expression was expected of type [ `float | `int ] t list
The second variant type does not allow tag(s) `string
*)
end

One doesn't have to use the functor as I just did. First-class modules,
or even records/objects will work too.

It is useful to see the expressions. Hence the first implementation of
the signature exp:

module Show : (exp with type 'a t = string) = struct
type 'a t = string
let int      = string_of_int
let float    = string_of_float
let string x = "\"" ^ String.escaped x ^ "\""
let add x y  = Printf.sprintf "Add(%s,%s)" x y
end

Here is how the sample expressions look like:

let _ = let module M = Ex1(Show) in M.l1
(* - : string list = ["1"; "2."; "Add(Add(3,1.),4)"] *)
let _ = let module M = Ex1(Show) in M.l2
(* - : string list = ["\"str\""; "1"; "2."; "Add(Add(3,1.),4)"] *)
let _ = let module M = Ex1(Show) in M.l1v
(* - : string = "Add(Add(Add(0,1),2.),Add(Add(3,1.),4))" *)

Now, to your questions.

``suppose I also wanted to define a function which computed the set of
int literals that appear in the (unreduced) expression?''

That is easy: we just interpret the same expressions in a different
way: instead of showing them, we compute the set of integers that occur in
their literals.

module IntSet = Set.Make(struct type t = int let compare = compare end)

module CollectInts : (exp with type 'a t = IntSet.t) = struct
type 'a t = IntSet.t
let int x    = IntSet.singleton x
let float x  = IntSet.empty
let string x = IntSet.empty
let add x y  = IntSet.union x y
end

let _ = let module M = Ex1(CollectInts) in List.map IntSet.elements M.l1
(* - : int list list = [; []; [3; 4]] *)
let _ = let module M = Ex1(CollectInts) in IntSet.elements M.l1v
(* - : int list = [0; 1; 3; 4] *)

The second question: ``More pressingly, I think, is what operations such as
`map_ints : (int -> int) -> 'a exp -> 'a exp` would look like. With an
initial style representation, this could be defined by traversing the
abstract syntax and rebuilding it with the updated `int`s.''

In tagless-final, it looks essentially like this, only simpler.
There is no need to write any traversal (a tagless-final term
itself specifies the traversal of itself)

module MapInts(Map:sig val f : int -> int end)(I:exp) = struct
type 'a t = 'a I.t
let int x  = I.int (Map.f x)
let float  = I.float
let string = I.string
let add    = I.add
end

let _ = let module M = Ex1(MapInts(struct let f = succ end)(Show)) in M.l1
(* - : string list = ["2"; "2."; "Add(Add(4,1.),5)"] *)
let _ = let module M = Ex1(MapInts(struct let f = succ end)(Show)) in M.l2
(* - : string list = ["\"str\""; "2"; "2."; "Add(Add(4,1.),5)"] *)
let _ = let module M = Ex1(MapInts(struct let f = succ end)(Show)) in M.l1v
(* - : string = "Add(Add(Add(1,2),2.),Add(Add(4,1.),5))" *)

The central idea is this:

eval (map f exp) === (compose eval (map f)) exp ===
let eval' = compose eval (map f) in eval' exp

Instead of transforming expressions we transform interpreters
(eval). Since evaluation is essentially folding, two foldings may be
fused.

I have updated
http://okmij.org/ftp/tagless-final/subtyping.ml
to include your questions.
```
Olegoleg@okmij.orgRe: [Caml-list] Question on the covariance of a GADT (polymorphic2020-10-07T13:13:37Zurn:uuid:69647779-46e6-cd5c-fff4-6ab3b64c275a
```> As far as I understand Oleg's solutions, they all boil down to the
> same idea, which I have considered, but is only half satisfactory:
> that is, to publish a covariant abstract type +'a t which internally
> is implemented as an abbreviation for an unparameterized type u. (See
> the food/wine/fridge example, rewritten in this style, below.)
Good summary!

> The reason why this solution is only half satisfactory is that inside the
> abstraction barrier, we have an unparameterized type u that does not express
> the desired data invariant, so we cannot exploit the invariant to prove that
> certain branches in the code are dead.

Indeed. That was the `imperfection' that I have referred to in my
message. I have two comments: first, there is the second solution,
which does exploit the invariant. Second, there may be more than one
abstraction barrier... That is, many implementations of the wine_food
signature may be in terms of some `more basic' signature -- or just
the same signature, if we are talking about term transformations (as I
have just replied to Josh Berdine).  Eventually we have to implement
the `most basic signature', and indeed we have to drop the indices and
fall into the `untyped' universe, so to speak. One hopes though that
this `security kernel' is really `most basic' and its operations are
few and primitive, so the correctness could be relatively easily
ascertained.

> Also, the user cannot perform case analysis, since the type 'a t is
> abstract.  This is not a problem for me, but could be a problem in
> other situations.

When I first mentioned tagless-final, in passing at a some conference
in Oxford a decade ago, an old Oxford professor (one of fathers of
SML) exclaimed: ``Isn't it bloody obvious that you cannot do
pattern-matching in this style!?''. I liked his phrase. Now I like to
say that pattern-matching in tagless-final style is not bloody, is not
obvious and is not impossible.

That it is always possible was proven by Boehm and Berarducci (their
proof was one inspiration for the development of Coq, as I heard).
http://okmij.org/ftp/tagless-final/course/Boehm-Berarducci.html
Although their general method works always, often one may design
simpler methods for particular pattern-matching tasks.
```