caml-list - the Caml user's mailing list
 help / Atom feed
From: Yann Régis-Gianas <yrg@irif.fr>
To: "Richard W.M. Jones" <rich@annexia.org>
Cc: Ocaml Mailing List <caml-list@inria.fr>, "Francois.Pottier@inria.fr" <Francois.Pottier@inria.fr>
Subject: Re: [Caml-list] Implementing include "file" statement in menhir
Date: Tue, 21 Jan 2020 08:15:34 +0100
Message-ID: <CAM+Uc3XkjQTgejsTyA5ikEZeB4WZVXwJMXMZN-ONiH1Sss3-hg@mail.gmail.com> (raw)
In-Reply-To: <20200121064845.GM27889@rich.annexia.org>

Hello Richard,

if I had to do that, I would use the incremental mode of Menhir:
indeed, this mode allows you to manage complex lexing without
interfering with the parser.

In that mode, Menhir produces a function that executes parsing
step-by-step and between each step, you can easily update the state of
the lexer. Besides, that technique solves the circular dependency
problem since the interaction between the lexer and the parser is to
be implemented in a third module, which only depends on the lexer and
the parser.

More precisely, I think that you could intercept the reduction of the
production related to your include directive and enrich your lexer
buffer with the content of the included file on-the-fly.

Cheers,

On Tue, Jan 21, 2020 at 7:48 AM Richard W.M. Jones <rich@annexia.org> wrote:
>
> [Resend, apologies if you get this twice, but I sent it
> earlier and that seems to have disappeared.]
>
> I'm writing a parser which needs to have a C-like include directive.
> There's an old thread on this describing a rather complicated way to
> do this for ocamllex:
> https://groups.google.com/forum/#!topic/fa.caml/_v_k4WTQV_Q
>
> I thought I'd have a go at writing an include statement in menhir, and
> I did come up with something which works but it's quite a large hack.
> What I did is documented below, but I wonder if someone can think of a
> simpler way to do this?  Also two related questions:
>
> How do you pass extra parameters to menhir's generated parser
> functions?
>
> Is there a nice way to export values into menhir's generated
> parser.mli file?
>
> ----
>
> The concept behind my include statement uses the following grammar:
>
>   %token INCLUDE
>   %token <string> STRING
>   %start file
>   %%
>   file: list(stmt) ;
>
>   stmt:
>       | INCLUDE STRING
>       {
>         let filename = $2 in
>         let fp = open_in filename in
>         let lexbuf = Lexing.from_channel fp in
>         lexbuf.lex_curr_p <- { lexbuf.lex_curr_p with pos_fname = filename };
>         (* Recursively call Parser.file: *)
>         file Lexer.read lexbuf;
>         close_in fp;
>       }
>       | ... other statements ...
>       ;
>
>
> Unfortunately as written the above code cannot work because it
> introduces a circular dependency between the Parser and the Lexer
> modules (normally the Lexer module depends on the Parser, and so the
> Parser cannot use any functions from the Lexer module).
>
> To break the cycle we have to add:
>
>   %{
>   let lexer_read = ref None
>   %}
>
> and replace Lexer.read with:
>
>   let reader =
>      match !lexer_read with None -> assert false | Some r -> r in
>   file reader lexbuf;
>
> Then to initialize lexer_read, we have to export it by doing this
> hack:
>
>   menhir parser.mly
>   echo 'val lexer_read : (Lexing.lexbuf -> token) option ref' >> parser.mli
>
> and we can set it from the main program.
>
> Rich.

  reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-21  6:48 Richard W.M. Jones
2020-01-21  7:15 ` Yann Régis-Gianas [this message]
2020-01-21  8:55 ` François Pottier
2020-01-21 13:04 ` Jocelyn Sérot

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM+Uc3XkjQTgejsTyA5ikEZeB4WZVXwJMXMZN-ONiH1Sss3-hg@mail.gmail.com \
    --to=yrg@irif.fr \
    --cc=Francois.Pottier@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=rich@annexia.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

caml-list - the Caml user's mailing list

Archives are clonable: git clone --mirror https://inbox.ocaml.org/caml-list

AGPL code for this site: git clone https://public-inbox.org/ public-inbox