caml-list - the Caml user's mailing list
 help / Atom feed
* [Caml-list] Implementing include "file" statement in menhir
@ 2020-01-21  6:48 Richard W.M. Jones
  2020-01-21  7:15 ` Yann Régis-Gianas
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Richard W.M. Jones @ 2020-01-21  6:48 UTC (permalink / raw)
  To: caml-list, Francois.Pottier, Yann.Regis-Gianas

[Resend, apologies if you get this twice, but I sent it
earlier and that seems to have disappeared.]

I'm writing a parser which needs to have a C-like include directive.
There's an old thread on this describing a rather complicated way to
do this for ocamllex:
https://groups.google.com/forum/#!topic/fa.caml/_v_k4WTQV_Q

I thought I'd have a go at writing an include statement in menhir, and
I did come up with something which works but it's quite a large hack.
What I did is documented below, but I wonder if someone can think of a
simpler way to do this?  Also two related questions:

How do you pass extra parameters to menhir's generated parser
functions?

Is there a nice way to export values into menhir's generated
parser.mli file?

----

The concept behind my include statement uses the following grammar:

  %token INCLUDE
  %token <string> STRING
  %start file
  %%
  file: list(stmt) ;

  stmt:
      | INCLUDE STRING
      {
        let filename = $2 in
        let fp = open_in filename in
        let lexbuf = Lexing.from_channel fp in
        lexbuf.lex_curr_p <- { lexbuf.lex_curr_p with pos_fname = filename };
	(* Recursively call Parser.file: *)
        file Lexer.read lexbuf;
        close_in fp;
      }
      | ... other statements ...
      ;


Unfortunately as written the above code cannot work because it
introduces a circular dependency between the Parser and the Lexer
modules (normally the Lexer module depends on the Parser, and so the
Parser cannot use any functions from the Lexer module).

To break the cycle we have to add:

  %{
  let lexer_read = ref None
  %}

and replace Lexer.read with:

  let reader =
     match !lexer_read with None -> assert false | Some r -> r in
  file reader lexbuf;

Then to initialize lexer_read, we have to export it by doing this
hack:

  menhir parser.mly
  echo 'val lexer_read : (Lexing.lexbuf -> token) option ref' >> parser.mli

and we can set it from the main program.

Rich.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Implementing include "file" statement in menhir
  2020-01-21  6:48 [Caml-list] Implementing include "file" statement in menhir Richard W.M. Jones
@ 2020-01-21  7:15 ` Yann Régis-Gianas
  2020-01-21  8:55 ` François Pottier
  2020-01-21 13:04 ` Jocelyn Sérot
  2 siblings, 0 replies; 4+ messages in thread
From: Yann Régis-Gianas @ 2020-01-21  7:15 UTC (permalink / raw)
  To: Richard W.M. Jones; +Cc: Ocaml Mailing List, Francois.Pottier

Hello Richard,

if I had to do that, I would use the incremental mode of Menhir:
indeed, this mode allows you to manage complex lexing without
interfering with the parser.

In that mode, Menhir produces a function that executes parsing
step-by-step and between each step, you can easily update the state of
the lexer. Besides, that technique solves the circular dependency
problem since the interaction between the lexer and the parser is to
be implemented in a third module, which only depends on the lexer and
the parser.

More precisely, I think that you could intercept the reduction of the
production related to your include directive and enrich your lexer
buffer with the content of the included file on-the-fly.

Cheers,

On Tue, Jan 21, 2020 at 7:48 AM Richard W.M. Jones <rich@annexia.org> wrote:
>
> [Resend, apologies if you get this twice, but I sent it
> earlier and that seems to have disappeared.]
>
> I'm writing a parser which needs to have a C-like include directive.
> There's an old thread on this describing a rather complicated way to
> do this for ocamllex:
> https://groups.google.com/forum/#!topic/fa.caml/_v_k4WTQV_Q
>
> I thought I'd have a go at writing an include statement in menhir, and
> I did come up with something which works but it's quite a large hack.
> What I did is documented below, but I wonder if someone can think of a
> simpler way to do this?  Also two related questions:
>
> How do you pass extra parameters to menhir's generated parser
> functions?
>
> Is there a nice way to export values into menhir's generated
> parser.mli file?
>
> ----
>
> The concept behind my include statement uses the following grammar:
>
>   %token INCLUDE
>   %token <string> STRING
>   %start file
>   %%
>   file: list(stmt) ;
>
>   stmt:
>       | INCLUDE STRING
>       {
>         let filename = $2 in
>         let fp = open_in filename in
>         let lexbuf = Lexing.from_channel fp in
>         lexbuf.lex_curr_p <- { lexbuf.lex_curr_p with pos_fname = filename };
>         (* Recursively call Parser.file: *)
>         file Lexer.read lexbuf;
>         close_in fp;
>       }
>       | ... other statements ...
>       ;
>
>
> Unfortunately as written the above code cannot work because it
> introduces a circular dependency between the Parser and the Lexer
> modules (normally the Lexer module depends on the Parser, and so the
> Parser cannot use any functions from the Lexer module).
>
> To break the cycle we have to add:
>
>   %{
>   let lexer_read = ref None
>   %}
>
> and replace Lexer.read with:
>
>   let reader =
>      match !lexer_read with None -> assert false | Some r -> r in
>   file reader lexbuf;
>
> Then to initialize lexer_read, we have to export it by doing this
> hack:
>
>   menhir parser.mly
>   echo 'val lexer_read : (Lexing.lexbuf -> token) option ref' >> parser.mli
>
> and we can set it from the main program.
>
> Rich.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Implementing include "file" statement in menhir
  2020-01-21  6:48 [Caml-list] Implementing include "file" statement in menhir Richard W.M. Jones
  2020-01-21  7:15 ` Yann Régis-Gianas
@ 2020-01-21  8:55 ` François Pottier
  2020-01-21 13:04 ` Jocelyn Sérot
  2 siblings, 0 replies; 4+ messages in thread
From: François Pottier @ 2020-01-21  8:55 UTC (permalink / raw)
  To: Richard W.M. Jones, caml-list, Yann.Regis-Gianas


Hello,

On 21/01/2020 07:48, Richard W.M. Jones wrote:
> I'm writing a parser which needs to have a C-like include directive.
> I thought I'd have a go at writing an include statement in menhir, and
> I did come up with something which works but it's quite a large hack.

If it is OK to recognize and obey an "include" directive in every 
grammatical
context (as opposed to only where a valid "stmt" is expected) then I would
suggest implementing support for "include" at the level of the lexer, so the
parser is entirely unaware of it. (I haven't tried it, though; it might take
some thought to come up with an approach that does not involve horrible side
effects.)

If on the contrary you wish to consider "include <foo.h>" as a valid 
statement
(and be able in principle to parse what follows without actually reading the
file foo.h) then I would suggest making Include a constructor in the 
abstract
syntax tree and defer reading included files to a separate "include
resolution" pass (which is allowed to invoke the parser). This approach 
should
be conceptually simpler. Disadvantages: 1- it should be slower by a constant
factor; 2- the include resolution pass requires writing a lot of boilerplate
traversal code (but this could be automated using the visitors package).

> How do you pass extra parameters to menhir's generated parser functions?

If you need to parameterize the entire parser, you can use %parameter 
(see the
manual, section 4.1.2).

If you need to parameterize only certain semantic actions, then you can let
these semantic actions return a closure, as in { fun my_parameter -> ... }

> Is there a nice way to export values into menhir's generated parser.mli
> file?

No. If you would like to write some code by hand and make it visible in
an .mli file, then you must write a separate module.

Hope this helps,

--
François Pottier
francois.pottier@inria.fr
http://gallium.inria.fr/~fpottier/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Implementing include "file" statement in menhir
  2020-01-21  6:48 [Caml-list] Implementing include "file" statement in menhir Richard W.M. Jones
  2020-01-21  7:15 ` Yann Régis-Gianas
  2020-01-21  8:55 ` François Pottier
@ 2020-01-21 13:04 ` Jocelyn Sérot
  2 siblings, 0 replies; 4+ messages in thread
From: Jocelyn Sérot @ 2020-01-21 13:04 UTC (permalink / raw)
  To: OCaML Mailing List

[-- Attachment #1: Type: text/plain, Size: 2591 bytes --]

Hi,

I had implemented a include mechanisms at the lexer level using ocamllex for the Caph programming language.
This is here (undocumented unfortunately) (related stuff starts at line 128):

https://github.com/jserot/caph/blob/master/compiler/lexer.mll <https://github.com/jserot/caph/blob/master/compiler/lexer.mll>

Just in case it can help.


Jocelyn

> Le 21 janv. 2020 à 07:48, Richard W.M. Jones <rich@annexia.org> a écrit :
> 
> [Resend, apologies if you get this twice, but I sent it
> earlier and that seems to have disappeared.]
> 
> I'm writing a parser which needs to have a C-like include directive.
> There's an old thread on this describing a rather complicated way to
> do this for ocamllex:
> https://groups.google.com/forum/#!topic/fa.caml/_v_k4WTQV_Q
> 
> I thought I'd have a go at writing an include statement in menhir, and
> I did come up with something which works but it's quite a large hack.
> What I did is documented below, but I wonder if someone can think of a
> simpler way to do this?  Also two related questions:
> 
> How do you pass extra parameters to menhir's generated parser
> functions?
> 
> Is there a nice way to export values into menhir's generated
> parser.mli file?
> 
> ----
> 
> The concept behind my include statement uses the following grammar:
> 
>  %token INCLUDE
>  %token <string> STRING
>  %start file
>  %%
>  file: list(stmt) ;
> 
>  stmt:
>      | INCLUDE STRING
>      {
>        let filename = $2 in
>        let fp = open_in filename in
>        let lexbuf = Lexing.from_channel fp in
>        lexbuf.lex_curr_p <- { lexbuf.lex_curr_p with pos_fname = filename };
> 	(* Recursively call Parser.file: *)
>        file Lexer.read lexbuf;
>        close_in fp;
>      }
>      | ... other statements ...
>      ;
> 
> 
> Unfortunately as written the above code cannot work because it
> introduces a circular dependency between the Parser and the Lexer
> modules (normally the Lexer module depends on the Parser, and so the
> Parser cannot use any functions from the Lexer module).
> 
> To break the cycle we have to add:
> 
>  %{
>  let lexer_read = ref None
>  %}
> 
> and replace Lexer.read with:
> 
>  let reader =
>     match !lexer_read with None -> assert false | Some r -> r in
>  file reader lexbuf;
> 
> Then to initialize lexer_read, we have to export it by doing this
> hack:
> 
>  menhir parser.mly
>  echo 'val lexer_read : (Lexing.lexbuf -> token) option ref' >> parser.mli
> 
> and we can set it from the main program.
> 
> Rich.


[-- Attachment #2: Type: text/html, Size: 4539 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21  6:48 [Caml-list] Implementing include "file" statement in menhir Richard W.M. Jones
2020-01-21  7:15 ` Yann Régis-Gianas
2020-01-21  8:55 ` François Pottier
2020-01-21 13:04 ` Jocelyn Sérot

caml-list - the Caml user's mailing list

Archives are clonable: git clone --mirror https://inbox.ocaml.org/caml-list

AGPL code for this site: git clone https://public-inbox.org/ public-inbox