parsing - Pattern Matching in dypgen -


i want handle ambiguities in dypgen. found in manual, want know, how can use that. in manual point 5.2 "pattern matching on symbols" there example:

expr: | expr op<"+"> expr { $1 + $2 } | expr op<"*"> expr { $1 * $2 } 

op matched "+" or "*", understand. find there:

the patterns can caml patterns (but without keyword when). instance possible:

expr: expr<(function([arg1;arg2],f_body)) f> expr { action } 

so tried put there other expressions, dont understand, happens. if put in there printf outputs value of matched string. if put in there (fun x -> printf x), seems me same printf, dypgen complains syntax error , points end of expression. if put printf.printf in there, complains syntax error: operator expected. , if put there (fun x -> printf.printf x) says: lexing failed message: lexing: empty token these different error-messages mean?

in end in hashtable, if value in there, don't know, if possible way. or isn't possible?


edit: minimal example derived forest-example dypgen-demos.

the grammarfile forest_parser.dyp contains:

{ open parse_tree let dyp_merge = dyp.keep_all }  %start main %layout [' ' '\t']  %%  main : np "." "\n" { $1 }  np:   |    sg                   {noun($1)}   |    pl                   {noun($1)}  sg: word    <word("sheep"|"fish")>  {sg($1)} sg: word    <word("cat"|"dog")>  {sg($1)} pl: word    <word("sheep"|"fish")>  {pl($1)} pl: word    <word("cats"|"dogs")>  {pl($1)}  /* or try:     sg: word    <printf>  {sg($1)}     pl: word    <printf>  {pl($1)} */  word:    | (['a'-'z' 'a'-'z']+)    {word($1)} 

the forest.ml has following print_forest-function now:

let print_forest forest =   let rec aux1 t = match t     | word x     -> print_string x     | noun (x) -> (         print_string "n [";         aux1 x;         print_string " ]")     | sg (x) -> (         print_string "sg [";         aux1 x;         print_string " ]")     | pl (x) -> (         print_string "pl [";         aux1 x;         print_string " ]")   in   let aux2 t = aux1 t; print_newline () in   list.iter aux2 forest;   print_newline () 

and parser_tree.mli contains:

type tree =    | word        of string   | noun        of tree   | sg          of tree   | pl          of tree 

and can determine, numeri fish, sheep, cat(s) etc. are.

sheep or fish can singular , plural. cats , dogs cannot.  fish. n [sg [fish ] ] n [pl [fish ] ] 

i know nothing dypgen tried figure out.

let's see found out.

in parser.dyp file can define lexer , parser or can use external lexer. here's did :

my ast looks :

parse_prog.mli

type f =    | print of string   | function of string list * string * string  type program = f list 

prog_parser.dyp

{   open parse_prog    (* let dyp_merge = dyp.keep_all *)        let string_buf = buffer.create 10 }  %start main  %relation pf<pr  %lexer  let newline = '\n' let space = [' ' '\t' '\r'] let uident = ['a'-'z']['a'-'z' 'a'-'z' '0'-'9' '_']* let lident = ['a'-'z']['a'-'z' 'a'-'z' '0'-'9' '_']*  rule string = parse   | '"' { () }   | _ { buffer.add_string string_buf (dyp.lexeme lexbuf);       string lexbuf }  main lexer =   newline | space + -> { () }   "fun"  -> anonymfunction { () }   lident -> function { dyp.lexeme lexbuf }   uident -> module { dyp.lexeme lexbuf }   '"' -> string { buffer.clear string_buf;                   string lexbuf;                   buffer.contents string_buf }  %parser  main : function_calls eof                                              { $1 }  function_calls:   |                                                                     { [] }   | function_call ";" function_calls                                    { $1 :: $3 }  function_call:   | printf string                                                       { print $2 } pr   | "(" anonymfunction lident "->" printf lident ")" string             { print $6 } pf   | nested_modules "." function string                                  { function ($1, $3, $4) } pf   | function string                                                     { function ([], $1, $2) } pf   | "(" anonymfunction lident "->" function lident ")" string           { function ([], $5, $8) } pf  printf:   | function<"printf">                                                  { () }   | module<"printf"> "." function<"printf">                             { () }  nested_modules:   | module                                            { [$1] }   | module "." nested_modules                         { $1 :: $3 } 

this file important. can see, if have function printf "test" grammar ambiguous , can reduced either print "test" or function ([], "printf", "test") !, realized, can give priorities rules if 1 higher priority 1 chosen first parsing. (try uncomment let dyp_merge = dyp.keep_all , you'll see possible combinations).

and in main :

main.ml

open parse_prog  let print_stlist fmt sl =   match sl      | [] -> ()     | _ -> list.iter (format.fprintf fmt "%s.") sl  let print_program tl =   let aux1 t = match t       | function (ml, f, p) ->          format.printf "i can't %a%s(\"%s\")@." print_stlist ml f p       | print s -> format.printf "you want print : %s@." s   in   let aux2 t = list.iter (fun (tl, _) ->       list.iter aux1 tl; format.eprintf "------------@.") tl in   list.iter aux2 tl  let input_file = sys.argv.(1)  let lexbuf = dyp.from_channel (forest_parser.pp ()) (pervasives.open_in input_file)  let result = parser_prog.main lexbuf  let () = print_program result 

and, example, following file :

test

printf "first print"; printf.printf "nested print"; format.eprintf "nothing possible"; (fun x -> printf x) "anonymous print"; 

if execute ./myexec test following prompt

you want print : first print want print : nested print can't format.eprintf("nothing possible") want print : x ------------ 

so, tl;dr, manual example here show you can play defined tokens (i never defined token print, function) , match on them new rules.

i hope it's clear, learned lot question ;-)

[edit] so, changed parser match wanted watch :

{       open parse_prog        (* let dyp_merge = dyp.keep_all *)        let string_buf = buffer.create 10     }      %start main      %relation pf<pp      %lexer      let newline = '\n'     let space = [' ' '\t' '\r']     let uident = ['a'-'z']['a'-'z' 'a'-'z' '0'-'9' '_']*     let lident = ['a'-'z']['a'-'z' 'a'-'z' '0'-'9' '_']*      rule string = parse       | '"' { () }       | _ { buffer.add_string string_buf (dyp.lexeme lexbuf);           string lexbuf }      main lexer =       newline | space + -> { () }       "fun"  -> anonymfunction { () }       lident -> function { dyp.lexeme lexbuf }       uident -> module { dyp.lexeme lexbuf }       '"' -> string { buffer.clear string_buf;                       string lexbuf;                       buffer.contents string_buf }      %parser      main : function_calls eof                                                  { $1 }      function_calls:       |                                                                         { [] } pf       | function_call <function((["printf"] | []), "printf", st)> ";" function_calls         { (print st) :: $3 } pp       | function_call ";" function_calls                                        { $1 :: $3 } pf       function_call:       | nested_modules "." function string                                   { function ($1, $3, $4) }       | function string                                      { function ([], $1, $2) }       | "(" anonymfunction lident "->" function lident ")" string         { function ([], $5, $8) }      nested_modules:       | module                                                { [$1] }       | module "." nested_modules                             { $1 :: $3 } 

here, can see, don't handle fact function print when parse when put in functions list. so, match on algebraic type built parser. hope example ok ;-) (but warned, extremely ambiguous ! :-d)


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -

c++ - Migration from QScriptEngine to QJSEngine -