
时间:2022-10-29 13:42:32

I am new to Antlr, but have used Flex/Bison before. I want to know if what I want to do using Antlr is possible.

我是Antlr的新手,但之前使用过Flex / Bison。我想知道使用Antlr我想做什么是可能的。

I want to parse an PDDL file using Antlr and build up my own representation of the PDDL file's contents in a Java Class that I wrote as the PDDL file is parsed (in the actions for the rules?). After the file is finished parsing I want to return the object representation of the file's contents to the Java program to run other operations on.


So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.


Is this possible? I have tried looking at the documentation, but haven't found a good answer.


Thanks very much.


2 个解决方案



So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.


Is this possible?



First you need to describe your language in a (ANTLR) grammar file. The easiest is to do this in a combined grammar. A combined grammar will create a lexer and parser for your language. When the language gets more complex, it is better to separate these two, but to start out, it will be easier to use just one (combined) grammar file.


Let's say the PDDL language is just an easy language: it is a succession of one or more numbers either in hexadecimal (0x12FD), octal (0745) or decimal (12345) notation separated by white spaces. This language can be described in the following ANTLR grammar file called PDDL.g:


grammar PDDL;

  :  number+ EOF

  :  Hex
  |  Dec
  |  Oct

  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+

  :  '0'
  |  '1'..'9' ('0'..'9')*

  :  '0' '0'..'7'+

  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}

In this grammar, the rules (parse, number, Hex, ... are rules) that start with a capital are lexer-rules. The other ones are parser-rules.


From this grammar, you can create a lexer and parser like this:


java -cp antlr-3.2.jar org.antlr.Tool PDDL.g

which produces (at least) the files PDDLParser.java and PDDLLexer.java.


Now create a little test class in which you can use these lexer and parser classes:


import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);

where the contents of the source.txt file might look like this:


0xcAfE 0234
66678 0X12 0777

Now compile all .java files:


javac -cp antlr-3.2.jar *.java

and run the main class:


// Windows
java -cp .;antlr-3.2.jar Main

// *nix/MacOS
java -cp .:antlr-3.2.jar Main

If all goes well, nothing is being printed to the console.


Now you say you wanted to let the parser return certain objects based on the contents of your source file. Let's say we want our grammar to return a List<Integer>. This can be done by embedding "actions" in your grammar rules like this:

现在你说你想让解析器根据源文件的内容返回某些对象。假设我们希望我们的语法返回List 。这可以通过在您的语法规则中嵌入“actions”来完成,如下所示:

grammar PDDL;

parse returns [List<Integer> list]
@init{$list = new ArrayList<Integer>();}
  :  (number {$list.add($number.value);})+ EOF

number returns [Integer value]
  :  Hex {$value = Integer.parseInt($Hex.text.substring(2), 16);}
  |  Dec {$value = Integer.parseInt($Dec.text);}
  |  Oct {$value = Integer.parseInt($Oct.text, 8);}

  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+

  :  '0'
  |  '1'..'9' ('0'..'9')*

  :  '0' '0'..'7'+

  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}

As you can see, you can let rules return objects (returns [Type t]) and can embed plain Java code if wrapping it in { and }. The @init part in the parse rule is placed at the start of the parse method in the PDDLParser.java file.

正如您所看到的,您可以让规则返回对象(返回[Type t]),并且如果将它包装在{和}中,则可以嵌入纯Java代码。解析规则中的@init部分放在PDDLParser.java文件中的parse方法的开头。

Test the new parser with this class:


import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        List<Integer> numbers = parser.parse();
        System.out.println("After parsing :: "+numbers);

and you'll see the following being printed to the console:


After parsing :: [51966, 156, 66678, 18, 511]



This is certainly possible, since Antlr is designed to generate parsers that then get invoked as part of a larger system (eg, a compiler or a static code analyzer).


Start with Terence Parr's The Definitive Antlr Reference: Building Domain-Specific Languages. He's the Antlr author, and also an unusually clear and jargon-free teacher on language processing.

从Terence Parr的The Definitive Antlr Reference:Building Domain-Specific Languages开始。他是Antlr的作者,也是语言处理方面非常清晰且没有行话的老师。

Martin Fowler's Domain-Specific Languages uses Antlr in a lot of its examples. For instance on page 200 he shows a simple "Hello World" example where a Java program calls Antlr to parse a file of people to greet, and while doing it emits the greetings. Here's where the work gets done (page 206):

Martin Fowler的Domain-Specific Languages在很多例子中都使用了Antlr。例如,在第200页,他展示了一个简单的“Hello World”示例,其中Java程序调用Antlr来解析人们要问候的文件,并在执行此操作时发出问候语。这是工作完成的地方(第206页):

class GreetingsLoader. ..
  public void run() {
    try {
      GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input));
      GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer));
      parser.helper = this;
      parser.script() ;
      if (hasErrors() ) throw new RuntimeException("it all went pear-shaped\n" +
 errorReport() ) ;
    } catch (IOException e) {
      throw new RuntimeException( e) ;
    } catch (RecognitionException e) {
      throw new RuntimeException( e) ;

A third good book is Terence's new one on DSLs Language Implementation Patterns. He describes various ways to use Antlr, as for instance to write an abstract syntax tree generator to put into a compiler.




So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.


Is this possible?



First you need to describe your language in a (ANTLR) grammar file. The easiest is to do this in a combined grammar. A combined grammar will create a lexer and parser for your language. When the language gets more complex, it is better to separate these two, but to start out, it will be easier to use just one (combined) grammar file.


Let's say the PDDL language is just an easy language: it is a succession of one or more numbers either in hexadecimal (0x12FD), octal (0745) or decimal (12345) notation separated by white spaces. This language can be described in the following ANTLR grammar file called PDDL.g:


grammar PDDL;

  :  number+ EOF

  :  Hex
  |  Dec
  |  Oct

  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+

  :  '0'
  |  '1'..'9' ('0'..'9')*

  :  '0' '0'..'7'+

  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}

In this grammar, the rules (parse, number, Hex, ... are rules) that start with a capital are lexer-rules. The other ones are parser-rules.


From this grammar, you can create a lexer and parser like this:


java -cp antlr-3.2.jar org.antlr.Tool PDDL.g

which produces (at least) the files PDDLParser.java and PDDLLexer.java.


Now create a little test class in which you can use these lexer and parser classes:


import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);

where the contents of the source.txt file might look like this:


0xcAfE 0234
66678 0X12 0777

Now compile all .java files:


javac -cp antlr-3.2.jar *.java

and run the main class:


// Windows
java -cp .;antlr-3.2.jar Main

// *nix/MacOS
java -cp .:antlr-3.2.jar Main

If all goes well, nothing is being printed to the console.


Now you say you wanted to let the parser return certain objects based on the contents of your source file. Let's say we want our grammar to return a List<Integer>. This can be done by embedding "actions" in your grammar rules like this:

现在你说你想让解析器根据源文件的内容返回某些对象。假设我们希望我们的语法返回List 。这可以通过在您的语法规则中嵌入“actions”来完成,如下所示:

grammar PDDL;

parse returns [List<Integer> list]
@init{$list = new ArrayList<Integer>();}
  :  (number {$list.add($number.value);})+ EOF

number returns [Integer value]
  :  Hex {$value = Integer.parseInt($Hex.text.substring(2), 16);}
  |  Dec {$value = Integer.parseInt($Dec.text);}
  |  Oct {$value = Integer.parseInt($Oct.text, 8);}

  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+

  :  '0'
  |  '1'..'9' ('0'..'9')*

  :  '0' '0'..'7'+

  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}

As you can see, you can let rules return objects (returns [Type t]) and can embed plain Java code if wrapping it in { and }. The @init part in the parse rule is placed at the start of the parse method in the PDDLParser.java file.

正如您所看到的,您可以让规则返回对象(返回[Type t]),并且如果将它包装在{和}中,则可以嵌入纯Java代码。解析规则中的@init部分放在PDDLParser.java文件中的parse方法的开头。

Test the new parser with this class:


import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        List<Integer> numbers = parser.parse();
        System.out.println("After parsing :: "+numbers);

and you'll see the following being printed to the console:


After parsing :: [51966, 156, 66678, 18, 511]



This is certainly possible, since Antlr is designed to generate parsers that then get invoked as part of a larger system (eg, a compiler or a static code analyzer).


Start with Terence Parr's The Definitive Antlr Reference: Building Domain-Specific Languages. He's the Antlr author, and also an unusually clear and jargon-free teacher on language processing.

从Terence Parr的The Definitive Antlr Reference:Building Domain-Specific Languages开始。他是Antlr的作者,也是语言处理方面非常清晰且没有行话的老师。

Martin Fowler's Domain-Specific Languages uses Antlr in a lot of its examples. For instance on page 200 he shows a simple "Hello World" example where a Java program calls Antlr to parse a file of people to greet, and while doing it emits the greetings. Here's where the work gets done (page 206):

Martin Fowler的Domain-Specific Languages在很多例子中都使用了Antlr。例如,在第200页,他展示了一个简单的“Hello World”示例,其中Java程序调用Antlr来解析人们要问候的文件,并在执行此操作时发出问候语。这是工作完成的地方(第206页):

class GreetingsLoader. ..
  public void run() {
    try {
      GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input));
      GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer));
      parser.helper = this;
      parser.script() ;
      if (hasErrors() ) throw new RuntimeException("it all went pear-shaped\n" +
 errorReport() ) ;
    } catch (IOException e) {
      throw new RuntimeException( e) ;
    } catch (RecognitionException e) {
      throw new RuntimeException( e) ;

A third good book is Terence's new one on DSLs Language Implementation Patterns. He describes various ways to use Antlr, as for instance to write an abstract syntax tree generator to put into a compiler.
