我可以使用Antlr创建的词法分析器/解析器来解析PDDL文件并将数据返回给Java程序吗?

时间:2022-10-29 13:42:32

I am new to Antlr, but have used Flex/Bison before. I want to know if what I want to do using Antlr is possible.

我是Antlr的新手,但之前使用过Flex / Bison。我想知道使用Antlr我想做什么是可能的。

I want to parse an PDDL file using Antlr and build up my own representation of the PDDL file's contents in a Java Class that I wrote as the PDDL file is parsed (in the actions for the rules?). After the file is finished parsing I want to return the object representation of the file's contents to the Java program to run other operations on.

我想使用Antlr解析PDDL文件,并在解析PDDL文件时编写的Java类中构建我自己的PDDL文件内容表示(在规则的操作中?)。在文件完成解析之后,我想将文件内容的对象表示返回给Java程序以运行其他操作。

So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.

基本上,我想从Java程序内部在PDDL文件上调用Antler生成的PDDL解析器,并让它返回一个描述PDDL文件到主Java程序的对象。

Is this possible? I have tried looking at the documentation, but haven't found a good answer.

这可能吗?我试过看文档,但没有找到一个好的答案。

Thanks very much.

非常感谢。

2 个解决方案

#1


8  

So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.

基本上,我想从Java程序内部在PDDL文件上调用Antler生成的PDDL解析器,并让它返回一个描述PDDL文件到主Java程序的对象。

Is this possible?

这可能吗?

Sure.

First you need to describe your language in a (ANTLR) grammar file. The easiest is to do this in a combined grammar. A combined grammar will create a lexer and parser for your language. When the language gets more complex, it is better to separate these two, but to start out, it will be easier to use just one (combined) grammar file.

首先,您需要在(ANTLR)语法文件中描述您的语言。最简单的方法是在组合语法中执行此操作。组合语法将为您的语言创建词法分析器和解析器。当语言变得更复杂时,最好将这两者分开,但首先,只使用一个(组合的)语法文件会更容易。

Let's say the PDDL language is just an easy language: it is a succession of one or more numbers either in hexadecimal (0x12FD), octal (0745) or decimal (12345) notation separated by white spaces. This language can be described in the following ANTLR grammar file called PDDL.g:

假设PDDL语言只是一种简单的语言:它是由十六进制(0x12FD),八进制(0745)或十进制(12345)表示的一个或多个数字的连续,用空格分隔。可以在以下名为PDDL.g的ANTLR语法文件中描述该语言:

grammar PDDL;

parse
  :  number+ EOF
  ;

number
  :  Hex
  |  Dec
  |  Oct
  ;

Hex
  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+
  ;

Dec
  :  '0'
  |  '1'..'9' ('0'..'9')*
  ;

Oct
  :  '0' '0'..'7'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}
  ;

In this grammar, the rules (parse, number, Hex, ... are rules) that start with a capital are lexer-rules. The other ones are parser-rules.

在这个语法中,以大写字母开头的规则(解析,数字,十六进制,......是规则)是词法规则。其他的是解析器规则。

From this grammar, you can create a lexer and parser like this:

从这个语法,你可以像这样创建一个词法分析器和解析器:

java -cp antlr-3.2.jar org.antlr.Tool PDDL.g

which produces (at least) the files PDDLParser.java and PDDLLexer.java.

它产生(至少)文件PDDLParser.java和PDDLLexer.java。

Now create a little test class in which you can use these lexer and parser classes:

现在创建一个小测试类,您可以在其中使用这些lexer和parser类:

import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        parser.parse();
    }
}

where the contents of the source.txt file might look like this:

source.txt文件的内容可能如下所示:

0xcAfE 0234
66678 0X12 0777

Now compile all .java files:

现在编译所有.java文件:

javac -cp antlr-3.2.jar *.java

and run the main class:

并运行主类:

// Windows
java -cp .;antlr-3.2.jar Main

// *nix/MacOS
java -cp .:antlr-3.2.jar Main

If all goes well, nothing is being printed to the console.

如果一切顺利,则不会向控制台打印任何内容。

Now you say you wanted to let the parser return certain objects based on the contents of your source file. Let's say we want our grammar to return a List<Integer>. This can be done by embedding "actions" in your grammar rules like this:

现在你说你想让解析器根据源文件的内容返回某些对象。假设我们希望我们的语法返回List 。这可以通过在您的语法规则中嵌入“actions”来完成,如下所示:

grammar PDDL;

parse returns [List<Integer> list]
@init{$list = new ArrayList<Integer>();}
  :  (number {$list.add($number.value);})+ EOF
  ;

number returns [Integer value]
  :  Hex {$value = Integer.parseInt($Hex.text.substring(2), 16);}
  |  Dec {$value = Integer.parseInt($Dec.text);}
  |  Oct {$value = Integer.parseInt($Oct.text, 8);}
  ;

Hex
  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+
  ;

Dec
  :  '0'
  |  '1'..'9' ('0'..'9')*
  ;

Oct
  :  '0' '0'..'7'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}
  ;

As you can see, you can let rules return objects (returns [Type t]) and can embed plain Java code if wrapping it in { and }. The @init part in the parse rule is placed at the start of the parse method in the PDDLParser.java file.

正如您所看到的,您可以让规则返回对象(返回[Type t]),并且如果将它包装在{和}中,则可以嵌入纯Java代码。解析规则中的@init部分放在PDDLParser.java文件中的parse方法的开头。

Test the new parser with this class:

使用此类测试新的解析器:

import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        List<Integer> numbers = parser.parse();
        System.out.println("After parsing :: "+numbers);
    }
}

and you'll see the following being printed to the console:

你会看到以下内容被打印到控制台:

After parsing :: [51966, 156, 66678, 18, 511]

#2


0  

This is certainly possible, since Antlr is designed to generate parsers that then get invoked as part of a larger system (eg, a compiler or a static code analyzer).

这当然是可能的,因为Antlr旨在生成解析器,然后将其作为更大系统(例如,编译器或静态代码分析器)的一部分进行调用。

Start with Terence Parr's The Definitive Antlr Reference: Building Domain-Specific Languages. He's the Antlr author, and also an unusually clear and jargon-free teacher on language processing.

从Terence Parr的The Definitive Antlr Reference:Building Domain-Specific Languages开始。他是Antlr的作者,也是语言处理方面非常清晰且没有行话的老师。

Martin Fowler's Domain-Specific Languages uses Antlr in a lot of its examples. For instance on page 200 he shows a simple "Hello World" example where a Java program calls Antlr to parse a file of people to greet, and while doing it emits the greetings. Here's where the work gets done (page 206):

Martin Fowler的Domain-Specific Languages在很多例子中都使用了Antlr。例如,在第200页,他展示了一个简单的“Hello World”示例,其中Java程序调用Antlr来解析人们要问候的文件,并在执行此操作时发出问候语。这是工作完成的地方(第206页):

class GreetingsLoader. ..
  public void run() {
    try {
      GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input));
      GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer));
      parser.helper = this;
      parser.script() ;
      if (hasErrors() ) throw new RuntimeException("it all went pear-shaped\n" +
 errorReport() ) ;
    } catch (IOException e) {
      throw new RuntimeException( e) ;
    } catch (RecognitionException e) {
      throw new RuntimeException( e) ;
    }
  }

A third good book is Terence's new one on DSLs Language Implementation Patterns. He describes various ways to use Antlr, as for instance to write an abstract syntax tree generator to put into a compiler.

第三本好书是Terence关于DSL语言实现模式的新书。他描述了使用Antlr的各种方法,例如编写一个抽象语法树生成器以放入编译器。

#1


8  

So essentially, I want to invoke an Antler produced PDDL parser on a PDDL file from inside a Java program and have it return an object that describes the PDDL file to the main Java program.

基本上,我想从Java程序内部在PDDL文件上调用Antler生成的PDDL解析器,并让它返回一个描述PDDL文件到主Java程序的对象。

Is this possible?

这可能吗?

Sure.

First you need to describe your language in a (ANTLR) grammar file. The easiest is to do this in a combined grammar. A combined grammar will create a lexer and parser for your language. When the language gets more complex, it is better to separate these two, but to start out, it will be easier to use just one (combined) grammar file.

首先,您需要在(ANTLR)语法文件中描述您的语言。最简单的方法是在组合语法中执行此操作。组合语法将为您的语言创建词法分析器和解析器。当语言变得更复杂时,最好将这两者分开,但首先,只使用一个(组合的)语法文件会更容易。

Let's say the PDDL language is just an easy language: it is a succession of one or more numbers either in hexadecimal (0x12FD), octal (0745) or decimal (12345) notation separated by white spaces. This language can be described in the following ANTLR grammar file called PDDL.g:

假设PDDL语言只是一种简单的语言:它是由十六进制(0x12FD),八进制(0745)或十进制(12345)表示的一个或多个数字的连续,用空格分隔。可以在以下名为PDDL.g的ANTLR语法文件中描述该语言:

grammar PDDL;

parse
  :  number+ EOF
  ;

number
  :  Hex
  |  Dec
  |  Oct
  ;

Hex
  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+
  ;

Dec
  :  '0'
  |  '1'..'9' ('0'..'9')*
  ;

Oct
  :  '0' '0'..'7'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}
  ;

In this grammar, the rules (parse, number, Hex, ... are rules) that start with a capital are lexer-rules. The other ones are parser-rules.

在这个语法中,以大写字母开头的规则(解析,数字,十六进制,......是规则)是词法规则。其他的是解析器规则。

From this grammar, you can create a lexer and parser like this:

从这个语法,你可以像这样创建一个词法分析器和解析器:

java -cp antlr-3.2.jar org.antlr.Tool PDDL.g

which produces (at least) the files PDDLParser.java and PDDLLexer.java.

它产生(至少)文件PDDLParser.java和PDDLLexer.java。

Now create a little test class in which you can use these lexer and parser classes:

现在创建一个小测试类,您可以在其中使用这些lexer和parser类:

import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        parser.parse();
    }
}

where the contents of the source.txt file might look like this:

source.txt文件的内容可能如下所示:

0xcAfE 0234
66678 0X12 0777

Now compile all .java files:

现在编译所有.java文件:

javac -cp antlr-3.2.jar *.java

and run the main class:

并运行主类:

// Windows
java -cp .;antlr-3.2.jar Main

// *nix/MacOS
java -cp .:antlr-3.2.jar Main

If all goes well, nothing is being printed to the console.

如果一切顺利,则不会向控制台打印任何内容。

Now you say you wanted to let the parser return certain objects based on the contents of your source file. Let's say we want our grammar to return a List<Integer>. This can be done by embedding "actions" in your grammar rules like this:

现在你说你想让解析器根据源文件的内容返回某些对象。假设我们希望我们的语法返回List 。这可以通过在您的语法规则中嵌入“actions”来完成,如下所示:

grammar PDDL;

parse returns [List<Integer> list]
@init{$list = new ArrayList<Integer>();}
  :  (number {$list.add($number.value);})+ EOF
  ;

number returns [Integer value]
  :  Hex {$value = Integer.parseInt($Hex.text.substring(2), 16);}
  |  Dec {$value = Integer.parseInt($Dec.text);}
  |  Oct {$value = Integer.parseInt($Oct.text, 8);}
  ;

Hex
  :  '0' ('x' | 'X') ('0'..'9' | 'a'..'f' | 'A'..'F')+
  ;

Dec
  :  '0'
  |  '1'..'9' ('0'..'9')*
  ;

Oct
  :  '0' '0'..'7'+
  ;

Space
  :  (' ' | '\t' | '\r' | '\n'){$channel=HIDDEN;}
  ;

As you can see, you can let rules return objects (returns [Type t]) and can embed plain Java code if wrapping it in { and }. The @init part in the parse rule is placed at the start of the parse method in the PDDLParser.java file.

正如您所看到的,您可以让规则返回对象(返回[Type t]),并且如果将它包装在{和}中,则可以嵌入纯Java代码。解析规则中的@init部分放在PDDLParser.java文件中的parse方法的开头。

Test the new parser with this class:

使用此类测试新的解析器:

import org.antlr.runtime.*;
import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        File source = new File("source.txt");
        ANTLRInputStream in = new ANTLRInputStream(new FileInputStream(source));
        PDDLLexer lexer = new PDDLLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        PDDLParser parser = new PDDLParser(tokens);
        List<Integer> numbers = parser.parse();
        System.out.println("After parsing :: "+numbers);
    }
}

and you'll see the following being printed to the console:

你会看到以下内容被打印到控制台:

After parsing :: [51966, 156, 66678, 18, 511]

#2


0  

This is certainly possible, since Antlr is designed to generate parsers that then get invoked as part of a larger system (eg, a compiler or a static code analyzer).

这当然是可能的,因为Antlr旨在生成解析器,然后将其作为更大系统(例如,编译器或静态代码分析器)的一部分进行调用。

Start with Terence Parr's The Definitive Antlr Reference: Building Domain-Specific Languages. He's the Antlr author, and also an unusually clear and jargon-free teacher on language processing.

从Terence Parr的The Definitive Antlr Reference:Building Domain-Specific Languages开始。他是Antlr的作者,也是语言处理方面非常清晰且没有行话的老师。

Martin Fowler's Domain-Specific Languages uses Antlr in a lot of its examples. For instance on page 200 he shows a simple "Hello World" example where a Java program calls Antlr to parse a file of people to greet, and while doing it emits the greetings. Here's where the work gets done (page 206):

Martin Fowler的Domain-Specific Languages在很多例子中都使用了Antlr。例如,在第200页,他展示了一个简单的“Hello World”示例,其中Java程序调用Antlr来解析人们要问候的文件,并在执行此操作时发出问候语。这是工作完成的地方(第206页):

class GreetingsLoader. ..
  public void run() {
    try {
      GreetingsLexer lexer = new GreetingsLexer(new ANTLRReaderStream(input));
      GreetingsParser parser = new GreetingsParser(new CommonTokenStream(lexer));
      parser.helper = this;
      parser.script() ;
      if (hasErrors() ) throw new RuntimeException("it all went pear-shaped\n" +
 errorReport() ) ;
    } catch (IOException e) {
      throw new RuntimeException( e) ;
    } catch (RecognitionException e) {
      throw new RuntimeException( e) ;
    }
  }

A third good book is Terence's new one on DSLs Language Implementation Patterns. He describes various ways to use Antlr, as for instance to write an abstract syntax tree generator to put into a compiler.

第三本好书是Terence关于DSL语言实现模式的新书。他描述了使用Antlr的各种方法,例如编写一个抽象语法树生成器以放入编译器。