javac的词法分析

时间:2022-06-11 13:37:27
 
1、词法分析将Java源文件的字符流转变为对应的Token流。
 
JavacParser类规定了哪些词是符合Java语言规范规定的词,而具体读取和归类不同词法的操作由Scanner类来完成。
 
首先来看一下词法分析:
 /**
    CompilationUnit = [ { "@" Annotation } PACKAGE Qualident ";"] {ImportDeclaration} {TypeDeclaration}
  */
    public JCTree.JCCompilationUnit parseCompilationUnit() {
        int pos = S.pos();
        JCExpression pid = null;
        String dc = S.docComment();
        JCModifiers mods = null;
        List<JCAnnotation> packageAnnotations = List.nil();
        if (S.token() == MONKEYS_AT)
            mods = modifiersOpt();
        if (S.token() == PACKAGE) {
            if (mods != null) {
                checkNoMods(mods.flags);
                packageAnnotations = mods.annotations;
                mods = null;
            }
            S.nextToken();
            pid = qualident();
            accept(SEMI);
        }
        ListBuffer<JCTree> defs = new ListBuffer<JCTree>();
        boolean checkForImports = true;
        while (S.token() != EOF) {
            if (S.pos() <= errorEndPos) {
                // error recovery
                skip(checkForImports, false, false, false);
                if (S.token() == EOF)
                    break;
            }
            if (checkForImports && mods == null && S.token() == IMPORT) {
                defs.append(importDeclaration());
            } else {
                JCTree def = typeDeclaration(mods);
                if (keepDocComments && dc != null && docComments.get(def) == dc) {
                    // If the first type declaration has consumed the first doc
                    // comment, then don't use it for the top level comment as well.
                    dc = null;
                }
                if (def instanceof JCExpressionStatement)
                    def = ((JCExpressionStatement)def).expr;
                defs.append(def);
                if (def instanceof JCClassDecl)
                    checkForImports = false;
                mods = null;
            }
        }
        JCTree.JCCompilationUnit toplevel = F.at(pos).TopLevel(packageAnnotations, pid, defs.toList());
        attach(toplevel, dc);
        if (defs.elems.isEmpty())
            storeEnd(toplevel, S.prevEndPos());
        if (keepDocComments)
            toplevel.docComments = docComments;
        if (keepLineMap)
            toplevel.lineMap = S.getLineMap();
        return toplevel;
    }
可以看到,从源文件的第一个字符开始,按照Java语法规范依次找出package、import、类定义、以及属性和方法等,最后构成一个抽象的语法树。
其实在形成一个Token流时,在词法分析的过程中,可以归纳为三类,分别是:
(1)标识符号:如Token.PLUS、Token.EQ、Token.LBRACE、Token.RBRACE等
(2)Java的保留关键字:
 
数据类型:
Boolean\int\long\short\byte\float\double\char\class\interface
 
流程控制:
if\else\do\while\for\switch\case\default\break\continue\return\try\catch\finally
 
修饰符:      
public   
protected   
private   
final   
void    
static   
strictfp    
abstract    
transient
synchronized    
volatile   
native
 
动作:          
package   
import    
throw   
throws    
extends   
implements   
this   
Super   
instanceof   
new
 
保留字:      
true\false\null\goto\const
 
(3)Token.IDENTIFIER
 
用来表示用户自定义的类名、包名、变量包、方法名等
/**
     * Qualident = Ident { DOT Ident }
     */
    public JCExpression qualident() {
        JCExpression t = toP(F.at(S.pos()).Ident(ident()));
        while (S.token() == DOT) {
            int pos = S.pos();
            S.nextToken();
            t = toP(F.at(pos).Select(t, ident()));
        }
        return t;
    }

  

可以打开Scanner类中的如下属性来进行调试查看:

private static boolean scannerDebug = false;