• 从零写一个编译器(十三):代码生成之遍历AST


    项目的完整代码在 C2j-Compiler

    前言

    在上一篇完成对JVM指令的生成,下面就可以真正进入代码生成部分了。通常现代编译器都是先把生成IR,再经过代码优化等等,最后才编译成目标平台代码。但是时间水平有限,我们没有IR也没有代码优化,就直接利用AST生成Java字节码

    入口

    进行代码生成的入口在CodeGen,和之前解释器一样:先获取main函数的头节点,从这个节点开始,先进入函数定义,再进入代码块

    函数定义节点

    在进入函数定义节点的时候,就要生成一个函数定义对应的Java字节码,即一个静态方法(因为我们对整个C语言文件生成为一个类,main方法为public static main,其它的则是对应的静态方法,结构体则是另外的类)

    • 对于函数定义先从节点中拿到对应的函数命和参数
    • emitArgs是用来处理参数的,根据参数生成相应的Java字节码
    • 如果这个函数是main的话已经是交由前面处理了,逻辑差不多(具体在start.Start中)
    case SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl:
        root.reverseChildren();
        AstNode n = root.getChildren().get(0);
        String name = (String) n.getAttribute(NodeKey.TEXT);
        symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
        generator.setCurrentFuncName(name);
        if (name != null && !name.equals("main")) {
            String declaration = name + emitArgs(symbol);
            generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration);
            generator.setNameAndDeclaration(name, declaration);
        }
        copyChild(root, root.getChildren().get(0));
        break;
    
    case SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl:
        n = root.getChildren().get(0);
        name = (String) n.getAttribute(NodeKey.TEXT);
        symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
        generator.setCurrentFuncName(name);
        if (name != null && !name.equals("main")) {
            String declaration = name + emitArgs(symbol);
            generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration);
            generator.setNameAndDeclaration(name, declaration);
        }
    
        Symbol args = symbol.getArgList();
    
        if (args == null || argsList == null || argsList.isEmpty()) {
            System.err.println("generate function with arg list but arg list is null");
            System.exit(1);
        }
        break;
    

    创建结构体和数组

    数组

    创建结构体和数组的节点在DefGenerate里,可以看到在这里只处理了数组和普通变量,有关结构体的处理是在对结构体第一次使用的时候。顺便提一下代码生成对于赋初值操作是没有进行处理的。

    • 如果是个数组,酒直接调用ProgramGenerator直接生成创建数组的指令
    • 如果是个普通变量,就直接找到它并且赋值为0(这里变量在队列里的位置是根据符号表来计算的,具体可以看上一篇的getLocalVariableIndex方法)
    public class DefGenerate extends BaseGenerate {
        @Override
        public Object generate(AstNode root) {
            int production = (int) root.getAttribute(NodeKey.PRODUCTION);
            ProgramGenerator generator = ProgramGenerator.getInstance();
            Symbol symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
    
            switch (production) {
                case SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def:
                    Declarator declarator = symbol.getDeclarator(Declarator.ARRAY);
                    if (declarator != null) {
                        if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null) {
                            generator.createArray(symbol);
                        }
                    } else {
                        int i = generator.getLocalVariableIndex(symbol);
                        generator.emit(Instruction.SIPUSH, "" + 0);
                        generator.emit(Instruction.ISTORE, "" + i);
                    }
    
                    break;
    
                default:
                    break;
            }
    
            return root;
        }
    }
    

    结构体

    处理结构体定义的代码在UnaryNodeGenerate,也就是只有在使用到结构体定义时才会进行定义

    • 先拿到当前UNARY的符号,如果instanceof ArrayValueSetter就说明是一个结构体数组,就进入getStructSymbolFromStructArray方法创建一个结构体数组,并返回当前下标的结构体对象
    • 设置当前结构体的作用域范围
    • 对结构体作为类进行定义
    • 然后对读取结构体的域
    • 其实可以忽略指针部分,因为代码生成并没有对指针进行模拟
    case SyntaxProductionInit.Unary_StructOP_Name_TO_Unary:
        child = root.getChildren().get(0);
        String fieldName = (String) root.getAttribute(NodeKey.TEXT);
        Object object = child.getAttribute(NodeKey.SYMBOL);
        boolean isStructArray = false;
    
        if (object instanceof ArrayValueSetter) {
            symbol = getStructSymbolFromStructArray(object);
            symbol.addValueSetter(object);
            isStructArray = true;
        } else {
            symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
        }
    
        if (isStructArray) {
            ArrayValueSetter vs = (ArrayValueSetter) object;
            Symbol structArray = vs.getSymbol();
            structArray.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
        } else {
            symbol.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
        }
    
        ProgramGenerator.getInstance().putStructToClassDeclaration(symbol);
    
        if (isSymbolStructPointer(symbol)) {
            copyBetweenStructAndMem(symbol, false);
        }
    
        Symbol args = symbol.getArgList();
        while (args != null) {
            if (args.getName().equals(fieldName)) {
                args.setStructParent(symbol);
                break;
            }
    
            args = args.getNextSymbol();
        }
    
        if (args == null) {
            System.err.println("access a filed not in struct object!");
            System.exit(1);
        }
    
        if (args.getValue() != null) {
            ProgramGenerator.getInstance().readValueFromStructMember(symbol, args);
        }
    
        root.setAttribute(NodeKey.SYMBOL, args);
        root.setAttribute(NodeKey.VALUE, args.getValue());
    
        if (isSymbolStructPointer(symbol)) {
            checkValidPointer(symbol);
            structObjSymbol = symbol;
            monitorSymbol = args;
    
            GenerateBrocasterImpl.getInstance().registerReceiverForAfterExe(this);
        } else {
            structObjSymbol = null;
        }
        break;
    

    一元操作节点

    这个节点和在解释器的有很多相同,除了有对结构体的操作,其它的也是有非常重要的作用

    • 像数字、字符串或者是变量和之前的操作都是把信息传递到父节点,交由父节点处理
    case SyntaxProductionInit.Number_TO_Unary:
        text = (String) root.getAttribute(NodeKey.TEXT);
        boolean isFloat = text.indexOf('.') != -1;
        if (isFloat) {
            value = Float.valueOf(text);
            root.setAttribute(NodeKey.VALUE, value);
        } else {
            value = Integer.valueOf(text);
            root.setAttribute(NodeKey.VALUE, value);
        }
        break;
    
    case SyntaxProductionInit.Name_TO_Unary:
        symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
        if (symbol != null) {
            root.setAttribute(NodeKey.VALUE, symbol.getValue());
            root.setAttribute(NodeKey.TEXT, symbol.getName());
        }
        break;
    
    case SyntaxProductionInit.String_TO_Unary:
        text = (String) root.getAttribute(NodeKey.TEXT);
        root.setAttribute(NodeKey.VALUE, text);
        break;
    
    case SyntaxProductionInit.Unary_LB_Expr_RB_TO_Unary:
        child = root.getChildren().get(0);
        symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
    
        child = root.getChildren().get(1);
        int index = 0;
        if (child.getAttribute(NodeKey.VALUE) != null) {
            index = (Integer) child.getAttribute(NodeKey.VALUE);
        }
        Object idxObj = child.getAttribute(NodeKey.SYMBOL);
    
        try {
            Declarator declarator = symbol.getDeclarator(Declarator.ARRAY);
            if (declarator != null) {
                Object val = declarator.getElement((int) index);
                root.setAttribute(NodeKey.VALUE, val);
                ArrayValueSetter setter;
                if (idxObj == null) {
                    setter = new ArrayValueSetter(symbol, index);
                } else {
                    setter = new ArrayValueSetter(symbol, idxObj);
                }
    
                root.setAttribute(NodeKey.SYMBOL, setter);
                root.setAttribute(NodeKey.TEXT, symbol.getName());
    
            }
            Declarator pointer = symbol.getDeclarator(Declarator.POINTER);
            if (pointer != null) {
                setPointerValue(root, symbol, index);
                PointerValueSetter pv = new PointerValueSetter(symbol, index);
                root.setAttribute(NodeKey.SYMBOL, pv);
                root.setAttribute(NodeKey.TEXT, symbol.getName());
            }
        } catch (Exception e) {
            e.printStackTrace();
            System.exit(1);
        }
        break;
    

    赋值操作

    • 如果当前是一个数组,先拿到它的符号和下标
    • 如果不是结构体数组,那么拿到下标直接用readArrayElement生成读取数组元素的指令
    • 如果是一个符号则用getLocalVariableIndex读取这个符号的值
    • 如果是一个常数,则直接生成IPUSH指令
    • 最后进行赋值操作,如果不是对结构体的域进行赋值就直接用getLocalVariableIndex拿到队列位置然后生成ISTORE
    • 如果是对结构体数组的元素的域的赋值,就调用assignValueToStructMemberFromArray生成代码,如果只是结构体就直接调用assignValueToStructMember生成代码
    ProgramGenerator generator = ProgramGenerator.getInstance();
    
    if (BaseGenerate.resultOnStack) {
        this.value = obj;
        BaseGenerate.resultOnStack = false;
    } else if (obj instanceof ArrayValueSetter) {
        ArrayValueSetter setter = (ArrayValueSetter) obj;
        Symbol symbol = setter.getSymbol();
        Object index = setter.getIndex();
        if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null) {
            if (index instanceof Symbol) {
                ProgramGenerator.getInstance().readArrayElement(symbol, index);
                if (((Symbol) index).getValue() != null) {
                    int i = (int) ((Symbol) index).getValue();
                    try {
                        this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
                    } catch (Exception e) {
                        e.printStackTrace();
                    }
                }
            } else {
                int i = (int) index;
                try {
                    this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
                } catch (Exception e) {
                    e.printStackTrace();
                }
    
                ProgramGenerator.getInstance().readArrayElement(symbol, index);
            }
        }
    } else if (obj instanceof Symbol) {
        Symbol symbol = (Symbol) obj;
        this.value = symbol.value;
        int i = generator.getLocalVariableIndex(symbol);
        generator.emit(Instruction.ILOAD, "" + i);
    } else if (obj instanceof Integer) {
        Integer val = (Integer) obj;
        generator.emit(Instruction.SIPUSH, "" + val);
        this.value = obj;
    }
    
    if (!this.isStructMember()) {
        int idx = generator.getLocalVariableIndex(this);
        if (!generator.isPassingArguments()) {
            generator.emit(Instruction.ISTORE, "" + idx);
        }
    } else {
        if (this.getStructSymbol().getValueSetter() != null) {
            generator.assignValueToStructMemberFromArray(this.getStructSymbol().getValueSetter(), this, this.value);
        } else {
            generator.assignValueToStructMember(this.getStructSymbol(), this, this.value);
        }
    }
    

    最后

    完成这部分后,对下面的代码

    void quicksort(int A[10], int p, int r) {
        int x;
        int i;
        i = p - 1;
        int j;
        int t;
        int v;
        v = r - 1;
        if (p < r) {
            x = A[r];
            for (j = p; j <= v; j++) {
                if (A[j] <= x) {
                    i++;
                    t = A[i];
                    A[i] = A[j];
                    A[j] = t;
                }
            }
            v = i + 1;
            t = A[v];
            A[v] = A[r];
            A[r] = t;
            t = v - 1;
            quicksort(A, p, t);
            t = v + 1;
            quicksort(A, t, r);
        }
    }
    
    void main () {
        int a[10];
        int i;
        int t;
        printf("before quick sort:");
        for(i = 0; i < 10; i++) {
            t = (10 - i);
            a[i] = t;
            printf("value of a[%d] is %d", i, a[i]);
        }
        quicksort(a, 0, 9);
        printf("after quick sort:");
        for (i = 0; i < 10; i++) {
            printf("value of a[%d] is %d", i, a[i]);
        }
    }
    

    则会生成下面的Java字节码

    .class public C2Bytecode
    .super java/lang/Object
    
    .method public static main([Ljava/lang/String;)V
    	sipush	10
    	newarray	int
    	astore	0
    	sipush	0
    	istore	1
    	sipush	0
    	istore	2
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"before quick sort:"
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"
    "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	sipush	0
    	istore	1
    
    loop0:
    	iload	1
    	sipush	10
    if_icmpge branch0
    	sipush	10
    	iload	1
    	isub
    	istore	2
    	aload	0
    	iload	1
    	iload	2
    	iastore
    	aload	0
    	iload	1
    	iaload
    	istore	3
    	iload	1
    	istore	4
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"value of a["
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	iload	4
    	invokevirtual	java/io/PrintStream/print(I)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"] is "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	iload	3
    	invokevirtual	java/io/PrintStream/print(I)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"
    "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	iload	1
    	sipush	1
    	iadd
    	istore	1
    goto loop0
    branch0:
    	aload	0
    	sipush	0
    	sipush	9
    	invokestatic	C2Bytecode/quicksort([III)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"after quick sort:"
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"
    "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	sipush	0
    	istore	1
    
    loop2:
    	iload	1
    	sipush	10
    if_icmpge branch4
    	aload	0
    	iload	1
    	iaload
    	istore	3
    	iload	1
    	istore	4
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"value of a["
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	iload	4
    	invokevirtual	java/io/PrintStream/print(I)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"] is "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	iload	3
    	invokevirtual	java/io/PrintStream/print(I)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"
    "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	iload	1
    	sipush	1
    	iadd
    	istore	1
    goto loop2
    branch4:
    	return
    .end method
    .method public static quicksort([III)V
    	sipush	2
    	newarray	int
    	astore	6
    	sipush	0
    	istore	5
    	sipush	1
    	istore	5
    	aload	6
    	iload	5
    	sipush	1
    	iastore
    	aload	6
    	sipush	1
    	iaload
    	istore	10
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"before quick sort: "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	iload	10
    	invokevirtual	java/io/PrintStream/print(I)V
    	getstatic	java/lang/System/out Ljava/io/PrintStream;
    	ldc	"
    "
    	invokevirtual	java/io/PrintStream/print(Ljava/lang/String;)V
    	sipush	0
    	istore	9
    	sipush	0
    	istore	3
    	iload	1
    	sipush	1
    	isub
    	istore	3
    	sipush	0
    	istore	4
    	sipush	0
    	istore	7
    	sipush	0
    	istore	8
    	iload	2
    	sipush	1
    	isub
    	istore	8
    	iload	1
    	iload	2
    if_icmpge branch1
    
    	aload	0
    	iload	2
    	iaload
    	istore	9
    	iload	1
    	istore	4
    
    loop1:
    
    	iload	4
    	iload	8
    if_icmpgt ibranch1
    
    	aload	0
    	iload	4
    	iaload
    	iload	9
    if_icmpgt ibranch2
    
    	iload	3
    	sipush	1
    	iadd
    	istore	3
    	aload	0
    	iload	3
    	iaload
    	istore	7
    	aload	0
    	iload	3
    	aload	0
    	iload	4
    	iaload
    	iastore
    	aload	0
    	iload	4
    	iload	7
    	iastore
    ibranch2:
    
    	iload	4
    	sipush	1
    	iadd
    	istore	4
    goto loop1
    
    ibranch1:
    
    	iload	3
    	sipush	1
    	iadd
    	istore	8
    	aload	0
    	iload	8
    	iaload
    	istore	7
    	aload	0
    	iload	8
    	aload	0
    	iload	2
    	iaload
    	iastore
    	aload	0
    	iload	2
    	iload	7
    	iastore
    	iload	8
    	sipush	1
    	isub
    	istore	7
    	aload	0
    	iload	1
    	iload	7
    	invokestatic	C2Bytecode/quicksort([III)V
    	iload	8
    	sipush	1
    	iadd
    	istore	7
    	aload	0
    	iload	7
    	iload	2
    	invokestatic	C2Bytecode/quicksort([III)V
    branch1:
    
    	return
    .end method
    
    .end class
    

    小结

    这篇的代码生成和之前解释器的思路很相似,都是根据AST和对应的产生式来执行或者生成代码。

    其实主要的思路是很清晰的,只是其中有太多细节容易让人太过纠结。这个系列算作是我自己的学习笔记,到这也有十三篇了,下一篇可能写写总结就正式结束了。

    欢迎Star!

  • 相关阅读:
    node连接数据库(express+mysql)
    Apache与Tomcat有什么关系和区别
    tomcat部署简单的html静态网页
    Vue项目webpack打包部署到服务器
    windows下vue+webpack前端开发环境搭建及nginx部署
    DIV 边框重叠 像素变粗 解决办法
    基于iscroll实现下拉和上拉刷新
    css图片居中(水平居中和垂直居中)
    nodejs分页类
    事件--键盘事件
  • 原文地址:https://www.cnblogs.com/secoding/p/11391239.html
Copyright © 2020-2023  润新知