當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

C语言编译器开发之旅（二）：解析器

發(fā)布時間：2024/3/13 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 C语言编译器开发之旅（二）：解析器小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

解析器將識別的語言定義一個語法。我么這里采用BNF描述：

expression: number
| expression ‘*’ expression
| expression ‘/’ expression
| expression ‘+’ expression
| expression ‘-’ expression
;

number: T_INTLIT
;
我們都知道BNF定義的語法是遞歸定義的，那么我們也需要一個遞歸函數(shù)去解析輸入的表達式。在我們現(xiàn)有的語法元素可以構(gòu)成的表達式中第一個語法元素始終為數(shù)字，否則就是語法錯誤。其后可能是一個運算符，或者只有一個數(shù)字。那么我們可以用如下偽代碼表示我們的遞歸下降解析函數(shù)：

function expression() {
Scan and check the first token is a number. Error if it’s not
Get the next token
If we have reached the end of the input, return, i.e. base case

Otherwise, call expression()
}
讓我們來模擬一次此函數(shù)的運行，輸入為2 + 3 - 5 T_EOF其中T_EOF 是反映輸入結(jié)束的標(biāo)記。

expression0:
Scan in the 2, it’s a number
Get next token, +, which isn’t T_EOF
Call expression()

expression1:Scan in the 3, it's a numberGet next token, -, which isn't T_EOFCall expression()expression2:Scan in the 5, it's a numberGet next token, T_EOF, so return from expression2return from expression1

return from expression0
為了進行語義分析，我們需要代碼來解釋識別的輸入，或者將其轉(zhuǎn)換為另一種格式，例如匯編代碼。在旅程的這一部分，我們將為輸入構(gòu)建一個解釋器。但要實現(xiàn)這一目標(biāo)，我們首先要將輸入轉(zhuǎn)換為抽象語法樹。

抽象語法樹的節(jié)點結(jié)構(gòu)定義如下：

// defs.h
// AST node types
enum {
A_ADD, A_SUBTRACT, A_MULTIPLY, A_DIVIDE, A_INTLIT
};

// Abstract Syntax Tree structure
struct ASTnode {
int op; // “Operation” to be performed on this tree
struct ASTnode *left; // Left and right child trees
struct ASTnode *right;
int intvalue; // For A_INTLIT, the integer value
};
節(jié)點元素op表示該節(jié)點的類型，當(dāng)op的值為A_ADD、A_SUBTRACT等運算符時，該節(jié)點具有左右兩顆子樹，我們將使用op代表的運算符對左右兩棵子樹的值做計算；當(dāng)op的值為A_INTLIT時，代表該節(jié)點是整數(shù)值，是葉節(jié)點，節(jié)點元素intvalue存儲著該整數(shù)的值。

tree.c 中的代碼具有構(gòu)建 AST 的功能。函數(shù)mkastnode()生成一個節(jié)點并返回指向節(jié)點的指針：

// tree.c
// Build and return a generic AST node
struct ASTnode *mkastnode(int op, struct ASTnode *left,
struct ASTnode *right, int intvalue) {
struct ASTnode *n;

// Malloc a new ASTnode
n = (struct ASTnode *) malloc(sizeof(struct ASTnode));
if (n == NULL) {
fprintf(stderr, “Unable to malloc in mkastnode()\n”);
exit(1);
}
// Copy in the field values and return it
n->op = op;
n->left = left;
n->right = right;
n->intvalue = intvalue;
return (n);
}
我們對其進一步封裝出兩個常用的函數(shù)，分別用來創(chuàng)建左子樹與葉節(jié)點：

// Make an AST leaf node
struct ASTnode *mkastleaf(int op, int intvalue) {
return (mkastnode(op, NULL, NULL, intvalue));
}

// Make a unary AST node: only one child
struct ASTnode *mkastunary(int op, struct ASTnode *left, int intvalue) {
return (mkastnode(op, left, NULL, intvalue));
我們將使用 AST 來存儲我們識別的每個表達式，以便稍后我們可以遞歸遍歷它來計算表達式的最終值。我們確實想處理數(shù)學(xué)運算符的優(yōu)先級。這是一個例子。考慮表達式 2 * 3 4 * 5。現(xiàn)在，乘法比加法具有更高的優(yōu)先級。因此，我們希望將乘法操作數(shù)綁定在一起并在進行加法之前執(zhí)行這些操作。

如果我們生成 AST 樹看起來像這樣：

+/ \/ \/ \* */ \ / \ 2 3 4 5

然后，在遍歷樹時，我們會先執(zhí)行 2 * 3，然后是 4 * 5。一旦我們有了這些結(jié)果，我們就可以將它們傳遞給樹的根來執(zhí)行加法。

在開始解析語法樹之前，我們需要一個將掃描到的token轉(zhuǎn)換為AST節(jié)點操作值的函數(shù)，如下：

// expr.c
// Convert a token into an AST operation.
int arithop(int tok) {
switch (tok) {
case T_PLUS:
return (A_ADD);
case T_MINUS:
return (A_SUBTRACT);
case T_STAR:
return (A_MULTIPLY);
case T_SLASH:
return (A_DIVIDE);
default:
fprintf(stderr, “unknown token in arithop() on line %d\n”, Line);
exit(1);
}
}
我們需要一個函數(shù)來檢查下一個標(biāo)記是否是整數(shù)文字，并構(gòu)建一個 AST 節(jié)點來保存文字值。如下：

// Parse a primary factor and return an
// AST node representing it.
static struct ASTnode *primary(void) {
struct ASTnode *n;

// For an INTLIT token, make a leaf AST node for it
// and scan in the next token. Otherwise, a syntax error
// for any other token type.
switch (Token.token) {
case T_INTLIT:
n = mkastleaf(A_INTLIT, Token.intvalue);
scan(&Token);
return (n);
default:
fprintf(stderr, “syntax error on line %d\n”, Line);
exit(1);
}
}
這里的Token是一個全局變量，保存著掃描到的最新的值。

那么我們現(xiàn)在可以寫解析輸入表達式生成AST的方法：

// Return an AST tree whose root is a binary operator
struct ASTnode *binexpr(void) {
struct ASTnode *n, *left, *right;
int nodetype;

// Get the integer literal on the left.
// Fetch the next token at the same time.
left = primary();

// If no tokens left, return just the left node
if (Token.token == T_EOF)
return (left);

// Convert the token into a node type
nodetype = arithop(Token.token);

// Get the next token in
scan(&Token);

// Recursively get the right-hand tree
right = binexpr();

// Now build a tree with both sub-trees
n = mkastnode(nodetype, left, right, 0);
return (n);
}
這只是一個子簡單的解析器，他的解析結(jié)果沒有實現(xiàn)優(yōu)先級的調(diào)整，解析結(jié)果如下：

* / \

2 +
/
3 *
/
4 5
正確的樹狀結(jié)構(gòu)應(yīng)該是這樣的：

+/ \/ \/ \* */ \ / \ 2 3 4 5

我們將在下一節(jié)實現(xiàn)生成一個正確的AST。

那么接下來我們來試著寫代碼遞歸的解釋這顆AST。我們以正確的語法樹為例，偽代碼：

interpretTree:
First, interpret the left-hand sub-tree and get its value
Then, interpret the right-hand sub-tree and get its value
Perform the operation in the node at the root of our tree
on the two sub-tree values, and return this value
調(diào)用過程可以用如下過程表示：

interpretTree0(tree with +):
Call interpretTree1(left tree with *):
Call interpretTree2(tree with 2):
No maths operation, just return 2
Call interpretTree3(tree with 3):
No maths operation, just return 3
Perform 2 * 3, return 6

Call interpretTree1(right tree with *):
Call interpretTree2(tree with 4):
No maths operation, just return 4
Call interpretTree3(tree with 5):
No maths operation, just return 5
Perform 4 * 5, return 20

Perform 6 + 20, return 26
這是在interp.c 中并依據(jù)上述偽代碼寫的功能：

// Given an AST, interpret the
// operators in it and return
// a final value.
int interpretAST(struct ASTnode *n) {
int leftval, rightval;

// Get the left and right sub-tree values
if (n->left)
leftval = interpretAST(n->left);
if (n->right)
rightval = interpretAST(n->right);

switch (n->op) {
case A_ADD:
return (leftval + rightval);
case A_SUBTRACT:
return (leftval - rightval);
case A_MULTIPLY:
return (leftval * rightval);
case A_DIVIDE:
return (leftval / rightval);
case A_INTLIT:
return (n->intvalue);
default:
fprintf(stderr, “Unknown AST operator %d\n”, n->op);
exit(1);
}
}
這里還有一些其他代碼，比如調(diào)用 main() 中的解釋器：

scan(&Token); // Get the first token from the input
n = binexpr(); // Parse the expression in the file
printf("%d\n", interpretAST(n)); // Calculate the final result
exit(0);
USB Microphone https://www.soft-voice.com/
Wooden Speakers https://www.zeshuiplatform.com/
亞馬遜測評 www.yisuping.cn
深圳網(wǎng)站建設(shè)www.sz886.com

總結(jié)

以上是生活随笔為你收集整理的C语言编译器开发之旅（二）：解析器的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：如何自学英语
下一篇： 20165220 我期望的师生关系

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

C语言编译器开发之旅（二）：解析器

總結(jié)