编译原理Antlr教程
一.安裝、配置Antlr
首先,安裝配置Antlr前,確保你已經(jīng)安裝好java環(huán)境了。
1.下載Antlr4
下載網(wǎng)址:https://www.antlr.org/download/
選擇 Tool and Java runtime lib 目錄下的 antlr-4.7.2-complete.jar 下載。
2.配置批處理文件
在 antlr-4.7.2-complete.jar 所在目錄下新建兩個bat文件,antlr4.bat和grun.bat
文件組織如下:
在antlr4.bat中寫入:
java org.antlr.v4.Tool %*在grun.bat中寫入:
java org.antlr.v4.gui.TestRig %*3.配置環(huán)境變量
步驟:(win10)設(shè)置 -> 系統(tǒng) -> 關(guān)于 -> (右上角)高級系統(tǒng)設(shè)置 -> 環(huán)境變量 ->系統(tǒng)變量。
在系統(tǒng)變量 CLASSPATH 中添加antlr-4.7.2-complete.jar所在路徑:
就成功配置好了Antlr環(huán)境。
二、使用Antlr
1.編寫.g4文件
.g4文件是antlr生成詞法解析規(guī)則和語法解析規(guī)則的基礎(chǔ),是語言的文法的表示方法。一個完整的文法是編譯原理整個實驗的基礎(chǔ)。
以下是我的實驗采用的C語言的文法文件。命名為MyCGrammer.g4
具體是參考
/*[The "BSD licence"]Copyright (c) 2013 Sam HarwellAll rights reserved.Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditionsare met:1. Redistributions of source code must retain the above copyrightnotice, this list of conditions and the following disclaimer.2. Redistributions in binary form must reproduce the above copyrightnotice, this list of conditions and the following disclaimer in thedocumentation and/or other materials provided with the distribution.3. The name of the author may not be used to endorse or promote productsderived from this software without specific prior written permission.THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS ORIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIESOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUTNOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANYTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OFTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. *//** C 2011 grammar built from the C11 Spec */ grammar MyCGrammer;primaryExpression: tokenId//Identifier| tokenConstant//Constant| tokenStringLiteral//StringLiteral+| '(' expression ')'| genericSelection| '__extension__'? '(' compoundStatement ')' // Blocks (GCC extension)| '__builtin_va_arg' '(' unaryExpression ',' typeName ')'| '__builtin_offsetof' '(' typeName ',' unaryExpression ')'; tokenId : Identifier; tokenConstant: Constant; tokenStringLiteral: StringLiteral+;genericSelection: '_Generic' '(' assignmentExpression ',' genericAssocList ')';genericAssocList: genericAssociation| genericAssocList ',' genericAssociation;genericAssociation: typeName ':' assignmentExpression| 'default' ':' assignmentExpression;postfixExpression: primaryExpression #postfixExpression_pass| postfixExpression '[' expression ']' #postfixExpression_arrayaccess| postfixExpression '(' argumentExpressionList? ')' #postfixExpression_funcall| postfixExpression '.' Identifier #postfixExpression_member| postfixExpression '->' Identifier #postfixExpression_point| postfixExpression '++' #postfixExpression_| postfixExpression '--' #postfixExpression_| '(' typeName ')' '{' initializerList '}' #postfixExpression_pass| '(' typeName ')' '{' initializerList ',' '}' #postfixExpression_pass| '__extension__' '(' typeName ')' '{' initializerList '}' #postfixExpression_pass| '__extension__' '(' typeName ')' '{' initializerList ',' '}' #postfixExpression_pass;argumentExpressionList: assignmentExpression| argumentExpressionList ',' assignmentExpression;unaryExpression: postfixExpression #unaryExpression_pass| '++' unaryExpression #unaryExpression_| '--' unaryExpression #unaryExpression_| unaryOperator castExpression #unaryExpression_| 'sizeof' unaryExpression #unaryExpression_pass| 'sizeof' '(' typeName ')' #unaryExpression_pass| '_Alignof' '(' typeName ')' #unaryExpression_pass| '&&' Identifier #unaryExpression_pass;unaryOperator: '&' | '*' | '+' | '-' | '~' | '!';castExpression: unaryExpression #castExpression_pass| '(' typeName ')' castExpression #castExpression_| '__extension__' '(' typeName ')' castExpression #castExpression_;multiplicativeExpression: castExpression #multiplicativeExpression_pass| multiplicativeExpression '*' castExpression #multiplicativeExpression_| multiplicativeExpression '/' castExpression #multiplicativeExpression_| multiplicativeExpression '%' castExpression #multiplicativeExpression_;additiveExpression: multiplicativeExpression #additiveExpression_pass| additiveExpression '+' multiplicativeExpression #additiveExpression_| additiveExpression '-' multiplicativeExpression #additiveExpression_;shiftExpression: additiveExpression #shiftExpression_pass| shiftExpression '<<' additiveExpression #shiftExpression_| shiftExpression '>>' additiveExpression #shiftExpression_;relationalExpression: shiftExpression #relationalExpression_pass| relationalExpression '<' shiftExpression #relationalExpression_| relationalExpression '>' shiftExpression #relationalExpression_| relationalExpression '<=' shiftExpression #relationalExpression_| relationalExpression '>=' shiftExpression #relationalExpression_;equalityExpression: relationalExpression #equalityExpression_pass| equalityExpression '==' relationalExpression #equalityExpression_| equalityExpression '!=' relationalExpression #equalityExpression_;andExpression: equalityExpression #andExpression_pass| andExpression '&' equalityExpression #andExpression_;exclusiveOrExpression: andExpression #exclusiveOrExpression_pass| exclusiveOrExpression '^' andExpression #exclusiveOrExpression_;inclusiveOrExpression: exclusiveOrExpression #inclusiveOrExpression_pass| inclusiveOrExpression '|' exclusiveOrExpression #inclusiveOrExpression_;logicalAndExpression: inclusiveOrExpression #logicalAndExpression_pass| logicalAndExpression '&&' inclusiveOrExpression #logicalAndExpression_;logicalOrExpression: logicalAndExpression #logicalOrExpression_pass| logicalOrExpression '||' logicalAndExpression #logicalOrExpression_;conditionalExpression: logicalOrExpression ('?' expression ':' conditionalExpression)?;assignmentExpression: conditionalExpression #assignmentExpression_pass| unaryExpression assignmentOperator assignmentExpression #assignmentExpression_;assignmentOperator: '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|=';expression: assignmentExpression #expression_| expression ',' assignmentExpression #expression_pass;constantExpression: conditionalExpression;declaration: declarationSpecifiers initDeclaratorList? ';'| staticAssertDeclaration;declarationSpecifiers: declarationSpecifier+;declarationSpecifiers2: declarationSpecifier+;declarationSpecifier: storageClassSpecifier| typeSpecifier| typeQualifier| functionSpecifier| alignmentSpecifier;initDeclaratorList: initDeclarator| initDeclaratorList ',' initDeclarator;initDeclarator: declarator| declarator '=' initializer;storageClassSpecifier: 'typedef'| 'extern'| 'static'| '_Thread_local'| 'auto'| 'register';typeSpecifier: 'void' #typeSpecifier_| 'char' #typeSpecifier_| 'short' #typeSpecifier_| 'int' #typeSpecifier_| 'long' #typeSpecifier_| 'float' #typeSpecifier_| 'double' #typeSpecifier_| 'signed' #typeSpecifier_| 'unsigned' #typeSpecifier_;structOrUnionSpecifier: structOrUnion Identifier? '{' structDeclarationList '}'| structOrUnion Identifier;structOrUnion: 'struct'| 'union';structDeclarationList: structDeclaration| structDeclarationList structDeclaration;structDeclaration: specifierQualifierList structDeclaratorList? ';'| staticAssertDeclaration;specifierQualifierList: typeSpecifier specifierQualifierList?| typeQualifier specifierQualifierList?;structDeclaratorList: structDeclarator| structDeclaratorList ',' structDeclarator;structDeclarator: declarator| declarator? ':' constantExpression;enumSpecifier: 'enum' Identifier? '{' enumeratorList '}'| 'enum' Identifier? '{' enumeratorList ',' '}'| 'enum' Identifier;enumeratorList: enumerator| enumeratorList ',' enumerator;enumerator: enumerationConstant| enumerationConstant '=' constantExpression;enumerationConstant: Identifier;atomicTypeSpecifier: '_Atomic' '(' typeName ')';typeQualifier: 'const'| 'restrict'| 'volatile'| '_Atomic';functionSpecifier: ('inline'| '_Noreturn'| '__inline__' // GCC extension| '__stdcall')| gccAttributeSpecifier| '__declspec' '(' Identifier ')';alignmentSpecifier: '_Alignas' '(' typeName ')'| '_Alignas' '(' constantExpression ')';declarator: pointer? directDeclarator gccDeclaratorExtension*;directDeclarator: Identifier #directDeclarator_pass| '(' declarator ')' #directDeclarator_pass| directDeclarator '[' typeQualifierList? assignmentExpression? ']' #directDeclarator_array| directDeclarator '[' 'static' typeQualifierList? assignmentExpression ']' #directDeclarator_array| directDeclarator '[' typeQualifierList 'static' assignmentExpression ']' #directDeclarator_array| directDeclarator '[' typeQualifierList? '*' ']' #directDeclarator_array| directDeclarator '(' parameterTypeList ')' #directDeclarator_func| directDeclarator '(' identifierList? ')' #directDeclarator_func;gccDeclaratorExtension: '__asm' '(' StringLiteral+ ')'| gccAttributeSpecifier;gccAttributeSpecifier: '__attribute__' '(' '(' gccAttributeList ')' ')';gccAttributeList: gccAttribute (',' gccAttribute)*| // empty;gccAttribute: ~(',' | '(' | ')') // relaxed def for "identifier or reserved word"('(' argumentExpressionList? ')')?| // empty;nestedParenthesesBlock: ( ~('(' | ')')| '(' nestedParenthesesBlock ')')*;pointer: '*' typeQualifierList?| '*' typeQualifierList? pointer| '^' typeQualifierList? // Blocks language extension| '^' typeQualifierList? pointer // Blocks language extension;typeQualifierList: typeQualifier| typeQualifierList typeQualifier;parameterTypeList: parameterList| parameterList ',' '...';parameterList: parameterDeclaration| parameterList ',' parameterDeclaration;parameterDeclaration: declarationSpecifiers declarator| declarationSpecifiers2 abstractDeclarator?;identifierList: Identifier| identifierList ',' Identifier;typeName: specifierQualifierList abstractDeclarator?;abstractDeclarator: pointer| pointer? directAbstractDeclarator gccDeclaratorExtension*;directAbstractDeclarator: '(' abstractDeclarator ')' gccDeclaratorExtension*| '[' typeQualifierList? assignmentExpression? ']'| '[' 'static' typeQualifierList? assignmentExpression ']'| '[' typeQualifierList 'static' assignmentExpression ']'| '[' '*' ']'| '(' parameterTypeList? ')' gccDeclaratorExtension*| directAbstractDeclarator '[' typeQualifierList? assignmentExpression? ']'| directAbstractDeclarator '[' 'static' typeQualifierList? assignmentExpression ']'| directAbstractDeclarator '[' typeQualifierList 'static' assignmentExpression ']'| directAbstractDeclarator '[' '*' ']'| directAbstractDeclarator '(' parameterTypeList? ')' gccDeclaratorExtension*;initializer: assignmentExpression| '{' initializerList '}'| '{' initializerList ',' '}';initializerList: designation? initializer| initializerList ',' designation? initializer;designation: designatorList '=';designatorList: designator| designatorList designator;designator: '[' constantExpression ']'| '.' Identifier;staticAssertDeclaration: '_Static_assert' '(' constantExpression ',' StringLiteral+ ')' ';';statement: labeledStatement| compoundStatement| expressionStatement| selectionStatement| iterationStatement| jumpStatement| ('__asm' | '__asm__') ('volatile' | '__volatile__') '(' (logicalOrExpression (',' logicalOrExpression)*)? (':' (logicalOrExpression (',' logicalOrExpression)*)?)* ')' ';';labeledStatement: Identifier ':' statement | 'case' constantExpression ':' statement | 'default' ':' statement;compoundStatement: '{' blockItemList? '}';blockItemList: blockItem| blockItemList blockItem;blockItem: declaration| statement;expressionStatement: expression? ';';selectionStatement: 'if' '(' expression ')' statement ('else' statement)? #selectionStatement_if| 'switch' '(' expression ')' statement #selectionStatement_switch;iterationStatement: 'while' '(' expression ')' statement #iterationStatement_while| 'do' statement 'while' '(' expression ')' ';' #iterationStatement_dowhile| 'for' '(' expression? ';' expression? ';' expression? ')' statement #iterationStatement_for| 'for' '(' declaration expression? ';' expression? ')' statement #iterationStatement_forDeclared;jumpStatement: 'goto' Identifier ';' #jumpStatement_goto| 'continue' ';' #jumpStatement_continue| 'break' ';' #jumpStatement_break| 'return' expression? ';' #jumpStatement_return| 'goto' unaryExpression ';' #jumpStatement_ // GCC extension ;compilationUnit: translationUnit? EOF;translationUnit: externalDeclaration| translationUnit externalDeclaration;externalDeclaration: functionDefinition| declaration| ';' // stray ;;functionDefinition: declarationSpecifiers? declarator declarationList? compoundStatement;declarationList: declaration| declarationList declaration;functionCall: tokenId '(' argumentExpressionList? ')' #functionCall_ ;Auto : 'auto'; Break : 'break'; Case : 'case'; Char : 'char'; Const : 'const'; Continue : 'continue'; Default : 'default'; Do : 'do'; Double : 'double'; Else : 'else'; Enum : 'enum'; Extern : 'extern'; Float : 'float'; For : 'for'; Goto : 'goto'; If : 'if'; Inline : 'inline'; Int : 'int'; Long : 'long'; Register : 'register'; Restrict : 'restrict'; Return : 'return'; Short : 'short'; Signed : 'signed'; Sizeof : 'sizeof'; Static : 'static'; Struct : 'struct'; Switch : 'switch'; Typedef : 'typedef'; Union : 'union'; Unsigned : 'unsigned'; Void : 'void'; Volatile : 'volatile'; While : 'while';Alignas : '_Alignas'; Alignof : '_Alignof'; Atomic : '_Atomic'; Bool : '_Bool'; Complex : '_Complex'; Generic : '_Generic'; Imaginary : '_Imaginary'; Noreturn : '_Noreturn'; StaticAssert : '_Static_assert'; ThreadLocal : '_Thread_local';LeftParen : '('; RightParen : ')'; LeftBracket : '['; RightBracket : ']'; LeftBrace : '{'; RightBrace : '}';Less : '<'; LessEqual : '<='; Greater : '>'; GreaterEqual : '>='; LeftShift : '<<'; RightShift : '>>';Plus : '+'; PlusPlus : '++'; Minus : '-'; MinusMinus : '--'; Star : '*'; Div : '/'; Mod : '%';And : '&'; Or : '|'; AndAnd : '&&'; OrOr : '||'; Caret : '^'; Not : '!'; Tilde : '~';Question : '?'; Colon : ':'; Semi : ';'; Comma : ',';Assign : '='; // '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|=' StarAssign : '*='; DivAssign : '/='; ModAssign : '%='; PlusAssign : '+='; MinusAssign : '-='; LeftShiftAssign : '<<='; RightShiftAssign : '>>='; AndAssign : '&='; XorAssign : '^='; OrAssign : '|=';Equal : '=='; NotEqual : '!=';Arrow : '->'; Dot : '.'; Ellipsis : '...';Identifier: IdentifierNondigit( IdentifierNondigit| Digit)*;fragment IdentifierNondigit: Nondigit| UniversalCharacterName//| // other implementation-defined characters...;fragment Nondigit: [a-zA-Z_];fragment Digit: [0-9];fragment UniversalCharacterName: '\\u' HexQuad| '\\U' HexQuad HexQuad;fragment HexQuad: HexadecimalDigit HexadecimalDigit HexadecimalDigit HexadecimalDigit;Constant: IntegerConstant| FloatingConstant//| EnumerationConstant| CharacterConstant;fragment IntegerConstant: DecimalConstant IntegerSuffix?| OctalConstant IntegerSuffix?| HexadecimalConstant IntegerSuffix?| BinaryConstant;fragment BinaryConstant: '0' [bB] [0-1]+;fragment DecimalConstant: NonzeroDigit Digit*;fragment OctalConstant: '0' OctalDigit*;fragment HexadecimalConstant: HexadecimalPrefix HexadecimalDigit+;fragment HexadecimalPrefix: '0' [xX];fragment NonzeroDigit: [1-9];fragment OctalDigit: [0-7];fragment HexadecimalDigit: [0-9a-fA-F];fragment IntegerSuffix: UnsignedSuffix LongSuffix?| UnsignedSuffix LongLongSuffix| LongSuffix UnsignedSuffix?| LongLongSuffix UnsignedSuffix?;fragment UnsignedSuffix: [uU];fragment LongSuffix: [lL];fragment LongLongSuffix: 'll' | 'LL';fragment FloatingConstant: DecimalFloatingConstant| HexadecimalFloatingConstant;fragment DecimalFloatingConstant: FractionalConstant ExponentPart? FloatingSuffix?| DigitSequence ExponentPart FloatingSuffix?;fragment HexadecimalFloatingConstant: HexadecimalPrefix HexadecimalFractionalConstant BinaryExponentPart FloatingSuffix?| HexadecimalPrefix HexadecimalDigitSequence BinaryExponentPart FloatingSuffix?;fragment FractionalConstant: DigitSequence? '.' DigitSequence| DigitSequence '.';fragment ExponentPart: 'e' Sign? DigitSequence| 'E' Sign? DigitSequence;fragment Sign: '+' | '-';fragment DigitSequence: Digit+;fragment HexadecimalFractionalConstant: HexadecimalDigitSequence? '.' HexadecimalDigitSequence| HexadecimalDigitSequence '.';fragment BinaryExponentPart: 'p' Sign? DigitSequence| 'P' Sign? DigitSequence;fragment HexadecimalDigitSequence: HexadecimalDigit+;fragment FloatingSuffix: 'f' | 'l' | 'F' | 'L';fragment CharacterConstant: '\'' CCharSequence '\''| 'L\'' CCharSequence '\''| 'u\'' CCharSequence '\''| 'U\'' CCharSequence '\'';fragment CCharSequence: CChar+;fragment CChar: ~['\\\r\n]| EscapeSequence; fragment EscapeSequence: SimpleEscapeSequence| OctalEscapeSequence| HexadecimalEscapeSequence| UniversalCharacterName; fragment SimpleEscapeSequence: '\\' ['"?abfnrtv\\]; fragment OctalEscapeSequence: '\\' OctalDigit| '\\' OctalDigit OctalDigit| '\\' OctalDigit OctalDigit OctalDigit; fragment HexadecimalEscapeSequence: '\\x' HexadecimalDigit+; StringLiteral: EncodingPrefix? '"' SCharSequence? '"'; fragment EncodingPrefix: 'u8'| 'u'| 'U'| 'L'; fragment SCharSequence: SChar+; fragment SChar: ~["\\\r\n]| EscapeSequence| '\\\n' // Added line| '\\\r\n' // Added line;ComplexDefine: '#' Whitespace? 'define' ~[#]*-> skip;// ignore the following asm blocks: /*asm{mfspr x, 286;}*/ AsmBlock: 'asm' ~'{'* '{' ~'}'* '}'-> skip;// ignore the lines generated by c preprocessor // sample line : '#line 1 "/home/dm/files/dk1.h" 1' LineAfterPreprocessing: '#line' Whitespace* ~[\r\n]*-> skip; LineDirective: '#' Whitespace? DecimalConstant Whitespace? StringLiteral ~[\r\n]*-> skip;PragmaDirective: '#' Whitespace? 'pragma' Whitespace ~[\r\n]*-> skip;Whitespace: [ \t]+-> skip;Newline: ( '\r' '\n'?| '\n')-> skip;BlockComment: '/*' .*? '*/'-> skip;LineComment: '//' ~[\r\n]*-> skip;2.利用Antlr生成詞法分析器和語法分析器
在MyCGrammer.g4文件目錄打開命令行
輸入:
antlr4 MyCGrammer.g4 -visitor-visitor(是生成visitor類,默認不生成,這涉及antlr的兩種遍歷方式,其實生不生成影響不大)
之后文件目錄下會生成如下文件
?接著對其進行編譯,在命令行輸入:
javac MyCGrammer*.java這樣C語言詞法分析器和語法分析器就生成好了。
?3.測試
在命令行輸入:
grun MyCGrammer compilationUnit -tokens再輸入一段c語言代碼,按Crtl+Z結(jié)束。就可以生成對應(yīng)代碼的詞法分析結(jié)果。
?在命令行輸入:
grun MyCGrammer compilationUnit -gui同樣再輸入一段c語言代碼,按Crtl+Z結(jié)束。就可以生成對應(yīng)代碼的語法分析樹。
再介紹以下其它的選項:
-tokens:打印出詞法符號流。
-tree:以LISP格式打印出語法分析樹。
-gui:在對話框中以可視化方式顯示語法分析樹。
-ps file.ps :以PostScript格式生成可視化語法分析樹,然后將其存儲于file.ps。
-encoding encodingname:若當(dāng)前的區(qū)域設(shè)定無法正確讀取輸入,使用這個選項指定測試組件輸入文件的編碼。
-trace:打印規(guī)則的名字以及進入和離開該規(guī)則時的詞法符號。
-diagnostics:開啟解析過程中的調(diào)試信息輸出。
-SLL:使用另外一種更快但是功能稍弱的解析策略。
?
三、打包詞法分析器和語法分析器
以上我們的工作都是在命令行進行的,如果要將詞法分析和語法分析放到項目中,就需要將生成的文件進行打包。
在剛剛生成的文件目錄下,新建兩個文件夾lib和MyCGrammer
將下載的antlr-4.7.2-complete.jar復(fù)制到lib文件夾中
用IDEA打開(我此處用的是IntelliJ IDEA Community Edition 2021.1,其它應(yīng)該也類似,或者自己搜索打包方法)
?找到以下文件,并在頭部輸入:
package MyCGrammer;并移入MyCGrammer文件夾中?
?點擊左上角File->Project Structure->Artifacts->JAR->From modules with dependecies->copy to...
?
?然后點擊Build->Build Artifacts
即可在對應(yīng)目錄out中生成對應(yīng)的jar包。
———————————————————————————————————————————
以上便是Antlr的整個教程,后續(xù)將利用此對C語言進行詞法分析,語法分析,中間代碼生成以及生成目標(biāo)代碼。
總結(jié)
以上是生活随笔為你收集整理的编译原理Antlr教程的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 超级账本(介绍)
- 下一篇: IBM X3850 X5“Ext QPI