A1

附录 I

Here is a complete grammar for Lox. The chapters that introduce each part of the language include the grammar rules there, but this collects them all into one place.

这里是完整的 Lox 语法。包括这里以及其它章节所介绍的语法规则,均汇总到这里。

A1 . 1Syntax Grammar

The syntactic grammar is used to parse the linear sequence of tokens into the nested syntax tree structure. It starts with the first rule that matches an entire Lox program (or a single REPL entry).

programdeclaration* EOF ;

A1 . 1 . 1Declarations

A program is a series of declarations, which are the statements that bind new identifiers or any of the other statement types.

declarationclassDecl
               | funDecl
               | varDecl
               | statement ;

classDecl"class" IDENTIFIER ( "<" IDENTIFIER )?
                 "{" function* "}" ;
funDecl"fun" function ;
varDecl"var" IDENTIFIER ( "=" expression )? ";" ;

A1 . 1 . 2Statements

The remaining statement rules produce side effects, but do not introduce bindings.

statementexprStmt
               | forStmt
               | ifStmt
               | printStmt
               | returnStmt
               | whileStmt
               | block ;

exprStmtexpression ";" ;
forStmt"for" "(" ( varDecl | exprStmt | ";" )
                           expression? ";"
                           expression? ")" statement ;
ifStmt"if" "(" expression ")" statement
                 ( "else" statement )? ;
printStmt"print" expression ";" ;
returnStmt"return" expression? ";" ;
whileStmt"while" "(" expression ")" statement ;
block"{" declaration* "}" ;

Note that block is a statement rule, but is also used as a nonterminal in a couple of other rules for things like function bodies.

A1 . 1 . 3Expressions

Expressions produce values. Lox has a number of unary and binary operators with different levels of precedence. Some grammars for languages do not directly encode the precedence relationships and specify that elsewhere. Here, we use a separate rule for each precedence level to make it explicit.

expressionassignment ;

assignment     → ( call "." )? IDENTIFIER "=" assignment
               | logic_or ;

logic_orlogic_and ( "or" logic_and )* ;
logic_andequality ( "and" equality )* ;
equalitycomparison ( ( "!=" | "==" ) comparison )* ;
comparisonterm ( ( ">" | ">=" | "<" | "<=" ) term )* ;
termfactor ( ( "-" | "+" ) factor )* ;
factorunary ( ( "/" | "*" ) unary )* ;

unary          → ( "!" | "-" ) unary | call ;
callprimary ( "(" arguments? ")" | "." IDENTIFIER )* ;
primary"true" | "false" | "nil" | "this"
               | NUMBER | STRING | IDENTIFIER | "(" expression ")"
               | "super" "." IDENTIFIER ;

A1 . 1 . 4Utility rules

In order to keep the above rules a little cleaner, some of the grammar is split out into a few reused helper rules.

functionIDENTIFIER "(" parameters? ")" block ;
parametersIDENTIFIER ( "," IDENTIFIER )* ;
argumentsexpression ( "," expression )* ;

A1 . 2Lexical Grammar

The lexical grammar is used by the scanner to group characters into tokens. Where the syntax is context free, the lexical grammar is regularnote that there are no recursive rules.

NUMBERDIGIT+ ( "." DIGIT+ )? ;
STRING"\"" <any char except "\"">* "\"" ;
IDENTIFIERALPHA ( ALPHA | DIGIT )* ;
ALPHA"a" ... "z" | "A" ... "Z" | "_" ;
DIGIT"0" ... "9" ;