CIS 400 LECTURE 4
COMPILATION
Source
language
Tokens
·
Lexical
rules
·
Syntax
rules
·
Semantics
rules
Compiler
organization phases
·
Lexical
analysis
·
Syntactical
analysis
·
Semantic
analysis
·
Optimization
·
Code
generation
·
Peephole
optimization
Pass-
a run through source code in multiple passes that may be needed to check
syntax, etc.
Symbol
table- record kept for every identifier.
Some
compilers keep it for run-time, some not.
Lexical analyzer (or scanner")
·
Looks
for variables and special words
·
Creates
spaces in symbol table
·
"Lex"
is a UNIX tool, (scanner), that translates BNT-like grammar into code
Syntactic analyzer
·
Parser,
(generates parsing tree)
·
In
UNIX, "yacc" takes output from "lex" for generating a parse
tree
Optimizer
·
Performs
sub-expression elimination (i.e. in expression: a[i+j] = b[i+j] * c[i+j], the
(i+j) will be calculated to a new variable, d, such that: a[d] = b[d] * c[d])
·
Strength
reduction (i.e. L = Length(str1 || str2) is changed to the simpler to compute
expression L = Length(str1) || Length(str2))
·
Loop
optimization
-84 + 19 * 100 parsed:
Intermediate forum: "quadruple statement
format"
Operator |
Operand 1 |
Operand 2 |
Destination |
Unary minus |
Lit(84) |
Nil |
Temp1 |
Multiply op |
Lit(19) |
Lit(100) |
Temp2 |
Add |
Temp1 |
Temp2 |
Temp3 |
grammar:
(x + y) * 2 + x
expression::= term | expression + term
term::= factor | term * factor
factor::= x | y | z | (expression)
operator stack: arithmetic and grouping operator
operand stack: intermediate value and results
Evaluating an expression
·
During
scan, identifiers are evaluated and values are pushed on operand stack
·
Left
parenthesis pushed onto stack
·
"+"
or "*" requires a check of operator stack. If top token is
"+" or "*", pop the operand stack's top two values, process
them, and place the result back on the operand stack. Finally, the new
"+" of "*" are placed on the operator stack
·
Right
parenthesis causes popping, evaluation and pushing until a left parenthesis is
found
Operator precedence function: dummy operators
$ º beginning of string
# º end of string