CHAPTER 3
DESCRIBING SYNTAX AND SEMANTICS
INTRODUCTION
·
The
study of programming languages can be divided into the examination of
syntax and semantics
·
Syntax
- is the form of expressions, statements, and program units
·
Semantics
- is the meaning of those expressions, statements, and program
units
·
In
a well-designed programming language, semantics should follow directly
from syntax
·
Describing
syntax is easier than describing semantics
THE GENERAL PROBLEM OF DESCRIBING SYNTAX
·
Lexemes
- the lowest level of syntactic unit
·
The
lexemes of a programming language include its identifiers, literals,
operators and special words
·
Token
of a language is a category of its lexemes
LANGUAGE
RECOGNIZERS
- Languages can be
defined in two ways: by recognition and by generation
- A language generator is
a device that can be used to generate the
sentences of a language
FORMAL
METHODS OF DESCRIBING SYNTAX
- John Backus and Noam
Chomsky invented a notation that is most widely used
for describing programming language syntax
CONTEXT FREE GRAMMERS
- Chomsky described 4
classes of grammars that define 4 classes of
languages. Two of these grammar classes, context-free and regular
turned
out to be useful for describing the syntax of programming languages
- The tokens of
programming languages can be described by regular grammars
ORIGINS
OF BACKUS -NAUR FORM (BNF)
- BNF is a very natural
notation for describing syntax
- Chomsky's context-free
languages is almost same as BNF's context-free
grammars
- meta-language is a
language that is used to describe another language
- BNF is meta-language
for programming languages
- The abstractions in
BNF, or grammar are called non-terminals
- The lexemes and tokens
of the rules are called terminals
- A BNF description, or
grammar, is simply a collection of rules
PARSE TREES
·
Most
attractive feature of grammars is that they describe the hierarchical
syntactic structure of sentences of the languages they define. These
hierarchical structures are called parse trees
·
A
grammar that generates a sentence for which there are two or more
distinct parse trees is said to be ambiguous
·
Syntactic
ambiguity of language structures is a problem because compilers
often base the semantics of those structures on their syntactic form
SYNTAX
GRAPHS
- A graph is a collection
of nodes, some of which are connected by lines,
called edges
- A directed graph is one
in which the edges are directional; they have
arrowheads on one end to indicate a direction
- The information in BNF
rules can be represented in a directed graph, such
graphs are called syntax graphs. These graphs use rectangles for
non-terminals and circles for terminals
GRAMMARS
AND RECOGNIZERS
- One of the most widely
used of the syntax analyzer generators is named
yacc - yet another compiler compiler
- Syntax analyzers for
programming languages, which are often called
parsers, construct parse trees for given programs
- The 2 broad classes of
parsers are top-down, in which the tree is built
from the root downward to the leaves, and bottom-up, in which the parse
tree is built from the leaves upward to the root.
RECURSIVE
DECENT PARSING
- Context-free grammar
can serve as the basis for the syntax analyzer, or
parser, of a compiler
- A simple kind of
grammar-based top-down parser is named recursive decent
- Parsing is the process
of tracing a parse tree for a given input string
- The basic idea of a
recursive decent parser is that there is a subprogram
for each non-terminal in the grammar
ATTRIBUTE
GRAMMARS
- An attribute grammar is
a device used to describe more of the structure of
a programming language than is possible with a context-free grammar
- An attribute grammar is
an extension to a context-free grammar
STATIC
SEMANTICS
- The static semantics of
a language is only indirectly related to the
meaning of programs execution; rather, it has to do with the legal form of
programs (syntax rather than semantics)
BASIC CONCEPTS
- Attributes which are
associated with grammar symbols, are similar to
variables in the sense that they can have values assigned to them
- Attribute computation
functions are associated with grammar rules to
specify how attribute values are computed
- Predicate functions
which state some of the syntax and static semantic
rules of the language, are associated with grammar rules
- Intrinsic attributes
are synthesized attributes of leaf nodes whose values
are determined outside the parse tree
DESCRIBING
THE MEANING OF PROGRAMS: DYNAMIC SYMANTICS
- Dynamic semantics are
the meaning of the expressions, statements and
program units
- programmers need to
know precisely what statements of a language do
- Compile writers
determine the semantics of a language for which they are
writing compilers from English descriptions
OPERATIONAL
SEMANTICS
- Operational semantics
the idea is to describe the meaning of a program by
executing its statements on a machine, either real or simulated
- Operational semantics
provides an effective means of describing semantics
for language users and language implementers, as long as the descriptions
are kept simple and informal
- Operational semantics
depends on algorithms, not mathematics
AXIOMATIC
SEMANTICS
- Axiomatic semantics
defined in conjunction with the development of a
method to prove the correctness of a program
- Axiomatic semantics is
based on mathematical logic. The logical
expressions are called predicates, or assertions. An assertion
immediately
following a statement describes a new constraints on those variables after
execution of the statement. These assertions are called the
precondition
and post-condition. Developing an axiomatic description or proof of
a given
program requires that every statement in the program have both a
precondition and a post-condition.
DENOTATIONAL SESMANTICS
·
Denotation
semantics is the most rigorous widely known method for
describing the meaning of programs. It is based on recursive function
theory.
·
The
fundamental concept of denotation semantics is to define for each
language entity both a mathematical object and a function that maps
instances of that entity onto instances of the mathematical object.