CHAPTER 5
DATA TYPES
INTRO
- The two most common
structured data types are arrays and records.
- These and other data
types are specified by type operators, or constructors, which are used
to perform expressions.
- It is logical and
correct to think of variables in terms of descriptors.
- A descriptor is the collection of the attributes of a variable.
- In an implementation, a
descriptor is a collection of memory cells that store variable attributes.
- The work "object
is often associated with the value of a variable and the space it occupies
PRIMITIVE DATA TYPES
- Data types that are not
required in terms of other types are called primitive data types
- The primitive data
types of a language are used to, along with one or more type constructors,
to provide the structured types.
NUMERIC
TYPES
INTEGER
- The most common
primitive numeric data type is integer.
- Many computers support
several sizes of integers, and these capabilities are reflected in some
programming languages
- For example, Ada allows
these: short integer, integer and
long integer.
- An integer is
represented by a string of bits, with the leftmost representing the sign
bit.
FLOATING-POINT
- Floating point data
types model real numbers, but the representations are only approximations
for most real values.
- On most computers,
floating-point numbers are stored in binary, which exacerbates the problem
- Floating-point values
are represented as fractions and exponents
- Most new computers use
the standard IEEE format
- Most languages use float and double as floating-point types
- The float is stored in
4 bytes of memory
- The double has twice as
big of storage
DECIMAL
- Most larger computers
that are designed to support business systems applications have hardware
support for decimal data types
- Decimal data types
store a fixed number of decimal digits, with the decimal point at a fixed
position in the value - these are
essential to COBOL
- Decimal types have
advantage of precisely storing decimal values, the disadvantages of
decimal types are that the range of values is restricted because no
exponents are allowed, and their representation in memory is wasteful
- Decimal types are
stored like character strings.
BOOLEAN
TYPES
- These are the simplest
of all types and were introduced in ALGOL 60
- The range of values has
only two elements - TRUE or FALSE
- Boolean types are often
used to represent switches or flags in programs
CHARACTER
TYPES
- These are stored as
numeric coding
- The most commonly used
coding is ASCII
- A new 16-bit character
set named Unicode had been developed as an alternative
- Java is the first to
use Unicode
CHARACTER STRING TYPES
- A character string type is one in which the values consist of
sequences of characters.
- They are used to label
output, and input and output of all kinds.
DESIGN
ISSUES
- Should strings be a
special kink of character array or a primitive type?
- Should strings have
static or dynamic length?
STRINGS
AND THEIR OPERATIONS
- String data is stored
in arrays of single characters and referenced as such in a language.
- In Ada string is a type
that is predefined to be single-dimensioned arrays of character elements.
- Character string
catenation in Ada is an operation specified by the “&”
Ex. name1 := name1
& name2;
- C and C++ use char arrays to store character
strings
- Some of the most
commonly used library functions for character strings in C and C++ are srtcpy which moves strings; strcat, which catenates one given
string onto another; strcmp,
which compares by order
STRING
LENGTH OPTION
- There are several
design choices regarding the length of string values.
- First, the length can
be static and specified in the declaration – such a string is called static length string
- The second option is to
allow strings to have varying length
up to a declared and fixed max set by variable’s definition – these
are called limited dynamic length
strings
- The third option is to
allow strings to have varying length with no max – these are called dynamic length strings
USER DEFINED ORDINAL TYPES
- An ordinal type is one in which the range of possible values can
be easily associated with the set of positive integers
ENUMERATION
TYPES
- An enumeration type in one in which all of the possible values,
which become symbolic constants, are enumerated in the definition.
- Ex. in Ada: type DAYS is (Mon, Tue, Wed, Thu,
Fri, Sat, Sun);
- Enumeration types have
advantages to readability and reliability.
SUBRANGE
TYPES
- A sub range type is a
contiguous subsequence of an ordinal type.
- Ex, 12 ... 14 is a sub
range of integer type.
ARRAY TYPES
- An array is a homogeneous aggregate of data elements in which an
individual element is identified by its position in the aggregate,
relative to the first element
- Specific elements of an
array are referenced by means of a two-level syntactic mechanism, where
the first part is the aggregate name, and the second part is a possibly
dynamic selector consisting of one or more items known as subscripts or indexes.
- If all of the indexes
in a reference are constants, the selector is static, otherwise it is
dynamic
- A static array is one in which the subscript ranges are
statically bound and storage allocation is static (done before run time),
advantage is efficiency
- A fixed stack-dynamic array is one in which the subscript ranges
are statically bound, but the allocation is done at declaration
elaboration time during execution – advantage is space efficiency
- Static-dynamic array is one in which the
subscript ranges are dynamically bound and the storage allocation is
dynamic – the advantage here is flexibility
- Heap-dynamic array is one in which the
binding of subscript ranges and allocation is dynamic and can change any
number of times during the array’s lifetime – the advantage here is
flexibility
- Arrays in C can have
only one subscript, but arrays can have arrays as elements, thus
supporting multi-dimensional arrays.
This is an example of orthogonality
- A slice of an array is some substructure of that array
- There are 2 common ways
in which multi-dimensional arrays can be mapped to one dimension in
memory: row major order and column major order.
- Row major order the array is stored by
rows
- Column major order the array is stored by
columns
ASSOCIATIVE
ARRAYS
- An associative array is an unordered collection of data elements
that are indexed by an equal number of values called keys – implemented by
Perl
RECORD TYPES
- A record is a heterogeneous aggregate of data elements in which
the individual elements are identified by names
- Records and arrays are
closely related and are interesting to compare them. Arrays are used when all the data
values have the same type and are processed in the same way. Records are used when the collection of
data values is heterogeneous and different fields are not processed in the
same way. Also the fields of a
record often need not be processed in a particular sequential order.
UNION TYPES
- A union is a type that may store different type values at
different times during program execution
- Fortran, C and C++
provide union constructs
SET TYPES
- A set type is one whose variables can store unordered
collections of distinct values from some ordinal type called its base type. Set types are often used to model
mathematical sets.
POINTER TYPES
- A pointer type is one in which the variables have a range of
values that consists of memory addresses and a special value, nil.
- Pointers have been
designed for 2 uses: One, pointers
provide some of the power of indirect addressing, which is heavily used in
Assembly. Two, pointers provide a
method of dynamic storage management.
A pointer can be used to access a location in the area where
storage is dynamically allocated, which is usually called a heap.
- Variables that are
dynamically allocated from the heap are called heap-dynamic variables.
- 2 pointer operations
provided by pointers are assignment and dereferencing
- A dangling pointer is a pointer that contains the address of a
heap-dynamic variable that has been deallocated.