CIS 350 Machine Problem 2
Winter 1997
In this assignment will provide you with the
opportunity to compare the relative efficiencies of several
search algorithms. You will need to use a programming
language which supports the use of pointer variables and
allows access to some type of TIME function. You may work
with a partner if you wish, but you will each need to turn
in separate lab write-ups (I only need one set of
programming listings and output).
You should begin your work by writing a program which
is capable of repeatedly processing the same set of data
using four search algorithms: sequential search of an
unordered list, sequential search of an ordered list, and
two hashing collision resolution algorithms of your choice
(you will need to use buckets and chaining to handle
collisions). All linked lists are to be implemented using
pointer variables and dynamic allocation of storage.
Your program should read the two data sets housed in
the skye data file ~cis350/mp2.data. Each data set consists
of 150 three digit numbers and is terminated by the sentinel
number 0.
The trials will consist of measuring the time and the
number of probes required to search and insert (50 and 150)
numbers from the first data set in the list or hash table
using each algorithm and then measuring the time and the
number of probes required to locate and remove (50 and
150) numbers from the second data set from the list or the
hash table using each algorithm. For each of the your
algorithms you are to have two trials, one using 50 numbers
and one using all 150 numbers. You may modify the data sets,
if you can provide a good rationale.
Any time you compare an data value which doesn’t match
your target key - you count it as a probe. Your hash tables
should use 75 buckets. Some of the numbers in the second
data set will not be found during your "search and remove"
operation.
The output from your program must include the times
requested above and the average number of probes for each
trial, along with appropriate headings indicating which
trial the times have come from. As always some intermediate
output should be included to make grading and testing
easier. To avoid biasing the timing operations though, all
algorithms will need to include the same type of output.
You may use any data structures you wish to house the
test data sets, as long as your algorithm only uses abstract
operations to manipulate them. You will need to turn in a 3
to 5 page memo discussing your data structure implementation
decisions (based on complexity arguments) and program
features/limitations, along with the program source listing,
compiler diagnostics, and program output.