CIS 350 Machine Problem 2
Fall 2000
In this assignment will provide you with the opportunity to compare the relative efficiencies of several search algorithms. You will need to use a programming language which supports the use of pointer variables and allows access to some type of TIME function. You may work with a partner if you wish, but you will each need to turn in separate lab write-ups (I only need one set of programming listings and output).
You should begin your work by writing a program which is capable of repeatedly processing the same set of data using four search algorithms: sequential search of an unordered list, sequential search of an ordered list, and two hashing collision resolution algorithms of your choice (you will need to use buckets and chaining to handle collisions). All linked lists are to be implemented using pointer variables and dynamic allocation of storage.
Your program should read the two data sets housed in the elvis data file search.txt (found by following link from this page. Each data set consists of 150 three digit numbers and is terminated by the sentinel number 0.
The trials will consist of measuring the time and the number of probes required to search and insert (50 and 150) numbers from the first data set in the list or hash table using each algorithm and then measuring the time and the number of probes required to locate and remove (50 and 150) numbers from the second data set from the list or the hash table using each algorithm. For each of the your algorithms you are to have two trials, one using 50 numbers and one using all 150 numbers. You may modify the data sets, if you can provide me with a good rationale.
Any time you compare an data value which doesn’t match your target key - you count it as a probe. Your hash tables should use 75 buckets. Some of the numbers in the second data set will not be found during your "search and remove" operation.
The output from your program must include the times requested above and the average number of probes for each trial, along with appropriate headings indicating which trial the times have come from. As always some intermediate output should be included to make grading and testing easier. To avoid biasing the timing operations though, all algorithms will need to include the same type of output.
You may use any data structures you wish to house the test data sets, as long as your algorithm only uses abstract operations to manipulate them. You must implement your data structuresas C++ classes. You will need to turn in a 3 to 5 page memo discussing your data structure implementation decisions (based on complexity arguments) and program features/limitations, along with the program source listing, compiler diagnostics, and program output.
Assigned: 10-12-00
Date due: 11-02-00