Help for the Hashing Animation Tool
|
Introduction
| Sample Instructions | Additional
Display Information | Algorithms | Operating
Requirements | Suggestions |
|
|
Introduction: |
|
|
The purpose of this applet is to demonstrate, and
allow comparisons between, five hashing algorithms:
After the user enters specifications for the hash
table and numerical input, the applet will animate the storage or retrieval
of the data. As the demo runs, the user may display the pseudocode for
the algorithm. In this mode, lines of code are highlighted in synchronization
with the animation to demonstrate the algorithm logic. The user may
alternatively elect to display statistics that are continuously updated
as the algorithm executes.
Tabular setup and statistics summaries are provided
for up to five runs. These allow the user to compare the performance
of different algorithms or the same algorithm with different settings.
In addition, the applet will also display plots of average time and
average probes as a function of table load.
|
|
|
Sample
Instructions: |
|
|
-
When the applet loads in the browser it displays
an animated title screen and a toolbar along the bottom. To display
the Setup page for the demo, select an algorithm from the drop-down
menu or click the Setup button if you wish to use Linear Probing.
-
The Setup page shows the
current settings for the selected algorithm. If you wish to change
a setting, click the Change button for the appropriate section,
enter the new setting(s) and click the OK button. Instructions for
each input are provided in the Notes display. When you are finished,
click the Execute button on the toolbar.
-
The Execution page displays
the hash table, hash function, execution progress and the statistics
table. To view the algorithm pseudocode instead of the statistics,
click the Source Code button. Click the Statistics button if you
wish to toggle back.
-
To begin the demo click either the Step or Run
button. If the Statistics table is showing, the Step button will
execute one table search or insertion. If the pseudocode is displayed,
it will execute the action corresponding to the highlighted line
of code. Click Step again for the next action or click the Run button
to animate the demo to completion. To adjust the speed of the animation
or disable the sound, click the Options button and set the controls
in the pop-up dialog box. Clicking the Pause button while a demo
is running halts the execution and allows you to switch back to
Step mode if desired. Clicking Abort resets the execution display
back to its start state. After executing one or more demos, click
the Results button.
-
The Results page displays
tabular summaries of up to five completed runs. Here you can compare
the performance of different algorithms or settings. To delete a
run, click on any of its summary fields to select it and click the
Delete button. Clicking the Undelete button restores any deleted
summaries one at a time if there is room to display them. Next,
click the Plot button.
- The Plot page allows you
to plot Average Time and Average Probes as functions of Average Table
Load Factor for storage and as functions of numbers searched/total
numbers for retrieval. To plot a run, click on its summary row with
the mouse. The summary text will change color and will match its corresponding
data on the graph. This page allows three possible views of the two
plots: two small plots, one large time plot or one large probes plot.
Click either the Time, Probes or Both button on the toolbar to display
the desired view.
You are not restricted to performing the steps in
the order outlined above. Any button or control that is enabled may
be used at any time.
|
|
|
Additional
Display Information: |
|
|
This section provides additional information about
each of the applet displays.
|
|
Setup
Page |
Each algorithm is setup individually. When the Setup
page is selected it will display the default settings for the current
algorithm or the settings entered the last time the Setup page for that
algorithm was modified.
The range limits on the individual settings are
interrelated and some of them are strictly for display purposes. For
example, more one-slot buckets will fit on the screen than three-slot
buckets. Therefore, if you increase the slots per bucket from one to
three the applet may automatically reduce the number of buckets to ensure
the table will fit on the screen. When this happens, the Notes area
will display a message indicating that an adjustment was made and a
notification sound will play (if sound is enabled). A similar adjustment
will also occur if you make changes that reduce the capacity of the
table to below that of the current data size setting.
There are two options that are only available for
Retrieve demos. Both are found in the Execution mode section of the
page. The first allows you to specify the percentage of searches for
data that will be successful. The second is a quick load option. For
Retrieve demos, you may watch or step through the loading of the table
prior to running the actual data retrieval. To skip this portion of
the demo, select the quick load checkbox. The table will then be loaded
with data prior to being displayed.
|
Execution
Page |
Table Display
|
The hash table uses different colors to indicate
various states and actions. An empty bucket slot is gray. A slot that
contains a value is purple. Links to other buckets are cyan. A link
displaying a diagonal line indicates that the value of that pointer
is null. Nodes of linked-list chains are show in orange.
When a bucket or node is searched, either to find
an empty slot or to find a certain value, the border of the bucket
or node is highlighted in yellow. A successful search for a value
is indicated by displaying the slot or node containing the value in
green.
Due to display limitations, nodes of linked-list
chains over a certain length will not be visible. In these cases,
the last visible node will have a dashed link arrow to indicate there
are additional nodes in the chain. This arrow will be shown in yellow
to indicate when these additional nodes are being searched and it
will be shown in green when a search is successful.
The text above of table indicates whether a store
or retrieve demo is selected. It also indicates the current search
status of the algorithm or whether the demo has finished executing.
|
Hash Function
Display
|
The upper right portion of the Execution page indicates
the current hash function. In all cases division hashing is used and
the equation will be:
home address = value mod (number
of buckets)
This display is updated as each value is stored
or retrieved.
|
Progress Display
|
The progress display is immediately below the hash
function. This display indicates how many values out of the data size
specified during setup remain to be processed. The progress display
is the only portion of the Execution page updated during a quick load.
|
Statistics Display
|
The following statistics are continuously updated
as the demo executes:
- Time - the minutes, seconds, and centiseconds
that have elapsed since the demo began. This data is obtained from
your computer system's clock and is not adjusted to account for
animation speed or delays introduced by using the Step and Pause
buttons.
- Average Probes - the average number of bucket
and node searches per value that the algorithm attempted to store
or retrieve.
- Load Factor - the number of values in the table
divided by the table capacity. The table capacity equals: buckets
x slots + size of separate overflow.
- Searches - if a value was successfully stored
or retrieved from the table it counts as a successful search. When
the algorithm finishes the sum of the successful and unsuccessful
searches will equal the data size.
|
Pseudocode Display
|
This portion of the page shows the pseudocode for
the current algorithm. As the demo runs the lines of code are highlighted
in synchronization with the animation to demonstrate the algorithm
logic. See the Algorithms section for more
information about the individual collision resolution algorithms.
|
Results
Page |
Any demo that is allowed to run to completion creates
a summary listing that will be displayed on the Results page. Please
note that the results displayed for a retrieve demo do not include any
of the probes, searches, etc. that occurred while the table was being
loaded.
The Results Page displays summaries for up to five
runs. If additional demos are run, the summary displayed in column one
will be lost, the remaining summaries shifted down and the new summary
added to column five.
You may delete any of the displayed summaries to
make room on the page for new runs. To do this click any of the summary
data for the runs you wish to remove and click the Delete button. Clicking
the Undelete button restores deleted summaries one at a time if there
is room to display them. Up to five deleted summaries may be restored.
|
Plot
Page |
The Plot page allows you to plot Average Time and
Average Probes for up to five demos. Average Time refers to the average
number of seconds that elapsed for each value the algorithm attempted
to store or retrieve. Similarly, Average Probes is the average number
of bucket and node searches per value that the algorithm attempted to
store or retrieve. The interpretation of the x-axis depends on if the
algorithm was storing or retrieving data. For storage runs, the plots
are functions of Average Table Load Factor. Retrieve demos, however,
plot data as functions of the current number of values searched over
the total number to search (the data size).
|
Options Dialog |
The Options dialog is a pop-up window that is displayed
when the Options button is clicked on the toolbar. This dialog allows
you to enable or disable sound and to adjust the algorithm execution
speed.
Sounds are played under the following circumstances:
to indicate an input error on the Setup page, to indicate the number
of buckets or data size was automatically reduced, and to indicate successful
and unsuccessful searches.
The animation delay is controlled with a slider and
sets the number of milliseconds the algorithm pauses after each action.
Setting the delay to zero results in maximum execution speed. The execution
will not appear instantaneous however, since the applet still requires
a certain amount of time to refresh the elements displayed on the screen.
|
|
Algorithms: |
|
|
This section provides additional information about
the five collision resolution algorithms the applet demonstrates. Recall
that all of the algorithms use the same division hash function to determine
the home address. The collision resolution algorithms dictate the actions
taken when a value cannot be stored in its home address because it is
already full. Similarly, the approach taken when searching for data
depends on the algorithm used to store the data initially.
|
|
Linear
Probing
|
If a collision occurs while trying to store a value
at its home address, this algorithm moves forward through the table
searching each bucket one by one until an empty slot is found. If
the bottom of the table is reached, the search wraps around back to
the beginning of the table and continues until either an empty slot
is found or the search reaches its starting point, in which case the
table is full. For the applet demo, the data size cannot exceed the
table capacity; so all storage attempts will be successful.
The retrieve algorithm also performs a wrap-around
search starting at the home address. The search is considered unsuccessful
if the algorithm encounters an empty slot or wraps around to the home
address before finding the desired value.
|
Quadratic
Probing
|
This algorithm is similar to linear probing except
a different formula is used to calculate which address is searched
next. This formula is:
address = home address + i*i where
"i" is a zero-based search counter.
If the value of an address exceeds the address
of the bottom of the table, the search wraps around from address 0.
There are variations of this algorithm that, with
certain restrictions on the table and input, guarantee each bucket
will be searched. The applet's implementation does not impose these
restrictions. Instead, the maximum number of searches it will perform
for an empty bucket is equal to the number of buckets in the table.
When this condition is reached the storage is considered unsuccessful.
Retrieval is similar except that a search is also considered unsuccessful
if an empty slot is encountered prior to finding the value searched
for.
|
Bucket
Chaining
|
Storage using the bucket chaining algorithm requires
two passes through the input data set. On the first pass, the home
address for a value is determined and the data is inserted in the
table if there is an available slot at this address. If there is a
collision however, the data is stored in a temporary data structure
until the second pass.
The second pass works as follows:
For each value that wasn't stored on the first pass, the algorithm
checks the link of the value's home address. If the link is null,
a wrap around linear probing search is done for a completely empty
bucket. If an empty bucket is found, the value is inserted and the
link of the home address is set to the address where the value was
inserted. If the link at the home address is not null, the algorithm
searches for an empty slot in the bucket pointed to by the link. If
there is an empty slot the value is inserted. If this bucket is also
full and has a link that is not null, the next bucket in the chain
is searched. If the chain ends with a full bucket, the algorithm performs
a wrap around linear probing search for a completely empty bucket.
If an empty bucket is found, the value is inserted and the link of
the last bucket in the chain is set to the address where the value
was inserted. Any wrap around search that returns to its origin is
considered unsuccessful.
The retrieve algorithm starts at the home address
and searches each bucket in the chain. The search is unsuccessful
if an empty slot or a full bucket with a null link is encountered
prior to finding the value searched for.
|
Linked
List Chaining
|
This algorithm always stores a data value at its
home address. Data is inserted in the bucket slots until the bucket
is full. Subsequent values hashed to this address are added as new
nodes in the link list pointed to by the bucket. For the algorithm
demonstrated by the applet, new nodes are added to the front of the
list.
The retrieve algorithm first searches the slots
of the bucket at the home address. If the value is not present, the
algorithm searches each node of the linked list one by one. The search
is unsuccessful if the end of the chain is reached prior to finding
the value searched for.
|
Chaining
with Separate Overflow
|
This algorithm requires two hash tables. The hash
function is based on the number of buckets in the primary hash table.
All home addresses calculated by the hash function refer to the primary
table. Data is inserted in the overflow table only if the home address
in the primary table is full. The overflow table used in the applet
demo is limited to one slot per bucket.
If a collision occurs while trying to store a
value at its home address, the algorithm checks the link at this bucket.
If the link is null, the algorithm performs a linear probing search
of the overflow table for an empty bucket starting at the first bucket
in the table. If an empty bucket is found, the value is inserted and
the link of the home address is set to the address in the overflow
table where the value was inserted.
If the link at the home address is not null, the
algorithm searches the associated chain in the overflow table until
the end of the chain is reached. At this point the algorithm performs
a wrap around linear probing search for an empty bucket. If an empty
bucket is found, the value is inserted and the link of the last bucket
in the chain pointed to this bucket's address. Any wrap around search
that returns to its origin is considered unsuccessful.
The retrieve algorithm starts at the home address
and searches each bucket in the chain. The search is unsuccessful
if a full bucket with a null link is encountered prior to finding
the value searched for.
|
|
Operating
Requirements: |
|
|
The Hashing Animation Tool is implemented as a Java
applet. In order to view the applet a Java enabled web browser that
supports Java 1.1, such as Internet Explorer 4+ or Netscape Navigator
4+, is required. In addition, due to the dimensions of the applet a
monitor resolution of at least 1027 x 768 is recommended. Finally, the
file size for the applet is fairly large. Users with dial-up connections
may need to wait a minute or so for the applet to finish loading.
|
|
|
Suggestions
for running the applet: |
|
|
When running Storage demos, the first insertions
into the table are not very interesting since there will probably be
few collisions. Adjust the animation delay to its minimum value, click
Run and let the demo execute until you begin to see collisions. After
that, either click Pause and switch to Step mode or increase the animation
delay so that you can observe how the algorithm works.
As stated in the Execution Page section, the time
statistic is obtained from your computer system's clock and is not adjusted
to account for animation speed or delays introduced by using the Step
and Pause buttons. If you wish to compare the execution times for different
algorithms only use the Run button and use the same animation delay
setting for each demo.
Finally, when you are ready to browse away to a
new web page abort the demo if it is running. If you don't and are running
Netscape, you may have to wait a few extra seconds before the browser
will let you leave the page.
|
|
|
Introduction
| Sample Instructions | Additional
Display Information | Algorithms | Operating
Requirements | Suggestions |
|