GitHub - Shiwang0-0/Hash-Table: A simple Hash Table Implementation from the first principles

A simple Hash Table implementation from the first principles

why soft delete?

The concept of soft delete is, if we have elements on index 3,5,7,10 and when deleting a key it say its hash values lands on index 5, i would not "hard delete" it, because then we will make 3 and 7,10 seperated by an empty space, and we will not be able to perform the right look up through the hash table in this case.
what we do is, we will mark the 5th index as soft_deleted=true and will not delete its pointer for now.
Once we will do resizing, we will run a hard delete for all items that had soft_deleted=true and move only the remaining pointers in the resized table.

To address the same problem we also have active_key_count which tells how many keys/items there are actually in the hash table and used_key_count which tells that how many of the key space or alternately ( active keys + soft deleted keys ).
So we will decrease only the active_key_count when performing soft_delete_item operation This also helps in addressing the Load Factor in a new manner.

While doing a search, we can simply check if the mapped value is already soft_deleted (for the same table size) if it was soft_deleted dont take that value, and search the same key for other attempt.
While doing insert, since the pointers of soft_deleted_item are not freed yet, so we can simply continue with our normal apporach

little about amortized analysis

basic idea of amortized analysis is, for some operations the operating cost might be expensive but these expensive operations can be analysed and the average case tends to be a small cost. for this hash table implementation amortized analysis can be obeserved when we resize the hash table.

double the size of hash table (dynamic array)

say we have a size 1 hash table initially, to insert a new element we will need to create a new hash table of twice the size which is 2 and then copy the element from size 1 hash table, this will cost us 4 operations ( 2 for doubling the size of the array, 1 for copying the element and 1 for inserting the new element ), on similar note when we double the number of operations goes in the following way

no of element | size | operations
1 | 1 | 1 (for single insertion)
2 | 2 | 4 (2 ops for 2 size table, 1 op for copy, 1 op for insert)
3 | 4 | 4+2+1 (4 ops for 4 size table, 2 op for copy, 1 op for insert)
4 | 4 | 1 (since size was already available, only 1 op for insert)
5 | 8 | 8+4+1 (8 ops for 8 size table, 4 op for copy, 1 op for insert)
6 | 8 | 1 (since size was already available, only 1 op for insert)
7 | 8 | 1 (since size was already available, only 1 op for insert)

here cheaper operations are performed far more times than the expensive ones, thus dropping us from assumed approximation of O(n^2) to near about O(n) for n insertions.

halving the size of hash table

when we want to increase size, we increase it by doubling it, would shrinking when elements are less than twice the array size will benefit? No, suppose we have 1,2,3,4,5,6,7,8 so size would have been 16, and if we remove 8, we are immediately shirinking the size to 8, now elements are 1,2,3,4,5,6,7. but one more insertion would lead us to double the size again (more operations frequently).
should we do it in 1/4 ? suppose we have 16 elements, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 size would have been 32. remove last 8 elements, elements would be 1,2,3,4,5,6,7,8 in space of 32 elements (considering threshold is at 0.5) even if we half the size now meaning take it to 16 size, one more insertion and we would break the threshold of 0.5 because we will have 8 element in size of 16. (although n/4 is still considered as reasonable spot, if threshold is more than 0.5)

we will consider 1/8 as our sweet spot, in that way some few immediate insertions would not cause a upsize. so we only have 2 elements remaining in the array, then 16 size table will shrink to 8, (16/8 == 2)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
amortized_analysis		amortized_analysis
.gitignore		.gitignore
hashtable.c		hashtable.c
hashtable.h		hashtable.h
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

why soft delete?

little about amortized analysis

double the size of hash table (dynamic array)

halving the size of hash table

About

Uh oh!

Releases

Packages

Languages

Shiwang0-0/Hash-Table

Folders and files

Latest commit

History

Repository files navigation

why soft delete?

little about amortized analysis

double the size of hash table (dynamic array)

halving the size of hash table

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages