## DSA Tutorial

DSA HOME DSA Intro DSA Simple Algorithm

## Arrays

DSA Arrays DSA Bubble Sort DSA Selection Sort DSA Insertion Sort DSA Quick Sort DSA Counting Sort DSA Radix Sort DSA Merge Sort DSA Linear Search DSA Binary Search

## Linked Lists

DSA Linked Lists DSA Linked Lists in Memory DSA Linked Lists Types Linked Lists Operations

## Stacks & Queues

DSA Stacks DSA Queues

## Hash Tables

DSA Hash Tables DSA Hash Sets DSA Hash Maps

## Trees

DSA Trees DSA Binary Trees DSA Pre-order Traversal DSA In-order Traversal DSA Post-order Traversal DSA Array Implementation DSA Binary Search Trees DSA AVL Trees

## Graphs

DSA Graphs Graphs Implementation DSA Graphs Traversal DSA Cycle Detection

## Shortest Path

DSA Shortest Path DSA Dijkstra's DSA Bellman-Ford

## Minimum Spanning Tree

Minimum Spanning Tree DSA Prim's DSA Kruskal's

## Maximum Flow

DSA Maximum Flow DSA Ford-Fulkerson DSA Edmonds-Karp

## Time Complexity

Introduction Bubble Sort Selection Sort Insertion Sort Quick Sort Counting Sort Radix Sort Merge Sort Linear Search Binary Search

## DSA Reference

DSA Euclidean Algorithm DSA Huffman Coding DSA The Traveling Salesman DSA 0/1 Knapsack DSA Memoization DSA Tabulation DSA Dynamic Programming DSA Greedy Algorithms

## DSA Examples

DSA Examples DSA Exercises DSA Quiz DSA Certificate

# DSA Hash Sets

## Hash Sets

A Hash Set is a form of Hash Table data structure that usually holds a large number of elements.

Using a Hash Set we can search, add, and remove elements really fast.

Hash Sets are used for lookup, to check if an element is part of a set.

Hash Set

0:
{{ el.name }}
1:
{{ el.name }}
2:
{{ el.name }}
3:
{{ el.name }}
4:
{{ el.name }}
5:
{{ el.name }}
6:
{{ el.name }}
7:
{{ el.name }}
8:
{{ el.name }}
9:
{{ el.name }}

Hash Code

{{ sumOfAscii }} % 10 = {{ currHashCode }}

{{ resultText }}0

A Hash Set stores unique elements in buckets according to the element's hash code.

• Hash code: A number generated from an element's unique value (key), to determine what bucket that Hash Set element belongs to.
• Unique elements: A Hash Set cannot have more than one element with the same value.
• Bucket: A Hash Set consists of many such buckets, or containers, to store elements. If two elements have the same hash code, they belong to the same bucket. The buckets are therefore often implemented as arrays or linked lists, because a bucket needs to be able to hold more than one element.

## Finding The Hash Code

A hash code is generated by a hash function.

The hash function in the animation above takes the name written in the input, and sums up the Unicode code points for every character in that name.

After that, the hash function does a modulo 10 operation (% 10) on the sum of characters to get the hash code as a number from 0 to 9.

This means that a name is put into one of ten possible buckets in the Hash Set, according to the hash code of that name. The same hash code is generated and used when we want to search for or remove a name from the Hash Set.

The Hash Code gives us instant access as long as there is just one name in the corresponding bucket.

Unicode code point: Everything in our computers are stored as numbers, and the Unicode code point is a unique number that exist for every character. For example, the character A has Unicode code point 65. Just try it in the simulation above. See this page for more information about how characters are represented as numbers.

Modulo: A mathematical operation, written as % in most programming languages (or $$mod$$ in mathematics). A modulo operation divides a number with another number, and gives us the resulting reminder. So for example, 7 % 3 will give us the reminder 1. (Dividing 7 apples between 3 people, means that each person gets 2 apples, with 1 apple to spare.)

## Direct Access in Hash Sets

Searching for Peter in the Hash Set above, means that the hash code 2 is generated (512 % 10), and that directs us right to the bucket Peter is in. If that is the only name in that bucket, we will find Peter right away.

In cases like this we say that the Hash Set has constant time $$O(1)$$ for searching, adding, and removing elements, which is really fast.

But, if we search for Jens, we need to search through the other names in that bucket before we find Jens. In a worst case scenario, all names end up in the same bucket, and the name we are searching for is the last one. In such a worst case scenario the Hash Set has time complexity $$O(n)$$, which is the same time complexity as arrays and linked lists.

To keep Hash Sets fast, it is therefore important to have a hash function that will distribute the elements evenly between the buckets, and to have around as many buckets as Hash Set elements.

Having a lot more buckets than Hash Set elements is a waste of memory, and having a lot less buckets than Hash Set elements is a waste of time.

## Hash Set Implementation

Hash Sets in Python are typically done by using Python's own set data type, but to get a better understanding of how Hash Sets work we will not use that here.

To implement a Hash Set in Python we create a class SimpleHashSet.

Inside the SimpleHashSet class we have a method __init__ to initialize the Hash Set, a method hash_function for the hash function, and methods for the basic Hash Set operations: add, contains, and remove.

We also create a method print_set to better see how the Hash Set looks like.

### Example

class SimpleHashSet:
def __init__(self, size=100):
self.size = size
self.buckets = [[] for _ in range(size)]  # A list of buckets, each is a list (to handle collisions)

def hash_function(self, value):
# Simple hash function: sum of character codes modulo the number of buckets
return sum(ord(char) for char in value) % self.size

def add(self, value):
# Add a value if it's not already present
index = self.hash_function(value)
bucket = self.buckets[index]
if value not in bucket:
bucket.append(value)

def contains(self, value):
# Check if a value exists in the set
index = self.hash_function(value)
bucket = self.buckets[index]
return value in bucket

def remove(self, value):
# Remove a value
index = self.hash_function(value)
bucket = self.buckets[index]
if value in bucket:
bucket.remove(value)

def print_set(self):
# Print all elements in the hash set
print("Hash Set Contents:")
for index, bucket in enumerate(self.buckets):
print(f"Bucket {index}: {bucket}")

Using the SimpleHashSet class we can create the same Hash Set as in the top of this page:

### Example

class SimpleHashSet:
def __init__(self, size=100):
self.size = size
self.buckets = [[] for _ in range(size)]  # A list of buckets, each is a list (to handle collisions)

def hash_function(self, value):
# Simple hash function: sum of character codes modulo the number of buckets
return sum(ord(char) for char in value) % self.size

def add(self, value):
# Add a value if it's not already present
index = self.hash_function(value)
bucket = self.buckets[index]
if value not in bucket:
bucket.append(value)

def contains(self, value):
# Check if a value exists in the set
index = self.hash_function(value)
bucket = self.buckets[index]
return value in bucket

def remove(self, value):
# Remove a value
index = self.hash_function(value)
bucket = self.buckets[index]
if value in bucket:
bucket.remove(value)

def print_set(self):
# Print all elements in the hash set
print("Hash Set Contents:")
for index, bucket in enumerate(self.buckets):
print(f"Bucket {index}: {bucket}")

# Creating the Hash Set from the simulation
hash_set = SimpleHashSet(size=10)

hash_set.add("Charlotte")
hash_set.add("Thomas")
hash_set.add("Jens")
hash_set.add("Peter")
hash_set.add("Lisa")
hash_set.add("Adele")
hash_set.add("Michaela")
hash_set.add("Bob")

hash_set.print_set()

print("\n'Peter' is in the set:",hash_set.contains('Peter'))
print("Removing 'Peter'")
hash_set.remove('Peter')
print("'Peter' is in the set:",hash_set.contains('Peter'))
print("'Adele' has hash code:",hash_set.hash_function('Adele'))
Run Example ยป

×

## Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail:
sales@w3schools.com

## Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail:
help@w3schools.com

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning. Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness of all content. While using W3Schools, you agree to have read and accepted our terms of use, cookie and privacy policy.

Copyright 1999-2024 by Refsnes Data. All Rights Reserved. W3Schools is Powered by W3.CSS.