# Which Data Structure Is Used in Disjoint Set?

//

Heather Bennett

Which Data Structure Is Used in Disjoint Set?

The disjoint set data structure is commonly used to solve problems involving partitioning a set of elements into disjoint subsets. It provides efficient operations for merging sets and finding the representative element of each set.

The data structure used in implementing a disjoint set is called a disjoint-set forest. Let’s explore how it works and the key components involved.

## Disjoint-Set Forest

A disjoint-set forest is a collection of trees, where each tree represents a separate disjoint subset. Each tree can be viewed as a set, with one element being designated as the representative element or the root of that tree. The representative element uniquely identifies its corresponding subset.

To implement the disjoint-set data structure efficiently, we use various techniques such as union by rank and path compression. These techniques help optimize the time complexity of operations like merging sets and finding the representative element.

### Union by Rank

The union by rank technique ensures that when two sets are merged, the tree with a smaller rank is attached below the root of the tree with a larger rank. This approach helps to keep the trees balanced, preventing them from becoming skewed. The rank of each tree denotes an upper bound on its height.

When two trees have equal ranks, any one of them can be chosen as the root, and its rank is increased by one. This ensures that the resulting tree’s rank increases only if both input trees have equal ranks.

### Path Compression

The path compression technique optimizes the find operation by making subsequent find operations faster. When finding the representative element of a particular element, all elements on the path from that element to its root are connected directly to the root. This compression helps to flatten the tree and reduce its height, leading to faster find operations in the future.

## Implementation

There are multiple ways to implement a disjoint-set forest, but one of the most common approaches is using an array representation. In this representation, each element of the array corresponds to an element in the set, and its value represents either its parent or a negative value indicating it is a root element.

Here’s a simple example of the array representation:

• Initial state: [0, 1, 2, 3, 4]
• After union(1, 3): [0, 3, 2, -1, 4]
• After union(2, 4): [0, 3, -2, -1, -4]

In this example:

• The initial state represents five disjoint subsets with each element being its own representative.
• After merging subsets containing elements ‘1’ and ‘3’, ‘3’ becomes their representative.
• Similarly, after merging subsets containing elements ‘2’ and ‘4’, ‘-2’ becomes their representative.

This array representation allows us to perform efficient union and find operations by following the parent pointers until we reach a root node. The path compression technique can be applied during find operations to further optimize their performance.

## Conclusion

The disjoint-set data structure is essential for solving problems that involve partitioning a set into disjoint subsets. The underlying data structure used in implementing this concept is called a disjoint-set forest. It combines union by rank and path compression techniques to provide efficient operations for merging sets and finding the representative element of each set.

By understanding the principles behind the disjoint-set data structure and its implementation details, you can effectively apply it to solve various problems involving set partitioning and connectivity.