# What Is R-Tree Data Structure?

//

Larry Thompson

An R-Tree is a data structure that is used to index spatial data in a database. It is specifically designed for efficient retrieval of objects based on their spatial relationships, such as proximity or intersection. The R-Tree is widely used in applications that involve spatial indexing, including geographic information systems (GIS) and database management systems (DBMS).

## Structure of an R-Tree

The R-Tree is a balanced tree-like structure where each node represents a rectangular region. The root node represents the entire space being indexed, and the leaf nodes represent individual objects or groups of objects.

Each node in the R-Tree can have a variable number of entries, which consist of a rectangle and a pointer to either another internal node or a leaf node. The rectangles in each node are designed to enclose all the rectangles contained within its child nodes.

### Insertion and Splitting

When inserting an object into an R-Tree, the tree is traversed from the root to the appropriate leaf node based on the object’s bounding rectangle. If the leaf node has enough space to accommodate the new object, it is simply added. However, if the leaf node is full, it needs to be split.

The splitting process involves selecting two seeds from the set of rectangles in the overflowing leaf node. These seeds are chosen based on their area enlargement if they were added individually. The remaining rectangles are then assigned to one of the two seeds based on their area enlargement when added to each seed’s bounding rectangle.

• In quadratic split, two seeds are selected based on their pairwise area enlargement. The algorithm then assigns each remaining rectangle to one of these seeds based on their area enlargement when added to each seed’s bounding rectangle.

This process is repeated until all the rectangles are assigned to a seed.

• Splitting by Linear Split:
• In linear split, two seeds are chosen based on the minimum overlap cost of combining them. The algorithm then assigns each remaining rectangle to one of these seeds based on their overlap cost when added to each seed’s bounding rectangle. This process is repeated until all the rectangles are assigned to a seed.

### Searching in an R-Tree

The search operation in an R-Tree involves traversing the tree from the root node and finding all the objects that intersect with a given query region. It eliminates unnecessary checks by examining only those nodes whose bounding rectangles intersect with the query region.

The search process starts at the root node and recursively visits child nodes whose bounding rectangles intersect with the query region. Once a leaf node is reached, all objects within its bounding rectangles are checked for intersection with the query region, and relevant results are returned.