Monday, January 9, 2012

A comparison on B-Tree, R-Tree & Quad Tree Indexing technologies


A comparison on B-Tree, R-Tree & Quad Tree Indexing technologies in the field of Spatial

          The following table illustrates a comparison on Indexing technologies B-Tree, R-Tree and Quad Tree in the field of Spatial. Going through the content, the reader should be in a position to understand that each of the indexing technologies have their own set of strengths and weaknesses and have limitations on the areas they play around. Clearly, B-Tree is ill suited in the field of Spatial, while it gels well, when it is applied on Linear data sets. The reader can take his / her perspective as to which of the two indexing technologies are better, i.e between R-Tree and Quad Tree. I'll leave that decision to the reader based on their need and requirement.



B-Tree
R-Tree
Quad Tree
Definition
In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time.
R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons.
A Quadtree is a tree data structure in which each internal node has exactly four children.
Description
The B-tree is a generalization of a binary search tree in that a node can have more than two children. The B-tree is optimized for systems that read and write large blocks of data. It is commonly used in databases and file system’s.
Similar to the B-tree, the R-tree is also a balanced search tree (so all leaf nodes are at the same height), organizes the data in pages, and is designed for storage on disk (as used in databases)
Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes.
Support Spatial Data
The B-tree access method, indexes numeric and character data only.
You cannot use the B-tree access method to index spatial data.
Designed for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons
Quadtrees work on two dimensional space / regions that may be square, rectangular or any arbitrary shapes
Approximation
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
The approximation of geometries cannot be fine tuned. (Spatial uses the minimum bounding
rectangles.)
The approximation of geometries can be fine tuned by setting the tiling level and number of tiles.
Complexity
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
Index creation and tuning are easier.
Tuning is more complex, and setting the appropriate tuning parameter values can affect performance significantly.
Storage
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
Less storage is required
More storage is required.
Workload
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
If your application
workload includes nearest neighbor
queries, R-tree indexes are faster, and you can use the sdo_batch_size
keyword.
If your application
workload includes nearest neighbor
queries, quadtree indexes are slower, and you cannot use the sdo_batch_size
keyword.
Updates
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
Heavy update activity to
the spatial column may
decrease the R-tree index performance until the index is rebuilt.
Heavy update activity
does not affect the
performance of a Quadtree index.
Dimensions supported
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
You can index up to four dimensions.
You can index only two dimensions. If LRS (Linear Referencing System) data is indexed using a spatial Quadtree  index, only the first two dimensions are indexed; the measure dimension and its values are not indexed.
Recommended
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
An R-tree index is
recommended for indexing geodetic data if
ST_WITHIN queries will be used on it.
A quadtree index is not recommended for Indexing geodetic data if ST_WITHIN queries will be used on it.
Whole-Earth Model
B-Tree is optimized for Linear data set and cannot handle spatial data and features. Hence, not applicable.
An R-tree index is required for a whole-Earth index.
A quadtree index cannot be used for a whole-Earth index.



Reference 1: Wikipedia pages of B-Tree, R-Tree and Quad Tree