Clever Geek Handbook
📜 ⬆️ ⬇️

Disjoint set system

The system of disjoint sets ( English disjoint-set , or union – find data structure ) is a data structure that allows you to administer a set of elements, divided into disjoint subsets. Moreover, each subset is assigned its representative - an element of this subset. An abstract data structure is defined by many three operations:{Union,Find,MakeSet} {\ displaystyle \ {\ mathrm {Union}, \ mathrm {Find}, \ mathrm {MakeSet} \}} {\ displaystyle \ {\ mathrm {Union}, \ mathrm {Find}, \ mathrm {MakeSet} \}} .

It is used to store connected components in graphs , in particular, the Kraskal algorithm needs a similar data structure for effective implementation.

Content

Definition

Let beS {\ displaystyle S}   finite set divided into disjoint subsets ( classes)Xi {\ displaystyle X_ {i}}   :

S=X0∪Xone∪X2∪...∪Xk:Xi∩Xj=∅∀i,j∈{0,one,...,k},i≠j{\ displaystyle S = X_ {0} \ cup X_ {1} \ cup X_ {2} \ cup \ ldots \ cup X_ {k}: X_ {i} \ cap X_ {j} = \ varnothing \ quad \ forall i , j \ in \ lbrace 0,1, \ ldots, k \ rbrace, i \ neq j}   .

To each subsetXi {\ displaystyle X_ {i}}   a representative is appointedri∈Xi {\ displaystyle r_ {i} \ in X_ {i}}   . The corresponding system of disjoint sets supports the following operations:

  • MakeSet(x){\ displaystyle \ mathrm {MakeSet} (x)}   : creates for an elementx {\ displaystyle x}   new subset. Assigns the same element as a representative of the created subset.
  • Union(r,s){\ displaystyle \ mathrm {Union} (r, s)}   : combines both subsets belonging to representativesr {\ displaystyle r}   ands {\ displaystyle s}   , and appointsr {\ displaystyle r}   representative of a new subset.
  • Find(x){\ displaystyle \ mathrm {Find} (x)}   : defines forx∈S {\ displaystyle x \ in S}   the subset to which the element belongs and returns its representative.

Algorithmic implementation

The trivial implementation preserves the belonging of elements fromS {\ displaystyle S}   and representativesri {\ displaystyle r_ {i}}   in the index array . In practice, many trees are often used. This can significantly reduce the time required for the Find operation. In this case, the representative is written to the root of the tree, and the remaining elements of the class to the nodes below it.

  • Union(r,s){\ displaystyle \ mathrm {Union} (r, s)}   : hangs the root of a lower tree under the root of a higher tree. If at the same timer {\ displaystyle r}   becomes a descendants {\ displaystyle s}   , both nodes are swapped.
  • Find(x){\ displaystyle \ mathrm {Find} (x)}   : goes fromx {\ displaystyle x}   to the root of the tree and returns it (the root in this case is a representative).

Heuristics

To speed up Union and Find operations, the Union -By-Size , Union-By-Height , Random-Union, and path compression heuristics can be used.

In Union-By-Size heuristic during operationUnion(r,s) {\ displaystyle \ mathrm {Union} (r, s)}   the root of the smaller tree is hung under the root of the larger tree. Thanks to this approach, tree balancing is maintained. Depth of each subtreeT {\ displaystyle T}   cannot exceed the valuelog⁡|T| {\ displaystyle \ log \ left | T \ right |}   . When using this heuristic, the Find operation time in the worst case increases withO(log⁡n) {\ displaystyle O (\ log n)}   beforeO(n) {\ displaystyle O (n)}   . For effective implementation, it is proposed to keep the number of nodes in the tree at the root.

The Union-By-Height heuristic is similar to Union-By-Size , but uses tree height instead of size.

The Random-Union heuristic uses the fact that you can not spend extraO(n) {\ displaystyle O (n)}   memory to save the number of nodes in the tree: it is enough to choose a root randomly - such a solution gives a speed on random queries that is quite comparable with other implementations. Nevertheless, if there are many queries such as “combine a large set with a small one”, this heuristic improves the expectation (that is, the average operating time) by only two times, therefore it is not recommended to use it without the path compression heuristic.

Path compression heuristics used to speed up operationFind(x) {\ displaystyle \ mathrm {Find} (x)}   . With each new search, all the elements that are on the way from the root to the desired element are hung under the root of the tree. In this case, the Find operation will work on averageα(n) {\ displaystyle \ alpha (n)}   whereα {\ displaystyle \ alpha}   Is the inverse of the Ackerman function . This allows you to significantly speed up the work, asα {\ displaystyle \ alpha}   for all values ​​used in practice, takes a value less than 5.

Implementation Example

C ++ implementation:

  const int MAXN = 1000 ;

 int p [ MAXN ], rank [ MAXN ];

 void MakeSet ( int x ) 
 {
     p [ x ] = x ;
     rank [ x ] = 0 ;
 }

 int Find ( int x ) 
 {
     return ( x == p [ x ] ? x : p [ x ] = Find ( p [ x ]) );
 }

 void Union ( int x , int y ) 
 {
     if ( ( x = Find ( x )) == ( y = Find ( y )) )
         return
	
     if ( rank [ x ] < rank [ y ] )
         p [ x ] = y ;
     else {
         p [ y ] = x ;
         if ( rank [ x ] == rank [ y ] )
             ++ rank [ x ];
     }
 }

Free Pascal implementation:

  1 const MAX_N = 1000 ;
 2
 3 var Parent , Rank : array [ 1 .. MAX_N ] of LongInt ;
 four
 5 procedure swap ( var x , y : LongInt ) ;
 6 var tmp : LongInt ;
 7 begin
 8 tmp : = x ; 
 9 x : = y ; 
 10 y : = tmp ;
 11 end ;
 12
 13 procedure MakeSet ( x : LongInt ) ;
 14 begin
 15 Parent [ x ] : = x ;
 16 Rank [ x ] : = 0 ;
 17 end ;
 18
 19 function Find ( x : LongInt ) : LongInt ;
 20 begin
 21 if ( Parent [ x ] <> x ) then
 22 Parent [ x ] : = Find ( Parent [ x ] ) ;
 23 Exit ( Parent [ x ] ) ;
 24 end ;
 25
 26 procedure Union ( x , y : LongInt ) ;
 27 begin
 28 x : = Find ( x ) ;
 29 y : = Find ( y ) ;
 30 if ( x = y ) then exit () ;
 31 if ( Rank [ x ] < Rank [ y ] ) then swap ( x , y ) ;
 32  
 33 Parent [ y ] : = x ;
 34 if ( Rank [ x ] = Rank [ y ] ) then
 35 inc ( Rank [ x ] ) ;
 36 end ;

See also

  • Forest of disjoint sets

Literature

  • Galler, Bernard A., and Michael J. Fisher. “An improved equivalence algorithm.” // Communications of the ACM , 7.5 (1964): 301-303. (eng.)
  • Tarjan, Robert E., and Jan Van Leeuwen. “Worst-case analysis of set union algorithms.” // Journal of the ACM 31.2 (1984): 245–281. (eng.)
  • Thomas Kormen et al. Algorithms: Construction and Analysis = Introduction to Algorithms. - 2nd ed. - M .: "Williams" , 2006. - S. 1296. - ISBN 0-07-013151-1 .

Links

  • Union-Find / Kevin Wayne, Pearson-Addison Wesley
  • Chapter 22: Data Structures For Disjoint Sets / Introduction to Algorithms, Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest
  • Visualizer of the work of some data structures for disjoint sets / ITMO
  • Implementing disjoint sets in the C ++ library collection Boost , 2006
Source - https://ru.wikipedia.org/w/index.php?title=Disjoint_Set_system&oldid=101440906


More articles:

  • Kanchalan (river)
  • Malpighievs
  • Woodpile
  • Ryasne (Lviv)
  • Kunze, Otto
  • Turner Spreading
  • Arthur Henderson
  • hCalendar
  • List of Honored Masters of Sports of the USSR (towns)
  • Rogans

All articles

Clever Geek | 2019