Non-root binary tree

A cladogram in the form of a non-root binary tree, representing the similarity and evolutionary history of the genera of actinobacteria .

A non-root binary tree is a non-root tree in which each vertex has either one or three neighbors.

Definitions

A free tree or a non-root tree is a connected, undirected graph without cycles . Vertices with one neighbor are called tree leaves , the remaining vertices are called internal nodes . The degree of a vertex is the number of its neighbors. In a tree with more than one arc, the leaves have a degree of unity. A non-oriented binary tree is a free tree in which all internal nodes have degree three.

In some applications, it makes sense to distinguish between subspecies of non-root binary trees — a planar embedding of a tree can be fixed by specifying the cyclic order of the edges at each vertex, translating the tree into a flat tree . In computer science, binary trees are often root oriented trees, if they are used as data structures , but when using non-root binary trees in hierarchical clustering and reconstruction of evolutionary trees , undirected trees are more common ^[1] .

In addition, trees can be distinguished in which all vertices have different labels, trees in which only leaves are marked, and trees in which nodes are not marked. A non-root binary tree with n leaves has n - 2 internal nodes, so labels can be taken from the interval from 1 to 2 n - 1 if you need to mark all nodes, or a set of numbers from 1 to n , if you want to mark only leaves ^[1] .

Related Structures

Root binary trees

A non-root binary tree T can be transformed into a root binary tree (that is, a root tree in which each non-leaf node has exactly two children) by selecting the root edge e of the tree T , placing a new root in the middle of the edge e and directing all the edges of the resulting subdivided tree from the root. Back - any root tree can be transformed into a non-root binary tree by removing the root arc, replacing the path between its two descendants with one undirected edge and removing the directions of the arcs in the graph. For this reason, there are exactly 2 n −3 times more root trees with n leaves compared to non-root binary trees with n leaves ^[1] .

Hierarchical clustering

Hierarchical clustering of a set of objects can be formalized as a maximum objects in which no two sets intersect (but some sets may be contained in others). That is, for two sets S and T in the family, either S and T do not overlap, or one set is contained in the other, and no set can be added to the family while preserving this property. If T is a non-root binary tree, it determines the hierarchical clustering of leaves of this tree — for each edge ( u , v ) of T there is a cluster consisting of leaves that are closer to u than to v , and these sets together with the empty set and the set of all leaves form a maximal non-intersecting family. In the opposite direction, from any family of disjoint sets over a set of n elements, it is possible to form a single non-root binary tree that has a node for any triple ( A , B , C ) disjoint sets from the family, covering together all the elements ^[2] .

Evolutionary Trees

According to simple forms of the theory of evolution , the history of life can be reduced to a phylogenetic tree , in which each node describes a genus, the leaves represent the existing species, and the edges represent the parent-offspring relationship between the genera. This tree has a natural orientation of the edges from the parent species to the descendants, and the root is the common ancestor of the species, so this tree is the root tree. However, some methods of recovering binary trees can restore only the nodes and edges of this tree, but not their orientation.

For example, cladistic methods , such as the , use as a data a set of binary properties describing the characteristics of these genera. These methods look for a tree with the given species as leaves, in which the inner nodes are also marked with some signs, and try to minimize the number of cases in which the sign is present on only one vertex of the edge in the tree. Ideally, each attribute can have only one edge with such a case. Changing the root of a tree does not change the number of edges with different characteristics, so methods based on the principles of greatest economy cannot determine the position of the root of a tree and create a non-root tree, often a non-root binary tree ^[3] .

Non-root binary trees can also be formed by reconstructing evolutionary trees based on data definitions for every four genera of leaves, a non-root binary tree describing the evolution of these four species, and methods that use to measure the distance between trees ^[4] .

Decomposition on the branch

Non-root binary trees are also used to determine the decomposition of a graph on a branch of a graph by forming a non-root binary tree, the leaves of which represent the edges of a given graph. That is, decomposition of a graph on a branch can be considered as hierarchical clustering of edges of a graph. Decomposition of a graph into branches and the corresponding numerical value, the branch width, are closely related to the tree width and form the basis for constructing efficient algorithms for dynamic programming on graphs ^[5] .

Enumeration

Due to the use of non-root binary trees in hierarchical clustering, the most natural task of enumerating graphs on non-root binary trees is counting the number of trees with n marked leaves and unlabeled internal nodes. A non-root binary tree with n marked leaves can be formed by connecting the nth leaf to a new node in the middle of any edge of a root binary tree with n - 1 marked leaves. There are 2 n - 5 edges to which the n- th node can be added. Thus, the number of trees with n leaves is greater than the number of trees with n - 1 leaves 2 n - 5 times, so the number of trees with n marked leaves is equal to double factorial

(2n-5)!!={\frac {(2n-4)!}{(n-2)!2^{n-2}}}.

{\ displaystyle (2n-5) !! = {\ frac {(2n-4)!} {(n-2)! 2 ^ {n-2}}}.}

^[6] .

The number of trees with 2, 3, 4, 5, ... marked leaves

1, 1, 3, 15, 105, 945, 10395, 135135, 2027025, 34459425, ... (sequence A001147 in OEIS ).

Alternative names

Non-root binary trees are also called free binary trees ^[7] , cubic trees ^[8] , ternary trees ^[5], and non-root ternary trees ^[9] . However, the name “free binary tree” is also used for non-rooted trees, which can have nodes of degree two ^[10] and for root binary trees with non-oriented descendants ^[11] , and the name “ternary tree” is more often used in the sense of .

Notes

↑ ¹ ² ³ Furnas, 1984 .
↑ See, for example, Epstein's article ( Eppstein 2009 ) on the correspondence between clustering and trees, but the article uses root binary trees, not non-root trees, and therefore includes an arbitrary choice of the root node.
↑ Hendy, Penny, 1989 .
↑ St. John, Warnow, Moret, Vawterd, 2003 .
↑ ¹ ² Robertson, Seymour, 1991 .
↑ Balding, Bishop, Cannings, 2007 .
↑ Czumaj, Gibbons, 1996 .
↑ Exoo, 1996 .
↑ Cilibrasi, Vitanyi, 2006 .
↑ Harary, Palmer, Robinson, 1992 .
↑ Przytycka, Larmore, 1994 .

Literature

DJ Balding, Martin J. Bishop, Christopher Cannings. Handbook of Statistical Genetics. - 3rd. - Wiley-Interscience, 2007. - Vol. 1. - P. 502. - ISBN 978-0-470-05830-5 .
Rudi Cilibrasi, Paul MB Vitanyi. A new quartet tree heuristic for hierarchical clustering. - 2006.
Artur Czumaj, Alan Gibbons. Guthrie's problem: new equivalences and rapid reductions // Theoretical Computer Science. - 1996. - V. 154 , no. 1 . - P. 3–22 . - DOI : 10.1016 / 0304-3975 (95) 00126-3 .
David Eppstein. Square summing up a tree: Sum of subtree clustering and hyperbolic pants decomposition // ACM Transactions on Algorithms. - 2009. - Vol. 5 , no. 3 - p . 1–24 . - DOI : 10.1145 / 1541885.1541890 . - arXiv : cs.CG/0604034 .
Geoffrey Exoo. A simple method for constructing small cubic graphs of girths 14, 15, and 16 // Electronic Journal of Combinatorics. - 1996. - V. 3 , no. 1 .
George W. Furnas. The binary of the unordered trees // Journal of Classification. - 1984. - V. 1 , no. 1 . - p . 187–233 . - DOI : 10.1007 / BF01890123 .
Frank Harary , EM Palmer, RW Robinson. Counting free binary trees admitting a given height // Journal of Combinatorics, Information, and System Sciences. - 1992. - T. 17 . - pp . 175–181 .
Michael D. Hendy, David Penny. Framework for the quantitative study of evolutionary trees // Systematic Biology. - 1989. - V. 38 , no. 4 - p . 297–309 . - DOI : 10.2307 / 2992396 .
Teresa M. Przytycka, Lawrence L. Larmore. The optimal alphabetic tree problem revisited // Proc. 21st International Colloquium on Automata, Languages and Programming (ICALP '94). - Springer-Verlag, 1994. - T. 820. - p. 251–262. - (Lecture Notes in Computer Science). - DOI : 10.1007 / 3-540-58201-0_73 .
Neil Robertson, Paul D. Seymour. Graph minors. X. Obstructions to tree-decomposition // Journal of Combinatorial Theory. - 1991. - V. 52 , no. 2 - p . 153–190 . - DOI : 10.1016 / 0095-8956 (91) 90061-N .
Katherine St. John, Tandy Warnow, Bernard ME Moret, Lisa Vawterd. Performance study of phylogenetic methods: (unweighted) quartet methods and neighbor-joining // Journal of Algorithms. - 2003. - V. 48 , no. 1 . - p . 173–193 . - DOI : 10.1016 / S0196-6774 (03) 00049-X .

[_9fa01cfc665b9b32-1] ¹ ² ³ Furnas, 1984 .

[2] See, for example, Epstein's article ( Eppstein 2009 ) on the correspondence between clustering and trees, but the article uses root binary trees, not non-root trees, and therefore includes an arbitrary choice of the root node.

[_ba3beb1374ac99e8-3] Hendy, Penny, 1989 .

[_67b4943072303968-4] St. John, Warnow, Moret, Vawterd, 2003 .

[_316ca60a66839625-5] ¹ ² Robertson, Seymour, 1991 .

[_91f3e4e0928a7c45-6] Balding, Bishop, Cannings, 2007 .

[_a7938e1e43d09dc6-7] Czumaj, Gibbons, 1996 .

[_c0a082e570f9d1ad-8] Exoo, 1996 .

[_a948c6dfee8c0c7d-9] Cilibrasi, Vitanyi, 2006 .

[_74fa993c08907152-10] Harary, Palmer, Robinson, 1992 .

[_8cba373503d3b5b1-11] Przytycka, Larmore, 1994 .