Module 21. Trees
Learning Objectives
- Understand the trade-offs between different tree data structures in terms of performance and memory usage.
- Define trees and their key components (nodes, edges, root, leaves, branches, subtrees).
- Explain the different types of trees (binary trees, binary search trees, AVL trees, heaps).
- Implement basic tree operations (creating, traversing, searching, inserting, and deleting) using Python.
- Analyze the time and space complexity of tree algorithms.
- Apply trees to solve real-world problems, such as data structures for efficient searching and sorting.
1. Introduction
Trees are a fundamental data structure in computer science that can be used to represent hierarchical relationships between elements. Think of a tree as an upside-down family tree, where the root node is at the top and the leaves are at the bottom.
In Python, trees can be implemented using various techniques, including:
- Using nested lists to represent the hierarchical structure.
- Creating custom classes to define nodes and their relationships.
- Leveraging built-in data structures like dictionaries or collections.
This module will explore the concepts of trees in Python, focusing on their definition, types, operations, and applications.
2. Types of Trees
Binary Trees
- Binary search trees (BSTs): A binary tree where the left child of a node has a value less than the parent, and the right child has a value greater than the parent.
- AVL trees: A self-balancing binary search tree where the heights of the left and right subtrees of any node differ by at most one.
- Heaps: A complete binary tree where either all nodes are greater than or equal to their children (max heap), or all nodes are less than or equal to their children (min heap).
Other Tree Types
- Red-Black trees: a red–black tree is a self-balancing binary search treedata structure noted for fast storage and retrieval of ordered information.
- General trees: Trees where nodes can have an arbitrary number of children.
- N-ary trees: Trees where each node has at most N children.
- Trie: A tree-like data structure used for efficient string searching.
- Suffix tree: A trie that stores all suffixes of a string.
- Decision trees: Trees used in machine learning to make decisions based on a set of rules.
These are just a few examples of the many types of trees that exist. The choice of tree data structure depends on the specific requirements of the problem at hand.
3. Binary Trees
Binary trees are a specific type of tree where each node has at most two children. They are commonly used in computer science due to their simplicity and efficiency.
Key Concepts
- Node: A basic unit of a binary tree, containing a value and pointers to its left and right children.
- Root: The topmost node in a binary tree.
- Leaf: A node with no children.
- Internal node: A node with at least one child.
- Subtree: A connected subset of a binary tree that is itself a binary tree.
Types of Binary Trees
- Full binary tree: Every node has either 0 or 2 children.
- Complete binary tree: All levels are filled except possibly the last, and all nodes on the last level are as far left as possible.
- Perfect binary tree: A full binary tree where all leaves are at the same depth.
Traversal Algorithms
There are three common ways to traverse a binary tree:
- In-order traversal: Visit the left subtree, then the root, then the right subtree.
In in-order traversal, the left child is visited first, followed by the node itself, and then the right child. This can be visualized as Left - Root - Right.
Output:
In-Order Traversal: 4 2 5 1 3 |
- Pre-order traversal: Visit the root, then the left subtree, then the right subtree.
In pre-order traversal, the node is visited first, followed by its left child and then its right child. This can be visualized as Root - Left - Right.
Output:
Pre-Order Traversal: 1 2 4 5 3 |
- Post-order traversal: Visit the left subtree, then the right subtree, then the root.
In post-order traversal, the left child is visited first, then the right child, and finally the node itself. This can be visualized as Left - Right - Root.
Output:
Post-Order Traversal: 4 5 2 3 1 |
3.1 Binary Search Tree
A Binary Search Tree (or BST) is a data structure used in computer science for organizing and storing data in a sorted manner. Each node in a Binary Search Tree has at most two children, a left child and a right child, with the left child containing values less than the parent node and the right child containing values greater than the parent node. This hierarchical structure allows for efficient searching, insertion, and deletion operations on the data stored in the tree.
- Insertion in Binary Search Tree (BST)
Given a BST, the task is to insert a new node in this BST.
How to Insert a value in a Binary Search Tree:
A new key is always inserted at the leaf by maintaining the property of the binary search tree. We start searching for a key from the root until we hit a leaf node. Once a leaf node is found, the new node is added as a child of the leaf node. The below steps are followed while we try to insert a node into a binary search tree:
- Initilize the current node (say, currNode or node) with root node
- Compare the key with the current node.
- Move left if the key is less than or equal to the current node value.
- Move right if the key is greater than current node value.
- Repeat steps 2 and 3 until you reach a leaf node.
- Attach the new key as a left or right child based on the comparison with the leaf node’s value.
Insertion in Binary Search Tree using Recursion:
Below is the implementation of the insertion operation using recursion.
Output:
20 30 40 50 60 70 80 |
Time Complexity: The worst-case time complexity of insert operations is O(h) where h is the height of the Binary Search Tree.
In the worst case, we may have to travel from the root to the deepest leaf node. The height of a skewed tree may become n and the time complexity of insertion operation may become O(n).
Insertion in Binary Search Tree using Iterative approach:
Instead of using recursion, we can also implement the insertion operation iteratively using a while loop. Below is the implementation using a while loop.
Output:
20 30 40 50 60 70 80 |
The time complexity of inorder traversal is O(n), as each node is visited once.
The Auxiliary space is O(n), as we use a stack to store nodes for recursion.
Searching in Binary Search Tree:
The task is to search a node in this BST. For searching a value in BST, consider it as a sorted array. Now we can easily perform search operation in BST using Binary Search Algorithm. (Binary Search Algorithm: below is the step-by-step algorithm for Binary Search: Divide the search space into two halves by finding the middle index “mid”. Compare the middle element of the search space with the key. If the key is found at middle element, the process is terminated. If the key is not found at middle element, choose which half will be used as the next search space. If the key is smaller than the middle element, then the left side is used for next search. If the key is larger than the middle element, then the right side is used for next search. This process is continued until the key is found or the total search space is exhausted.)
Input: Root of the below BST
Output: True
Explanation: 8 is present in the BST as right child of root
Input: Root of the below BST
Output: False
Explanation: 14 is not present in the BST
Output:
19 is Not Found20 is Found |
Time complexity: O(h), where h is the height of the BST.
- Deletion in Binary Search Tree: the task is to delete a node in this BST, which can be broken down into 3 scenarios:
Case1: Delete a Leaf Node in BST
Case2: Delete a Node with Single Child in BST
Deleting a single child node is also simple in BST. Copy the child to the node and delete the node.
Case3: Delete a Node with Both Children in BST
Deleting a node with both children is not so simple. Here we have to delete the node is such a way, that the resulting tree follows the properties of a BST. The trick is to find the inorder successor of the node. Copy contents of the inorder successor to the node, and delete the inorder successor.
Output:
5 10 12 18 |
Time Complexity: O(h), where h is the height of the BST.
3.2 AVL Tree
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one. The difference between the heights of the left subtree and the right subtree for any node is known as the balance factor of the node. The AVL tree is named after its inventors, Georgy Adelson-Velsky and Evgenii Landis, who published it in their 1962 paper “An algorithm for the organization of information”.
The above tree is AVL because the differences between the heights of left and right subtrees for every node are less than or equal to 1.
Insertion in an AVL Tree
AVL Tree: AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees cannot be more than one for all nodes.
The above tree is AVL because the differences between the heights of left and right subtrees for every node are less than or equal to 1.
Deletion in an AVL Tree
To make sure that the given tree remains AVL after every deletion, we must augment the standard BST delete operation to perform some re-balancing. Following are two basic operations that can be performed to re-balance a BST without violating the BST property (keys(left) < key(root) < keys(right)).
- Left Rotation: When a node is added into the right subtree of the right subtree, if the tree gets out of balance, we do a single left rotation.
- Right Rotation: If a node is added to the left subtree of the left subtree, the AVL tree may get out of balance, we do a single right rotation.
- Left-Right Rotation: A left-right rotation is a combination in which first left rotation takes place after that right rotation executes.
- Right-Left Rotation: A right-left rotation is a combination in which first right rotation takes place after that left rotation executes.
Insertion at AVL tree:
Output:
Preorder traversal :30 20 10 25 40 50 |
Time Complexity: O(log(n)), For Insertion
Deletion at AVL tree:
Output:
Preorder traversal of the constructed AVL tree is9 1 0 -1 5 2 6 10 11 Preorder traversal after deletion of 101 0 -1 9 5 2 6 11 |
Time Complexity: The rotation operations (left and right rotate) take constant time as only few pointers are being changed there. Updating the height and getting the balance factor also take constant time. So the time complexity of AVL delete remains same as BST delete which is O(h) where h is height of the tree. Since AVL tree is balanced, the height is O(log n). So time complexity of AVL delete is O(log n).