MCS021
Asymptotic Notations
Asymptotic notations are the mathematical notations used to describe the running time of an algorithm when the input tends towards a particular value or a limiting value.
Asymptotic notations are used to write fastest and slowest possible running time for an algorithm. These are also referred to as ‘best case’ and ‘worst case’ scenarios respectively.
“In asymptotic notations, we derive the complexity concerning the size of the input. (Example in terms of n)”
“These notations are important because without expanding the cost of running the algorithm, we can estimate the complexity of the algorithms.”
For example: In bubble sort, when the input array is already sorted, the time taken by the algorithm is linear i.e. the best case.
But, when the input array is in reverse condition, the algorithm takes the maximum time (quadratic) to sort the elements i.e. the worst case.
When the input array is neither sorted nor in reverse order, then it takes average time. These durations are denoted using asymptotic notations.
There are mainly three asymptotic notations:
-Big-O notation
-Omega notation
-Theta notation
1. Big-oh notation: Big-oh is the formal method of expressing the upper bound of an algorithm’s running time. It is the measure of the longest amount of time. The function f (n) = O (g (n)) [read as “f of n is big-oh of g of n”] if and only if exist positive constant c and such that
f (n) ⩽ k.g (n)f(n)⩽k.g(n) for n>n0n>n0 in all case
Hence, function g (n) is an upper bound for function f (n), as g (n) grows faster than f (n)
For Example:
1. 3n+2=O(n) as 3n+2≤4n for all n≥2
2. 3n+3=O(n) as 3n+3≤4n for all n≥3
Hence, the complexity of f(n) can be represented as O (g (n))
2. Omega (Ω) Notation: The function f (n) = Ω (g (n)) [read as “f of n is omega of g of n”] if and only if there exists positive constant c and n0 such that
F (n) ≥ k* g (n) for all n, n≥ n0
For Example:
f (n) =8n2+2n-3≥8n2-3
=7n2+(n2-3)≥7n2 (g(n))
Thus, k1=7
Hence, the complexity of f (n) can be represented as Ω (g (n))
3. Theta (θ): The function f (n) = θ (g (n)) [read as “f is the theta of g of n”] if and only if there exists positive constant k1, k2 and k0 such that
k1 * g (n) ≤ f(n)≤ k2 g(n)for all n, n≥ n0
For Example:
3n+2= θ (n) as 3n+2≥3n and 3n+2≤ 4n, for n
k1=3,k2=4, and n0=2
Hence, the complexity of f (n) can be represented as θ (g(n)).
The Theta Notation is more precise than both the big-oh and Omega notation. The function f (n) = θ (g (n)) if g(n) is both an upper and lower bound.
Time Complexity.VS.Space Complexity
Calculates the time required…Estimates the space memory required
Time is counted for all statements…Memory space is counted for all variables, inputs, and outputs.
The size of the input data is the primary determinant…Primarily determined by the auxiliary variable size
More crucial in terms of solution optimization…More essential in terms of solution optimization
Sparse Matrices
A sparse matrix if most of the elements of it is 0. Another definition is, a matrix with a maximum of 1/3 non-zero elements (roughly 30% of m x n) is known as sparse matrix.
We use matrices in computers memory to do some operations in an efficient way. But if the matrices are sparse in nature, it may help us to do operations efficiently, but it will take larger space in memory. That spaces have no purpose. So we will follow some other kind of structures to store sparse matrices.
Suppose we have a sparse matrix like below −
As we can see that there are 8 non-zero elements, and 28 zero-values. This matrix is taking 6*6 = 36 memory spaces. If the size of the matrix is larger, the wastage of space will be increased.
We can store the information about the sparse matrices using table structure. Suppose we will create a table called X like below −
Here the first column is holding the Row number, and second one is holding the column number. The third one is holding the data present at M[row, col]. Each row of this table is called triplets, as there are three parameters. The first triplet is holding the size information of the matrix. Row = 6 and Column = 6 is indicating that the matrix M is 6 x 6 matrix. The value field is denoting the number of non-zero elements present in the array.
This table is also taking 9 * 3 = 36 amount of space, so where is the gain? Well if the matrix size is 8*8 = 64, but only 8 elements are present, then also the table X will take 36 unit of space. We can see that there are fixed 3 columns, the number of rows varies with the number of non-zero elements. So if there are T number of non-zero elements, then the space Complexity will be O(3*T) = O(T). For the matrix it will be O(m x n).
Write an algorithm to add two polynomials
Let p and q be the two polynomials represented by the linked list.
1. while p and q are not null, repeat step 2.
2. If powers of the two terms ate equal
then if the terms do not cancel then insert the sum of the terms into the sum Polynomial
Advance p
Advance q
Else if the power of the first polynomial> power of second
Then insert the term from first polynomial into sum polynomial
Advance p
Else insert the term from second polynomial into sum polynomial
Advance q
3. copy the remaining terms from the non empty polynomial into the
sum polynomial.
Linked List
-Linked List can be defined as collection of objects called nodes that are randomly stored in the memory.
-A node contains two fields i.e. data stored at that particular address and the pointer which contains the address of the next node in the memory.
-The last node of the list contains pointer to the null.
Uses of Linked List
-The list is not required to be contiguously present in the memory. The node can reside any where in the memory and linked together to make a list. This achieves optimized utilization of space.
-list size is limited to the memory size and doesn’t need to be declared in advance.
-Empty node can not be present in the linked list.
-We can store values of primitive types or objects in the singly linked list.
Why use linked list over array?
Till now, we were using array data structure to organize the group of elements that are to be stored individually in the memory. However, Array has several advantages and disadvantages which must be known in order to decide the data structure which will be used throughout the program.
Array contains following limitations:
-The size of array must be known in advance before using it in the program.
-Increasing size of the array is a time taking process. It is almost impossible to expand the size of the array at run time.
-All the elements in the array need to be contiguously stored in the memory. Inserting any element in the array needs shifting of all its predecessors.
Linked list is the data structure which can overcome all the limitations of an array. Using linked list is useful because,
-It allocates the memory dynamically. All the nodes of linked list are non-contiguously stored in the memory and linked together with the help of pointers.
-Sizing is no longer a problem since we do not need to define its size at the time of declaration. List grows as per the program’s demand and limited to the available memory space.
Singly Linked list
It is the commonly used linked list in programs. If we are talking about the linked list, it means it is a singly linked list. The singly linked list is a data structure that contains two parts, i.e., one is the data part, and the other one is the address part, which contains the address of the next or the successor node. The address part in a node is also known as a pointer.
Suppose we have three nodes, and the addresses of these three nodes are 100, 200 and 300 respectively. The representation of three nodes as a linked list is shown in the below figure:
Types of Linked List
We can observe in the above figure that there are three different nodes having address 100, 200 and 300 respectively. The first node contains the address of the next node, i.e., 200, the second node contains the address of the last node, i.e., 300, and the third node contains the NULL value in its address part as it does not point to any node. The pointer that holds the address of the initial node is known as a head pointer.
The linked list, which is shown in the above diagram, is known as a singly linked list as it contains only a single link. In this list, only forward traversal is possible; we cannot traverse in the backward direction as it has only one link in the list.
Representation of the node in a singly linked list
struct node
{
int data;
struct node *next;
}
In the above representation, we have defined a user-defined structure named a node containing two members, the first one is data of integer type, and the other one is the pointer (next) of the node type.
Doubly linked list
As the name suggests, the doubly linked list contains two pointers. We can define the doubly linked list as a linear data structure with three parts: the data part and the other two address part. In other words, a doubly linked list is a list that has three parts in a single node, includes one data part, a pointer to its previous node, and a pointer to the next node.
Suppose we have three nodes, and the address of these nodes are 100, 200 and 300, respectively. The representation of these nodes in a doubly-linked list is shown below:
Types of Linked List
As we can observe in the above figure, the node in a doubly-linked list has two address parts; one part stores the address of the next while the other part of the node stores the previous node’s address. The initial node in the doubly linked list has the NULL value in the address part, which provides the address of the previous node.
Representation of the node in a doubly linked list
struct node
{
int data;
struct node *next;
struct node *prev;
}
In the above representation, we have defined a user-defined structure named a node with three members, one is data of integer type, and the other two are the pointers, i.e., next and prev of the node type. The next pointer variable holds the address of the next node, and the prev pointer holds the address of the previous node. The type of both the pointers, i.e., next and prev is struct node as both the pointers are storing the address of the node of the struct node type.
Circular linked list
A circular linked list is a variation of a singly linked list. The only difference between the singly linked list and a circular linked list is that the last node does not point to any node in a singly linked list, so its link part contains a NULL value. On the other hand, the circular linked list is a list in which the last node connects to the first node, so the link part of the last node holds the first node’s address. The circular linked list has no starting and ending node. We can traverse in any direction, i.e., either backward or forward. The diagrammatic representation of the circular linked list is shown below:
struct node
{
int data;
struct node *next;
}
A circular linked list is a sequence of elements in which each node has a link to the next node, and the last node is having a link to the first node.
Algorithm for Implementing Stack using Arrays:
Algorithm for PUSH() operation in Stack using Array:
Step 1: Start
Step 2: Declare Stack[MAX]; //Maximum size of Stack
Step 3: Check if the stack is full or not by comparing top with (MAX-1)
If the stack is full, Then print “Stack Overflow” i.e, stack is full and cannot be pushed with another element
Step 4: Else, the stack is not full
Increment top by 1 and Set, a[top] = x
which pushes the element x into the address pointed by top.
// The element x is stored in a[top]
Step 5: Stop
Algorithm for POP() operation in Stack using Array:
Step 1: Start
Step 2: Declare Stack[MAX]
Step 3: Push the elements into the stack
Step 4: Check if the stack is empty or not by comparing top with base of array i.e 0
If top is less than 0, then stack is empty, print “Stack Underflow”
Step 5: Else, If top is greater than zero the stack is not empty,
then store the value pointed by top in a variable x=a[top] and decrement top by 1. The popped element is x.
Algorithm for PEEK() operation in Stack using Arrays:
Step 1: Start
Step 2: Declare Stack[MAX]
Step 3: Push the elements into the stack
Step 4: Print the value stored in the stack pointed by top.
Step 5: Stop
In the above algorithm,
We first define a stack of max size
PUSH(): First, we check if the stack is full, if the top pointing is equal to (MAX-1), it means that stack is full and no more elements can be inserted we print overflow. Otherwise, we increment the top variable and store the value to the index pointed by the top variable
POP(): First, we check if the stack is empty, if the top variable has a value less than 0, it means the stack is empty. So, we can’t delete any more elements and print underflow condition. Otherwise, we decrement the top variable.
PEEK(): The PEEK() function returns the topmost value stored in stack, it’s done by returning the element pointed by the top variable.
What is a Circular Queue?
A circular queue is similar to a linear queue as it is also based on the FIFO (First In First Out) principle except that the last position is connected to the first position in a circular queue that forms a circle. It is also known as a Ring Buffer.
Operations on Circular Queue
The following are the operations that can be performed on a circular queue:
Front: It is used to get the front element from the Queue.
Rear: It is used to get the rear element from the Queue.
enQueue(value): This function is used to insert the new value in the Queue. The new element is always inserted from the rear end.
deQueue(): This function deletes an element from the Queue. The deletion in a Queue always takes place from the front end.
Applications of Circular Queue
The circular Queue can be used in the following scenarios:
-Memory management: The circular queue provides memory management. As we have already seen that in linear queue,
the memory is not managed very efficiently. But in case of a circular queue, the memory is managed efficiently by placing the elements in a location which is unused.
-CPU Scheduling: The operating system also uses the circular queue to insert the processes and then execute them.
-Traffic system: In a computer-control traffic system, traffic light is one of the best examples of the circular queue. Each light of traffic light gets ON one by one after every jinterval of time. Like red light gets ON for one minute then yellow light for one minute and then green light. After green light, the red light gets ON.
Dequeue Operation
The steps of dequeue operation are given below:
First, we check whether the Queue is empty or not. If the queue is empty, we cannot perform the dequeue operation.
When the element is deleted, the value of front gets decremented by 1.
If there is only one element left which is to be deleted, then the front and rear are reset to -1.
Algorithm to delete an element from the circular queue
Step 1: IF FRONT = -1
Write ” UNDERFLOW “
Goto Step 4
[END of IF]
Step 2: SET VAL = QUEUE[FRONT]
Step 3: IF FRONT = REAR
SET FRONT = REAR = -1
ELSE
IF FRONT = MAX -1
SET FRONT = 0
ELSE
SET FRONT = FRONT + 1
[END of IF]
[END OF IF]
Step 4: EXIT
Binary Tree
The Binary tree means that the node can have maximum two children. Here, binary name itself suggests that ‘two’; therefore, each node can have either 0, 1 or 2 children.
Properties of Binary Tree
-At each level of i, the maximum number of nodes is 2i.
-The height of the tree is defined as the longest path from the root node to the leaf node. The tree which is shown above has a height equal to 3. Therefore, the maximum number of nodes at height 3 is equal to (1+2+4+8) = 15. In general, the maximum number of nodes possible at height h is (20 + 21 + 22+….2h) = 2h+1 -1.
-The minimum number of nodes possible at height h is equal to h+1.
-If the number of nodes is minimum, then the height of the tree would be maximum. Conversely, if the number of nodes is maximum, then the height of the tree would be minimum.
If there are ‘n’ number of nodes in the binary tree.
The minimum height can be computed as:
As we know that,
n = 2h+1 -1
n+1 = 2h+1
Taking log on both the sides,
log2(n+1) = log2(2h+1)
log2(n+1) = h+1
h = log2(n+1) – 1
The maximum height can be computed as:
As we know that,
n = h+1
h= n-1
Recursive preorder traversal of a binary tree
In this traversal, we first process the data stored in the root node, then we process all nodes in the left subtree recursively, and finally process all the nodes in the right subtree recursively. Suppose we are using a function preorder(root) with root as an input parameter. Here are the simple steps of recursive implementation:
First, process the data stored in the root node i.e. process(root->value). We can do various data manipulation operations like printing the node data, arithmetic operations, logical operations, etc.
Then we recursively traverse and process each node in the
left subtree by calling the same function with root->left as input parameter i.e. preorder(root->left).
Then we recursively traverse and process each node in the right subtree by calling the same function with root->right as input parameter i.e. preorder(root->right).
Base case: recursion will terminate and backtrack when it reaches the bottommost end of the binary tree, i.e. root == NULL.
Pseudocode of recursive preorder traversal
class TreeNode
{
int data
TreeNode left
TreeNode right
}
void preorder(TreeNode root)
{
if (root == NULL)
return
process(root->data)
preorder(root->left)
preorder(root->right)
}
Application of Preorder Traversal
We use preorder traversal in a situation when we need to explore all the tree nodes before inspecting any leaves.
-To create a replica or copy of a tree.
-To get the prefix expression of an expression tree.
Recursive inorder traversal of a binary tree
In this traversal, we first process all nodes in the left subtree recursively, process the data stored in the root node, finally process all the nodes in the right subtree recursively. Suppose we are using a function inorder(root) with root as an input parameter. Here are the simple steps to recursive implementation:
Before processing the root node, we recursively traverse and process each node in the left subtree by calling the same function with root->left as input parameter i.e. inorder(root->left).
After processing each node in the left subtree, now we process data stored in the root node i.e. process(root->data).
Finally, we recursively traverse and process each node in the
right subtree by calling the same function with root->right as input parameter i.e. inorder(root->right).
Base case: recursion will terminate and backtrack when it reaches the end of the tree i.e. root == NULL.
Pseudocode of recursive Inorder
class TreeNode
{
int data
TreeNode left
TreeNode right
}
void inorder(TreeNode root)
{
if (root == NULL)
return
inorder(root->left)
process(root->data)
inorder(root->right)
}
Application of Inorder Traversal
It is commonly used in the processing of binary search trees. In BST, there is an order associated with each node: all values in the left subtree ≤ node value ≤ all values in the right subtree. So inorder traversal of BST generates the node’s data in increasing order.
Recursive postorder traversal of a binary tree
Suppose we are using a function postorder(root) with root as an input parameter. We first process all the nodes in the left subtree, then process all the nodes in the right subtree and at the end, we process the root node.
-We first recursively traverse and process each node in the left subtree by calling the same function with root->left as input parameter i.e. postorder(root->left).
-Then we recursively traverse and process each node in the right subtree by calling the same function with root->right as input parameter i.e. inorder(root->right).
-Finally, we process the data stored in the root node i.e. process(root->value).
-Base case: recursion will terminate and backtrack when it reaches the bottommost end of the tree i.e. root == NULL.
Pseudocode of recursive postorder
class TreeNode
{
int data
TreeNode left
TreeNode right
}
void postorder(TreeNode root)
{
if(root == null)
return
postorder(root->left)
postorder(root->right)
process(root)
}
Application of Postorder Traversal
We use postorder traversal when we need to explore all the leaf nodes before inspecting any internal nodes.
-Deleting the entire binary tree
-To get postfix expression on of an expression tree.
What is a Binary Search Tree(BST)?
A binary search tree follows some order to arrange the elements. In a Binary search tree, the value of left node must be smaller than the parent node, and the value of right node must be greater than the parent node. This rule is applied recursively to the left and right subtrees of the root.
Let’s understand the concept of Binary search tree with an example.
In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its right child. So, it also satisfies the property of binary search tree. Therefore, we can say that the tree in the above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree. Therefore, the above tree is not a binary search tree.
Advantages of Binary search tree
Searching an element in the Binary search tree is easy as we always have a hint that which subtree has the desired element.
As compared to array and linked lists, insertion and deletion operations are faster in BST.
Algorithm to search an element in Binary search tree
Search (root, item)
Step 1 – if (item = root → data) or (root = NULL)
return root
else if (item < root → data)
return Search(root → left, item)
else
return Search(root → right, item)
END if
Step 2 – END
File Organization
File organization ensures that records are available for processing. It is used to determine an efficient file organization for each base relation.
For example, if we want to retrieve employee records in alphabetical order of name. Sorting the file by employee name is a good file organization. However, if we want to retrieve all employees whose marks are in a certain range, a file is ordered by employee name would not be a good file organization.
Types of File Organization
There are three types of organizing the file:
1. Sequential access file organization
2. Direct access file organization
3. Indexed sequential access file organization
Sequential access file organization
Storing and sorting in contiguous block within files on tape or disk is called as sequential access file organization. In sequential access file organization, all records are stored in a sequential order. The records are arranged in the ascending or descending order of a key field. Sequential file search starts from the beginning of the file and the records can be added at the end of the file. In sequential file, it is not possible to add a record in the middle of the file without rewriting the file.
Advantages of sequential file
-It is simple to program and easy to design.
-Sequential file is best use if storage space.
Disadvantages of sequential file
-Sequential file is time consuming process.
-It has high data redundancy.
-Random searching is not possible.
Direct access file organization
Direct access file is also known as random access or relative file organization.
In direct access file, all records are stored in direct access storage device (DASD), such as hard disk. The records are randomly placed throughout the file. The records does not need to be in sequence because they are updated directly and rewritten back in the same location. This file organization is useful for immediate access to large amount of information. It is used in accessing large databases. It is also called as hashing.
Advantages of direct access file organization
-Direct access file helps in online transaction processing system (OLTP) like online railway reservation system.
-In direct access file, sorting of the records are not required.
-It accesses the desired records immediately.
-It updates several files quickly.
-It has better control over record allocation.
Disadvantages of direct access file organization
-Direct access file does not provide back up facility.
-It is expensive.
-It has less storage space as compared to sequential file.
Indexed sequential access file organization
Indexed sequential access file combines both sequential file and direct access file organization.
In indexed sequential access file, records are stored randomly on a direct access device such as magnetic disk by a primary key. This file have multiple keys. These keys can be alphanumeric in which the records are ordered is called primary key. The data can be access either sequentially or randomly using the index. The index is stored in a file and read into memory when the file is opened.
Advantages of Indexed sequential access file organization
-In indexed sequential access file, sequential file and random file access is possible.
-It accesses the records very fast if the index table is properly organized.
-The records can be inserted in the middle of the file.
-It provides quick access for sequential and direct processing.
-It reduces the degree of the sequential search.
Disadvantages of Indexed sequential access file organization
-Indexed sequential access file requires unique keys and periodic reorganization.
-Indexed sequential access file takes longer time to search the index for the data access or retrieval.
-It requires more storage space.
-It is expensive because it requires special software.
-It is less efficient in the use of storage space as compared to other file organizations.
Quicksort
The Quicksort algorithm is based on the divide-and-conquer approach. Overall, the quicksort algorithm follows three main steps:
-Pick an element from the array as a pivot
-Partition the problem set by moving smaller elements to the left of the pivot and larger elements to its right
-Repeat the above steps on each partition
In average, as well as best cases, the time complexity of the Quicksort algorithm is O(n log n). In principle, the worst-case time complexity can be O(n^2) if we select a bad pivot, and it may happen when the array is already sorted in ascending or descending order.
Quicksort is usually implemented as an unstable sort with best-case space complexity of O(log n) and average-case space complexity of O(n).
Heapsort
Heapsort is a comparison-based sorting method based on the binary heap data structure. The binary heap structure allows the Heapsort algorithm to take advantage of the heap’s properties:
-Every node’s value must be greater (or smaller) than all values stored in its children
-It’s a complete tree, which means it has the smallest possible height
We can summarise Heapsort into four main steps:
-Build a min (or max) heap from the input array
-At this point, the smallest item is stored at the root of the heap. We’ll remove the element from the root node, and store the rightmost leaf in the root node.
-Heapify the root of the tree
-Repeat steps 2 and 3 while the size of the heap is greater than 1
Heapify is a process of arranging the nodes in the correct order so that they follow the heap property. For a more in-depth overview of the Heapsort algorithm and an explanation of the heap data structure, we can read these articles which explain Heap Sort in Java and how to Max-Heapify A Binary Tree.
Differences between BFS and DFS
BFS vs DFS
-Full form BFS stands for Breadth First Search.
DFS stands for Depth First Search.
-Technique
It a vertex-based technique to find the shortest path in a graph.
It is an edge-based technique because the vertices along the edge are explored first from the starting to the end node.
-Definition
BFS is a traversal technique in which all the nodes of the same level are explored first, and then we move to the next level.
DFS is also a traversal technique in which traversal is started from the root node and explore the nodes as far as possible until we reach the node that has no unvisited adjacent nodes.
-Data Structure
Queue data structure is used for the BFS traversal.
Stack data structure is used for the BFS traversal.
-Backtracking
BFS does not use the backtracking concept.
DFS uses backtracking to traverse all the unvisited nodes.
-Number of edges
BFS finds the shortest path having a minimum number of edges to traverse from the source to the destination vertex.
In DFS, a greater number of edges are required to traverse from the source vertex to the destination vertex.
-Optimality
BFS traversal is optimal for those vertices which are to be searched closer to the source vertex.
DFS traversal is optimal for those graphs in which solutions are away from the source vertex.
-Speed
BFS is slower than DFS.
DFS is faster than BFS.
-Suitability for decision tree
It is not suitable for the decision tree because it requires exploring all the neighboring nodes first.
It is suitable for the decision tree. Based on the decision, it explores all the paths. When the goal is found, it stops its traversal.
-Memory efficient
It is not memory efficient as it requires more memory than DFS.
It is memory efficient as it requires less memory than BFS.
Kruskal’s Algorithm
In this article, we will discuss Kruskal’s algorithm. Here, we will also see the complexity, working, example, and implementation of the Kruskal’s algorithm.
But before moving directly towards the algorithm, we should first understand the basic terms such as spanning tree and minimum spanning tree.
Spanning tree – A spanning tree is the subgraph of an undirected connected graph.
Minimum Spanning tree – Minimum spanning tree can be defined as the spanning tree in which the sum of the weights of the edge is minimum. The weight of the spanning tree is the sum of the weights given to the edges of the spanning tree.
Kruskal’s Algorithm is used to find the minimum spanning tree for a connected weighted graph. The main target of the algorithm is to find the subset of edges by using which we can traverse every vertex of the graph. It follows the greedy approach that finds an optimum solution at every stage instead of focusing on a global optimum.
How does Kruskal’s algorithm work?
In Kruskal’s algorithm, we start from edges with the lowest weight and keep adding the edges until the goal is reached. The steps to implement Kruskal’s algorithm are listed as follows –
First, sort all the edges from low weight to high.
Now, take the edge with the lowest weight and add it to the spanning tree. If the edge to be added creates a cycle, then reject the edge.
Continue to add the edges until we reach all vertices, and a minimum spanning tree is created.
The applications of Kruskal’s algorithm are –
Kruskal’s algorithm can be used to layout electrical wiring among cities.
It can be used to lay down LAN connections.
Algorithm
Step 1: Create a forest F in such a way that every vertex of the graph is a separate tree.
Step 2: Create a set E that contains all the edges of the graph.
Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
Step 4: Remove an edge from E with minimum weight
Step 5: IF the edge obtained in Step 4 connects two different trees, then add it to the forest F
(for combining two trees into one tree).
ELSE
Discard the edge
Step 6: END
Time Complexity
The time complexity of Kruskal’s algorithm is O(E logE) or O(V logV), where E is the no. of edges, and V is the no. of vertices.
Red-Black tree is a self-balancing binary search tree in which each node contains an extra bit for denoting the color of the node, either red or black.
A red-black tree satisfies the following properties:
Red/Black Property: Every node is colored, either red or black.
Root Property: The root is black.
Leaf Property: Every leaf (NIL) is black.
Red Property: If a red node has children then, the children are always black.
Depth Property: For each node, any simple path from this node to any of its descendant leaf has the same black-depth (the number of black nodes).
Algorithm to insert a node
Following steps are followed for inserting a new element into a red-black tree:
-Let y be the leaf (ie. NIL) and x be the root of the tree.
-Check if the tree is empty (ie. whether x is NIL). If yes, insert newNode as a root node and color it black.
-Else, repeat steps following steps until leaf (NIL) is reached.
Compare newKey with rootKey.
If newKey is greater than rootKey, traverse through the right subtree.
Else traverse through the left subtree.
-Assign the parent of the leaf as a parent of newNode.
-If leafKey is greater than newKey, make newNode as rightChild.
-Else, make newNode as leftChild.
-Assign NULL to the left and rightChild of newNode.
-Assign RED color to newNode.
-Call InsertFix-algorithm to maintain the property of red-black tree if violated.
Red-Black Tree Applications
-To implement finite maps
-To implement Java packages: java.util.TreeMap and java.util.TreeSet
-To implement Standard Template Libraries (STL) in C++: multiset, map, multimap
-In Linux Kernel
Splay Tree
Splay tree is another variant of a binary search tree. In a splay tree, recently accessed element is placed at the root of the tree. A splay tree is defined as follows…
Splay Tree is a self – adjusted Binary Search Tree in which every operation on element rearranges the tree so that the element is placed at the root position of the tree.
In a splay tree, every operation is performed at the root of the tree. All the operations in splay tree are involved with a common operation called “Splaying”.
Splaying an element, is the process of bringing it to the root position by performing suitable rotation operations.
In a splay tree, splaying an element rearranges all the elements in the tree so that splayed element is placed at the root of the tree.
By splaying elements we bring more frequently used elements closer to the root of the tree so that any operation on those elements is performed quickly. That means the splaying operation automatically brings more frequently used elements closer to the root of the tree.
Every operation on splay tree performs the splaying operation. For example, the insertion operation first inserts the new element using the binary search tree insertion
process, then the newly inserted element is splayed so that it is placed at the root of the tree. The search operation in a splay tree is nothing but searching the element using binary search process and then splaying that searched element so that it is placed at the root of the tree
B Tree
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order m can have at most m-1 keys and m children. One of the main reason of using B tree is its capability to store large number of keys in a single node and large key values by keeping the height of the tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
Every node in a B-Tree contains at most m children.
Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
The root nodes must have at least 2 nodes.
All leaf nodes must be at the same level.
It is not necessary that, all the nodes contain the same number of children but, each node must have m/2 number of nodes.
Operations
Searching :
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the following B Tree. The process will something like following :
-Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
-Since, 40<49<56, traverse right sub-tree of 40.
-49>45, move to right. Compare 49.
-match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any element in a B tree.
Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order to insert an item into B Tree.
-Traverse the B Tree in order to find the appropriate leaf node at which the node can be inserted.
-If the leaf node contain less than m-1 keys then insert the
element in the increasing order.
-Else, if the leaf node contains m-1 keys, then follow the following steps.
1.Insert the new element in the increasing order of elements.
2.Split the node into the two nodes at the median.
3.Push the median element upto its parent node.
4.If the parent node also contain m-1 number of keys, then split it too by following the same steps.
Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf node or an internal node. Following algorithm needs to be followed in order to delete a node from a B tree.
-Locate the leaf node.
-If there are more than m/2 keys in the leaf node then delete the desired key from the node.
-If the leaf node doesn’t contain m/2 keys then complete the keys by taking the element from eight or left sibling.
1.If the left sibling contains more than m/2 elements then push its largest element up to its parent and move the intervening element down to the node where the key is deleted.
2.If the right sibling contains more than m/2 elements then push its smallest element up to the parent and move intervening element down to the node where the key is deleted.
-If neither of the sibling contain more than m/2 elements then create a new leaf node by joining two leaf nodes and the intervening element of the parent node.
-If parent is left with less than m/2 nodes then, apply the above process on the parent too.
If the the node which is to be deleted is an internal node, then replace the node with its in-order successor or predecessor. Since, successor or predecessor will always be on the leaf node hence, the process will be similar as the node is being deleted from the leaf node.
Application of B tree
B tree is used to index the data and provides fast access to the actual data stored on the disks since, the access to value stored in a large database that is stored on a disk is a very time consuming process.
Searching an un-indexed and unsorted database containing n key values needs O(n) running time in worst case. However, if we use B Tree to index this database, it will be searched in O(log n) time in worst case.
AVL Tree
AVL Tree can be defined as height balanced binary search tree in which each node is associated with a balance factor which is calculated by subtracting the height of its right sub-tree from that of its left sub-tree.
Tree is said to be balanced if balance factor of each node is in between -1 to 1, otherwise, the tree will be unbalanced and need to be balanced.
Balance Factor (k) = height (left(k)) – height (right(k))
If balance factor of any node is 1, it means that the left sub-tree is one level higher than the right sub-tree.
If balance factor of any node is 0, it means that the left sub-tree and right sub-tree contain equal height.
If balance factor of any node is -1, it means that the left sub-tree is one level lower than the right sub-tree.
AVL Rotations
We perform rotation in AVL tree only in case if Balance Factor is other than -1, 0, and 1. There are basically four types of rotations which are as follows:
L L rotation: Inserted node is in the left subtree of left subtree of A
R R rotation : Inserted node is in the right subtree of right subtree of A
L R rotation : Inserted node is in the right subtree of left subtree of A
R L rotation : Inserted node is in the left subtree of right subtree of A
Where node A is the node whose balance Factor is other than -1, 0, 1.
Operations on AVL tree
Insertion
Insertion in AVL tree is performed in the same way as it is performed in a binary search tree. However, it may lead to violation in the AVL tree property and therefore the tree may need balancing. The tree can be balanced by applying rotations.
Deletion
Deletion can also be performed in the same way as it is performed in a binary search tree. Deletion may also disturb the balance of the tree therefore, various types of rotations are used to rebalance the tree.