In this blog, we will be covering two major algorithms that are used to find the minimum spanning tree of a graph, naming Kruskal’s and Prim’s algorithms. We will also learn steps to find the minimum spanning tree algorithm with the help of one example.
Table of Contents
Watch the video below to understand Data Structures and Algorithms in detail.
What is a Spanning Tree?
To understand what a spanning tree is, one must be familiar with undirected graphs. An undirected graph is a type of graph where edges are not directed in any specific direction. In this graph, connections between nodes (or vertices) are bidirectional, which means that if node A is connected to node B, then node B is also connected to node A. In other words, it is a type of graph, which has a V number of vertices and an E number of edges, where each vertice is connected to each other in a way such that each edge connects two different vertices.
Now let’s understand what a spanning tree is. Spanning trees are the subsets or subgraphs of a connected graph, which is also an undirected graph. The number of edges in every spanning tree generated from the original graph will be the same, but the number of edges in the spanning tree will always be one less than the number of vertices in the given graph. In other words, a spanning tree consists of (n-1) edges, where ‘n’ denotes the number of vertices or nodes in the graph.
Formula for calculating the number of Spanning Tree of a complete graph:
n^(n-2)
where ‘n’ is the number of vertices in the graph
For example, if a complete graph has 4 vertices (n = 4) then the maximum number of possible spanning trees will be 4^(4-2) = 16.
Let’s understand the concept of the derivation of a spanning tree with the help of the following example, in which we have a non-complete undirected, and connected graph with four vertices:
In the above example of a graph, note that it is not a completely connected graph. Thus, we can’t apply the formula of calculating number of spanning tree here (n^(n-2)). The above graph can have the following four spanning trees:
What is a Minimum Spanning Tree?
A Minimum Spanning Tree (MST) is a subset of the spanning tree of a connected and weighted graph whose sum of weight of edges is minimum in comparison to all possible spanning trees of that graph. In other words, it’s the smallest possible tree that connects all the vertices that have the total minimum weight of the edges.
Let’s understand, with the help of one example, how we can find the minimum spanning tree.
In the above example of spanning trees of weighted graph, if calculate the sum of edges for each graph, then the total cost of each graph would be:
- Graph(A) – (4 + 5 + 2) = 11
- Graph(B) – (5 + 2 + 1) = 8
- Graph(C) – (5 + 4 + 1) = 10
- Graph(D) – (4 + 1 + 2) = 7
As you can see, Graph-D has the minimum sum of edges, which means this graph connects all the vertices of the graph at the minimum possible cost. Hence, Graph – D is the minimum spanning tree possible for the following graph:
Enroll in this Full Stack Developer Course and start your journey now!
Minimum Spanning Tree Algorithms
To find the minimum spanning tree of a graph, there are primarily two algorithms. Both algorithms optimally give the minimum spanning tree, but through different approaches. Kruskal’s algorithm first does the sorting of edges and then adds them based on a safe choice (without formation of the cycle), while Prim’s algorithm starts the traversal of the graph from a vertex and continuously grows the tree by adding the shortest edge, which connects to an unseen vertex. The choice between these algorithms often depends on the characteristics of the graph (dense or sparse) and the available data structures for efficient implementation. Let’s understand the workings of both algorithms thoroughly:
Kruskal’s Algorithm
Kruskal’s algorithm starts by sorting all the edges of the graph in ascending order, based on their weights. Then, it iterates through these edges in ascending order and adds them to the MST if they don’t form a cycle. It uses the disjoint-set data structure to check whether the addition of an edge is forming a cycle in the MST or not. These steps are repeated until all vertices are included in the MST (minimum spanning tree), or we can say until (n-1) edges are added, where ‘n’ is the number of vertices.
Suppose there is a connected graph G with ‘n’ vertices and ‘m’ edges. Then, by following these steps, we can build the minimum spanning tree:
- Arrange all the edges in non-decreasing order based on their weights.
- Sort the edges based on their weights using any efficient sorting algorithm like quicksort or mergesort, with the time complexity of O(m log m).
- Create disjoint sets for each vertex in the graph.
- Use a disjoint-set data structure to check if its two vertices are in different sets (not part of the same connected component). This operation will be completed in O(n) time complexiety.
- Iterate through Sorted Edges:
- Start iterating through the sorted edges.
- For each edge, consider adding it to the MST if it doesn’t create a cycle.
- Repeat iterating through the sorted edges until ‘n-1’ edges are added to the MST.
- After adding ‘n-1’ edges, we will get a Minimum Spanning Tree of the graph because the Minimum Spanning Tree contains ‘n-1’ edges with the minimum total weight among all possible spanning trees.
The time complexity of Kruskal’s Algorithm is O(m log m + mα(n)), where ‘m’ is the number of edges, ‘n’ is the number of vertices. Here log m represents the step of sorting, and α(n) represents the inverse Ackermann function (nearly constant).
Prim’s Algorithm
Prim’s algorithm begins from the random vertex and then constructs the MST from this vertex.
It iteratively adds the shortest edge that connects the current MST to a vertex that is not added in the MST. This process continues until all vertices are part of the MST. It maintains a priority queue or a min-heap to efficiently select the next edge to add to the MST based on its weight.
Let’s understand Prim’s algorithm step by step. By assuming a graph that is weighted and connected, and the properties of the graph are as follows :
(G) – Connected, weighted graph with ‘n’ number of vertices.
(V) – Set of vertices.
(E) – Set of edges.
(key[v]) – Array to store the minimum weight of the edge connecting vertex(v) to the MST.
(parent[v]) – Array to store the parent of each vertex in the MST.
Here are the steps that are followed by Prim’s algorithm:
- Pick any vertex from the graph to start the algorithm. This will be the initial point for building the spanning tree.
- Create an empty set to keep track of the vertices that will be included in the spanning tree. Initially, this set will only contain the starting vertex.
- Build a priority queue to store all the edges connected to the selected vertices. The priority should be based on the weight of the edges, with the smallest weight having the highest priority.
- Repeat the following steps until all vertices are included in the spanning tree:
- Pick the edge with the smallest weight from the priority queue.
- Check if adding this edge will form a cycle in the selected vertices. If it doesn’t, add the edge to the spanning tree.
- Add the vertex at the other end of the chosen edge to the set of selected vertices.
- Update the priority queue with all the edges connected to the newly added vertex.
- Continue this process until all vertices are included in the spanning tree. The algorithm terminates when the set of selected vertices contains all the vertices from the original graph.
- The final result is a minimum spanning tree that connects all vertices with the minimum possible total edge weight.
Get a comprehensive understanding of Recursion in Data Structure with our in-depth blog post!
Applications of Minimum Spanning Tree Algorithm
The Minimum Spanning Tree (MST) algorithm has numerous practical real-world applications in various fields. Here are some of the most popular cases of application in various business sectors:
- Telecommunication Networks: MSTs are used in designing efficient communication networks, like telephone or internet networks, to establish reliable connections among different locations that minimize the overall cost or distance of laying cables.
- Transportation and Logistics: In logistics and transportation, MST algorithms assist in optimizing routes for delivery trucks, minimizing travel costs, and streamlining supply chain management by establishing the most efficient connections between different locations.
- Circuit Design: Electronic circuit design benefits from MSTs by optimizing the layout of components on a circuit board, reducing wire length, and minimizing manufacturing costs while maintaining connectivity.
- Computer Networking: In computer networks, MSTs help in designing efficient network protocols, optimizing routing, and ensuring reliable data transmission among devices in a network.
- Wireless Sensor Networks: MSTs are utilized in wireless sensor networks for optimal data collection, routing, and energy conservation by minimizing the total distance data needs to travel.
- Image Segmentation: In image processing, MSTs assist in image segmentation, where pixels with similar properties are grouped efficiently to help in object recognition and analysis.
- Oil and Gas Pipelines: MST algorithms are applied in laying pipelines for oil and gas transportation, optimizing the network to reduce construction costs and improve efficiency.
- Spanning Tree Protocol (STP) in Computer Networks: The Spanning Tree Protocol is based on MST algorithms and ensures a loop-free topology in Ethernet networks by creating a spanning tree that prevents network loops and ensures redundancy.
Go through these Top 50 Data Structures Interview Questions and Answers to crack your interviews.
Conclusion
The Minimum Spanning Tree (MST) algorithm has a wide range of practical applications in various industries. It optimizes network connectivity in telecommunications and streamlines processes in logistics and circuit design. MST algorithms efficiently minimize costs while establishing optimal connections between points. Understanding their application is significant for engineers and computer scientists aiming for optimal network design and performance.