Apache Solr Cloud Architecture

Understanding Solr Architecture

High availability and fault tolerance are combined with the Solr server,  we called it a SolrCloud. It provides distributed indexing  and searching capabilities.

The most important features of  Solr cloud,

  • Central configuration for every cluster
  • Automatic load balancing and failover for queries
  • ZooKeeper integration for cluster coordination and configuration.

Apache Solr Cloud Architecture

pic2

Solr is helps to enable the subset of optional features and also simplifies horizontal scaling a search  index using shard and replicating. Solr distributed cloud is mainly via distributed indexing side. Single Solr server modes are really fast and it has more features. High scalability.

It counts the Number of requests come and updates the queries come. If we reached a maximum size of a single server, we have to add another server, so Solr designing a cloud. The procedure to add the other server is explained below. Put some documents to another server, it allows the replicate the data.

Leader: A node that can accept writes without consulting another node. So, any node is basically a leader they determined any latency and accept requests are updating.

Is everyone a Leader?

  • Favors write availability
  • Challenges optimistic locking
  • Challenges consistency

Favors write availability: We need only any given node to be up in its going to accept our right arm to the downside to it makes.

Master Cloud Infrastructure and Networking with AWS
AWS Certification Training
quiz-icon

Challenges optimistic locking: Optimistic locking is more difficult, so when everybody’s leader what’s nice is if our notes get partitioned so they have got 170 is over, if both clusters except right, so if we want to do something like optimistic locking, it sending an update and we say which version we are trying to update and we should confirm whether we could update it or not.

If we have separate partitions, it’s not easy to get information back immediately, because one partition and the other partition may be the one that has the document, met our trying update  and its come back together.
Challenges consistency: We know units it’s an eventual consistency model. Primary reasons to offers optimistic locking is a form of transaction in their we can have form atomic updates on its per single document and a transaction that involves multiple documents.
Collections: Collection made of one or more Solr cores, single core contains single  Solr instances, Collection of shard1 and shard2. Each of  the shard is placed on two Solr instances.
Zoo Keeper:  It is a vital part of  Solr cloud. If the Zoo  Keeper will fail the whole cluster becomes useless Leader election, cluster state management and centralized configuration are provided by the distributed coordination service. We can use the embedded Zoo Keeper for testing.
Collection:  Each collection has a name, shard count, and replication factor in distributed search index across multiple nodes.
Replication factor: Number of copies of a document in a collection.
Shard: It is a collection of logical slices, While the ability to shard a logical Solr index is an excellent feature.

Every shard has a name, leader, hash range, and replication factor. Each Shard will contain at least one Leader Core and zero to many Replica Cores, One document is assigned to one shard per collection using a hash based  document routing strategy.

Replica: It is a Core that stores a copy of a Leader Core’s index, each replica is implemented by Solr core. Replica Cores and other Leader Cores are dependable for forwarding the Solr Document to the appropriate Leader Core.

Ready to conquer Big Data? Enroll in our AWS Big Data certification course and start your journey today!

Our Big Data Courses Duration and Fees

Program Name
Start Date
Fees
Cohort starts on 18th Jan 2025
₹22,743
Cohort starts on 25th Jan 2025
₹22,743
Cohort starts on 18th Jan 2025
₹22,743

About the Author

Technical Research Analyst - Big Data Engineering

Abhijit is a Technical Research Analyst specialising in Big Data and Azure Data Engineering. He has 4+ years of experience in the Big data domain and provides consultancy services to several Fortune 500 companies. His expertise includes breaking down highly technical concepts into easy-to-understand content.