Apache Solr is a standalone full-text search platform to perform searches on multiple websites and index documents using XML and HTTP. Built on a Java Library called Lucence, Solr supports a rich schema specification for a wide range and offers flexibility in dealing with different document fields. It also consists of an extensive search plugin API for developing custom search behavior.
Solrconfig.xml file contains configuration for data directory.
Learn for free ! Subscribe to our youtube Channel.
schema.xml file contains definition of the field types and fields of documents.
To know more about how Apache Solr works perfectly with Hadoop read this blog.
Supported by Apache Software Foundation, Apache Lucene is a free, open-source, high-performance text search engine library written in Java by Doug Cutting. Lucence facilitates full-featured searching, highlighting, indexing and spellchecking of documents in various formats like MS Office docs, HTML, PDF, text docs and others.
When a user runs a search in Solr, the search query is processed by a request handler. SolrRequestHandler is a Solr Plugin, which illustrates the logic to be executed for any request.Solrconfig.xml file comprises several handlers (containing a number of instances of the same SolrRequestHandler class having different configurations).
Also known as Lucence Parser, the Solr standard query parser enables users to specify precise queries through a robust syntax. However, the parser’s syntax is vulnerable to many syntax errors unlike other error-free query parsers like DisMax parser.
A field type includes four types of information:
Go through Apache Solr Tutorial to learn more about Apache Solr.
As the name suggests, Faceting is the arrangement and categorization of all search results based on their index terms. The process of faceting makes the searching task smoother as users can look for the exact results.
Dynamic Fields are a useful feature if users by any chance forget to define one or more fields. They allow excellent flexibility to index fields that have not been explicitly defined in the schema.
Working with textual data in Solr, Field Analyzer reviews and checks the filed text and generates a token stream. The pre-process of analyzing of input text is performed at the time of searching or indexing and at query time. Most Solr applications use Custom Analyzers defined by users. Remember, each Analyzer has only one Tokenizer.
Get a clear understanding of Solr Analyzer through this Big Data Hadoop and Spark community.
It is used to split a stream of text into a series of tokens, where each token is a subsequence of characters in the text. The token produced are then passed through Token Filters that can add, remove or update the tokens. Later,that field is indexed by the resulting token stream.
Phonetic filter creates tokens using one of the phonetic encoding algorithms in the org.apache.commons.codec.language package.
Apache Solr facilitates fault-tolerant, high-scalable searching capabilities that enable users to set up a highly-available cluster of Solr servers. These capabilities are well revered as SolrCloud.
It is used to describe how to populate fields with data copied from another field.
Highlighting refers to the fragmentation of documents matching the user’s query included in the query response. These fragments are then highlighted and placed in a special section, which is used by clients and users to present the snippets. Solr consists of a number of highlighting utilities having control over different fields. The highlighting utilities can be called by Request Handlers and reused with standard query parsers.
There are 3 highlighters in Solr:
It is used to generate statistics over the results of arbitrary numeric functions.
Execute $ bin/Solr –helpto see how to use the bin/Solr script.
$ bin/solr stop -p 8983 is used to stop Solr.
$ bin/solr start –f is used to start Solr in foreground.
$ bin/solr status is used to check Solr running status.
$ bin/solr start is used to start the server.
Solr is shut down from the same terminal where it was launched. Click Ctrl+C to shut it down.
Schema declares –
Become Master of Apache Solr by going through this online Solr Training.
The three steps of Installation are:
Solr supports two important configuration files
Awesome questions. Very useful and interesting.
Your email address will not be published. Required fields are marked *