Streaming Expressions in Apache Solr

The Structure Query Language (SQL)  engine is built on top of Solr’s Streaming API or streaming  expressions. Streaming expression provides a simple and powerful stream processing language for Solr cloud and Its support for parallel relational algebra and real-time map-reduce.

Distributed Joins: Streaming expressions are added to distributed joins.

  • Inner Join
  • Left Outer Join
  • Hash Join
  • Outer Hash Join

Example:

 innerJoin(search(collection1, q=*:*, fl="fieldP, fieldQ, fieldR", ...),

  search(collection2, q=*:*, fl=”fieldP, fieldM, fieldN”, …),   on=”fieldP=fieldP”) Rolling streaming expression: It is a group of  the common field value tuple. Example: rollup(search(collection1, qt=”/export”              q=”*:*”,              fl=”id,course,price”,              sort=”course asc”),        over=”course”),        count(*),        max(price))
Facet streaming expression: It pushes down the computation using  json.

Example: facet(intellipaat_courses,

            q="*:*",
            buckets="course",
            bucketSorts="count(*) desc",
            bucketSizeLimit=1000,
            count(*),
            sum(price),
            max(popularity))

There are many available functions.

  • Continuous push streaming
  • Continuous pull streaming
  • Request/Response streaming
  • MapReduce is shuffling aggregation
  • Pushdown faceted aggregation
  • Parallel relational algebra (distributed joins, intersections, unions, complements)
  • Publish/subscribe messaging
  • Distributed graph traversal

This blog will help you get a better understanding of Solr + Hadoop = Big Data Love

Recommended Videos

Leave a Reply

Your email address will not be published. Required fields are marked *