Streaming Expressions in Apache Solr
The Structure Query Language (SQL) engine is built on top of Solr’s Streaming API or streaming expressions. Streaming expression provides a simple and powerful stream processing language for Solr cloud and Its support for parallel relational algebra and real-time map-reduce.
Distributed Joins: Streaming expressions are added to distributed joins.
- Inner Join
- Left Outer Join
- Hash Join
- Outer Hash Join
Example:
innerJoin(search(collection1, q=*:*, fl="fieldP, fieldQ, fieldR", ...),
search(collection2, q=*:*, fl="fieldP, fieldM, fieldN", ...),
on="fieldP=fieldP")
Rolling streaming expression: It is a group of the common field value tuple.
Example:
rollup(search(collection1, qt="/export"
q="*:*",
fl="id,course,price",
sort="course asc"),
over="course"),
count(*),
max(price))
Facet streaming expression: It pushes down the computation using json.
Example: facet(intellipaat_courses,
q="*:*",
buckets="course",
bucketSorts="count(*) desc",
bucketSizeLimit=1000,
count(*),
sum(price),
max(popularity))
There are many available functions.
- Continuous push streaming
- Continuous pull streaming
- Request/Response streaming
- MapReduce is shuffling aggregation
- Pushdown faceted aggregation
- Parallel relational algebra (distributed joins, intersections, unions, complements)
- Publish/subscribe to messaging
- Distributed graph traversal