Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Big Data Hadoop & Spark by (11.4k points)

I am using Spark 1.3.0 and Spark Avro 1.0.0. I am working from the example on the repository page.

This following code works well:

val df = sqlContext.read.avro("src/test/resources/episodes.avro")
df.filter("doctor > 5").write.avro("/tmp/output")


But what if I needed to see if the doctor string contains a substring?

1 Answer

0 votes
by (32.3k points)

You can use contains (this works with an arbitrary sequence):

Note: do import:  import sqlContext.implicits._

df.filter($"foo".contains("bar"))

like (SQL like with SQL simple regular expression with _ matching an arbitrary character and % matching an arbitrary sequence):

df.filter($"foo".like("bar"))

or rlike (like with Java regular expressions):

df.filter($"foo".rlike("bar"))

depending on your requirements. LIKE and RLIKE should work with SQL expressions as well.

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94k users

Browse Categories

...