There are various data sources available in SparkSQL and few of them are below −
- JSON Datasets - Spark SQL automatically capture the schema of a JSON dataset. And, load it as a DataFrame.
- Hive Tables - Hive comes with the Spark library as HiveContext
- Parquet Files - Parquet is a columnar type, supported by several data processing systems.
If you want to learn Spark, I recommend this Spark Certification by Intellipaat
Also, watch this video on Spark tutorial: