Can I think of an ORC file as similar to a CSV file with column headings and row labels containing data? If so, can I somehow read it into a simple pandas dataframe? I am not that familiar with tools like Hadoop or Spark, but is it necessary to understand them just to see the contents of a local ORC file in Python?

The filename is someFile.snappy.orc

I can see online that'someFile.snappy.orc') works, but even after import pyspark, it is throwing error.

Use the below code, it will work fine:

import pandas as pd

import pyarrow.orc as orc

with open(filename) as file:

    data = orc.ORCFile(file)

    df =


If you want to know more about Pandas Dataframe visit this Pandas Tutorial.

