How to run Crail File System locally using netty

In this blog post I am going to show how to run Crail file system (CrailFS) from www.crail.io (github) locally on your laptop using its netty binding.

Read more »

How does Apache Spark read a parquet file

In this post I will try to explain what happens when Apache Spark tries to read a parquet file. Apache Parquet is a popular columnar storage format which stores its data as a bunch of files. Typically these files are stored on HDFS. In a seprate post I will explain more details about the internals of Parquet, but for here we focus on what happens when you call

 val parquetFileDF = spark.read.parquet("intWithPayload.parquet")

as documented in the Spark SQL programming guide.

Read more »

The website is finally up !