How to run Crail File System locally using netty

In this blog post I am going to show how to run Crail file system (CrailFS) from (github) locally on your laptop using its netty binding.

Read more »

How does Apache Spark read a parquet file

In this post I will try to explain what happens when Apache Spark tries to read a parquet file. Apache Parquet is a popular columnar storage format which stores its data as a bunch of files. Typically these files are stored on HDFS. In a seprate post I will explain more details about the internals of Parquet, but for here we focus on what happens when you call

 val parquetFileDF ="intWithPayload.parquet")

as documented in the Spark SQL programming guide.

Read more »

The website is finally up !