15 07, 2014

Big Data Open Source Landscape: Processing Technologies

2017-08-09T12:29:19+00:00July 15th, 2014|

Hadoop is a well established software framework which analyse structured/unstructured big data and distribute applications on thousands of servers. Hadoop was created in 2005 and after Hadoop several projects around in the Hadoop space appeared that tried to complement it. Sometimes those technologies overlap with each other and sometimes they are partially complementary. I will try to describe a brief map [...]

1 07, 2014

Databricks Cloud: Next Step For Spark

2017-08-09T12:29:26+00:00July 1st, 2014|

This morning, during the Spark Summit,  Databricks announced a new step forward, that will allow users to leverage Apache Spark technology to build end-to-end pipelines that underlie advanced analytic running on Amazon AWS. The name is Databricks Cloud. Spark is already deployable on AWS, but Databricks Cloud is a managed service based on Spark that will be supported directly by Databricks. [...]