Seminars

11:50 - 12:20  |  Fast Data Theatre

Scala and the JVM as a Big Data Platform – Lessons from Apache Spark

Wednesday 14 November 2018

ABOUT

Because Apache Spark is written in the Scala programming language for the JVM, it has encouraged many developers to adopt Scala. However, the JVM uses memory inefficiently for Big Data computations, causing significant garbage collection (GC) challenges. Spark's project "Tungsten" fixed these and other performance problems with custom data layouts and code generation.

In this talk, we'll discuss what lessons we've learned from Spark, the improvements made by Tungsten, and what we should do to improve both Scala and the JVM for Big Data.

SEE SPEAKER PROFILE:

CONTACT US