Because Apache Spark is written in the Scala programming language for the JVM, it has encouraged many developers to adopt Scala. However, the JVM uses memory inefficiently for Big Data computations, causing significant garbage collection (GC) challenges. Spark's project "Tungsten" fixed these and other performance problems with custom data layouts and code generation.

In this talk, we'll discuss what lessons we've learned from Spark, the improvements made by Tungsten, and what we should do to improve both Scala and the JVM for Big Data.