Getting Scala Spark working with IntelliJ IDE

UPDATE : Updated the instructions for build.sbt on 1/29/2015 for Spark 1.2 and Scala 2.11.4

I have recently started using Apache Spark and it is awesome.

This is a post about some troubles that I had getting Spark(Scala) working with IntelliJ. All the links that I found on google were out of date and the process was non-trivial for somebody new to SBT/Scala like me.

So, off we go!

Note : Tested on Mac OS X 10.9.2. Also works on Yosemite.

Install sbt

This can be installed like this-

brew install sbt

I was new to sbt myself and read up a bit about it here : http://www.scala-sbt.org/documentation.html

Choosing a version of Scala

It is not enough to install the latest version of Scala. Make sure that the scala version matches that of the Spark builds in the Maven repository. These repositories are where sbt will actually download spark from.

To do this, go to http://search.maven.org and search for spark-core and spark-streaming. Find the latest entry by date.

Notice 3 things here for both the files:
1. GroupId
2. ArtifactId
3. Latest Version

In my case, these were org.apache.spark , spark-core_2.11 and 1.2.0, respectively.
The suffix on the ArtifactId tells you which Scala version you should install.

So, spark-core_2.11 means that you need Scala 2.11.X.

The “Latest Version” is Spark’s version number.

Install Scala

Now, we can be sure that the Scala version being installed is the correct one. Do the following :

brew install scala

brew switch scala 2.11.4

Dont worry too much about the exact version. If you type in just 2.10, brew will automatically list the closest available formula.

Install IntelliJ and Scala plugin

I guess you alreayd have the IDE. Otherwise, download IntelliJ Community Edition. I am
using v13.X.
Once done, install the Scala plugin from the Plugins menu (Preferences->Plugins)
Once done, make sure that Scala is being correctly detected in IntelliJ preferences.

Create a Scala Project

Create a new Scala Project in IntelliJ.

Setting up sbt

Make sure that the following two files exist and have the following contents:

~/.sbt/{sbt version}/plugins/build.sbt

resolvers += “Sonatype snapshots” at “http://oss.sonatype.org/content/repositories/snapshots/”

addSbtPlugin(“com.github.mpeltonen” % “sbt-idea” % “1.6.0”)

NOTE: If you face any errors, confirm these two lines are updated to the latest version numbers. For that, check https://github.com/mpeltonen/sbt-idea

$PROJECT_DIR/build.sbt

scalaVersion := “2.11.0”

libraryDependencies += “org.apache.spark” %% “spark-core” % “1.2.0”

libraryDependencies += “org.apache.spark” %% “spark-streaming” % “1.2.0”

resolvers ++= Seq(
“Akka Repository” at “http://repo.akka.io/releases/”,
“Spray Repository” at “http://repo.spray.cc/”)

NOTE: The Scala version must match the version supplied to Brew earlier.

NOTE: The version numbers of spark-core and spark-streaming should exactly match the “Latest Version” field on the Maven Repository.

Almost done

In $PROJECT_DIR,

sbt update

sbt

Inside the sbt shell, run gen-idea

You are all set for Scala Spark development using IntelliJ IDE. Have fun!

  • Helio Silva

    Have to change to

    calaVersion := “2.11.0”

    libraryDependencies += “org.apache.spark” %% “spark-core” % “1.2.0”

    libraryDependencies += “org.apache.spark” %% “spark-streaming” % “1.2.0”

    //libraryDependencies += “org.apache.spark” % “spark-core_2.11” % “1.2.0”

    resolvers ++= Seq(“Akka Repository” at “http://repo.akka.io/releases/”)

    resolvers ++= Seq(“Spray Repository” at “http://repo.spray.cc/”)

    • Suvir Jain

      Thanks Helio. Updated the post.

      • Ken

        updated? org.spark-project should read org.apache.spark

        • Suvir Jain

          Thanks for pointing that out. Fixed.

  • Pingback: Spark and Scala | Pearltrees()