Vagrant allows the creation and configuration of lightweight, reproducible, and portable development environments.

Apache Spark™ is a fast and general engine for large-scale data processing.

This is a tutorial on how to install Spark on Ubuntu using Vagrant. You’ll need Vagrant and Virtual Box installed.

Initialise Vagrant

Create a working directory and initialise a vagrant file.

vagrant init .

Change the vagrant file to below. This will create a vm which will use 6 GB of RAM and 2 cpus.

Vagrant.configure(2) do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.provision :shell, path: "bootstrap.sh"
  config.vm.provider "virtualbox" do |vb|
    vb.gui = true
    vb.memory = 6144
    vb.cpus = 2
    vb.customize ["modifyvm", :id, "--vram", "128"]
    vb.customize ["modifyvm", :id, "--accelerate3d", "on"]
    vb.customize ["modifyvm", :id, "--graphicscontroller", "vboxvga"]
  end
end

Create bootstrap.sh which installs the Ubuntu Desktop and save in the directory.

Build the Vagrant image and run the VM.

vagrant up --provision

You should have a working Ubuntu VM now. You can login with the vagrant user with the password ‘vagrant’.

Install Java 7

Install Scala

Install Spark

VagrantSpark Repository

Alternatively clone and install from my repo:

git clone https://github.com/bayesjumping/VagrantSpark.git
cd VagrantSpark
make up-provision