Data Science Installation
Data Science Toolbox is a virtual environment which is based on GNU/Linux. It contains all the required command-line tools. Installation of the Data Science Toolbox is on local machine is a simplest method because the local version of the Data Science Toolbox runs on top of VirtualBox and Vagrant and it can be installed on Microsoft Windows, Linux and Mac Operating system.
Read these Top Trending Data Science Interview Q’s blog now that helps you grab high-paying jobs!
Step 1: Download and Install VirtualBox
Open https://www.virtualbox.org/wiki/Downloads and download the VirtualBox binaries according to your operating system. Then open the binary and follow the instructions.
Step 2: Download and Install Vagrant
Open http://www.vagrantup.com/downloads.html and download the VirtualBox binaries according to your operating system. Then open the binary and follow the instructions.
Step 3: Download and Start the Data Science Toolbox
Open a terminal (known as command prompt in windows) and then create a directory, like IntellipaatDataScienceToolbox, and navigate to it by typing:
$ mkdir IntellipaatDataScienceToolbox $ cd IntellipaatDataScienceToolbox
To initialize data science toolbox run following command:
$ vagrant init data-science-toolbox/data-science-at-the-command-line
It creates a file named Vagrantfile. It is a configuration file that tells Vagrant how to launch the virtual machine.
Example: Minimal configuration for Vagrant
Vagrant.configure(2) do |config| config.vm.box = "data-science-toolbox/data-science-at-the-command-line" end
Run the following command to boot and download the Data Science Toolbox:
$ vagrant up
If you observe the message default: Warning: Connection time out. Retrying… printed repeatedly, then it possibly that the virtual machine is waiting for input. It happens when virtual machine is not correctly shut down.
To find out what’s incorrect then add the below lines to Vagrantfile before the last end statement:
config.vm.provider "virtualbox" do |vb| vb.gui = true end
It shows a screen. Once the virtual machine has booted and problem is recognized then you can remove these lines from Vagrantfile. The username and password to log in is both vagrant.
Example: Configuring Vagrant
Vagrant.require_version ">= 1.5.0" (1) Vagrant.configure(2) do |config| config.vm.box = "data-science-toolbox/data-science-at-the-command-line" config.vm.network "forwarded_port", guest: 8000, host: 8000 (2) config.vm.provider "virtualbox" do |vb| vb.gui = true (3) vb.memory = 2048 (4) vb.cpus = 2 (5) end end
(1)Require at least version 1.5.0 of Vagrant.
(2)Forward port 8000.
(3)Launch a graphical user interface.
(4)Use 2 GB of memory.
(5)Use 2 CPUs.
Step 4: Log in (on Mac OS X and Linux)
If your operating system is Linux or Mac then you can login into data science toolbox by running following command:
$ vagrant ssh
Step 5: Log in (on Windows)
If you are using windows then you need to either run Vagrant with a GUI (In step 2) or use a third party application to log in to the Data Science Toolbox. To login into data science toolbox you can use putty as a third party application.
Open http://www.putty.org/ after this run putty and enter following information:
- Host Name (or IP address): 127.0.0.1
- Port: 2222
- Connection type: SSH
You can save these values by using save button so there is no need to enter these values again and then click to open button and enter vagrant as username and password.
Step 5: Shut Down or Start Anew
To shut down data science toolbox run below command from same directory as you ran vagrant up:
$ vagrant halt
If you want to get rid of the Data Science Toolbox and start over, you can run following command:
$ vagrant destroy
And then return to Step 3 to set up the Data Science Toolbox again.