© V Kishore Ayyadevara 2018
V Kishore AyyadevaraPro Machine Learning Algorithms https://doi.org/10.1007/978-1-4842-3564-5_14

14. Implementing Algorithms in the Cloud

V Kishore Ayyadevara1 
(1)
Hyderabad, Andhra Pradesh, India
 

Sometimes the amount of computation required to carry out a task can be enormous. This typically happens when there is a large dataset that has a size greater than the typical RAM size of a machine. It can also typically happen when the required processing on the data is huge.

In such cases, it is a good idea to switch to cloud-based analysis, which can help scale up a larger RAM size quickly. It can also avoid the need to purchase extended RAM to resolve an issue that might not occur very frequently. However, there is a cost to use cloud services, and certain configurations are more costly than others. You need to be careful when choosing a configuration and be disciplined so that you know when to stop using the cloud service.

The three major cloud providers are as follows:
  • Google Cloud Platform (GCP)

  • Microsoft Azure

  • Amazon Web Services (AWS)

In this chapter, we will work towards setting up a virtual machine in all the three cloud platforms. Once the instance is setup, we will learn about accessing Python and R on cloud.

Google Cloud Platform

GCP can be accessed at https://cloud.google.com . Once you set up an account, you use the console to create a project. In the console, click Compute Engine and then click VM instances, as shown in Figure 14-1 (VM stands for virtual machine).
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig1_HTML.jpg
Figure 14-1

Selecting the VM option

Click Create to create a new VM instance. You will see a screen like Figure 14-2.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig2_HTML.jpg
Figure 14-2

Options to create an instance

Depending on the dataset size, customize the machine type with the required cores, memory, and whether you need a GPU.

Next, select the operating system of your choice (Figure 14-3).
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig3_HTML.jpg
Figure 14-3

Selecting the OS

We will perform a few operations on PuTTY. You can download PuTTYgen from www.ssh.com/ssh/putty/windows/puttygen . Open the program and click Generate to generate a public/private key pair. A key will be generated, as shown in Figure 14-4.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig4_HTML.jpg
Figure 14-4

Generating a public/private key pair in PuTTYgen

Click “Save private key” to save the private key. Copy the public key at the top and paste it into the SSH Keys box on GCP, as shown in Figure 14-5.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig5_HTML.jpg
Figure 14-5

Pasting in the key

Click Create. That should create a new instance for you. It should also give you the IP address corresponding to the instance. Copy the IP address and paste it in PuTTY under Host Name.

Click SSH in the left pane, click Auth, and browse to the location where you saved the private key, as shown in Figure 14-6.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig6_HTML.jpg
Figure 14-6

The Auth options

Enter the login name as the “Key comment” entry shown in PuTTYgen when you were generating public and private keys back in Figure 14-4. You should now be logged into the Google Cloud machine.

Type python, and you should be able to run Python scripts.

Be sure to delete the instance as soon as you are done with your work. Otherwise the service may still be billing you.

Microsoft Azure Cloud Platform

Creating a virtual machine instance in Microsoft Azure is very similar to the way it is done in GCP. Visit https://azure.microsoft.com and set up an account.

Create an account in Azure and log in. Then click “Virtual machines,” as shown in Figure 14-7.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig7_HTML.jpg
Figure 14-7

Microsoft’s Virtual machines page

Click Add and then do the following:
  1. 1.

    Select the machine needed—in our case, I’m selecting Ubuntu Server 16.04 LTS.

     
  2. 2.

    Click the default Create button.

     
  3. 3.

    Enter the machine-level details in basic configuration settings.

     
  4. 4.

    Select the size of virtual machine needed.

     
  5. 5.

    Configure the optional features.

     
  6. 6.

    Finally, create the instance.

     
Once the instance is created, the dashboard provides the IP address corresponding to the instance, as shown in Figure 14-8.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig8_HTML.jpg
Figure 14-8

The IP address you need

Open PuTTY (see the previous section for more on downloading and launching it) and connect to the instance using the IP address, either by entering the password (if you selected the password option while creating the instance) or by using the private key.

You can connect to the instance and open Python using PuTTY in a similar way as we did in the previous section on GCP.

Amazon Web Services

In this section, we will sign up with Amazon Web Services. Got to https://aws.amazon.com and create an account.

Click “Launch a virtual machine,” as shown in Figure 14-9.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig9_HTML.jpg
Figure 14-9

Launching a virtual machine in AWS

On the next screen, click “Get started.” Name your instance and select the required attributes, as shown in Figure 14-10.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig10_HTML.jpg
Figure 14-10

Setting up your instance

Download the .pem file and click “Create this instance.” Then click “Proceed to console.”

In the meantime,
  1. 1.

    Open PuTTYgen.

     
  2. 2.

    Load the .pem file.

     
  3. 3.

    Convert it into a .ppk file.

     
  4. 4.

    Save the private key, as shown in Figure 14-11.

     
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig11_HTML.jpg
Figure 14-11

Saving the private key

Go back to the AWS console, where the screen looks like Figure 14-12.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig12_HTML.jpg
Figure 14-12

The AWS console

Click the Connect button. Note the example given in the pop-up that appears (Figure 14-13).
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig13_HTML.jpg
Figure 14-13

The example

In the highlighted part of Figure 14-13, the string after the @ is the host name. Copy it, open PuTTY, and paste the host name into the Host Name box in the PuTTY Configuration screen, as shown in Figure 14-14.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig14_HTML.jpg
Figure 14-14

Adding the host name

Back in Figure 14-13, the word just before the @ is the username. Type it into the “Auto-login username” box in PuTTY after you select “Data” in panel in the left, as shown in Figure 14-15.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig15_HTML.jpg
Figure 14-15

Adding the username

Click SSH to expand it, click Auth, and browse to the .ppk file created earlier. Click Open, as shown in Figure 14-16.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig16_HTML.jpg
Figure 14-16

Setting the private key

Now you should be able to run Python on AWS.

Transferring Files to the Cloud Instance

You can transfer files from your local machine to your cloud instance in all three platforms using WinSCP. If you don’t already have it installed, download WinSCP from www.winscp.net and install it. Open WinSCP and you should see a login screen similar to Figure 14-17.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig17_HTML.jpg
Figure 14-17

The WinSCP login screen

Enter the host name and username similar to how you entered them in PuTTY. In order to enter the .ppk file details, click the Advanced button.

Click Authentication in the SSH section and provide the location of the .ppk file, as shown in Figure 14-18.
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig18_HTML.jpg
Figure 14-18

Setting the private key

Click OK and then click Login.

Now you should be able to transfer files from your local machine to the virtual instance.

Another way to transfer files is to upload them into some other cloud storage (for example, Dropbox), obtain a link for the location of the file, and download it to the virtual instance.

Running Instance Jupyter Notebooks from Your Local Machine

You can run Jupyter Notebooks from your local machine by running the following code on Linux instances in any of GCP, AWS, or Azure:

sudo su
wget http://repo.continuum.io/archive/Anaconda3-4.1.1-Linux-x86_64.sh
bash Anaconda3-4.1.1-Linux-x86_64.sh
jupyter notebook --generate-config
vi jupyter_notebook_config.py

Insert the following code by pressing the I key:

c = get_config()
c.NotebookApp.ip = '*';
c.NotebookApp.open_browser = False
c.NotebookApp.port = 5000

Press Escape, type :wq, and press Enter.

Type the following:

sudo su
jupyter-notebook --no-browser --port=5000

Once the Jupyter Notebook opens, go to a browser on the local machine and type the IP address of the virtual instance, along with the port number into the address bar (make sure that the firewall rules are configured to open port 5000). For example, if the IP address is http://35.188.168.71, then type http://35.188.168.71:5000 into the browser’s address bar at the top of the screen.

You should be able to see the Jupyter environment on your local machine that is connected to the virtual instance (Figure 14-19).
../images/463052_1_En_14_Chapter/463052_1_En_14_Fig19_HTML.jpg
Figure 14-19

The Jupyter environment

Installing R on the Instance

R does not come installed by default on the instance. You can install R in Linux as follows:

sudo apt-get update
sudo apt-get install r-base

Now type R in your terminal:

R

You should now be able to run R code in the virtual instance.

Summary

In this chapter, you learned the following:
  • How to set up and open a virtual instance on the three major cloud platforms

  • How to run Python/R on the three platforms

  • How to transfer a file into the cloud environment