Example notebook - Create Cluster

This example uses a python library for working with an IAE instance.

IBM Analytics Engine Python Library links:

In [ ]:
! pip install --quiet --upgrade git+https://github.com/snowch/ibm-analytics-engine-python@master
In [ ]:
from ibm_analytics_engine import CloudFoundryAPI, CloudFoundryAPI
from ibm_analytics_engine import IAE, IAEServicePlanGuid, IAEClusterSpecificationExamples

We use an IBM Cloud API key to work with an IAE Instance. You can create an API using the bluemix CLI tools, e.g.

bluemix iam api-key-create My_IAE_Key -d "This is my IAE API key" -f my_api_key.json

Alternatively, follow these instructions to create an API key using the IBM Cloud web console and then save it in a secure location.

In [ ]:
cf = CloudFoundryAPI(api_key_filename='./my_api_key.json')

You aren’t restricted to just using an API key file. If you have the API key value, you can do this:

from getpass import getpass
api_key = getpass("Enter your api key: ")

cf = CloudFoundryAPI(api_key=api_key)
In [ ]:
# Provide your organizaton name and space name:

SPACE_GUID = cf.space_guid(org_name='my_org_name', space_name='my_space_name')
print(SPACE_GUID)

If you couldn’t find your space guid, try printing out all your orgs and spaces:

cf.print_orgs_and_spaces()
In [ ]:
# We interact with the IBM Analytics Engine through the IAE class.
# Let's create an instance of it:

iae = IAE(cf_client=cf)
In [ ]:
# List the clusters in the space

iae.clusters(space_guid=SPACE_GUID)
In [ ]:
cluster_guid = iae.create_cluster(service_instance_name = 'MY_SPARK_CLUSTER',
                   service_plan_guid = IAEServicePlanGuid.LITE,
                   cluster_creation_parameters = {
                        "hardware_config": "default",
                        "num_compute_nodes": 1,
                        "software_package": "ae-1.0-spark",
                    },
                   space_guid = SPACE_GUID)

Alternative options for service_plan_guid:

  • IAEServicePlanGuid.STD_HOURLY
  • IAEServicePlanGuid.STD_MONTHLY

There are also some examples of cluster_creation_paramters in IAEClusterSpecificationExamples class:

IAEClusterSpecificationExamples.SINGLE_NODE_BASIC_SPARK = {
    'num_compute_nodes': 1,
    'hardware_config': 'default',
    'software_package': 'ae-1.0-spark'
    }

and:

IAEClusterSpecificationExamples.SINGLE_NODE_BASIC_HADOOP = {
    'num_compute_nodes': 1,
    'hardware_config': 'default',
    'software_package': 'ae-1.0-hadoop-spark'
    }

These have been provided so you don’t have to remember the parameters for creating a default basic cluster.

You would use them like this:

iae.create_cluster(...,
    cluster_creation_parameters = IAEClusterSpecificationExamples.SINGLE_NODE_BASIC_SPARK,
    ...)
In [ ]:
# Poll the cluster until provisioning has finished

import time
while True:
    status = iae.status(cluster_instance_guid=cluster_guid)
    print(status)
    if status == 'succeeded' or status == 'failed': break
    time.sleep(60)

In [ ]:
# Only run this cell after the previous cell has finished with the status 'succeeded',
# otherwise you will receive an error trying to get or create the credentials.

import json

# get the credentials data for the cluster in vcap json format
vcap = iae.get_or_create_credentials(cluster_instance_guid=cluster_guid)

# print the credentials out
vcap_formatted = json.dumps(vcap, indent=4, separators=(',', ': '))
print(vcap_formatted)

# save the credentials to a file
with open('./vcap.json', 'w') as vcap_file:
    vcap_file.write(vcap_formatted)
In [ ]:
# Grab the ambari console url

print(vcap['cluster']['service_endpoints']['ambari_console'])
In [ ]:
# Delete the cluster.  Recursive=True will delete service bindings, service keys,
# and routes associated with the service instance.

iae.delete_cluster(cluster_guid, recursive=True)