Examples

This section shows example code snippets for working with this library.

Prerequisites

API Key

The lifecycle of an IBM Analytics Engine cluster is controlled through Cloud Foundry (e.g. create, delete, status operations). This python library requires an API key to work with the Cloud Foundry APIs. For more information on IBM Cloud API Keys including how to create and download an API Key, see https://console.bluemix.net/docs/iam/userid_keys.html#userapikey

Installation

Ensure you have installed this library.

Install with:

pip install --upgrade git+https://github.com/snowch/ibm-analytics-engine-python@master

Logging

Log level is controlled with the environment variable LOG_LEVEL.

You may set it programmatically in your code:

os.environ["LOG_LEVEL"] = "DEBUG"

Typical valid values are ERROR, WARNING, INFO, DEBUG. For a full list of values, see: https://docs.python.org/3/library/logging.html#logging-levels

Finding your space guid

Many operations in this library require you to specify a space guid. You can list the spaces guids for your account using this example:

from ibm_analytics_engine.cf.client import CloudFoundryAPI

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
cf.print_orgs_and_spaces()

Alternatively, if you know your organisation name and space name, you can use the following:

from ibm_analytics_engine.cf.client import CloudFoundryAPI

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')

try:
    space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')
    print(space_guid)
except ValueError as e:
    # Space not found
    print(e)

Create Cluster

This example shows how to create a basic spark cluster.

from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE, IAEServicePlanGuid

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')

space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')

iae = IAE(cf_client=cf)

cluster_instance_guid = iae.create_cluster(
    service_instance_name='SPARK_CLUSTER',
    service_plan_guid=IAEServicePlanGuid.LITE,
    space_guid=space_guid,
    cluster_creation_parameters={
        "hardware_config": "default",
        "num_compute_nodes": 1,
        "software_package": "ae-1.0-spark",
    }
)
print('>> IAE cluster instance id: {}'.format(cluster_instance_guid))

# This call blocks for several minutes.  See the Get Cluster Status example
# for alternative options.

status = iae.status(
    cluster_instance_guid=cluster_instance_guid,
    poll_while_in_progress=True)

print('>> Cluster status: {}'.format(status))

The above example creates a LITE cluster. See IBMServicePlanGuid for the available service plan guids.

Delete Cluster

from ibm_analytics_engine.cf.client import CloudFoundryAPI, CloudFoundryException
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')

iae = IAE(cf_client=cf)
try:
    iae.delete_cluster(
        cluster_instance_guid='12345-12345-12345-12345', 
        recursive=True)

    print('Cluster deleted.')
except CloudFoundryException as e:
    print('Unable to delete cluster: ' + str(e))
    

Get or Create Credentials

import json
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)

vcap_json = iae.get_or_create_credentials(cluster_instance_guid='12345-12345-12345-12345')

# prettify json
vcap_formatted = json.dumps(vcap_json, indent=4, separators=(',', ': '))

print(vcap_formatted)

To save the returned data to disk, you can do something like:

with open('./vcap.json', 'w') as vcap_file:
    vcap_file.write(vcap_formatted)

Get Cluster Status

To return the Cloud Foundry status:

import time
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)

while True:
    status = iae.status(cluster_instance_guid='12345-12345-12345-12345')
    if status == 'succeeded' or status == 'failed': break
    time.sleep(60)

print(status)

Alternative option to poll for the Cloud Foundry status. Note that this approach can block for many minutes while a cluster is being provisioned. While it is blocked, there is no progress output:

from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)

status = iae.status(
    cluster_instance_guid='12345-12345-12345-12345',
    poll_while_in_progress=True)

print(status)

To return the Data Platform API status:

from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')

iae = IAE(cf_client=cf)

vcap = iae.get_or_create_credentials(cluster_instance_guid='12345-12345-12345-12345')

status = iae.dataplatform_status(vcap)

print(status)

List Clusters

from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE

cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')

space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')

iae = IAE(cf_client=cf)

for i in iae.clusters(space_guid=space_guid):
    print(i)

Jupyter Notebook Gateway

This is an example script for running a docker notebook that connects to the cluster using the JNBG protocol and the credentials in your vcap.json file.

#!/bin/bash

export VCAP_STR="$(cat vcap.json)"

KG_HTTP_USER=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['user'])")
KG_HTTP_PASS=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['password'])")
KG_HTTP_URL=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['service_endpoints']['notebook_gateway'])")
KG_WS_URL=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['service_endpoints']['notebook_gateway_websocket'])")

# Create a directory for the notebooks so they don't disappear when the docker constainer shuts down 
if [ ! -d notebooks ] 
then
   mkdir notebooks
fi

docker run -it --rm \
	-v $(pwd)/notebooks:/tmp/notebooks \
	-e KG_HTTP_USER=$KG_HTTP_USER \
	-e KG_HTTP_PASS=$KG_HTTP_PASS \
	-e KG_URL=$KG_HTTP_URL \
	-e KG_WS_URL=$KG_WS_URL \
	-p 8888:8888 \
	biginsights/jupyter-nb-nb2kg

# Open a browser window to: http://127.0.0.1:8888