Examples¶
This section shows example code snippets for working with this library.
Prerequisites¶
API Key
The lifecycle of an IBM Analytics Engine cluster is controlled through Cloud Foundry (e.g. create, delete, status operations). This python library requires an API key to work with the Cloud Foundry APIs. For more information on IBM Cloud API Keys including how to create and download an API Key, see https://console.bluemix.net/docs/iam/userid_keys.html#userapikey
Installation
Ensure you have installed this library.
Install with:
pip install --upgrade git+https://github.com/snowch/ibm-analytics-engine-python@master
Logging¶
Log level is controlled with the environment variable LOG_LEVEL
.
You may set it programmatically in your code:
os.environ["LOG_LEVEL"] = "DEBUG"
Typical valid values are ERROR, WARNING, INFO, DEBUG. For a full list of values, see: https://docs.python.org/3/library/logging.html#logging-levels
Finding your space guid¶
Many operations in this library require you to specify a space guid. You can list the spaces guids for your account using this example:
from ibm_analytics_engine.cf.client import CloudFoundryAPI
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
cf.print_orgs_and_spaces()
Alternatively, if you know your organisation name and space name, you can use the following:
from ibm_analytics_engine.cf.client import CloudFoundryAPI
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
try:
space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')
print(space_guid)
except ValueError as e:
# Space not found
print(e)
Create Cluster¶
This example shows how to create a basic spark cluster.
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE, IAEServicePlanGuid
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')
iae = IAE(cf_client=cf)
cluster_instance_guid = iae.create_cluster(
service_instance_name='SPARK_CLUSTER',
service_plan_guid=IAEServicePlanGuid.LITE,
space_guid=space_guid,
cluster_creation_parameters={
"hardware_config": "default",
"num_compute_nodes": 1,
"software_package": "ae-1.0-spark",
}
)
print('>> IAE cluster instance id: {}'.format(cluster_instance_guid))
# This call blocks for several minutes. See the Get Cluster Status example
# for alternative options.
status = iae.status(
cluster_instance_guid=cluster_instance_guid,
poll_while_in_progress=True)
print('>> Cluster status: {}'.format(status))
The above example creates a LITE cluster. See IBMServicePlanGuid for the available service plan guids.
Delete Cluster¶
from ibm_analytics_engine.cf.client import CloudFoundryAPI, CloudFoundryException
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)
try:
iae.delete_cluster(
cluster_instance_guid='12345-12345-12345-12345',
recursive=True)
print('Cluster deleted.')
except CloudFoundryException as e:
print('Unable to delete cluster: ' + str(e))
Get or Create Credentials¶
import json
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)
vcap_json = iae.get_or_create_credentials(cluster_instance_guid='12345-12345-12345-12345')
# prettify json
vcap_formatted = json.dumps(vcap_json, indent=4, separators=(',', ': '))
print(vcap_formatted)
To save the returned data to disk, you can do something like:
with open('./vcap.json', 'w') as vcap_file:
vcap_file.write(vcap_formatted)
Get Cluster Status¶
To return the Cloud Foundry status:
import time
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)
while True:
status = iae.status(cluster_instance_guid='12345-12345-12345-12345')
if status == 'succeeded' or status == 'failed': break
time.sleep(60)
print(status)
Alternative option to poll for the Cloud Foundry status. Note that this approach can block for many minutes while a cluster is being provisioned. While it is blocked, there is no progress output:
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)
status = iae.status(
cluster_instance_guid='12345-12345-12345-12345',
poll_while_in_progress=True)
print(status)
To return the Data Platform API status:
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
iae = IAE(cf_client=cf)
vcap = iae.get_or_create_credentials(cluster_instance_guid='12345-12345-12345-12345')
status = iae.dataplatform_status(vcap)
print(status)
List Clusters¶
from ibm_analytics_engine.cf.client import CloudFoundryAPI
from ibm_analytics_engine import IAE
cf = CloudFoundryAPI(api_key_filename='your_api_key_filename')
space_guid = cf.space_guid(org_name='your_org_name', space_name='your_space_name')
iae = IAE(cf_client=cf)
for i in iae.clusters(space_guid=space_guid):
print(i)
Jupyter Notebook Gateway¶
This is an example script for running a docker notebook that connects to the cluster using the JNBG protocol and the credentials in your vcap.json file.
#!/bin/bash
export VCAP_STR="$(cat vcap.json)"
KG_HTTP_USER=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['user'])")
KG_HTTP_PASS=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['password'])")
KG_HTTP_URL=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['service_endpoints']['notebook_gateway'])")
KG_WS_URL=$(python -c "import json, os; print(json.loads(os.environ['VCAP_STR'])['cluster']['service_endpoints']['notebook_gateway_websocket'])")
# Create a directory for the notebooks so they don't disappear when the docker constainer shuts down
if [ ! -d notebooks ]
then
mkdir notebooks
fi
docker run -it --rm \
-v $(pwd)/notebooks:/tmp/notebooks \
-e KG_HTTP_USER=$KG_HTTP_USER \
-e KG_HTTP_PASS=$KG_HTTP_PASS \
-e KG_URL=$KG_HTTP_URL \
-e KG_WS_URL=$KG_WS_URL \
-p 8888:8888 \
biginsights/jupyter-nb-nb2kg
# Open a browser window to: http://127.0.0.1:8888