# Integrate SageMaker Model with Watson Open Scale

Contents
- Setup
- Binding machine learning engine
- Subscriptions
- Performance monitor, scoring and payload logging
- Quality monitor and feedback logging
- Fairness, Drift monitoring and explanations

### Required packages installation

In [1]:
!pip install sagemaker --no-cache | tail -n 1
!pip install --upgrade ibm-watson-openscale --no-cache | tail -n 1
!pip install --upgrade boto3 --no-cache | tail -n 1
!pip install -U pandas==1.2.5 | tail -n 1

Successfully installed boto3-1.26.60 botocore-1.29.60 contextlib2-21.6.0 dill-0.3.6 multiprocess-0.70.14 pathos-0.3.0 pox-0.3.2 ppft-1.7.6.6 protobuf3-to-dict-0.1.5 sagemaker-2.131.0 schema-0.7.5 smdebug_rulesconfig-1.0.1
Successfully installed ibm-watson-openscale-3.0.27
Successfully installed pandas-1.2.5


### Model creation using [Amazon SageMaker](https://aws.amazon.com/sagemaker/)

- Run the notebook ( RI-SageMaker-Deploy-Wstudio.ipynb ) to train a SageMaker model and create deployment endpoint for online inference

### Update the below cell with IBM DB2 credentials like DATABASE, HOSTNAME, USERNAME, PASSWORD & PORT. 

In [2]:
SCHEMA_NAME='LAB-1-DATA'
TABLE_NAME='DATA-LAB-1-WOS'
DATABASE=''
HOSTNAME=''
USERNAME=''
PASSWORD=''
PORT=

In [3]:
!wget "https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM.csv"

--2023-01-31 07:42:27--  https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM.csv
Resolving ibm-aws-immersion-day.s3.us-east-2.amazonaws.com (ibm-aws-immersion-day.s3.us-east-2.amazonaws.com)... 52.219.97.74
Connecting to ibm-aws-immersion-day.s3.us-east-2.amazonaws.com (ibm-aws-immersion-day.s3.us-east-2.amazonaws.com)|52.219.97.74|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 36731 (36K) [text/csv]
Saving to: ‘Data-region-RI-SM.csv’


2023-01-31 07:42:27 (45.3 MB/s) - ‘Data-region-RI-SM.csv’ saved [36731/36731]



### Update the URL along with username & password for Cloud Pak for Data

In [4]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,CloudPakForDataAuthenticator
from ibm_watson_openscale import APIClient

service_credentials = {
                "url": "",
                "username": "",
                "password": ""
                }

authenticator = CloudPakForDataAuthenticator(
        url=service_credentials['url'],
        username=service_credentials['username'],
        password=service_credentials['password'],
        disable_ssl_verification=True
    )

In [5]:
from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

wos_client = APIClient(
    service_url=service_credentials['url'],
    authenticator=authenticator
)
wos_client.version

'3.0.26'

Create schema for data mart.

In [6]:
wos_client.data_marts.show()

0,1,2,3,4,5
,,False,active,2022-12-19 11:00:56.742000+00:00,00000000-0000-0000-0000-000000000000


In [7]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS SGM Data Mart",
                description="Data Mart created for WOS SGM",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL, # For DB2 use DatabaseType.DB2
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=False,
                        certificate_base64=False
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS SGM Data Mart",
                description="Data Mart created for WOS SGM", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

Using existing datamart 00000000-0000-0000-0000-000000000000


In [8]:
wos_client.data_marts.get(data_mart_id).result.to_dict()

{'metadata': {'id': '00000000-0000-0000-0000-000000000000',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:data_mart:00000000-0000-0000-0000-000000000000',
  'url': '/v2/data_marts/00000000-0000-0000-0000-000000000000',
  'created_at': '2022-12-19T11:00:56.742000Z',
  'created_by': 'admin'},
 'entity': {'service_instance_crn': 'N/A',
  'internal_database': False,
  'database_configuration': {'database_type': 'db2',
   'instance_id': 'ip-10-0-22-85.us-east-2.compute.internal',
   'name': 'manual-ip-10-0-22-85.us-east-2.compute.internal',
   'credentials': {'secret_id': '782809e9-c182-499f-a922-2d9208968770'},
   'location': {'schema_name': 'LAB-1-DATA'}},
  'status': {'state': 'active'}}}

<a id="binding"></a>
## 2. Bind machine learning engines

### Bind  `SageMaker` machine learning engine

#### Provide SageMaker credentials using following fields:
- `access_key_id`
- `secret_access_key`
- `region`

In [9]:
SAGEMAKER_ENGINE_CREDENTIALS = {
                   'access_key_id': '', 
                   'secret_access_key': '', 
                   'region': 'us-east-2'}

In [10]:
SERVICE_PROVIDER_NAME = "AWS SGM Machine Learning"
SERVICE_PROVIDER_DESCRIPTION = "Added by AWS IBM Integration"

In [11]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

Deleted existing service_provider for WML instance: a164cb69-145c-4015-9d4e-7bc5d84fd20c


In [12]:
added_service_provider_result=wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description="AWS Service Provider",
        service_type=ServiceTypes.AMAZON_SAGEMAKER,
        credentials=SageMakerCredentials(
            access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],
            secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'],
            region=SAGEMAKER_ENGINE_CREDENTIALS['region']
        ),
        background_mode=False
    ).result



service_provider_id = added_service_provider_result.metadata.id
print("Service Provider id ", service_provider_id)




 Waiting for end of adding service provider b426a9dc-b15b-420c-99bf-1b509b40f5ae 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------


Service Provider id  b426a9dc-b15b-420c-99bf-1b509b40f5ae


In [13]:
wos_client.service_providers.show()

0,1,2,3,4,5
,active,AWS SGM Machine Learning,amazon_sagemaker,2023-01-31 07:42:29.457000+00:00,b426a9dc-b15b-420c-99bf-1b509b40f5ae


In [14]:
asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id).result
asset_deployment_details

{'resources': [{'metadata': {'guid': 'f87b94db698b412f3953cb42f15dbbaa',
    'url': 'linear-learner-2023-01-16-06-26-51-232',
    'created_at': '2023-01-16T06:26:51.872Z',
    'modified_at': '2023-01-16T06:30:35.707Z'},
   'entity': {'name': 'linear-learner-2023-01-16-06-26-51-232',
    'deployment_rn': 'arn:aws:sagemaker:us-east-2:481118440516:endpoint/linear-learner-2023-01-16-06-26-51-232',
    'type': 'online',
    'scoring_endpoint': {'url': 'linear-learner-2023-01-16-06-26-51-232'},
    'asset': {'asset_id': 'f87b94db698b412f3953cb42f15dbbaa',
     'asset_rn': 'arn:aws:sagemaker:us-east-2:481118440516:model/linear-learner-2023-01-16-06-26-51-232',
     'url': 's3://sagemaker-us-east-2-481118440516/linear-learner-2023-01-16-06-22-37-086/output/model.tar.gz',
     'name': 'linear-learner-2023-01-16-06-26-51-232',
     'asset_type': 'model',
     'created_at': '2023-01-16T06:26:51.552Z'},
    'asset_properties': {'asset_revision': '1673850635707'}}}],
 'count': 1}

In [15]:
deployment_id='' # 'resources': [{'metadata': {'guid' value from previous cell will be deployment_id.
for model_asset_details in asset_deployment_details['resources']:
    if model_asset_details['metadata']['guid']==deployment_id:
        break
model_asset_details

{'metadata': {'guid': 'f87b94db698b412f3953cb42f15dbbaa',
  'url': 'linear-learner-2023-01-16-06-26-51-232',
  'created_at': '2023-01-16T06:26:51.872Z',
  'modified_at': '2023-01-16T06:30:35.707Z'},
 'entity': {'name': 'linear-learner-2023-01-16-06-26-51-232',
  'deployment_rn': 'arn:aws:sagemaker:us-east-2:481118440516:endpoint/linear-learner-2023-01-16-06-26-51-232',
  'type': 'online',
  'scoring_endpoint': {'url': 'linear-learner-2023-01-16-06-26-51-232'},
  'asset': {'asset_id': 'f87b94db698b412f3953cb42f15dbbaa',
   'asset_rn': 'arn:aws:sagemaker:us-east-2:481118440516:model/linear-learner-2023-01-16-06-26-51-232',
   'url': 's3://sagemaker-us-east-2-481118440516/linear-learner-2023-01-16-06-22-37-086/output/model.tar.gz',
   'name': 'linear-learner-2023-01-16-06-26-51-232',
   'asset_type': 'model',
   'created_at': '2023-01-16T06:26:51.552Z'},
  'asset_properties': {'asset_revision': '1673850635707'}}}

<a id="subsciption"></a>
## 3. Subscriptions

### Add subscriptions

List available deployments.

**Note:** Depending on number of assets it may take some time.

In [16]:
aws_asset = Asset(
        asset_id=model_asset_details['entity']['asset']['asset_id'],
        name=model_asset_details['entity']['asset']['name'],
        url=model_asset_details['entity']['asset']['url'],
        asset_type=model_asset_details['entity']['asset']['asset_type'] if 'asset_type' in model_asset_details['entity']['asset'] else 'model',
        problem_type=ProblemType.MULTICLASS_CLASSIFICATION,
        input_data_type=InputDataType.STRUCTURED,
    )

In [17]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest
deployment_scoring_endpoint = model_asset_details['entity']['scoring_endpoint']
scoring_endpoint = ScoringEndpointRequest(url = model_asset_details['entity']['scoring_endpoint']['url'] )

In [18]:
deployment = AssetDeploymentRequest(
        deployment_id=model_asset_details['metadata']['guid'],
        url=model_asset_details['metadata']['url'],
        name=model_asset_details['entity']['name'],
        deployment_type=model_asset_details['entity']['type'],
        scoring_endpoint =  scoring_endpoint
    )

In [19]:
training_data_reference = TrainingDataReference(type=TrainingDataReferenceType.DB2,
                      location=DB2TrainingDataReferenceLocation(
                      schema_name=SCHEMA_NAME,
                      table_name=TABLE_NAME),
                      connection=DB2TrainingDataReferenceConnection(
                          hostname = HOSTNAME,
                          username=USERNAME,
                          password=PASSWORD,
                          database_name=DATABASE,
                          port=PORT
                      )
                    )

In [20]:
feature_columns = ['REGION','TOTAL_CASES']
categorical_columns = ['REGION']

In [21]:
asset_properties = AssetPropertiesRequest(
        label_column="RISK_INDEX",
        prediction_field='predicted_label',
        probability_fields=['score'],
        training_data_reference=training_data_reference,
        training_data_schema=None,
        input_data_schema=None,
        output_data_schema=None,
        feature_fields=feature_columns,
        categorical_fields=categorical_columns
    )

In [22]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=aws_asset,
        deployment=deployment,
        asset_properties=asset_properties,
        background_mode=False
).result
subscription_id = subscription_details.metadata.id
subscription_id




 Waiting for end of adding subscription f26582b5-3112-4f0f-b00b-64cc07d0f6fa 




preparing
active

-------------------------------------------
 Successfully finished adding subscription 
-------------------------------------------




'f26582b5-3112-4f0f-b00b-64cc07d0f6fa'

#### List subscriptions

In [23]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
f87b94db698b412f3953cb42f15dbbaa,linear-learner-2023-01-16-06-26-51-232,00000000-0000-0000-0000-000000000000,f87b94db698b412f3953cb42f15dbbaa,linear-learner-2023-01-16-06-26-51-232,b426a9dc-b15b-420c-99bf-1b509b40f5ae,active,2023-01-31 07:42:35.201000+00:00,f26582b5-3112-4f0f-b00b-64cc07d0f6fa


<a id="scoring"></a>
## 4. Performance metrics, scoring and payload logging

### Score the risk index model and measure response time

In [24]:
import requests
import time
import json
import boto3

In [25]:
subscription_details=wos_client.subscriptions.get(subscription_id).result.to_dict()
subscription_details

{'metadata': {'id': 'f26582b5-3112-4f0f-b00b-64cc07d0f6fa',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:subscription:f26582b5-3112-4f0f-b00b-64cc07d0f6fa',
  'url': '/v2/subscriptions/f26582b5-3112-4f0f-b00b-64cc07d0f6fa',
  'created_at': '2023-01-31T07:42:35.201000Z',
  'created_by': 'admin'},
 'entity': {'data_mart_id': '00000000-0000-0000-0000-000000000000',
  'service_provider_id': 'b426a9dc-b15b-420c-99bf-1b509b40f5ae',
  'asset': {'asset_id': 'f87b94db698b412f3953cb42f15dbbaa',
   'url': 's3://sagemaker-us-east-2-481118440516/linear-learner-2023-01-16-06-22-37-086/output/model.tar.gz',
   'name': 'linear-learner-2023-01-16-06-26-51-232',
   'asset_type': 'model',
   'problem_type': 'multiclass',
   'input_data_type': 'structured'},
  'asset_properties': {'training_data_reference': {'secret_id': '3271ffbe-4294-47fa-b34f-f109bb08d923'},
   'output_data_schema': {'type': 'struct',
    'fields': [{'metadata': {'columnInfo': {'columnL

In [26]:
endpoint_name = subscription_details['entity']['deployment']['name']

payload = "0,100"

In [27]:
runtime = boto3.client('sagemaker-runtime',
                       region_name=SAGEMAKER_ENGINE_CREDENTIALS['region'],
                       aws_access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],
                       aws_secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'])

start_time = time.time()
response = runtime.invoke_endpoint(EndpointName=endpoint_name, ContentType='text/csv', Body=payload)
response_time = int((time.time() - start_time)*1000)
result = json.loads(response['Body'].read().decode())

print(json.dumps(result, indent=2))

{
  "predictions": [
    {
      "score": [
        0.8418549299240112,
        0.023631207644939423,
        0.13451385498046875
      ],
      "predicted_label": 0
    }
  ]
}


### Store the request and response in payload logging table

#### Transform the model's input and output to the format compatible with OpenScale standard.

In [28]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

Payload data set id:  bc1d0f0a-877d-422d-97d1-89a455ad0aa7


In [29]:
values = [float(s) for s in payload.split(',')]

request_data = {'fields': feature_columns, 
                'values': values}

response_data = {'fields': list(result['predictions'][0]),
                 'values': [list(x.values()) for x in result['predictions']]}

#### Store the payload using Python SDK

**Hint:** You can embed payload logging code into your custom deployment so it is logged automatically each time you score the model.

In [30]:
import uuid
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord

print("Performing explicit payload logging.....")
wos_client.data_sets.store_records(data_set_id=payload_data_set_id, background_mode=False,request_body=[PayloadRecord(
           scoring_id=str(uuid.uuid4()),
           request=request_data,
           response=response_data,
           response_time=460
)])
time.sleep(5)
pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Performing explicit payload logging.....



 Waiting for end of storing records with request id: 1f2cb328-d81e-4141-b02a-0d35c41871e1 




pending
active

---------------------------------------
 Successfully finished storing records 
---------------------------------------


Number of records in the payload logging table: 1


In [31]:
wos_client.data_sets.show_records(data_set_id=payload_data_set_id)

0,1,2,3,4,5,6,7,8
,c3594b77-285a-44f3-8a08-7eca514af7d3-1,0,2023-01-31T07:42:50.797Z,0.8418549299240112,"[0.8418549299240112, 0.023631207644939423, 0.13451385498046875]",100.0,0.0,f87b94db698b412f3953cb42f15dbbaa


<a id="feedback"></a>
## 5. Feedback logging & quality (accuracy) monitoring

### Enable quality monitoring

You need to provide the monitoring `threshold` and `min_records` (minimal number of feedback records).

In [32]:
import time

time.sleep(10)
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=subscription_id
)
parameters = {
    "min_feedback_data_size": 10
}
thresholds = [
                {
                    "metric_id": "area_under_roc",
                    "type": "lower_limit",
                    "value": .80
                }
            ]
quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result




 Waiting for end of monitor instance creation 1a4119ad-12f6-4ce9-a2ae-b14048a26656 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




In [33]:
quality_monitor_instance_id = quality_monitor_details.metadata.id
quality_monitor_instance_id

'1a4119ad-12f6-4ce9-a2ae-b14048a26656'

### Feedback records logging

Feedback records are used to evaluate your model. The predicted values are compared to real values (feedback records).

You can check the schema of feedback table using below method.

In [34]:
feedback_dataset_id = None
feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result
feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")
feedback_dataset_id

'4c12c48a-e4ad-47d1-83bc-b6af0c5a7099'

The feedback records can be send to feedback table using below code.

In [35]:
import requests
import pandas as pd
import numpy as np
import time

time.sleep(10) #It gives enough time for dataset creation

data = pd.read_csv('https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Feedback.csv',header=0,dtype=np.float)
feedback_columns = data.columns.tolist()
feedback_records = data.values.tolist()

payload_scoring =  [{"fields": feedback_columns, "values": feedback_records}]
wos_client.data_sets.store_records(feedback_dataset_id, request_body=payload_scoring, background_mode=False)

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  data = pd.read_csv('https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Feedback.csv',header=0,dtype=np.float)





 Waiting for end of storing records with request id: 280946b0-94e3-4e9c-b266-12c199504f33 




active

---------------------------------------
 Successfully finished storing records 
---------------------------------------




<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7f71759a9750>

In [36]:
wos_client.data_sets.print_records_schema(data_set_id=feedback_dataset_id)

0,1,2
REGION,double,True
TOTAL_CASES,double,True
RISK_INDEX,double,True
record_id,string,False
record_timestamp,timestamp,False
transaction_id,string,True
_original_prediction,integer,True
_original_probability,"{'containsNull': True, 'elementType': 'double', 'type': 'array'}",True
_debiased_prediction,integer,True
_debiased_probability,"{'containsNull': True, 'elementType': 'double', 'type': 'array'}",True


In [37]:
wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)

12

In [38]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result




 Waiting for end of monitoring run ae938061-3a37-4c27-8b03-adc7c9239110 




running
finished

---------------------------
 Successfully finished run 
---------------------------




In [39]:
time.sleep(5)
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2023-01-31 07:43:40.257000+00:00,accuracy,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.8333333333333334,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,weighted_true_positive_rate,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.8333333333333333,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,weighted_precision,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.888888888888889,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,log_loss,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.7617815096623222,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,weighted_recall,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.8333333333333333,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,weighted_f_measure,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.837037037037037,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa
2023-01-31 07:43:40.257000+00:00,weighted_false_positive_rate,90ef22f8-af95-41a7-9552-6b048ddc3b74,0.0833333333333333,,,['model_type:original'],quality,1a4119ad-12f6-4ce9-a2ae-b14048a26656,ae938061-3a37-4c27-8b03-adc7c9239110,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa


<a id="datamart"></a>
## 6. Get the logged data

### Payload logging

#### Print schema of payload_logging table

In [40]:
wos_client.data_sets.print_records_schema(data_set_id=payload_data_set_id)

0,1,2
scoring_id,string,False
scoring_timestamp,timestamp,False
deployment_id,string,False
asset_revision,string,True
REGION,double,True
TOTAL_CASES,double,True
score,"{'containsNull': True, 'elementType': 'double', 'type': 'array'}",True
predicted_label,integer,True
prediction_probability,double,True


<a id="fairness_and_explainability"></a>
## 7. Fairness, Drift monitoring and explanations

### Get payload data

In [41]:
scoring_data_filename='Data-region-RI-SM-Scoring.csv'
scoring_data_filename_json='Data-region-RI-SM-Scoring.json'

In [42]:
!rm Data-region-RI-SM-Scoring.csv
!rm Data-region-RI-SM-Scoring.json
!wget "https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Scoring.json"
!wget "https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Scoring.csv"

rm: cannot remove 'Data-region-RI-SM-Scoring.csv': No such file or directory
rm: cannot remove 'Data-region-RI-SM-Scoring.json': No such file or directory
--2023-01-31 07:43:58--  https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Scoring.json
Resolving ibm-aws-immersion-day.s3.us-east-2.amazonaws.com (ibm-aws-immersion-day.s3.us-east-2.amazonaws.com)... 52.219.94.122
Connecting to ibm-aws-immersion-day.s3.us-east-2.amazonaws.com (ibm-aws-immersion-day.s3.us-east-2.amazonaws.com)|52.219.94.122|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 106 [application/json]
Saving to: ‘Data-region-RI-SM-Scoring.json’


2023-01-31 07:43:58 (3.85 MB/s) - ‘Data-region-RI-SM-Scoring.json’ saved [106/106]

--2023-01-31 07:43:59--  https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Scoring.csv
Resolving ibm-aws-immersion-day.s3.us-east-2.amazonaws.com (ibm-aws-immersion-day.s3.us-east-2.amazonaws.com)... 52.

In [43]:
data=pd.read_csv('https://ibm-aws-immersion-day.s3.us-east-2.amazonaws.com/publicdata/Data-region-RI-SM-Scoring.csv')

In [44]:
import io
from io import StringIO
csv_file = io.StringIO()
# by default sagemaker expects comma seperated
data.to_csv(csv_file, sep=",", header=False, index=False)
scoring_data_filename = csv_file.getvalue()

In [45]:
sm_runtime = boto3.client('sagemaker-runtime',
                       region_name=SAGEMAKER_ENGINE_CREDENTIALS['region'],
                       aws_access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],
                       aws_secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'])


scoring_response = sm_runtime.invoke_endpoint(EndpointName = endpoint_name,
                                                  ContentType = 'text/csv',
                                                  Body = scoring_data_filename)
    
result = json.loads(scoring_response['Body'].read().decode())
print(json.dumps(result, indent=2))

{
  "predictions": [
    {
      "score": [
        0.02799299918115139,
        0.7118191719055176,
        0.2601878345012665
      ],
      "predicted_label": 1
    },
    {
      "score": [
        0.8067101836204529,
        0.02732161432504654,
        0.16596820950508118
      ],
      "predicted_label": 0
    },
    {
      "score": [
        0.010750409215688705,
        0.19208674132823944,
        0.7971628308296204
      ],
      "predicted_label": 2
    },
    {
      "score": [
        0.03177503123879433,
        0.7168406248092651,
        0.25138434767723083
      ],
      "predicted_label": 1
    },
    {
      "score": [
        0.7829863429069519,
        0.029666010290384293,
        0.18734769523143768
      ],
      "predicted_label": 0
    },
    {
      "score": [
        0.060710735619068146,
        0.26483166217803955,
        0.6744576096534729
      ],
      "predicted_label": 2
    },
    {
      "score": [
        0.041998133063316345,
        0.72588139

In [46]:
f = open(scoring_data_filename_json,"r")
payload_values = json.load(f)
request_data = {'fields': feature_columns, 
                'values': payload_values}

response_data = {'fields': list(result['predictions'][0]),
                 'values': [list(x.values()) for x in result['predictions']]}


In [47]:
import uuid
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord

print("Performing explicit payload logging.....")
wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(
           scoring_id=str(uuid.uuid4()),
           request=request_data,
           response=response_data,
           response_time=460
)])
time.sleep(5)
pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Performing explicit payload logging.....
Number of records in the payload logging table: 13


In [48]:
wos_client.data_sets.show_records(payload_data_set_id)

0,1,2,3,4,5,6,7,8
,84164fa7-4f67-4884-be52-0e419970ef89-1,1.0,2023-01-31T07:43:59.979Z,0.7118191719055176,"[0.02799299918115139, 0.7118191719055176, 0.2601878345012665]",10575.0,1.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-10,0.0,2023-01-31T07:43:59.979Z,0.7493403553962708,"[0.7493403553962708, 0.03281030058860779, 0.21784931421279907]",200.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-11,0.0,2023-01-31T07:43:59.979Z,0.826548159122467,"[0.826548159122467, 0.025272492319345474, 0.1481793373823166]",1400.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-12,,2023-01-31T07:43:59.979Z,,[],350.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-2,0.0,2023-01-31T07:43:59.979Z,0.8067101836204529,"[0.8067101836204529, 0.02732161432504654, 0.16596820950508118]",2100.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-3,2.0,2023-01-31T07:43:59.979Z,0.7971628308296204,"[0.010750409215688705, 0.19208674132823944, 0.7971628308296204]",649.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-4,1.0,2023-01-31T07:43:59.979Z,0.7168406248092651,"[0.03177503123879433, 0.7168406248092651, 0.25138434767723083]",10023.0,1.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-5,0.0,2023-01-31T07:43:59.979Z,0.7829863429069519,"[0.7829863429069519, 0.029666010290384293, 0.18734769523143768]",1750.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-6,2.0,2023-01-31T07:43:59.979Z,0.6744576096534729,"[0.060710735619068146, 0.26483166217803955, 0.6744576096534729]",977.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-7,1.0,2023-01-31T07:43:59.979Z,0.7258813977241516,"[0.041998133063316345, 0.7258813977241516, 0.23212049901485443]",5900.0,1.0,f87b94db698b412f3953cb42f15dbbaa


### Enable and run fairness monitoring

In [49]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id

)
parameters = {
    "features": [
        {"feature": "REGION",
         "majority": [[1,1]],
         "minority": [[0,0]],
         "threshold": 0.95
         }
    ],
    "favourable_class": [0],
    "unfavourable_class": [2],
    "min_records": 10
}

fairness_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,
    target=target,
    parameters=parameters).result
fairness_monitor_instance_id =fairness_monitor_details.metadata.id
fairness_monitor_instance_id




 Waiting for end of monitor instance creation fefa92dc-2985-43e9-b132-23b670c2043b 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




'fefa92dc-2985-43e9-b132-23b670c2043b'

### Run fairness monitor

In [50]:
time.sleep(20)
#Note: When you create fairness monitor, initial run is also created
wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2023-01-31 07:44:15.913442+00:00,fairness_value,f6441a45-0451-4c97-8b96-b61ee598e2ae,100.0,80.0,,"['feature:REGION', 'fairness_metric_type:fairness', 'feature_value:0-0']",fairness,fefa92dc-2985-43e9-b132-23b670c2043b,44e7591d-f300-4ae4-9f4c-6b77980755e3,subscription,f26582b5-3112-4f0f-b00b-64cc07d0f6fa


### Enable and run Drift monitoring

#### We will have to upload the model manually in the OpenScale UI for setting up Drift Metrics

### Enable Explainability and run explanation on sample record. 

In [51]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result




 Waiting for end of monitor instance creation f0f59f48-8959-4904-99de-900380e9bbb8 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




Getting a `transaction_id` to run explanation on

In [52]:
explainability_monitor_id = explainability_details.metadata.id
explainability_monitor_id

'f0f59f48-8959-4904-99de-900380e9bbb8'

In [53]:
wos_client.data_sets.show_records(data_set_id=payload_data_set_id,limit=50)

0,1,2,3,4,5,6,7,8
,84164fa7-4f67-4884-be52-0e419970ef89-1,1.0,2023-01-31T07:43:59.979Z,0.7118191719055176,"[0.02799299918115139, 0.7118191719055176, 0.2601878345012665]",10575.0,1.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-10,0.0,2023-01-31T07:43:59.979Z,0.7493403553962708,"[0.7493403553962708, 0.03281030058860779, 0.21784931421279907]",200.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-11,0.0,2023-01-31T07:43:59.979Z,0.826548159122467,"[0.826548159122467, 0.025272492319345474, 0.1481793373823166]",1400.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-12,,2023-01-31T07:43:59.979Z,,[],350.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-2,0.0,2023-01-31T07:43:59.979Z,0.8067101836204529,"[0.8067101836204529, 0.02732161432504654, 0.16596820950508118]",2100.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-3,2.0,2023-01-31T07:43:59.979Z,0.7971628308296204,"[0.010750409215688705, 0.19208674132823944, 0.7971628308296204]",649.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-4,1.0,2023-01-31T07:43:59.979Z,0.7168406248092651,"[0.03177503123879433, 0.7168406248092651, 0.25138434767723083]",10023.0,1.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-5,0.0,2023-01-31T07:43:59.979Z,0.7829863429069519,"[0.7829863429069519, 0.029666010290384293, 0.18734769523143768]",1750.0,2.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-6,2.0,2023-01-31T07:43:59.979Z,0.6744576096534729,"[0.060710735619068146, 0.26483166217803955, 0.6744576096534729]",977.0,0.0,f87b94db698b412f3953cb42f15dbbaa
,84164fa7-4f67-4884-be52-0e419970ef89-7,1.0,2023-01-31T07:43:59.979Z,0.7258813977241516,"[0.041998133063316345, 0.7258813977241516, 0.23212049901485443]",5900.0,1.0,f87b94db698b412f3953cb42f15dbbaa


In [54]:
payload_data = wos_client.data_sets.get_list_of_records(limit=1,data_set_id=payload_data_set_id,output_type='pandas').result
scoring_ids=payload_data['scoring_id'].tolist()
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime", "contrastive"]
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types).result
print(result)
explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]
explanation_task_id

Running explanations on scoring IDs: ['84164fa7-4f67-4884-be52-0e419970ef89-1']
------------------------------------------------------------------------
| After November 2022, the explanation_tasks method must include the   |
| subscription_id parameter for future IBM Watson OpenScale releases.  |
------------------------------------------------------------------------
{
  "metadata": {
    "explanation_task_ids": [
      "904a9081-c474-42c9-bacc-433d4161cfdf"
    ],
    "created_by": "1000330999",
    "created_at": "2023-01-31T07:44:41.701637Z"
  }
}


'904a9081-c474-42c9-bacc-433d4161cfdf'

In [55]:
wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id).result.to_dict()

----------------------------------------------------------------------------
| After November 2022, the get_explanation_tasks method must include the   |
| subscription_id parameter for future IBM Watson OpenScale releases.      |
----------------------------------------------------------------------------


{'metadata': {'explanation_task_id': '904a9081-c474-42c9-bacc-433d4161cfdf',
  'created_by': '1000330999',
  'created_at': '2023-01-31T07:44:41.701637Z'},
 'entity': {'status': {'state': 'in_progress'},
  'scoring_id': '84164fa7-4f67-4884-be52-0e419970ef89-1'}}

You can now view the [OpenScale Dashboard](https://cpd-zen-46.apps.cp-deployer46-d01.ibmworkshops.com/aiopenscale/insights?serviceInstanceNamespace=zen-46&serviceInstanceDisplayName=openscale-defaultinstance). Click on the tile for the Risk Index AWS model to see Fairness, Explainability, Accuracy, and Performance monitors. 

### In this Notebook, we have learnt how to configure the metrics for monitoring the SageMaker model using Watson OpenScale. Fairness, Quality & Explainability metrics have been programmatically configuered using this notebook. Drift model has to be configuered manually using the steps listed in the documentation. 

---