Run Decision Tree Notebook

In this lab exercise, you will learn a popular machine learning algorithm, Decision Tree. You will use this classification algorithm to build a model from historical data of region and their total cases. Then you use the trained decision tree to predict the Risk Index of a region.

Notebook 3 : Risk Index Prediction with Decision Tree

  1. In Cloud Pak for Data, click on the Assets tab on top, under Asset types expand the Source code tab and select Notebook.

  2. You will see the three notebooks listed. You will refer to the Region-All-Decision-Tree.ipynb notebook.

  3. Click on the three dot menu and select edit to get started. brussels-edit

  4. The notebook should look something as shown below. notebook-preview

  5. Before running the notebook, you need to add the S3 connection to the notebook.

  • Click on the third code cell in the notebook.
  • Click on find and add data button on top right.
  • Click on Connections tab.
  • You will see your connection variable. Click on Insert to code and select pandas DataFrame.
  • Select the RI-data-ML.csv dataset from the connection variable.

add-data-connection

  1. Verify the dataframe name to be data_df_1 in the generated code snippet.

  2. Click on Cell and select Run All to run the notebook. run-notebook

  3. This will run the notebook, it will take some time please be patient.

  4. Once the notebook is completed you can observe the following in the notebook:

  • Decision Tree Model Accuracy
  • Decision Tree Visualization
  1. Decision Tree Model Accuracy: You can observe the accuracy of the model is 86.63%. nb2-current-trend

  2. Decision Tree Visualization: You can observe the decision tree in the notebook. nb2-lstm-accuracy

You have successfully completed this lab exercise. You can proceed to the next step.