6. Prediction Analysis

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams["font.family"] = "Times New Roman"

from utils import get_dataset
from utils import evaluate_model
from utils import get_fitted_model
from utils import prediction_distribution
dataset ,  _, _ = get_dataset(encoding="ohe")
X_train, y_train = dataset.training_data()
X_test, y_test = dataset.test_data()
***** Training *****
input_x shape:  (1059, 74)
target shape:  (1059, 1)
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)
model, _ = get_fitted_model()
***** Training *****
input_x shape:  (1059, 74)
target shape:  (1059, 1)
dot plot of model could not be plotted due to ('You must install pydot (`pip install pydot`) and install graphviz (see instructions at https://graphviz.gitlab.io/download/) ', 'for plot_model/model_to_dot to work.')
test_p = model.predict(x=X_test)
assigning name input_1 to IteratorGetNext:0 with shape (None, 74)
#evaluate_model(y_test, test_p)

Prediction Distribution

prediction_distribution('Adsorption Time (min)', test_p, 0.4 )
Adsorption Time (min)
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Adsorption Time (min)'}>
grid = [25, 550, 600, 700, 800, 900]
prediction_distribution('Pyrolysis Temperature', test_p, 0.4, grid=grid)
Pyrolysis Temperature
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Pyrolysis Temperature'}>
grid = [1.01, 10, 50, 100, 200, 300, 400, 900]
prediction_distribution('Initial Concentration', test_p, 0.4, grid=grid)
Initial Concentration
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Initial Concentration'}>
prediction_distribution('Solution pH', test_p, 0.4)
Solution pH
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Solution pH'}>
grid = [0.0, 0.01, 0.04, 0.1, 0.5, 2.47, 10]
prediction_distribution('Adsorbent Loading', test_p, 0.4, grid)
Adsorbent Loading
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Adsorbent Loading'}>
prediction_distribution('Volume (L)', test_p, 0.4)
Volume (L)
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Volume (L)'}>
prediction_distribution('Adsorption Temperature', test_p, 0.4)
Adsorption Temperature
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Adsorption Temperature'}>
grid = [2.75, 26.55, 81, 147.2, 495.5, 1085, 1509.11, 2430]
prediction_distribution('Surface Area', test_p, 0.4, grid=grid)
Surface Area
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Surface Area'}>
grid = [0.0, 0.18, 0.38, 0.39, 0.72, 1.32]
prediction_distribution('Pore Volume', test_p, 0.4, grid=grid)
Pore Volume
***** Test *****
input_x shape:  (455, 74)
target shape:  (455, 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
FixedFormatter should only be used together with FixedLocator

<AxesSubplot:title={'center':'Pore Volume'}>

Feature Interaction

_ = model.prediction_analysis(
    x = pd.DataFrame(X_test, columns=dataset.input_features),
    features = ['Adsorption Time (min)', 'Pyrolysis Time (min)'],
    feature_names = ['Adsorption Time (min)', 'Pyrolysis Time (min)'],
    grid_types=["percentile", "percentile"],
    num_grid_points=[6, 6],
    border=True,
    annotate_kws={'annotate_fontsize':15,
                  'annotate_colors': np.array(
                      [['black', 'black', 'black', 'black'],
                      ['black', 'black', 'black', 'black'],
                      ['black', 'white', 'black', 'black'],
                      ['black', 'black', 'black', 'black']])}
    )
prediction analysis
_ = model.prediction_analysis(
    x = pd.DataFrame(X_test, columns=dataset.input_features),
    features = ['Adsorption Time (min)', 'Initial Concentration'],
    feature_names = ['Adsorption Time (min)', 'Initial Concentration'],
    grid_types=["percentile", "percentile"],
    num_grid_points=[6, 6],
    border=True,
    annotate_kws={'annotate_fontsize': 15,
                  'annotate_colors': np.array(
                      [['black', 'black', 'black', 'black', 'black'],
                       ['black', 'black', 'black', 'black', 'black'],
                       ['black', 'black', 'black', 'black', 'black'],
                       ['white', 'black', 'black', 'black', 'black']])}
)
prediction analysis
_ = model.prediction_analysis(
    x = pd.DataFrame(X_test, columns=dataset.input_features),
    features = ['Adsorption Time (min)', 'Solution pH'],
    feature_names = ['Adsorption Time (min)', 'Solution pH'],
    grid_types=["percentile", "percentile"],
    num_grid_points=[6, 6],
    border=True,
    annotate_kws={'annotate_fontsize': 15,
                  'annotate_colors': np.array(
                      [['black', 'white', 'black', 'black'],
                       ['black', 'white', 'black', 'black'],
                       ['black', 'white', 'black', 'black'],
                       ['black', 'black', 'black', 'black']])}
)
prediction analysis
_ = model.prediction_analysis(
    x = pd.DataFrame(X_test, columns=dataset.input_features),
    features = ['Adsorption Time (min)', 'Adsorbent Loading'],
    feature_names = ['Adsorption Time (min)', 'Adsorbent Loading'],
    grid_types=["percentile", "percentile"],
    num_grid_points=[6, 6],
    border=True,
    annotate_kws={'annotate_fontsize': 15,
                  'annotate_colors': np.array(
                      [['black', 'black', 'black', 'black', 'black'],
                       ['black', 'black', 'black', 'black', 'black'],
                       ['black', 'white', 'black', 'black', 'black'],
                       ['black', 'black', 'black', 'black', 'black']])
                                 }
)
prediction analysis

Total running time of the script: (0 minutes 13.731 seconds)

Gallery generated by Sphinx-Gallery