Note
Go to the end to download the full example code
2. ML Experiments
import matplotlib.pyplot as plt
plt.rcParams["font.family"] = "Times New Roman"
from utils import make_data
from ai4water.experiments import MLRegressionExperiments
data, _, _ = make_data(encoding="ohe")
print(data.shape)
(1514, 75)
data.head()
Initialize the experiment
comparisons = MLRegressionExperiments(
input_features=data.columns.tolist()[0:-1],
output_features=data.columns.tolist()[-1:],
split_random=True,
seed=1575,
verbosity=0,
show=False
)
fit/train all the models
comparisons.fit(
data=data,
run_type="dry_run",
include=['XGBRegressor',
'AdaBoostRegressor', 'LinearSVR',
'BaggingRegressor', 'DecisionTreeRegressor',
'HistGradientBoostingRegressor',
'ExtraTreesRegressor', 'ExtraTreeRegressor',
'LinearRegression', 'KNeighborsRegressor']
)
***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
running XGBRegressor model
findfont: Font family ['Times New Roman'] not found. Falling back to DejaVu Sans.
findfont: Font family ['Times New Roman'] not found. Falling back to DejaVu Sans.
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
running AdaBoostRegressor model
divide by zero encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
running LinearSVR model
Liblinear failed to converge, increase the number of iterations.
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
running BaggingRegressor model
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in log
running DecisionTreeRegressor model
invalid value encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in log
running HistGradientBoostingRegressor model
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
running ExtraTreesRegressor model
divide by zero encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
running ExtraTreeRegressor model
divide by zero encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
running LinearRegression model
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
invalid value encountered in log
running KNeighborsRegressor model
divide by zero encountered in true_divide
divide by zero encountered in log
divide by zero encountered in true_divide
divide by zero encountered in log
Compare R2
_ = comparisons.compare_errors(
'r2',
data=data)
plt.tight_layout()
plt.show()

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in true_divide
Compare MSE
_ = comparisons.compare_errors(
'mse',
data=data,
cutoff_val=1e7,
cutoff_type="less"
)
plt.tight_layout()
plt.show()

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in true_divide
_ = best_models = comparisons.compare_errors(
'r2_score',
cutoff_type='greater',
cutoff_val=0.01,
data=data
)
plt.tight_layout()
plt.show()

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in true_divide
comparisons.taylor_plot(data=data)

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
divide by zero encountered in true_divide
invalid value encountered in true_divide
divide by zero encountered in true_divide
<Figure size 500x800 with 2 Axes>
comparisons.compare_edf_plots(
data=data,
exclude=["SGDRegressor", "KernelRidge", "PoissonRegressor"])
plt.tight_layout()
plt.show()

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
_ = comparisons.compare_regression_plots(data=data, figsize=(12, 14))

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
_ = comparisons.compare_residual_plots(data=data, figsize=(12, 14))

***** Training *****
input_x shape: (847, 74)
target shape: (847, 1)
***** Validation *****
input_x shape: (212, 74)
target shape: (212, 1)
***** Test *****
input_x shape: (455, 74)
target shape: (455, 1)
Total running time of the script: (1 minutes 56.864 seconds)