multiword
controls how scores or losses are calculated:
ndarray of (weights)
: weighted averager2_score
and explained_variance_score
also accept multioutput="variance_weighted"
for weighing the outputs.
$explained\_{}variance(y, \hat{y}) = 1 - \frac{Var\{ y - \hat{y}\}}{Var\{y\}}$.
1.0 is best possible score. Lower numbers are worse.
from sklearn.metrics import explained_variance_score as EVS
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print(EVS(y_true, y_pred))
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(EVS(y_true, y_pred, multioutput='raw_values'))
print(EVS(y_true, y_pred, multioutput=[0.3, 0.7]))
0.9571734475374732 [0.96774194 1. ] 0.9903225806451612
from sklearn.metrics import max_error
y_true = [3, 2, 7, 1]
y_pred = [9, 2, 7, 1]
print(max_error(y_true, y_pred))
6
MAE corresponds to the expected L1-norm loss.
$\text{MAE}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}}-1} \left| y_i - \hat{y}_i \right|.$
from sklearn.metrics import mean_absolute_error as MAE
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print(MAE(y_true, y_pred))
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(MAE(y_true, y_pred))
print(MAE(y_true, y_pred, multioutput='raw_values'))
print(MAE(y_true, y_pred, multioutput=[0.3, 0.7]))
0.5 0.75 [0.5 1. ] 0.85
MSE corresponds to the expected squared (quadratic) loss.
$\text{MSE}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples} - 1} (y_i - \hat{y}_i)^2.$
from sklearn.metrics import mean_squared_error as MSE
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print(MSE(y_true, y_pred))
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(MSE(y_true, y_pred))
0.375 0.7083333333333334
This metric is preferred when measuring exponential growth variables (populations, commodity sales over time, ...). It penalizes under-predicted estimates more than over-predicted ones.
$\text{MSLE}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples} - 1} (\log_e (1 + y_i) - \log_e (1 + \hat{y}_i) )^2.$
from sklearn.metrics import mean_squared_log_error as MSLE
y_true = [3, 5, 2.5, 7]
y_pred = [2.5, 5, 4, 8]
print(MSLE(y_true, y_pred))
y_true = [[0.5, 1], [1, 2], [7, 6]]
y_pred = [[0.5, 2], [1, 2.5], [8, 8]]
print(MSLE(y_true, y_pred))
0.03973012298459379 0.044199361889160516
Also called mean absolute percentage deviation (MAPD).
Sensitive to relative errors; not affected by a global scaling of the target variable.
$\text{MAPE}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}}-1} \frac{{}\left| y_i - \hat{y}_i \right|}{max(\epsilon, \left| y_i \right|)}$
Supports multiple output problems.
from sklearn.metrics import mean_absolute_percentage_error as MAPE
y_true = [1, 10, 1e6]
y_pred = [0.9, 15, 1.2e6]
print(MAPE(y_true, y_pred))
0.26666666666666666
Represents the proportion of variance (of y) that is explained by the independent variables in the model. It is an indication of goodness of fit - therefore, a measure of how well unseen samples are likely to be predicted by the model.
Variance is dataset dependent - so R² may not be comparable across different datasets. Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R² score of 0.0.
$R^2(y, \hat{y}) = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}$
from sklearn.metrics import r2_score as R2
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print(R2(y_true, y_pred))
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(R2(y_true, y_pred,
multioutput='variance_weighted'))
print(R2(y_true, y_pred,
multioutput='uniform_average'))
print(R2(y_true, y_pred,
multioutput='raw_values'))
print(R2(y_true, y_pred,
multioutput=[0.3, 0.7]))
0.9486081370449679 0.9382566585956417 0.9368005266622779 [0.96543779 0.90816327] 0.9253456221198156
Returns a mean Tweedie deviance error based on a power
parameter:
power=0
: equivalent to MSE (mean squared error)power=1
: equivalent to MPD (mean poisson deviancepower=2
: equivalent to MGD (mean gamma devianceGamma distribution with power=2 means that simultaneously scaling y_true and y_pred has no effect on the deviance.
Poisson distribution power=1 the deviance scales linearly.
Normal distribution (power=0), quadratically.
In general, the higher power the less weight is given to extreme deviations between true and predicted targets.
# MSE (power=0): very sensitive to 2nd point's predict diff:
from sklearn.metrics import mean_tweedie_deviance as MTD
print(MTD([ 1.0], [1.5], power=0))
print(MTD([100.0], [150.0], power=0))
0.25 2500.0
# power=1
print(MTD([ 1.0], [1.5], power=1))
print(MTD([100.0], [150.0], power=1))
0.18906978378367123 18.906978378367114
# power=2
print(MTD([ 1.0], [1.5], power=2))
print(MTD([100.0], [150.0], power=2))
0.14426354954966225 0.14426354954966225