Today we will need to use two different python env

```
python3.10 -m venv env_fairness
source env_fairness/bin/activate
pip install --upgrade pip
pip install numpy==1.25 fairlearn==0.9.0 plotly==5.24.1 nbformat==5.10.4 aif360['inFairness']==0.6.1 ipykernel==6.29.5 BlackBoxAuditing==0.1.54 cvxpy==1.6.0
cd env_fairness/lib/python3.9/site-packages/aif360/data/raw/meps
Rscipt generate_data.R
```

```
python3.10 -m venv env_adv
source env_fairness/bin/activate
pip install --upgrade pip
pip install numpy==1.26 fairlearn==0.9.0 plotly==5.24.1 nbformat==5.10.4 aif360['AdversarialDebiasing']==0.6.1 aif360['inFairness']==0.6.1 ipykernel==6.29.5 BlackBoxAuditing==0.1.54 cvxpy==1.6.0
cd env_adv/lib/python3.9/site-packages/aif360/data/raw/meps
Rscipt generate_data.R
```

The part of the TD on Adversarial Devbiasing can run only with the second env, the rest of the document can run on the first env.



!!! Attention sur Colab!!!, après avoir executé la cellule ci-dessus, il faudra redémarrer la session (onglet "Execution") afin de charger l'environnement installé

# TD 4: Mitigation des biais avec une méthode de in-processsing Prejudice Remover

The aim of this notebook is to use the Prejudice Remover in-processing approach and analyse its impact on the model output.
In terms of Machine Learning we will go a bit further in the train/valid/test paradigm.

The model has to be learn on the train dataset, then the model parameters has to be optimized on the valid dataset, and finally the model performance is evaluated on the test dataset.
No choice/decision etc can be taken depending on the test dataset. This could result on an overfitting on the test dataset.

Here you will manipulate:
- Prejudice Remover approach as a black box
- Training of the prejudice remover using the train/valid paradigm. to choice the 'best' threshold
- Combine Prejudice Remover with Reweighing

As a reminder of pre-processing approach we encourage you to :
- analyse the impact of the Reweighing on different model (Logistic Regression, Decision Tree, Random Forest, etc.)


## 1. Import and load the dataset

In [1]:
# imports
import numpy as np
import pandas as pd
import plotly.express as px
import warnings

warnings.simplefilter(action="ignore", category=FutureWarning)
warnings.simplefilter(action="ignore", append=True, category=UserWarning)
# Datasets
from aif360.datasets import MEPSDataset19

# Fairness metrics
from sklearn.metrics import accuracy_score, balanced_accuracy_score
from sklearn.preprocessing import StandardScaler

MEPSDataset19_data = MEPSDataset19()
(dataset_orig_panel19_train, dataset_orig_panel19_val, dataset_orig_panel19_test) = (
    MEPSDataset19().split([0.5, 0.8], shuffle=True)
)

In [2]:
len(dataset_orig_panel19_train.instance_weights), len(
    dataset_orig_panel19_val.instance_weights
), len(dataset_orig_panel19_test.instance_weights)

(7915, 4749, 3166)

In [3]:
instance_weights = MEPSDataset19_data.instance_weights
instance_weights

array([21854.981705, 18169.604822, 17191.832515, ...,  3896.116219,
        4883.851005,  6630.588948])

In [4]:
f"Taille du dataset {len(instance_weights)}, poids total du dataset {instance_weights.sum()}."

'Taille du dataset 15830, poids total du dataset 141367240.546316.'

In [5]:
from aif360.sklearn.metrics import *
from sklearn.metrics import  balanced_accuracy_score

 
# This method takes lists
def get_metrics(
    y_true, # list or np.array of truth values
    y_pred=None,  # list or np.array of predictions
    prot_attr=None, # list or np.array of protected/sensitive attribute values
    priv_group=1, # value taken by the privileged group
    pos_label=1, # value taken by the positive truth/prediction
    sample_weight=None # list or np.array of weights value,
):
    group_metrics = {}
    group_metrics["base_rate_truth"] = base_rate(
        y_true=y_true, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["statistical_parity_difference"] = statistical_parity_difference(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    group_metrics["disparate_impact_ratio"] = disparate_impact_ratio(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
    )
    if not y_pred is None:
        group_metrics["base_rate_preds"] = base_rate(
        y_true=y_pred, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["equal_opportunity_difference"] = equal_opportunity_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["average_odds_difference"] = average_odds_difference(
            y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, priv_group=priv_group, pos_label=pos_label, sample_weight=sample_weight
        )
        if len(set(y_pred))>1:
            group_metrics["conditional_demographic_disparity"] = conditional_demographic_disparity(
                y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
            )
        else:
            group_metrics["conditional_demographic_disparity"] =None
        group_metrics["smoothed_edf"] = smoothed_edf(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["df_bias_amplification"] = df_bias_amplification(
        y_true=y_true, y_pred=y_pred, prot_attr=prot_attr, pos_label=pos_label, sample_weight=sample_weight
        )
        group_metrics["balanced_accuracy_score"] = balanced_accuracy_score(
        y_true=y_true, y_pred=y_pred, sample_weight=sample_weight
        )
    return group_metrics

2025-02-21 00:05:58.902977: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-02-21 00:05:58.914642: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1740092758.925168 1925928 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1740092758.928153 1925928 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-21 00:05:58.939068: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

## Learning a Prejudice Remover model on the training dataset, and choose the best parameters with the validation dataset

In [6]:
# Bias mitigation techniques
from aif360.algorithms.preprocessing import Reweighing
from aif360.algorithms.inprocessing import PrejudiceRemover

### Question1 : Learn a Standard Scaler on the training dataset features, its output will be used as input of the model learned

In [7]:
pr_orig_scaler = StandardScaler()
train_dataset_scaled = dataset_orig_panel19_train.copy()
train_dataset_scaled.features = pr_orig_scaler.fit_transform(train_dataset_scaled.features)

### Question2: Create a method to learn a Prejudice Remover on the train dataset and retrieve the model learned
Execute the method with the parameter eta arbitrarily set at 25.0



In [8]:
def train_pr_model(eta=25.0, train_dataset=train_dataset_scaled):
    model = PrejudiceRemover(sensitive_attr='RACE', eta=eta)
    pr_orig_panel19 = model.fit(train_dataset)
    return pr_orig_panel19

pr_orig_panel19 = train_pr_model()

In [9]:
train_scores = pr_orig_panel19.predict(train_dataset_scaled).scores
train_scores.shape, train_scores[:10]

((7915, 1),
 array([[0.24968875],
        [0.13644747],
        [0.17826004],
        [0.23045409],
        [0.43822148],
        [0.0735638 ],
        [0.03395558],
        [0.07592396],
        [0.15190029],
        [0.01911056]]))

Le score du Prejudice Remover donne un sortie pour chaque instance une seule valeur, c'est un seuil, arbritrairement fixé à 0.5 par défault, qui permet à partir de ce score de décider la prédiction 1 ou 0.
Si le score est supérieur au seuil la prédiction est 1, sinon c'est 0.

### Validating: Choose the best parameters

Here there are two parameters :
- eta: fairness penalty parameter of the PR model
- thershold: the threshold of the binary classification

The threshold is used to obtains predictions from the model output.
The eta is used during the training

Question3: Create a method that will loop over 50 threshold ]0:0.5( and 5 values of ETA [1.0: 100.0], and outputs the metrics

In [10]:
get_metrics(
    y_true = dataset_orig_panel19_val.labels[:,0],
    prot_attr= dataset_orig_panel19_val.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_val.instance_weights
)

{'base_rate_truth': 0.22072904528726825,
 'statistical_parity_difference': -0.14003550662578024,
 'disparate_impact_ratio': 0.4948564702571672}

In [11]:
def validation_loop(thereshold_list, eta_list, train_dataset=train_dataset_scaled, val_dataset=dataset_orig_panel19_val):
    dataset = val_dataset.copy()
    dataset.features = pr_orig_scaler.transform(dataset.features)
    y_true = val_dataset.labels[:,0]
    prot_attr = val_dataset.protected_attributes[:,0]
    sample_weight = val_dataset.instance_weights
    metrics_list=[]

    for eta in eta_list:
        eta_model = train_pr_model(eta=eta, train_dataset=train_dataset)
        y_val_pred_prob = eta_model.predict(dataset).scores
        for thr in thereshold_list:
            y_val_pred = (y_val_pred_prob[:, -1] > thr).astype(np.float64)
            metrics = get_metrics(y_true=y_true, y_pred=y_val_pred, prot_attr=prot_attr, sample_weight=sample_weight)
            metrics['threshold'] = thr
            metrics['eta']=eta
            metrics_list.append(metrics)
    return metrics_list


In [12]:
metrics_list = validation_loop([float(x/100) if x>0 else 0.01 for x in range(50) ], [ float(x*20) for x in range(6)])
df_metrics = pd.DataFrame.from_records(metrics_list)
df_metrics

Unnamed: 0,base_rate_truth,statistical_parity_difference,disparate_impact_ratio,base_rate_preds,equal_opportunity_difference,average_odds_difference,conditional_demographic_disparity,smoothed_edf,df_bias_amplification,balanced_accuracy_score,threshold,eta
0,0.220729,-0.034800,0.965093,0.982916,0.000000,-0.019825,-0.096364,2.519941,1.816454,0.510961,0.01,0.0
1,0.220729,-0.034800,0.965093,0.982916,0.000000,-0.019825,-0.096364,2.519941,1.816454,0.510961,0.01,0.0
2,0.220729,-0.124200,0.873619,0.932640,-0.010704,-0.074537,-0.091926,2.103743,1.400256,0.541498,0.02,0.0
3,0.220729,-0.204108,0.786869,0.875326,-0.010715,-0.119249,-0.086965,1.761479,1.057992,0.571315,0.03,0.0
4,0.220729,-0.248157,0.726985,0.808846,-0.031087,-0.148392,-0.074629,1.315223,0.611735,0.607733,0.04,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...
295,0.220729,-0.126389,0.436114,0.173155,0.013595,-0.070038,-0.041047,0.829851,0.126363,0.520541,0.45,100.0
296,0.220729,-0.125617,0.428069,0.168963,0.014538,-0.069146,-0.041597,0.848471,0.144983,0.520766,0.46,100.0
297,0.220729,-0.123271,0.424774,0.164573,0.014929,-0.067455,-0.041689,0.856199,0.152711,0.521178,0.47,100.0
298,0.220729,-0.122407,0.418860,0.161254,0.012728,-0.067504,-0.042082,0.870218,0.166730,0.522749,0.48,100.0


### Question4 : Make plot to choose the best set of parameters

In [14]:
fig = px.parallel_coordinates(
    df_metrics, 
    color="balanced_accuracy_score", 
    dimensions=["eta", 'threshold', 'disparate_impact_ratio','df_bias_amplification', 'balanced_accuracy_score'])
fig.show()


L'ETA et le threshold ont de l'impact sur l'accuracy et le disparate impact.
On remarque que le compromis est difficile à trouver, comme le montre le graphique ci-dessous, peut etre plus lisible.

In [15]:
fig = px.scatter(df_metrics, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold", "eta"], facet_col="eta")
fig.show()


Plus l'eta est grand, plus la balanced_accuracy est faible et le disparate impact élevé, allant meme largement dépasser 1.
Aussi l'eta 50.0 semble etre un bon compris avec un threshold à 0.14

In [16]:
eta = 50.0
thr = 0.14
pr_orig_panel19 = train_pr_model(eta=eta)

In [17]:
val_dataset_scaled = dataset_orig_panel19_val.copy()
val_dataset_scaled.features = pr_orig_scaler.transform(val_dataset_scaled.features)
val_scores = pr_orig_panel19.predict(val_dataset_scaled).scores
val_preds = (val_scores[:, -1] > thr).astype(np.float64)



In [18]:
get_metrics(
    y_true = dataset_orig_panel19_val.labels[:,0],
    y_pred= val_preds,
    prot_attr= dataset_orig_panel19_val.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_val.instance_weights
)

{'base_rate_truth': 0.22072904528726825,
 'statistical_parity_difference': -0.015236638043276352,
 'disparate_impact_ratio': 0.9540549388774919,
 'base_rate_preds': 0.3254808860506231,
 'equal_opportunity_difference': 0.10584863869161054,
 'average_odds_difference': 0.06220191885464482,
 'conditional_demographic_disparity': -0.0032270032590193246,
 'smoothed_edf': 0.04703400723670348,
 'df_bias_amplification': -0.6564533858858657,
 'balanced_accuracy_score': 0.6725374394435751}

### Question 5: Evaluate : compute the metrics on the test dataset using the model learnt with the selected parameters 


In [19]:
test_dataset = dataset_orig_panel19_test.copy()
test_dataset.features = pr_orig_scaler.transform(test_dataset.features)
test_scores = pr_orig_panel19.predict(test_dataset).scores
test_preds = (test_scores[:, -1] > thr).astype(np.float64)

In [20]:
get_metrics(
    y_true = dataset_orig_panel19_test.labels[:,0],
    y_pred= test_preds,
    prot_attr= dataset_orig_panel19_test.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_test.instance_weights
)

{'base_rate_truth': 0.20798000643857775,
 'statistical_parity_difference': 0.0030802864679043696,
 'disparate_impact_ratio': 1.0099440229406393,
 'base_rate_preds': 0.31101048725128383,
 'equal_opportunity_difference': 0.12829083368876892,
 'average_odds_difference': 0.0848705674191425,
 'conditional_demographic_disparity': 0.0006573978858304182,
 'smoothed_edf': 0.00989492246891599,
 'df_bias_amplification': -0.698304266044081,
 'balanced_accuracy_score': 0.6934228599384545}

## Combine pre-processing and in-processing
### Question6: Redo the Prejudice Remover approach using first the Reweighing pre-processing

In [21]:
from aif360.algorithms.preprocessing import Reweighing

RW = Reweighing(
    unprivileged_groups=[{'RACE': 0.0}], privileged_groups=[{'RACE': 1.0}]
)
RW.fit(dataset_orig_panel19_train)
dataset_rw_train = RW.transform(dataset_orig_panel19_train)
dataset_rw_val = RW.transform(dataset_orig_panel19_val)
dataset_rw_test = RW.transform(dataset_orig_panel19_test)


In [22]:
# Standard Scaler
pr_rw_scaler = StandardScaler()
dataset_rw_train_scaled = dataset_rw_train.copy()
dataset_rw_train_scaled.features = pr_rw_scaler.fit_transform(dataset_rw_train_scaled.features)

In [23]:
# Boucle d'exploration des paramètres
metrics_rw_list = validation_loop(
    thereshold_list=[float(x/100) if x>0 else 0.01 for x in range(50) ], 
    eta_list=[ float(x*20) for x in range(5)], 
    train_dataset=dataset_rw_train_scaled,
    val_dataset=dataset_rw_val)

In [24]:
df_metrics_rw = pd.DataFrame.from_records(metrics_rw_list)

In [25]:
# Visualisation
fig = px.parallel_coordinates(
    df_metrics_rw, 
    color="balanced_accuracy_score", 
    dimensions=["eta", 'threshold', 'disparate_impact_ratio','df_bias_amplification', 'balanced_accuracy_score'])
fig.show()

On note ici que le df_bias_amplification est quasiment toujours positif, car le dataset préprocessé avec le RW est déjà presque parfait en terme de biais mesurables.

In [26]:
fig = px.scatter(df_metrics_rw, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold", "eta"], facet_col="eta")
fig.show()

Ces deux méthodes semblent bien complémentaires et compatibles entre elles, le compromis entre accuracy et fairness est plus facile à trouver. Avec les ETA 25 et 50, avec un threshold autour de 0.13 on obtient une balanced accuracy de 73% et un disparate impact de 1.

In [27]:
eta = 50.0
thr = 0.13
pr_rw_panel19 = train_pr_model(eta=eta, train_dataset=dataset_rw_train_scaled)

In [28]:
# Evaluation (calcul des métriques) sur valid et test dataset

val_dataset_rw_scaled = dataset_rw_val.copy()
val_dataset_rw_scaled.features = pr_rw_scaler.transform(val_dataset_rw_scaled.features)
val_scores = pr_rw_panel19.predict(val_dataset_rw_scaled).scores
val_preds = (val_scores[:,-1] > thr).astype(np.float64)
get_metrics(
    y_true = val_dataset_rw_scaled.labels[:,0],
    y_pred= val_preds,
    prot_attr= val_dataset_rw_scaled.protected_attributes[:,0],
    sample_weight= val_dataset_rw_scaled.instance_weights
)


{'base_rate_truth': 0.22117821483019162,
 'statistical_parity_difference': 0.03832036232413294,
 'disparate_impact_ratio': 1.1153052332563305,
 'base_rate_preds': 0.347839000877081,
 'equal_opportunity_difference': 0.0847068571534847,
 'average_odds_difference': 0.05601406034261591,
 'conditional_demographic_disparity': 0.007772020127861992,
 'smoothed_edf': 0.10912811961880509,
 'df_bias_amplification': 0.0891503265936765,
 'balanced_accuracy_score': 0.686282474833064}

Les métriques de fairness sont excellentes et le balanced_accuracy_score est proche de 70%.
Nous sommes parvenus à améliorer la fairness sans impacter la performance pure du modèle.
Le df_bias_amplification est quasi nul, car le dataset avec Reweighing a très de biais, les métriques de fairness laissent peu de marge de manoeuvre, ce score ne peut pas donc pas être aussi négatif que sans le Reweighing.

## Adversarial Debiasing

Adversarial debiasing [1] is an in-processing technique that learns a classifier to maximize prediction accuracy and simultaneously reduce an adversary's ability to determine the protected attribute from the predictions.

See [AIF360 tuto](https://github.com/Trusted-AI/AIF360/blob/main/examples/demo_adversarial_debiasing.ipynb)

Here we show how to learn and Adversarial Debiasing with the argumetn debias set to False

In [29]:
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()
from aif360.algorithms.inprocessing.adversarial_debiasing import AdversarialDebiasing

sess = tf.Session()

plain_model = AdversarialDebiasing(
    unprivileged_groups=[{'RACE': 0.0}], 
    privileged_groups=[{'RACE': 1.0}],
    scope_name='plain_classifier',
    debias=False, 
    sess=sess)

plain_model.fit(dataset_orig_panel19_train)

Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


2025-02-21 00:09:47.579644: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
I0000 00:00:1740092987.686078 1925928 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled


epoch 0; iter: 0; batch classifier loss: 5.374690
epoch 1; iter: 0; batch classifier loss: 0.612059
epoch 2; iter: 0; batch classifier loss: 0.550926
epoch 3; iter: 0; batch classifier loss: 0.543599
epoch 4; iter: 0; batch classifier loss: 0.374120
epoch 5; iter: 0; batch classifier loss: 0.472071
epoch 6; iter: 0; batch classifier loss: 0.545163
epoch 7; iter: 0; batch classifier loss: 0.336630
epoch 8; iter: 0; batch classifier loss: 0.421914
epoch 9; iter: 0; batch classifier loss: 0.395803
epoch 10; iter: 0; batch classifier loss: 0.512769
epoch 11; iter: 0; batch classifier loss: 0.330521
epoch 12; iter: 0; batch classifier loss: 0.315490
epoch 13; iter: 0; batch classifier loss: 0.345536
epoch 14; iter: 0; batch classifier loss: 0.239229
epoch 15; iter: 0; batch classifier loss: 0.387996
epoch 16; iter: 0; batch classifier loss: 0.289149
epoch 17; iter: 0; batch classifier loss: 0.356012
epoch 18; iter: 0; batch classifier loss: 0.248340
epoch 19; iter: 0; batch classifier loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x78d83a92b5b0>

In [30]:
# Apply the plain model to train and val data
dataset_nodebiasing_train = plain_model.predict(dataset_orig_panel19_train)
dataset_nodebiasing_val = plain_model.predict(dataset_orig_panel19_val)

In [31]:
get_metrics(
    y_true = dataset_orig_panel19_train.labels[:,0],
    y_pred= dataset_nodebiasing_train.labels[:,0],
    prot_attr= dataset_orig_panel19_train.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_train.instance_weights
)

{'base_rate_truth': 0.21451724306616177,
 'statistical_parity_difference': -0.14563927989835548,
 'disparate_impact_ratio': 0.373902592666064,
 'base_rate_preds': 0.17469166608341952,
 'equal_opportunity_difference': -0.16895288908532996,
 'average_odds_difference': -0.11265476868348775,
 'conditional_demographic_disparity': -0.04950064763812943,
 'smoothed_edf': 0.9837598229874769,
 'df_bias_amplification': 0.2956815628076588,
 'balanced_accuracy_score': 0.7699920794533341}

In [32]:
get_metrics(
    y_true = dataset_orig_panel19_val.labels[:,0],
    y_pred= dataset_nodebiasing_val.labels[:,0],
    prot_attr= dataset_orig_panel19_val.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_val.instance_weights
)

{'base_rate_truth': 0.22072904528726825,
 'statistical_parity_difference': -0.16923836643546045,
 'disparate_impact_ratio': 0.3035493858897022,
 'base_rate_preds': 0.17473064073335087,
 'equal_opportunity_difference': -0.22057827341954667,
 'average_odds_difference': -0.15564589931117803,
 'conditional_demographic_disparity': -0.05457126713047523,
 'smoothed_edf': 1.1922106607200975,
 'df_bias_amplification': 0.4887232675975284,
 'balanced_accuracy_score': 0.7093721829285422}

In [48]:
sess.close()
tf.reset_default_graph()


### Question 7: Redo the same (learn and Adversarial Debiasing) with the argument debias set to True

Compare the metrics outputed

In [49]:
# Learn parameters with debias set to True
sess = tf.Session()
debiased_model = AdversarialDebiasing(
    unprivileged_groups=[{'RACE': 0.0}], 
    privileged_groups=[{'RACE': 1.0}],
    scope_name='debiased_model',
    debias=True, 
    sess=sess)

In [50]:
debiased_model.fit(dataset_orig_panel19_train)

epoch 0; iter: 0; batch classifier loss: 1.668031; batch adversarial loss: 0.675715
epoch 1; iter: 0; batch classifier loss: 0.600276; batch adversarial loss: 0.719155
epoch 2; iter: 0; batch classifier loss: 0.424414; batch adversarial loss: 0.682085
epoch 3; iter: 0; batch classifier loss: 0.384530; batch adversarial loss: 0.741544
epoch 4; iter: 0; batch classifier loss: 0.398656; batch adversarial loss: 0.689060
epoch 5; iter: 0; batch classifier loss: 0.317573; batch adversarial loss: 0.713117
epoch 6; iter: 0; batch classifier loss: 0.355454; batch adversarial loss: 0.701915
epoch 7; iter: 0; batch classifier loss: 0.344063; batch adversarial loss: 0.713198
epoch 8; iter: 0; batch classifier loss: 0.442193; batch adversarial loss: 0.692672
epoch 9; iter: 0; batch classifier loss: 0.388846; batch adversarial loss: 0.704799
epoch 10; iter: 0; batch classifier loss: 0.327973; batch adversarial loss: 0.658760
epoch 11; iter: 0; batch classifier loss: 0.277305; batch adversarial loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x78d8395d1ae0>

In [51]:
# Apply the plain model to train and val data
dataset_debiased_train = debiased_model.predict(dataset_orig_panel19_train)
dataset_debiased_val = debiased_model.predict(dataset_orig_panel19_val)

In [52]:
sess.close()
tf.reset_default_graph()


In [None]:
get_metrics(
    y_true = dataset_orig_panel19_train.labels[:,0],
    y_pred= dataset_debiased_train.labels[:,0],
    prot_attr= dataset_orig_panel19_train.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_train.instance_weights
)

{'base_rate_truth': 0.21451724306616177,
 'statistical_parity_difference': -0.03286073522352835,
 'disparate_impact_ratio': 0.7829856100778496,
 'base_rate_preds': 0.13835275181308068,
 'equal_opportunity_difference': 0.08605263139629149,
 'average_odds_difference': 0.05171283635752971,
 'conditional_demographic_disparity': -0.013507679412370324,
 'smoothed_edf': 0.2446409015764861,
 'df_bias_amplification': -0.443437358603332,
 'balanced_accuracy_score': 0.7304047954243789}

In [54]:
get_metrics(
    y_true = dataset_orig_panel19_val.labels[:,0],
    y_pred= dataset_debiased_val.labels[:,0],
    prot_attr= dataset_orig_panel19_val.protected_attributes[:,0],
    sample_weight= dataset_orig_panel19_val.instance_weights
)

{'base_rate_truth': 0.22072904528726825,
 'statistical_parity_difference': -0.043359614701920665,
 'disparate_impact_ratio': 0.7255907209916608,
 'base_rate_preds': 0.14051951585037967,
 'equal_opportunity_difference': 0.05363167637931765,
 'average_odds_difference': 0.026357224025350512,
 'conditional_demographic_disparity': -0.016693321341970227,
 'smoothed_edf': 0.3207690559619516,
 'df_bias_amplification': -0.38271833716061754,
 'balanced_accuracy_score': 0.6851884726452877}

### Question 8: Combine the Reweighing with the Adversarial Debiasing

In [55]:
sess = tf.Session()
debiased_rw_model = AdversarialDebiasing(
    unprivileged_groups=[{'RACE': 0.0}], 
    privileged_groups=[{'RACE': 1.0}],
    scope_name='debiased_rw_model',
    debias=True, 
    sess=sess)

In [56]:
debiased_rw_model.fit(dataset_rw_train_scaled)

epoch 0; iter: 0; batch classifier loss: 1.154897; batch adversarial loss: 0.658370
epoch 1; iter: 0; batch classifier loss: 0.348980; batch adversarial loss: 0.688762
epoch 2; iter: 0; batch classifier loss: 0.357638; batch adversarial loss: 0.687641
epoch 3; iter: 0; batch classifier loss: 0.288130; batch adversarial loss: 0.680000
epoch 4; iter: 0; batch classifier loss: 0.327123; batch adversarial loss: 0.679374
epoch 5; iter: 0; batch classifier loss: 0.223153; batch adversarial loss: 0.675010
epoch 6; iter: 0; batch classifier loss: 0.328404; batch adversarial loss: 0.664870
epoch 7; iter: 0; batch classifier loss: 0.242467; batch adversarial loss: 0.671999
epoch 8; iter: 0; batch classifier loss: 0.212294; batch adversarial loss: 0.665952
epoch 9; iter: 0; batch classifier loss: 0.285957; batch adversarial loss: 0.673434
epoch 10; iter: 0; batch classifier loss: 0.211140; batch adversarial loss: 0.635633
epoch 11; iter: 0; batch classifier loss: 0.220927; batch adversarial loss:

<aif360.algorithms.inprocessing.adversarial_debiasing.AdversarialDebiasing at 0x78d8302f4790>

In [58]:
# Apply the plain model to train and val data
dataset_debiased_rw_train = debiased_rw_model.predict(dataset_rw_train_scaled)
dataset_debiased_rw_val = debiased_rw_model.predict(val_dataset_rw_scaled)

In [61]:
get_metrics(
    y_true = dataset_rw_train_scaled.labels[:,0],
    y_pred= dataset_debiased_rw_train.labels[:,0],
    prot_attr= dataset_rw_train_scaled.protected_attributes[:,0],
    sample_weight= dataset_rw_train_scaled.instance_weights
)

{'base_rate_truth': 0.21451724306616174,
 'statistical_parity_difference': 0.034066849708783875,
 'disparate_impact_ratio': 1.2719115818733928,
 'base_rate_preds': 0.13883535227369675,
 'equal_opportunity_difference': 0.15386967241765637,
 'average_odds_difference': 0.07760904056335116,
 'conditional_demographic_disparity': 0.013962606237140575,
 'smoothed_edf': 0.24052095710201438,
 'df_bias_amplification': 0.24052094124953416,
 'balanced_accuracy_score': 0.819415000779465}

In [62]:
get_metrics(
    y_true = val_dataset_rw_scaled.labels[:,0],
    y_pred= dataset_debiased_rw_val.labels[:,0],
    prot_attr= val_dataset_rw_scaled.protected_attributes[:,0],
    sample_weight= val_dataset_rw_scaled.instance_weights
)

{'base_rate_truth': 0.22117821483019162,
 'statistical_parity_difference': -0.02865933396734835,
 'disparate_impact_ratio': 0.7577313861357575,
 'base_rate_preds': 0.10670301961727642,
 'equal_opportunity_difference': -0.0458395485628581,
 'average_odds_difference': -0.03408427596721164,
 'conditional_demographic_disparity': -0.013833463412643522,
 'smoothed_edf': 0.2774261869825949,
 'df_bias_amplification': 0.2574483939574663,
 'balanced_accuracy_score': 0.6305208309403187}

This in-processing approach does not seem compatible withe the Reweighing, has the df_bias_amplification is high and the disparate impact ratio is not improved by the use of the reweighing has pre-processing.
Although very efficient on the fairness metrics of the dataset, the Reweighing is not convenient for every kind of machine learning algo.



## Analysis of the influence of Reweighing 

### QUESTION 9 : Pour aller plus loin, étudier l'impact du Reweighing sur différents modèles notamment les arbres de décision

In [28]:
from sklearn import tree
DTclf = tree.DecisionTreeClassifier()

Apprentissage d'un arbre de decision sans Reweighing

In [None]:
DTclf.fit(
    X=train_dataset_scaled.features, 
    y=train_dataset_scaled.labels[:,0],
    sample_weight=train_dataset_scaled.instance_weights)
metrics_list=[]
y_val_pred_scores = DTclf.predict_proba(val_dataset_scaled.features)

for thr in [float(x/20) if x>0 else 0.01 for x in range(10) ]:
    y_val_pred = (y_val_pred_scores[:, -1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_DT = pd.DataFrame.from_records(metrics_list)
df_metrics_DT

In [None]:
fig = px.scatter(df_metrics_DT, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold"])
fig.show()

We observe that the threshold has a low impact on the two metrics

In [None]:
# with Reweighing
DTclf_rw = tree.DecisionTreeClassifier()
DTclf_rw.fit(
    X=dataset_rw_train_scaled.features, 
    y=dataset_rw_train_scaled.labels[:,0],
    sample_weight=dataset_rw_train_scaled.instance_weights)
metrics_list=[]
y_val_rw_pred_scores = DTclf_rw.predict_proba(val_dataset_rw_scaled.features)
for thr in [float(x/20) if x>0 else 0.01 for x in range(10) ]:
    y_val_pred = (y_val_rw_pred_scores[:, -1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_rw_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_rw_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_rw_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_rw_DT = pd.DataFrame.from_records(metrics_list)
df_metrics_rw_DT

With the Reweighing the Disparate Impact has improved, it is closer to 1 but the df_bias_amplification has increased, because the dataset is not biased anymore. But the model still over learned the bias, reweighing is not so powerful for decision trees.

In [None]:
fig = px.scatter(df_metrics_rw_DT, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold"])
fig.show()

In [None]:
## naive_bayes
from sklearn import naive_bayes 
NBclf = naive_bayes.GaussianNB()
NBclf.fit(
    X=train_dataset_scaled.features, 
    y=train_dataset_scaled.labels[:,0],
    sample_weight=train_dataset_scaled.instance_weights)
metrics_list=[]
y_val_pred_scores = NBclf.predict_proba(val_dataset_scaled.features)
for thr in [float(x/20) if x>0 else 0.01 for x in range(20) ]:
    y_val_pred = (y_val_pred_scores[:, 1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_NB = pd.DataFrame.from_records(metrics_list)
df_metrics_NB

In [None]:
fig = px.scatter(df_metrics_NB, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold"])
fig.show()

In [None]:
# NB with Reweighing
NBclf_rw = naive_bayes.GaussianNB()
NBclf_rw.fit(
    X=dataset_rw_train_scaled.features, 
    y=dataset_rw_train_scaled.labels[:,0],
    sample_weight=dataset_rw_train_scaled.instance_weights)
metrics_list=[]
y_val_rw_pred_scores = NBclf_rw.predict_proba(val_dataset_rw_scaled.features)
for thr in [float(x/20) if x>0 else 0.01 for x in range(10) ]:
    y_val_pred = (y_val_rw_pred_scores[:, -1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_rw_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_rw_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_rw_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_rw_NB = pd.DataFrame.from_records(metrics_list)
df_metrics_rw_NB

In [None]:
fig = px.scatter(df_metrics_rw_NB, x='balanced_accuracy_score', y='disparate_impact_ratio', color='threshold', hover_data=["threshold"])
fig.show()

In [None]:
# LR
from sklearn.linear_model import LogisticRegression

LR_clf = LogisticRegression(solver='liblinear', random_state=42)
LR_clf.fit(
    X=train_dataset_scaled.features, 
    y=train_dataset_scaled.labels[:,0],
    sample_weight=train_dataset_scaled.instance_weights)
metrics_list=[]
y_val_pred_scores = LR_clf.predict_proba(val_dataset_scaled.features)
for thr in [float(x/20) if x>0 else 0.01 for x in range(10) ]:
    y_val_pred = (y_val_pred_scores[:, -1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_LR = pd.DataFrame.from_records(metrics_list)
df_metrics_LR

In [None]:
LR_clf_rw = LogisticRegression(solver='liblinear', random_state=42)
LR_clf_rw.fit(
    X=dataset_rw_train_scaled.features, 
    y=dataset_rw_train_scaled.labels[:,0],
    sample_weight=dataset_rw_train_scaled.instance_weights)
metrics_list=[]
y_val_rw_pred_scores = LR_clf_rw.predict_proba(val_dataset_rw_scaled.features)
for thr in [float(x/20) if x>0 else 0.01 for x in range(10) ]:
    y_val_pred = (y_val_rw_pred_scores[:, -1] > thr).astype(np.float64)
    metrics = get_metrics(
        y_true=val_dataset_rw_scaled.labels[:,0], 
        y_pred=y_val_pred, 
        prot_attr=val_dataset_rw_scaled.protected_attributes[:,0], 
        sample_weight=val_dataset_rw_scaled.instance_weights)
    metrics['threshold'] = thr
    metrics_list.append(metrics)
df_metrics_rw_LR = pd.DataFrame.from_records(metrics_list)
df_metrics_rw_LR