Unveiling the Power of Bias Adjustment: Enhancing Predictive Precision in Imbalanced Datasets | by Hyung Gyu Rho | Aug, 2023


To demonstrate the effectiveness of our bias adjustment algorithm in addressing class imbalance, we employ a real-world dataset from a Kaggle competition focused on credit card fraud detection. In this scenario, the challenge lies in predicting whether a credit card transaction is fraudulent (labeled as 1) or not (labeled as 0), given the inherent rarity of fraud cases.

We start by loading essential packages and preparing the dataset:

import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_addons as tfa
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE, RandomOverSampler

# Load and preprocess the dataset
df = pd.read_csv("/kaggle/input/playground-series-s3e4/train.csv")
y, x = df.Class, df[df.columns[1:-1]]
x = (x - x.min()) / (x.max() - x.min())
x_train, x_valid, y_train, y_valid = train_test_split(x, y, test_size=0.3, random_state=1)
batch_size = 256
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(buffer_size=1024).batch(batch_size)
valid_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid)).batch(batch_size)

We then define a simple deep learning model for binary classification and set up the optimizer, loss function, and evaluation metric. I follow the competition evaluation and choose AUC as evaluation metric. Furthermore, the model is intentionally simplified as the focus of this article is to show how to implement the bias adjustment algorithm, not to ace in prediction:

model = tf.keras.Sequential([
tf.keras.layers.Normalization(),
tf.keras.layers.Dense(32, activation='swish'),
tf.keras.layers.Dense(32, activation='swish'),
tf.keras.layers.Dense(1)
])
optimizer = tf.keras.optimizers.Adam()
loss = tf.keras.losses.BinaryCrossentropy()
val_metric = tf.keras.metrics.AUC()

Within the core of our bias adjustment algorithm lies the training and validation steps, where we meticulously address class imbalance. To elucidate this process, we delve into the intricate mechanisms that balance the model’s predictions.

Training Step with Accumulating Delta Values

In the training step, we embark on the journey of enhancing model sensitivity to class imbalance. Here, we calculate and accumulate the sum of model outputs for two distinct clusters: delta0 and delta1. These clusters hold significant importance, representing the predicted values associated with classes 0 and 1, respectively.

# Define Training Step function
@tf.function
def train_step(x, y):
delta0, delta1 = tf.constant(0, dtype = tf.float32), tf.constant(0, dtype = tf.float32)
with tf.GradientTape() as tape:
logits = model(x, training=True)
y_pred = tf.keras.activations.sigmoid(logits)
loss_value = loss(y, y_pred)
# Calculate new bias term for addressing imbalance class
if len(logits[y == 1]) == 0:
delta0 -= (tf.reduce_sum(logits[y == 0]))
elif len(logits[y == 0]) == 0:
delta1 -= (tf.reduce_sum(logits[y == 1]))
else:
delta0 -= (tf.reduce_sum(logits[y == 0]))
delta1 -= (tf.reduce_sum(logits[y == 1]))
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
return loss_value, delta0, delta1

Validation Step: Imbalance Resolution with Delta

The normalized delta values, derived from the training process, take center stage in the validation step. Armed with these refined indicators of class imbalance, we align the model’s predictions more accurately with the true distribution of classes. The test_step function integrates these delta values to adaptively adjust predictions, ultimately leading to a refined evaluation.

@tf.function
def test_step(x, y, delta):
logits = model(x, training=False)
y_pred = tf.keras.activations.sigmoid(logits + delta) # Adjust predictions with delta
val_metric.update_state(y, y_pred)

Utilizing Delta Values for Imbalance Correction

As training progresses, we collect valuable insights encapsulated within the delta0 and delta1 cluster sums. These cumulative values emerge as indicators of the bias inherent in our model’s predictions. At the conclusion of each epoch, we execute a vital transformation. By dividing the accumulated cluster sums by the corresponding number of observations from each class, we derive normalized delta values. This normalization acts as a crucial equalizer, encapsulating the essence of our bias adjustment approach.

E = 1000
P = 10
B = len(train_dataset)
N_class0, N_class1 = sum(y_train == 0), sum(y_train == 1)
early_stopping_patience = 0
best_metric = 0
for epoch in range(E):
# init delta
delta0, delta1 = tf.constant(0, dtype = tf.float32), tf.constant(0, dtype = tf.float32)
print("nStart of epoch %d" % (epoch,))
# Iterate over the batches of the dataset.
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
loss_value, step_delta0, step_delta1 = train_step(x_batch_train, y_batch_train)

# Update delta
delta0 += step_delta0
delta1 += step_delta1

# Take average of all delta values
delta = (delta0/N_class0 + delta1/N_class1)/2

# Run a validation loop at the end of each epoch.
for x_batch_val, y_batch_val in valid_dataset:
test_step(x_batch_val, y_batch_val, delta)

val_auc = val_metric.result()
val_metric.reset_states()
print("Validation AUC: %.4f" % (float(val_auc),))
if val_auc > best_metric:
best_metric = val_auc
early_stopping_patience = 0
else:
early_stopping_patience += 1

if early_stopping_patience > P:
print("Reach Early Stopping Patience. Training Finished at Validation AUC: %.4f" % (float(best_metric),))
break;

The Outcome

In our application to credit card fraud detection, the enhanced efficacy of our algorithm shines through. With bias adjustment seamlessly integrated into the training process, we achieve an impressive AUC score of 0.77. This starkly contrasts with the AUC score of 0.71 attained without the guiding hand of bias adjustment. The profound improvement in predictive performance stands as a testament to the algorithm’s ability to navigate the intricacies of class imbalance, charting a course towards more accurate and reliable predictions.



Source link

Leave a Comment