To analyze credit card transactions and develop machine learning models that accurately detect fraudulent transactions.
Credit card fraud detection involves identifying unauthorized or suspicious activities using stolen or compromised card information.
Financial institutions rely on advanced fraud detection systems to protect customers and minimize financial loss.
Dataset Source:
Kaggle - Credit Card Fraud Detection
- NumPy β Numerical computation
- Pandas β Data manipulation and analysis
- Seaborn β Data visualization
- Matplotlib β Data visualization
- Scikit-learn β Machine learning and model evaluation
-
Data Collection & Cleaning
- Data sourced from Kaggle
- Cleaned by removing irrelevant or redundant columns
-
Exploratory Data Analysis (EDA)
- Visualized transaction distributions and feature correlations
- Identified trends and patterns between fraudulent and normal transactions
-
Handling Imbalanced Data
- Dataset is highly imbalanced (fraudulent vs. non-fraudulent)
- Applied SMOTE (Synthetic Minority Oversampling Technique)
- Tried under-sampling and over-sampling approaches
-
Data Splitting
- Split into training and test sets (e.g., 80-20 split)
-
Model Training
- Logistic Regression
- Random Forest Classifier
- Decision Tree Classifier
-
Model Evaluation
- Metrics: F1-score, Accuracy, Precision, Recall
- Compared model performances to select the best performer
-
Result Visualization
- Confusion matrix
- ROC Curve
- Precision-Recall curve
- Traditional rule-based fraud detection often results in high false positives, blocking legitimate transactions (e.g., customer traveling abroad).
- ML models learn individual behavior patterns, improving fraud detection accuracy and reducing false positives.