CNN for Scene Classification
Comprehensive CNN implementation for scene classification using ResNet-50 transfer learning with comparative analysis of dropout regularization and activation functions.
This project implements a comprehensive CNN-based scene classification system for the Duke AIPI 590 Applied Computer Vision Course. The project focuses on transfer learning using ResNet-50 architecture and conducts systematic experiments to evaluate the impact of different regularization techniques and activation functions on model performance.
Project Overview
The CNN Scene Classification project implements a professional-grade computer vision pipeline with three main experimental components:
Core Implementation
- ResNet-50 Transfer Learning: Fine-tuning pre-trained ResNet-50 for scene classification on SUN397 dataset
- Dropout Analysis: Comparative study of models with and without dropout regularization
- Activation Function Study: Evaluation of ReLU, ELU, SiLU, and GELU activation functions
Key Features
- Modular Architecture: Clean, maintainable code with separation of concerns
- Comprehensive Logging: Detailed experiment tracking and result visualization
- Configuration Management: YAML-based configuration for reproducible experiments
- Statistical Analysis: Rigorous evaluation with multiple performance metrics
Technical Implementation
Model Architecture
- Base Model: ResNet-50 pre-trained on ImageNet
- Transfer Learning: Frozen feature extractor with custom classifier
- Regularization: Configurable dropout layers and activation functions
- Training: SGD optimizer with Nesterov momentum and learning rate scheduling
Experimental Design
- Dataset: SUN397 scene recognition dataset with 20 scene categories
- Data Split: 70% training, 10% validation, 20% testing
- Augmentation: RandomResizedCrop, RandomHorizontalFlip for training
- Evaluation: Accuracy, F1-score, confusion matrix, and classification reports
Key Results
Dropout Comparison
- Regularization Impact: Systematic analysis of dropout effectiveness in preventing overfitting
- Performance Metrics: Comparative evaluation of training and validation performance
- Statistical Significance: Rigorous statistical testing of performance differences
Activation Function Analysis
- Multi-Activation Study: Comprehensive comparison of ReLU, ELU, SiLU, and GELU
- Performance Benchmarking: Detailed evaluation across multiple metrics
- Best Practice Identification: Evidence-based recommendations for activation function selection
Technical Contributions
- Professional Code Structure: Modular design with comprehensive error handling
- Experiment Management: Automated logging, model saving, and result visualization
- Reproducible Research: Fixed random seeds and detailed configuration management
- Performance Analysis: Advanced evaluation metrics including F1-scores and confusion matrices
- Early Stopping: Intelligent training termination to prevent overfitting
Impact
This project demonstrates advanced computer vision implementation skills and provides valuable insights into transfer learning best practices. The systematic experimental approach offers evidence-based guidance for CNN architecture design and hyperparameter selection in scene classification tasks.
The comprehensive evaluation framework and professional code structure make this project a valuable reference for computer vision practitioners and researchers working on similar classification problems.