Integrasi Komprehensif: Dari Ideasi hingga Deployment Sistem ML Produksi
Author
Pembelajaran Mesin - Data Science for Cybersecurity
Published
December 15, 2025
29 Lab 12: Capstone Project Intensive
29.1 Selamat Datang ke Kulminasi Pembelajaran Mesin!
Note
Apa yang akan Anda lakukan: Membangun dan mendeploykan sistem machine learning produksi yang mengintegrasikan SEMUA konsep dari seluruh kursus.
Tingkat Kesulitan: Advanced
Estimasi Waktu: 8 jam (proyek multi-minggu, Minggu 13-14)
Tujuan Utama: Demonstrasi mastery dalam problem-solving ML end-to-end dengan standar profesional
29.2 Mengapa Lab Ini Penting?
Lab capstone adalah puncak dari perjalanan pembelajaran Anda. Ini bukan sekadar tugas - ini adalah kesempatan untuk:
Menunjukkan Kompetensi Penuh: Mengintegrasikan semua 5 CPMK (learning outcomes) dalam satu proyek kohesif
Menghadapi Tantangan Real-World: Bekerja dengan dataset asli, constraint bisnis, dan ketidakpastian
Membangun Portfolio: Proyek berkualitas tinggi untuk karir data science Anda
Menguasai Praktik Profesional: Mengikuti standar industri untuk ML development
29.2.1 Skenario Real-World
Anda adalah Senior ML Engineer untuk startup FinTech:
Startup kami menghadapi masalah serius dengan fraud detection. Sistem lama kami berbasis rule manual dan hanya menangkap 45% fraud dengan false positive rate 15% (banyak customer komplain). Kami butuh solusi ML yang robust untuk:
Meningkatkan fraud detection rate menjadi 85%+
Mengurangi false positive rate menjadi <5%
API harus merespons dalam <100ms (per request)
Model harus interpretable (bisa jelaskan ke compliance team)
System harus production-ready dengan monitoring
Deadline: 2 minggu, budget: Anda (tim kecil). GO!
29.3 Tujuan Pembelajaran (Learning Outcomes)
Setelah menyelesaikan capstone ini, Anda mampu:
29.3.1 CPMK-1: Foundational ML Knowledge
Mengaplikasikan fundamental ML concepts untuk memecahkan masalah dunia nyata
Menjelaskan pilihan model dengan justifikasi teknis dan bisnis yang kuat
29.3.2 CPMK-2: End-to-End ML Pipelines
Membangun complete ML pipeline dari data collection hingga deployment
Mengoptimalkan pipeline untuk latency, memory, dan throughput constraints
Mengidentifikasi dan memitigasi data leakage dan common pitfalls
29.3.3 CPMK-3: Critical Analysis & Evaluation
Mengevaluasi model dengan multiple metrics yang appropriate untuk use case
Menganalisis failure modes dan melakukan error analysis sistematis
Memvalidasi hasil dengan cross-validation dan proper train/val/test splitting
29.3.4 CPMK-4: Advanced Solutions
Mengimplementasikan advanced techniques (ensemble, hyperparameter tuning, transfer learning)
Merancang system architecture untuk scalability dan maintainability
29.3.5 CPMK-5: Production ML Systems
Mendeploy model ke production dengan proper containerization dan monitoring
Dokumentasikan sistem dengan standar profesional (model cards, READMEs, technical reports)
Mempresentasikan findings dan insights kepada stakeholders dengan berbagai backgrounds
29.4 Struktur Lab: 5 Bagian Terintegrasi (8 Jam Total)
Anda harus memilih SATU dari 5 domain proyek yang disediakan. Setiap domain memiliki:
Problem statement template
Dataset sources
Success metrics guidelines
Example deliverables
30.1.1 Opsi 1: Cybersecurity ML Application
Use Case: Malware Detection menggunakan Binary Features
PROBLEM CONTEXT:
- Setiap hari, lebih dari 350,000 file malware baru dikembangkan
- Antivirus signatures hanya menangkap 60% malware
- Anda butuh proactive detection berbasis machine learning
GOAL:
Membangun classifier untuk membedakan benign vs malware executable
dengan accuracy 90%+ dan false positive rate <5%
DATA:
- EMBER Dataset: 600K Windows PE files
- 2381 static features (header, section info, imports, etc.)
- Binary labels: benign/malware
- Size: ~10GB (processed: ~2GB)
SUCCESS METRICS:
- Classification accuracy: ≥90%
- False positive rate: <5% (minimize blocking legit software)
- False negative rate: <15% (catch most malware)
- Inference time: <100ms per file
- Model interpretability: Top 10 important features identifiable
30.1.2 Opsi 2: Business Intelligence - Customer Analytics
Use Case: Customer Churn Prediction
PROBLEM CONTEXT:
- Telco company dengan 100K+ customers
- Annual churn rate: 26% (industry average)
- Cost untuk acquire new customer: $500-1000
- Retention cost: $50-100 per customer
GOAL:
Predict which customers akan churn dalam 3 bulan ke depan,
sehingga sales team bisa proactively engage dengan targeted offers
DATA:
- Customer demographics, service usage, billing info
- ~20 features per customer
- 7000+ historical customers dengan churn labels
- Class imbalance: 73% retained, 27% churned
SUCCESS METRICS:
- Recall (catch churners): ≥80%
- Precision (avoid false alarms): ≥60%
- ROC-AUC: ≥0.85
- Business impact: Identify top 20% customers untuk retention program
30.1.4 Opsi 4: NLP/LLM Application - Text Classification
Use Case: Sentiment Analysis untuk Customer Reviews
PROBLEM CONTEXT:
- E-commerce platform dengan 100K+ reviews per hari
- Manual review scoring is expensive (80 jam/hari labor)
- Butuh automated sentiment classification untuk business insights
GOAL:
Classify product reviews menjadi Positive/Negative/Neutral
dengan accuracy 85%+ untuk support monitoring dan QA
DATA:
- Amazon reviews atau custom e-commerce dataset
- 5000-50000 reviews dengan sentiment labels
- Text length: 50-500 words per review
- Class distribution: mixed (need to handle imbalance)
SUCCESS METRICS:
- Multi-class accuracy: ≥85%
- Macro F1-score: ≥0.82
- Balanced precision/recall across classes
- Inference time: <50ms per review
- Interpretability: Which words/phrases drive sentiment?
PROBLEM CONTEXT:
- Malware analysis traditionally requires reverse engineering
- Visual features dari binary images bisa reveal patterns
- Researchers successfully use CNN untuk malware classification
GOAL:
Classify grayscale binary images dari executable files
ke malware/benign categories dengan high accuracy
DATA:
- Binary visualization dari PE files (grayscale images)
- 1000-5000 images (32x32 or 64x64 resolution)
- Balanced classes (500-2500 per class)
- Can use transfer learning (ImageNet pretrained models)
SUCCESS METRICS:
- Image classification accuracy: ≥88%
- Balanced precision/recall
- Model interpretability: Visualization of learned features
- Inference time: <50ms per image
- Can work with limited data (data augmentation)
MNIST-style binary images can be created from PE file headers
30.2 1.2 Problem Definition dengan SMART Criteria
Pilih satu domain di atas, kemudian lengkapi Project Proposal Template:
TASK 1.1: Problem Definition
Gunakan PROJECT_PROPOSAL_TEMPLATE.md untuk mendefinisikan:
Business Context (2-3 paragraf)
Siapa stakeholder?
Apa problem yang ingin diselesaikan?
Kenapa penting sekarang?
Data (1 paragraf)
Source data
Size dan characteristics
Key features
Success Metrics (3-5 metrics)
Primary metric (aligned dengan business goal)
Secondary metrics
Success threshold
Constraints
Latency requirement
Model interpretability needs
Data privacy/compliance
Resource constraints
Deliverables
What will you deliver?
When (milestones)?
How will it be used?
Deadline: Selesaikan sebelum mulai coding apapun!
Template Quick Reference:
# Project Proposal: [Project Title]## Executive Summary[2-3 sentences: what you're building, why it matters, expected impact]## Problem Statement### Context[Industry context and current situation]### Problem[Specific problem to solve]### Data- Source: [where data comes from]- Size: [n samples x m features]- Target: [what we're predicting]- Class distribution: [if applicable]### Success Criteria (SMART)| Metric | Target | Justification ||--------|--------|---------------|| Primary: Accuracy/Recall | ≥85% | Business need: ... || Secondary: Precision | ≥75% | Important because: ... || Latency | <100ms | Production requirement |## Approach Overview[Bullet points: how you'll solve it]## Timeline & Milestones- Week 1: [What]- Week 2: [What]- etc.## Risks & Mitigation| Risk | Probability | Impact | Mitigation ||------|-------------|--------|-----------|| ... | ... | ... | ... |
30.3 1.3 Timeline & Milestone Planning
Struktur Timeline untuk Capstone (2 Minggu = 8 Jam Lab + Kerja Independen):
WEEK 13 (4 jam lab + homework):
├─ Monday: Lab Bagian 1-2 (Planning + EDA)
│ └─ Output: Project proposal finalized
│ └─ Output: EDA report drafted
├─ Wednesday: Lab Bagian 3 (Modeling)
│ └─ Output: Baseline model trained
│ └─ Output: Experiment #1-2 completed
├─ Friday: Homework
│ └─ Run experiments #3-5
│ └─ Feature engineering
│ └─ Hyperparameter tuning
│
WEEK 14 (4 jam lab + homework):
├─ Monday: Lab Bagian 4-5 (Deployment + Reporting)
│ └─ Output: FastAPI/Flask app working
│ └─ Output: Docker container built
│ └─ Output: Technical report drafted
├─ Wednesday: Lab Q&A + Finalization
│ └─ Fix any issues
│ └─ Prepare presentation
├─ Friday: FINAL PRESENTATION
│ └─ Each student: 15-20 minute presentation
│ └─ Live demo atau video walkthrough
│ └─ Q&A with instructors
⚠️ Critical Milestones
Week 13 - MUST Complete by EOD Wednesday:
Week 13 - MUST Complete by EOD Friday:
Week 14 - MUST Complete by EOD Wednesday:
Week 14 - FINAL:
30.4 1.4 Risk Assessment Template
Sebelum memulai, identifikasi potensi blockers:
TASK 1.2: Risk Assessment
Complete tabel berikut dalam proposal Anda:
Risk
Probability
Impact
Mitigation Strategy
Data tidak tersedia
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Dataset terlalu besar
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Model tidak konvergen
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Class imbalance
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Scope creep
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Documentation incomplete
[H/M/L]
[Critical/High/Med]
[Your mitigation]
Guidance untuk setiap domain:
Cybersecurity:
Risk: EMBER dataset terlalu besar (10GB)
Mitigation: Download sample 100K files, atau gunakan preprocessed features
Business Intelligence:
Risk: Imbalanced churn data (27% vs 73%)
Mitigation: Plan SMOTE, stratified sampling, class weights
Healthcare:
Risk: Dataset terlalu kecil untuk deep learning
Mitigation: Use traditional ML (RF, XGBoost), extensive validation
NLP:
Risk: Text preprocessing complexity
Mitigation: Use pretrained embeddings (Word2Vec, fastText, BERT)
Computer Vision:
Risk: Limited training data
Mitigation: Data augmentation, transfer learning, smaller model
31 BAGIAN 2: Data & EDA (2 Jam)
31.1 2.1 Data Collection & Loading
TASK 2.1: Load Your Dataset
Sesuaikan dengan domain pilihan Anda:
31.1.1 Option 1: Cybersecurity (EMBER)
import pandas as pdimport numpy as np# Load EMBER sample (preprocessed features)# Option A: Download dari GitHub# git clone https://github.com/elastic/ember# cd ember && python extract_features.py -d path/to/binaries# Option B: Use preprocessed dataX_train = pd.read_csv('ember_train_features.csv')y_train = pd.read_csv('ember_train_labels.csv')X_test = pd.read_csv('ember_test_features.csv')y_test = pd.read_csv('ember_test_labels.csv')print(f"Training data: {X_train.shape}")print(f"Features: {X_train.columns.tolist()[:5]}... (total {X_train.shape[1]})")print(f"Class distribution: {y_train.value_counts()}")
31.1.2 Option 2: Business Intelligence (Telco Churn)
# Download dari Kaggleimport pandas as pddf = pd.read_csv('WA_Fn-UseC_-Telco-Customer-Churn.csv')print(f"Data shape: {df.shape}")print(f"Columns: {df.columns.tolist()}")print(f"\nFirst few rows:")print(df.head())# Separate features and targetX = df.drop('Churn', axis=1)y = df['Churn']
# Build imagedocker build -t capstone-model:latest .# Run containerdocker run -p 8000:8000 capstone-model:latest# Or use docker-compose (buat docker-compose.yml)version:'3.8'services:model-api:build: .ports:-"8000:8000"volumes:- ./models:/app/models:roenvironment:- PYTHONUNBUFFERED=1
33.4 4.4 Model Monitoring & Documentation
TASK 4.4: Create Model Card
Buat models/MODEL_CARD.md:
# Model Card: [Project Name] v1.0## Model Details- **Model Type**: [e.g., Random Forest Classifier]- **Framework**: scikit-learn- **Training Date**: [Date]- **Version**: 1.0- **Authors**: [Your Name]## Intended Use- **Primary Use**: [What is the model used for?]- **Primary Users**: [Who will use it?]- **Out-of-Scope Uses**: [What shouldn't it be used for?]## Performance Metrics| Metric | Value ||--------|-------|| Accuracy | 0.87 || Precision | 0.85 || Recall | 0.89 || F1-Score | 0.87 || AUC-ROC | 0.92 |## Data- **Training Data**: [n samples, m features]- **Data Source**: [Where data came from]- **Preprocessing**: [What preprocessing was done]- **Class Distribution**: [If applicable]## Limitations- [Limitation 1]- [Limitation 2]- [Limitation 3]## Deployment Considerations- **Inference Latency**: <100ms per request- **Memory Usage**: ~50MB- **Docker Image Size**: ~500MB
34 BAGIAN 5: Presentation & Reporting (0.5 Jam)
34.1 5.1 Technical Report
TASK 5.1: Write Technical Report
Buat TECHNICAL_REPORT.md (15-25 pages):
# Technical Report: [Project Title]## 1. Executive Summary[1 page - high-level overview, key findings, recommendations]## 2. Introduction- Problem context and motivation- Why this problem matters- Research questions- Contributions of this work## 3. Literature Review- Related work and existing solutions- State-of-the-art approaches- How your work differs## 4. Methodology### 4.1 Problem Formulation[Mathematical definition of the problem]### 4.2 Approach[Describe your ML pipeline]- Data preprocessing- Feature engineering- Model selection- Evaluation methodology### 4.3 Evaluation Metrics[Explain choice of metrics and how they're calculated]## 5. Data Description- Dataset characteristics- Data collection and preprocessing- Feature engineering decisions- Data splits (train/val/test)- Class distribution analysis## 6. Results### 6.1 Model Comparison[Table comparing all models tried]### 6.2 Best Model Performance[Detailed results for best model]### 6.3 Ablation Studies[Impact of different components]### 6.4 Visualizations[Confusion matrix, ROC curve, feature importance]## 7. Analysis & Discussion- Why did the model work/fail?- Key findings and insights- Error analysis- Limitations of the approach## 8. Deployment & Production Considerations- Model serialization strategy- API design and latency analysis- Containerization and scalability- Monitoring and retraining strategy## 9. Conclusion- Summary of findings- Practical implications- Future work directions## 10. References[Academic and technical references]