This document provides a comprehensive technical reference for the PCB Defect Detection and Classification System. It describes the algorithmic choices, implementation details, processing pipeline stages, model architecture, and integration design decisions made across the full project.
The system was designed with three guiding principles:
- Template independence: Defect localization must work from a single PCB image without requiring a matched reference PCB.
- Classification accuracy: The CNN classifier must reliably distinguish between six visually similar defect types, targeting ≥ 95% accuracy.
- Deployment readiness: The full upload-to-annotated-output inference cycle must complete within 3 seconds on a standard CPU.
The system is structured as five sequential processing phases, each implemented as isolated, testable Python functions. The Streamlit web frontend orchestrates these phases by passing in-memory image data through the pipeline without touching the file system during inference.
┌──────────────────────────────────────────────────────────────────┐
│ INPUT: Raw PCB image bytes (uploaded via browser) │
└──────────────────────────────────┬───────────────────────────────┘
│
┌──────────────▼──────────────┐
│ PREPROCESSING │
│ BGR decode · Grayscale │
│ Median blur (5×5 kernel) │
└──────────────┬──────────────┘
│
┌───────────────────┴───────────────────────┐
│ │
┌────────────▼────────────┐ ┌──────────────▼────────────┐
│ ADAPTIVE THRESHOLDING │ │ CANNY EDGE DETECTION │
│ Gaussian local window │ │ Gradient-based edges │
│ blockSize=31, C=8 │ │ low=40, high=160 │
└────────────┬────────────┘ └──────────────┬────────────┘
│ Bitwise OR (combined map) │
└───────────────────┬───────────────────────┘
│
┌──────────────▼──────────────┐
│ MORPHOLOGICAL CLEANUP │
│ Close (5×5) → Open (3×3) │
│ Binary defect mask │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ CONTOUR DETECTION & ROI │
│ findContours (EXTERNAL) │
│ Area / aspect-ratio filter │
│ Centroid + bounding rect │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ CNN CLASSIFICATION │
│ 128×128 crop per region │
│ EfficientNetB0 → Softmax(6) │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ ANNOTATION & EXPORT │
│ Colored bounding boxes │
│ PNG · CSV · TXT outputs │
└─────────────────────────────┘
The training pipeline begins with paired PCB images drawn from the DeepPCB dataset. Each image pair consists of a template (reference, defect-free PCB) and a test (defective PCB) image captured from the same board layout.
Image Alignment (ORB + Homography)
Before subtraction, the test image is geometrically aligned to the template. Misalignment — even sub-pixel — produces large false-positive regions in the difference map.
| Step | OpenCV Function | Parameters |
|---|---|---|
| Keypoint detection | cv2.ORB_create |
nfeatures=5000 |
| Descriptor matching | cv2.BFMatcher |
NORM_HAMMING, crossCheck=True |
| Geometric estimation | cv2.findHomography |
RANSAC, reprojThresh=5.0 |
| Image warping | cv2.warpPerspective |
aligned to template canvas size |
If fewer than 4 keypoints are detected or the homography cannot be computed, the test image is resized to the template dimensions as a fallback.
Defect Mask via Image Subtraction
Once aligned, the absolute pixel-wise difference between the two blurred images is computed:
diff_gray = |medianBlur(template) − medianBlur(aligned_test)|
Otsu's thresholding automatically selects the optimal binary threshold from the bimodal histogram of the difference image:
mask = threshold(diff_gray, 0, 255, THRESH_BINARY + THRESH_OTSU)
Morphological closing (fills holes inside defect blobs) and opening (removes isolated noise pixels) are applied with 5×5 and 3×3 rectangular kernels respectively.
For the live web application, a template image is not required. Defect localization is instead performed using a two-channel approach:
Channel 1 — Adaptive Thresholding
cv2.adaptiveThreshold computes a local threshold for each pixel based on the weighted average of its blockSize × blockSize Gaussian neighborhood:
threshold(pixel) = weighted_mean(neighborhood) − C
This highlights local intensity anomalies (missing holes, spurs, excess copper) regardless of global brightness variations.
| Parameter | Value | Rationale |
|---|---|---|
blockSize |
31 | Larger than typical defect features; captures local context |
C |
8 | Offset prevents over-segmentation of smooth trace regions |
adaptiveMethod |
ADAPTIVE_THRESH_GAUSSIAN_C |
Smooth weight falloff reduces border artifacts |
Channel 2 — Canny Edge Detection
Canny identifies sharp intensity gradients corresponding to structural breaks (open circuits, shorts):
| Parameter | Value |
|---|---|
threshold1 (low) |
40 |
threshold2 (high) |
160 |
Fusion and Cleanup
The two binary maps are merged with a bitwise OR, then cleaned with the same morphological sequence as the dual-image pipeline. The result is a binary defect mask whose white regions correspond to candidate defect locations.
cv2.findContours with RETR_EXTERNAL retrieves only outer contours of white defect blobs. For each contour, the following features are computed:
| Feature | Formula | Purpose |
|---|---|---|
| Area | cv2.contourArea(c) |
Filter noise vs. real defects |
| Centroid | cx = m10/m00, cy = m01/m00 |
ROI crop center |
| Bounding rectangle | cv2.boundingRect(c) |
Annotation coordinates |
| Circularity (M2 only) | 4π·area / perimeter² |
Filter missing-hole class |
| Aspect ratio (M2 only) | max(w,h) / min(w,h) |
Reject elongated non-defects |
Filtering thresholds (training pipeline):
| Parameter | Value |
|---|---|
| Min area | 80 px² |
| Max area | 1 200 px² |
| Max aspect ratio | 3.5 |
| Min circularity (Missing_hole) | 0.45 |
| ROI padding | 4 px |
Filtering thresholds (inference backend):
| Parameter | Value |
|---|---|
| Min area | 80 px² |
| Max area | 50 000 px² |
The larger max-area window in the inference backend accommodates larger defect regions in unseen images.
Each accepted contour's centroid (cx, cy) is used to crop a 128 × 128 pixel patch from the aligned PCB image, clipped to image bounds:
left = max(0, cx − 64)
top = max(0, cy − 64)
right = min(width, cx + 64)
bottom = min(height, cy + 64)
The crop is resized to exactly 128 × 128 before being passed to the classifier.
All extracted ROI patches are pooled per defect class, shuffled with a fixed seed (42 for reproducibility), and partitioned:
| Split | Ratio | Usage |
|---|---|---|
| Train | 70% | Gradient updates |
| Validation | 15% | Hyperparameter monitoring |
| Test | 15% | Final accuracy reporting |
The classifier uses EfficientNetB0 as a frozen feature extractor with a custom classification head. EfficientNetB0 was selected for its use of compound scaling (depth, width, resolution) which delivers strong accuracy per parameter:
| Layer | Configuration |
|---|---|
| Input | 128 × 128 × 3 (RGB) |
| Backbone | EfficientNetB0 (ImageNet weights, include_top=False) |
| GlobalAveragePooling2D | Spatial compression to 1 280-d feature vector |
| BatchNormalization | Stabilizes activations after backbone |
| Dropout (0.4) | Regularization |
| Dense (256, ReLU) | Task-specific representation |
| BatchNormalization | Stabilizes before final layer |
| Dropout (0.3) | Regularization |
| Dense (6, Softmax) | Class probability distribution |
Total trainable parameters (fine-tune phase): ~4.2 M (EfficientNetB0 top-30 layers + head)
Training proceeds in two phases to prevent catastrophic forgetting of the backbone's ImageNet representations:
Phase 1 — Warm-Up (backbone frozen)
The backbone weights are frozen. Only the custom head is trained, allowing it to converge quickly to the defect feature space before the backbone is touched.
| Setting | Value |
|---|---|
| Max epochs | 10 |
| Learning rate | 1 × 10⁻³ |
| Early stopping patience | 5 epochs (val_accuracy) |
| Backbone | Fully frozen |
Phase 2 — Fine-Tuning (top backbone layers unfrozen)
The top 30 layers of EfficientNetB0 are unfrozen and trained jointly with the head at a reduced learning rate to preserve learned low-level features.
| Setting | Value |
|---|---|
| Max epochs | 20 |
| Learning rate | 1 × 10⁻⁴ (1/10th of warm-up) |
| Early stopping patience | 8 epochs (val_accuracy) |
| Backbone | Top 30 layers trainable |
Callbacks used:
| Callback | Configuration |
|---|---|
ModelCheckpoint |
Saves best_model.keras at peak val_accuracy |
EarlyStopping |
Restores best weights on plateau |
ReduceLROnPlateau |
Halves LR on val_loss plateau (min: 1 × 10⁻⁷) |
CSVLogger |
Epoch-by-epoch metrics to training_log.csv |
Applied only to the training split to improve generalization across board layouts, lighting conditions, and imaging angles:
| Augmentation | Range |
|---|---|
| Rotation | ±20° |
| Width/height shift | ±15% |
| Shear | 10% |
| Zoom | 15% |
| Horizontal flip | Enabled |
| Vertical flip | Enabled |
| Brightness | [0.8, 1.2] |
| Component | Choice | Reason |
|---|---|---|
| Loss | Categorical Cross-Entropy | Standard for multi-class softmax output |
| Optimizer | Adam | Adaptive LR; robust to noisy gradients |
| Metrics | Accuracy | Primary, human-interpretable metric |
After training, the model is evaluated on the held-out test split. The following metrics are computed:
| Metric | Description |
|---|---|
| Test Accuracy | Overall fraction of correctly classified ROIs |
| Precision (weighted) | TP / (TP + FP), averaged across classes by support |
| Recall (weighted) | TP / (TP + FN), averaged across classes by support |
| F1-Score (weighted) | Harmonic mean of precision and recall |
| Confusion Matrix | 6×6 matrix — true vs. predicted class counts |
| Classification Report | Per-class precision, recall, F1, support |
A separate evaluation script (Single_Test.py) processes new, unseen image pairs through the full dual-image pipeline. If ground-truth annotations are provided in test_ground_truth.json, additional metrics are computed:
| Metric | Computation |
|---|---|
| GT Match Rate | Predictions within ±32 px of GT centroid with matching class |
| False Positive Rate | FP / (FP + TN) across the confusion matrix |
| False Negative Rate | FN / (FN + TP) across the confusion matrix |
The match-rate metric simulates real-world quality-assurance conditions where the system must locate the correct defect at the correct position.
The backend is fully modularized — each processing step is an independent, testable function:
| Function | Input | Output | Stage |
|---|---|---|---|
load_model(path) |
File path | Cached Keras model | Initialization |
compute_defect_mask_single(bgr) |
BGR image array | (highlight_bgr, mask) |
Localization |
compute_defect_mask(template, aligned) |
Two BGR arrays | (diff_gray, mask) |
Localization (dual) |
align_images(template, test) |
Two BGR arrays | Aligned BGR array | Preprocessing |
detect_defect_regions(mask) |
Binary mask | [(cx,cy,x,y,w,h), …] |
Segmentation |
classify_roi(model, pil, cx, cy) |
PIL image + coords | (class_name, conf, probs) |
Classification |
annotate_image(bgr, predictions) |
BGR + prediction list | Annotated BGR | Visualization |
run_inference_single(bytes, path) |
Image bytes + model path | Full results dict | Orchestration |
run_inference(t_bytes, test_bytes, path) |
Two byte streams + path | Full results dict | Orchestration (dual) |
predictions_to_csv_bytes(preds, ts, ms) |
Prediction list | CSV bytes | Export |
image_to_png_bytes(rgb_array) |
NumPy RGB array | PNG bytes | Export |
Model Caching: load_model uses a module-level dictionary (_model_cache) to store the loaded Keras model by file path. On repeat inference calls (e.g., multiple uploads in the same Streamlit session), the model is never reloaded from disk, eliminating a ~0.5–2 s overhead per run.
Both run_inference_single and run_inference return a unified dictionary:
{
"original_rgb": np.ndarray, # uploaded image in RGB
"highlight_rgb": np.ndarray, # anomaly heatmap (red tint) in RGB
"mask_rgb": np.ndarray, # binary mask as 3-channel RGB
"annotated_rgb": np.ndarray, # bounding-box annotated image in RGB
"predictions": [
{
"class": str, # one of the 6 defect class names
"confidence": float, # probability [0.0 – 1.0]
"probs": list, # full 6-class probability vector
"cx": int, # defect centroid x
"cy": int, # defect centroid y
"bbox": [x,y,w,h] # bounding box in pixels
},
...
],
"elapsed_ms": float, # end-to-end pipeline time in milliseconds
"timestamp": str # ISO 8601 timestamp
}The Streamlit application (app.py) follows a single-page layout with logical sections:
| Section | Component | Purpose |
|---|---|---|
| Hero banner | Custom HTML/CSS | Project identity and description |
| Image uploader | st.file_uploader |
Accept JPG/PNG/BMP/TIF |
| Pre-analysis preview | st.image |
Show upload before running |
| Run button | st.button |
Trigger inference |
| Metric tiles | Custom HTML grid | Defects found, types, confidence, time |
| Performance gate | st.success / st.warning |
≤ 3 s target status |
| 4-panel output | st.columns + st.image |
Visual pipeline output |
| Annotated output (full) | st.image |
High-resolution result |
| Prediction table | Custom HTML table | Per-defect details |
| Distribution chart | st.bar_chart |
Defect class frequencies |
| Export section | st.download_button ×4 |
PNG, CSV, Mask, TXT |
| Sidebar | Config + legend | Model path, pipeline steps, class badges |
Theme: A dark GitHub-style theme (#0d1117 background) with Inter font, blue accent #58a6ff, and class-specific badge colors matching the bounding box palette.
Inference results are stored in st.session_state["result"] immediately after a successful run. This allows the results to persist across Streamlit's reactive re-render cycles (triggered by each download button click) without re-running the inference pipeline.
The export section provides four downloadable files, generated entirely in-memory without writing to disk:
| Export | Generation Method | Content |
|---|---|---|
| Annotated PNG | image_to_png_bytes(annotated_rgb) |
Original with bounding boxes + labels |
| CSV Prediction Log | predictions_to_csv_bytes(predictions, ts, ms) |
Structured tabular defect data |
| Binary Mask PNG | image_to_png_bytes(mask_rgb) |
White=defect, Black=background |
| Evaluation Report TXT | Formatted string → encode("utf-8") |
Summary + per-defect table + performance gate |
| Metric | Observed Range | Target |
|---|---|---|
| End-to-end inference (CPU) | 200 – 400 ms | ≤ 3 000 ms |
| Model load time (first call) | 500 – 1 500 ms | Cached on repeat calls |
| Number of model parameters | ~4.2 M | — |
| Test set classification accuracy | Target ≥ 95% | ≥ 95% |
| Package | Version Constraint | Role |
|---|---|---|
tensorflow-cpu |
≥ 2.12, < 2.17 | Model training and inference |
opencv-python |
≥ 4.8.0 | All image processing |
Pillow |
≥ 10.0.0 | PIL crop operations and PNG export |
streamlit |
≥ 1.32.0 | Web application framework |
numpy |
≥ 1.24.0 | Array operations |
pandas |
≥ 2.0.0 | DataFrame operations |
matplotlib |
≥ 3.7.0 | Training curve plots |
seaborn |
≥ 0.12.0 | Confusion matrix heatmaps |
scikit-learn |
≥ 1.3.0 | Classification metrics |
imutils |
≥ 0.5.4 | Contour grab utility |
| Limitation | Proposed Solution |
|---|---|
| Single-image mode may over-detect on uniform boards | Confidence thresholding or a binary "defect / no-defect" pre-filter |
| CPU-only inference | Install tensorflow-gpu or export to TFLite for edge acceleration |
| No real-time camera support | Integrate st.camera_input for live manufacturing-line feed |
| No persistent logging | Connect to MongoDB or SQLite for historical defect trend analysis |
| PDF export not implemented | Use reportlab or fpdf2 to generate formal QA PDF reports |
| Alert system absent | Add SMTP/SMS notification on critical defect types (Short, Open Circuit) |