Grading & Sorting
Computer Vision for Produce Grading, NIR Spectroscopy, and Automated Sorting Line Calibration
Why Manual Grading Is a Bottleneck
India grades and sorts approximately 8% of its processed fruits and vegetables by machine, versus 70%+ in developed markets. The gap is not a technology availability problem — commercial sorting machines from companies like Tomra, Key Technology, and CFTRI-licensed Indian manufacturers are accessible. The gap is in the trained vision models and NIR calibration curves adapted to Indian varieties, growing conditions, and the wide quality variation that characterizes Indian produce markets.
A Kesar mango from Girnar and a Kesar from Junagadh are the same variety but can differ significantly in Brix (sugar content), skin color at maturity, and fiber content — parameters relevant to export grading. A vision model trained on Israeli or Brazilian mango varieties misclassifies Indian mangoes at 15-25% error rate. Indian variety-specific training datasets are the foundational investment.
Open data/mango-grading-images.json — it contains metadata and feature vectors from 12,000 Alphonso and Kesar mango images collected at packing houses in Ratnagiri and Junagadh, labeled by APEDA-certified graders: weight class (A/B/C), color grade (full yellow, 50% color, 25% color), external defect flags (stem end rot, skin bruising, latex staining, lenticel browning), and Brix (refractometer measurement on 20% sample).
Computer Vision Architecture for Produce Grading
A production mango grading line vision system has four components working in sequence:
| Component | Technology | Speed Requirement | Output |
|---|---|---|---|
| Acquisition | 4K line scan camera, telecentric lens, structured LED illumination | 5-12 mangoes/sec | Raw image frames |
| Pre-processing | Background subtraction, perspective correction, color calibration to D65 | Real-time GPU | Normalized fruit image |
| Defect detection | CNN (ResNet-50 or EfficientNet-B3) → per-pixel segmentation | <80ms/fruit | Defect mask + type classification |
| Grade assignment | Rule engine consuming vision outputs + load cell weight | <20ms/fruit | Diverter trigger signal (Grade A/B/C/Reject) |
The CNN architecture choice depends on deployment hardware. On a Jetson AGX Orin (common in Indian packing houses due to price point), EfficientNet-B3 at INT8 quantization achieves 45ms inference time vs. ResNet-50 at 78ms, with comparable accuracy.
# Transfer learning setup for mango defect detection
import torchvision.models as models
import torch.nn as nn
backbone = models.efficientnet_b3(pretrained=True)
# Replace classifier head for multi-label defect detection
n_defect_classes = 8 # stem_end_rot, bruise, lenticel_browning, latex_stain,
# anthracnose, insect_damage, dry_skin, color_uneven
backbone.classifier = nn.Sequential(
nn.Dropout(0.3),
nn.Linear(backbone.classifier[1].in_features, n_defect_classes),
nn.Sigmoid() # Multi-label: each defect independently present or absent
)
# Training data augmentation for Indian packing house conditions
transforms = [
RandomRotation(360), # Fruit orientation is random on conveyor
ColorJitter(brightness=0.3), # LED illumination variation between packing houses
RandomHorizontalFlip(),
GaussianBlur(kernel_size=3), # Camera vibration from conveyor
Normalize(mean=IMAGENET_MEAN, std=IMAGENET_STD)
]Rice and Wheat Grain Quality: Broken, Discoloured, Foreign Matter
Grain quality assessment is the highest-volume grading application in India — the Food Corporation of India (FCI) grades 60+ million tonnes of rice and wheat annually. Traditional visual inspection by human graders is slow (2-5 kg/min), subjective, and vulnerable to fatigue. AI vision systems operate at 100-500 kg/hour with <1% misclassification on trained categories.
Key quality attributes for rice:
| Attribute | Visual Signal | Detection Challenge |
|---|---|---|
| Head rice (whole) | L/W ratio > 2.5 for long-grain | Low contrast against background |
| Broken (large) | L/W ratio 1.5-2.5 | Gradation boundary ambiguity |
| Broken (small/brokens) | L/W ratio < 1.5 | Distinguish from husks |
| Chalky grain | White opaque spot in otherwise translucent grain | Requires transmitted light |
| Red-striped grain | Pink/red coloration from pericarp | Color variation across varieties |
| Immature grain | Green coloration, low density | Color camera + density separation |
| Foreign matter | Non-rice: husk, stone, mud ball | Multi-class: train separately per foreign type |
| Paddy (unhusked) | Golden brown hull | Reliable; consistent texture |
CFTRI has developed an open-access rice quality assessment system (RQAS) with labeled datasets for Basmati, Sona Masuri, and Ponni varieties — the best starting point for training before collecting proprietary data.
Prompt: "I have a batch of Basmati rice images captured on a white background conveyor under
diffuse LED lighting [data/rice-grain-images.json — 8,000 labeled grains]. The current model
accuracy is 94% on head rice identification but only 78% on chalky grain detection (our target
is >92% for all classes). Analyze the confusion matrix [attached], identify the failure modes
for chalky grain, and recommend: (1) a data augmentation strategy to address the specific
failure cases, (2) whether a second model with transmitted light images would close the gap,
(3) a threshold calibration approach that minimizes false reject rate (economic cost: ₹18/kg
misclassified as lower grade) while keeping chalky grade-up rate <0.5%."NIR Spectroscopy: Composition Analysis Without Sampling
Near-Infrared (NIR) spectroscopy measures the absorption of NIR light (780-2500nm) by food samples — different chemical bonds (O-H for water, N-H for protein, C-H for fat/starch) absorb at characteristic wavelengths. The result: a full composition analysis in 30 seconds without sample destruction, at ₹1-3/measurement vs. ₹500-2000 for wet chemistry lab analysis.
NIR calibration model development workflow:
1. Collect reference samples (n=200+ for robust calibration)
→ Cover full range of expected composition variation (variety, origin, season)
→ Each sample: NIR spectrum + wet chemistry reference analysis
2. Pre-process spectra
→ Standard Normal Variate (SNV) correction for particle size variation
→ Savitzky-Golay smoothing (2nd derivative) to resolve overlapping peaks
→ Multiplicative Scatter Correction (MSC) for scattering baseline drift
3. Model fitting
→ Partial Least Squares Regression (PLSR): n_components = 6-12 for food matrices
→ Cross-validate: leave-one-sample-out or k-fold (k=10)
→ Report: RMSECV (cross-validation error), R² ≥ 0.95 for quantitative analysis
4. Validation on independent hold-out set (n=50+)
→ RMSEP (prediction error on unseen samples) should be ≤ RMSECV × 1.15
→ Bias check: systematic over/under-prediction indicates subset not in calibrationOpen data/nir-spectra-calibration.csv — it contains NIR spectra (1100-2500nm, 2nm resolution) and reference lab values for 350 wheat flour samples from 8 flour mills: moisture%, protein%, ash%, wet gluten%, water absorption (Farinograph). The task: build a robust PLSR model that can replace the Farinograph measurement with a 30-second NIR scan.
NIR applications by commodity in India:
| Commodity | Parameters Measured | Accuracy (RMSEP) | Economic Value |
|---|---|---|---|
| Wheat flour | Moisture, protein, ash, gluten | Protein: ±0.2%, Moisture: ±0.1% | Grade/price decision on every lot |
| Rice | Moisture, amylose, chalky% | Moisture: ±0.15% | Milling yield optimization |
| Edible oil | FFA, moisture, adulteration | FFA: ±0.05% | Quality gating at receipt |
| Milk powder | Fat, protein, lactose, moisture | Fat: ±0.15% | Formula compliance for export |
| Spices | Moisture, essential oil, adulteration | Moisture: ±0.2% | FSSAI and Spices Board specs |
| Sugar | Moisture, ash, color (ICUMSA) | Color: ±12 IU | Premium grade eligibility |
Automated Sorting Line Calibration and OEE
A sorting line that runs at 90% of rated speed due to miscalibration loses ₹500-2000/hour in processing capacity (product value × throughput gap). Calibration drift occurs from:
AI-assisted calibration uses reference standard objects (certified color tiles, NIST-traceable size standards) run through the line at shift start. The vision system measures deviation from expected values and adjusts:
Prompt: "Our mango sorting line vision system has been running for 8 months without recalibration.
Calibration tile measurements today [data/calibration-tile-readings.json] show: D65 white tile
luminance down 18% from baseline, red tile chroma shift ΔE=4.2 (CIEDE2000), green tile ΔE=2.8.
Current operational data shows grade A assignment rate dropped from 42% to 31% over the last
6 weeks (we believe actual incoming quality is stable). Calculate: (1) the correction matrix
needed to restore colorimetric accuracy, (2) whether the grade A rate drop is fully explained
by illumination drift or indicates a model recalibration is needed, (3) a recalibration schedule
recommendation based on this degradation rate."Tomato and Grape Grading for Export
Tomato (Namdhari/Kolar): EU export grade requires color uniformity (CIELAB a* value), size (diameter 47-102mm), and absence of blossom end rot, catfacing, and cracking. AI grades 20,000+ tomatoes/hour with <2% misclassification vs. 800/hour by trained human grader at >5% error rate.
Grapes (Nashik Seedless for EU export): APEDA export protocol requires berry diameter, cluster weight, total soluble solids (Brix ≥16), and defect assessment. NIR on-line Brix measurement eliminates destructive sampling at 100% throughput.
Key Takeaways
This is chapter 5 of AI for Food Processing & Agri.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details