AI for Pipeline Integrity & Asset Management
ILI Data Analysis, Cathodic Protection Monitoring, and Leak Detection
ILI Data Analysis: Anomaly Classification from MFL/UT Data
In-line inspection (ILI) — running intelligent pigs through pipelines — generates massive datasets. A single MFL (Magnetic Flux Leakage) run on GAIL's HBJ pipeline (Hazira-Bijaipur-Jagdishpur, 2,700 km) produces 50+ million data points: axial and circumferential flux leakage signals, internal caliper readings, GPS coordinates, odometer distance. Converting these signals into actionable defect calls is where AI transforms pipeline integrity management.
Traditional ILI analysis: vendor delivers a defect listing with depth, length, width, and classification (metal loss, dent, lamination, weld anomaly). The vendor's algorithms are proprietary, conservative, and tuned for false-negative minimization. This means high false-positive rates — 30-40% of reported anomalies turn out to be benign on excavation and direct examination.
ML Defect Classification
Open data/pipeline-inspection-data.csv — each row is a reported ILI anomaly with MFL signal characteristics, location, pipe properties, and field verification results (for previously excavated features).
Classification model:
Features: peak_mfl_amplitude, signal_width_axial, signal_width_circ,
signal_shape_factor, internal_external_flag,
pipe_wall_thickness, pipe_grade, pipe_diameter,
distance_from_weld, clock_position,
previous_run_signal (if available)
Target: defect_type (metal_loss, dent, lamination, mill_defect,
weld_anomaly, false_call)
Model: Gradient Boosted Trees (XGBoost)
Performance: 91% accuracy on GAIL validation set
(vs 72% for vendor's rule-based classification)
Critical metric: 98.5% recall on metal_loss class
(must not miss corrosion — integrity failures are catastrophic)Growth Rate Estimation
For pipelines with multiple ILI runs (common on GAIL and IOCL trunk lines with 3-5 year run intervals), AI models estimate corrosion growth rates by matching features across runs. Feature matching itself is a classification problem — same feature or different feature? — complicated by pig speed variations, sensor orientation changes, and new defects appearing between runs.
| Growth Rate Method | Data Required | Accuracy (mm/year) | Limitation |
|---|---|---|---|
| **Box matching** (traditional) | 2+ runs | ±0.15 mm/yr | Mismatches common at feature clusters |
| Signal correlation matching | 2+ runs with raw signals | ±0.10 mm/yr | Requires vendor raw data access |
| ML probabilistic matching | 2+ runs + pipe tally | ±0.08 mm/yr | Needs training data from excavations |
| Bayesian growth model | 2+ runs + CP data + soil survey | ±0.05 mm/yr | Computationally intensive |
For IOCL's cross-country crude oil pipelines (Salaya-Mathura, Mundra-Panipat, Paradip-Haldia), where external corrosion is the dominant threat, growth rate models incorporating cathodic protection effectiveness and soil corrosivity data reduce the number of required excavations by 40-50% compared to vendor-recommended dig lists.
Cathodic Protection Monitoring and Under-Protection Detection
Cathodic protection (CP) is the primary corrosion mitigation for buried pipelines. The principle: maintain pipe-to-soil potential more negative than -850 mV (CSE) to suppress anodic dissolution. Simple in theory, complex in practice — CP effectiveness varies with soil resistivity, coating condition, interference from other structures (railways, power lines, neighboring pipelines), and rectifier output.
Open data/cathodic-protection-data.csv — each row is a CP survey reading: test post location, pipe-to-soil potential (on/off), soil resistivity, coupon current density, coating condition notes, and rectifier station data.
Spatial-Temporal CP Modeling
Traditional CP assessment: annual CIPS (Close Interval Potential Survey) — walking the pipeline with a trailing wire, measuring potential every 1-3 meters. This is a snapshot. Between surveys, CP effectiveness changes with seasons (soil moisture), rectifier aging, coating degradation, and third-party damage.
An AI model that interpolates between annual CIPS surveys using continuous remote monitoring data:
Continuous inputs (from rectifier RTUs, every 15 minutes):
rectifier_output_voltage, rectifier_output_current
test_post_potentials (at key locations, remote-read)
Periodic inputs (from CIPS, annually):
full spatial potential profile (every 1-3 meters)
soil resistivity profile
Static inputs:
coating_type, coating_age, pipe_depth
crossing_locations (roads, railways, rivers)
foreign_structure_locations
Model output: estimated potential at every 100m segment, daily
Flag: segments with estimated potential > -850 mV CSE (under-protected)For GAIL's trunk pipeline network (16,000+ km), this approach identified 23 under-protected segments between annual surveys that would have been missed until the next CIPS. Three of these showed active corrosion on excavation.
Interference Detection
AC and DC interference from railways, HVDC transmission lines, and neighboring CP systems can cause both corrosion (AC corrosion at >30 A/m²) and CP reading errors (DC stray current makes pipe-to-soil readings unreliable). ML anomaly detection on rectifier output current identifies interference events in real time:
Interference signatures:
DC stray current: sudden potential shifts correlated with railway timetable
AC interference: 50 Hz ripple on potential readings, elevated coupon AC density
Coating shielding: potential readings normal but coupon shows active corrosion
Detection model: Isolation Forest on rectifier current patterns
Alert: potential interference detected → field verification within 48 hoursLeak Detection: SCADA-Based, Acoustic, and Flow Balance Methods
Pipeline leaks range from catastrophic ruptures (immediately obvious) to small seepage (1-5% of flow — detectable only with instrumentation). Indian pipeline regulations (PNGRB Technical Standards) require leak detection systems capable of detecting leaks as small as 1% of flow rate within 2 hours.
Open data/leak-detection-events.json — historical leak and false alarm events with SCADA data snapshots, detection method, detection time, and field verification results.
SCADA-Based Statistical Leak Detection
Traditional CPM (Computational Pipeline Monitoring) uses mass balance: if inlet mass flow minus outlet mass flow exceeds a threshold for a sustained period, alarm. The challenge: measurement noise, temperature transients, and batch changes cause false alarms. Indian pipelines operating with ±0.5% flow meter accuracy generate hundreds of nuisance alarms per month at thresholds sensitive enough to detect 1% leaks.
ML-based leak detection learns the normal operating envelope and detects deviations:
Features (sampled every 30 seconds):
inlet_flow, outlet_flow, line_pack_rate_of_change
inlet_pressure, outlet_pressure, intermediate_pressures
product_temperature_profile (from DTS or discrete sensors)
pump_status, valve_positions
ambient_temperature, soil_temperature (if available)
Approach 1 — Autoencoder anomaly detection:
Train on 6 months of confirmed leak-free operation
Reconstruction error > threshold → leak alarm
Advantage: no labeled leak data needed (leaks are rare)
Approach 2 — Physics-informed neural network:
Encode hydraulic model (pressure drop = f(flow, viscosity, elevation))
Residuals between model and measured values → leak signature
Advantage: can estimate leak location from pressure profileAcoustic Leak Detection
Fiber optic DTS/DAS (Distributed Temperature/Acoustic Sensing) is increasingly deployed on new Indian pipelines. IOCL's Paradip-Haldia pipeline has fiber optic monitoring along its 600+ km length. The fiber converts the pipeline into a continuous microphone — acoustic emissions from leaks produce characteristic spectral signatures.
ML classification on DAS spectral data distinguishes:
A CNN trained on labeled DAS events achieves 95% accuracy in classifying event types, with leak detection sensitivity down to 0.5% of flow rate within 5 minutes — far exceeding PNGRB requirements.
Repair Prioritization Using Fitness-for-Service Assessment
Not every defect requires immediate repair. Fitness-for-service (FFS) assessment — ASME B31G, Modified B31G, RSTRENG, BS 7910, API 579 — determines whether a defect can remain in service until the next planned maintenance window.
AI-Enhanced FFS Screening
Traditional FFS is computationally simple for individual defects but becomes complex at scale: a single ILI run reports 2,000-5,000 anomalies, each requiring assessment against operating conditions, interaction rules (are two defects close enough to interact?), and growth rate projections.
Prioritization model:
Input per defect:
depth_pct, length_mm, width_mm, defect_type
pipe_grade, wall_thickness, diameter
MAOP, operating_pressure, pressure_cycling_frequency
growth_rate_mm_per_year (from growth model)
distance_to_nearest_defect, interaction_flag
Output:
remaining_life_years (regression)
repair_priority: immediate / next_shutdown / monitor (classification)
estimated_failure_pressure_psi (regression)
Validation: FEA (Finite Element Analysis) for top 100 critical defects
Agreement: 94% of ML priority rankings match FEA-based rankingsFor GAIL's pipeline network, this approach reduced the annual excavation program from 800 digs to 350 digs while maintaining the same safety level — saving ₹40+ crore annually in excavation and repair costs.
Key Takeaways
This is chapter 3 of AI for Oil & Gas / Energy.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details