9 min

AI for Pipeline Integrity & Asset Management

ILI Data Analysis, Cathodic Protection Monitoring, and Leak Detection

ILI Data Analysis: Anomaly Classification from MFL/UT Data

In-line inspection (ILI) — running intelligent pigs through pipelines — generates massive datasets. A single MFL (Magnetic Flux Leakage) run on GAIL's HBJ pipeline (Hazira-Bijaipur-Jagdishpur, 2,700 km) produces 50+ million data points: axial and circumferential flux leakage signals, internal caliper readings, GPS coordinates, odometer distance. Converting these signals into actionable defect calls is where AI transforms pipeline integrity management.

Traditional ILI analysis: vendor delivers a defect listing with depth, length, width, and classification (metal loss, dent, lamination, weld anomaly). The vendor's algorithms are proprietary, conservative, and tuned for false-negative minimization. This means high false-positive rates — 30-40% of reported anomalies turn out to be benign on excavation and direct examination.

ML Defect Classification

Open data/pipeline-inspection-data.csv — each row is a reported ILI anomaly with MFL signal characteristics, location, pipe properties, and field verification results (for previously excavated features).

Classification model:
  Features: peak_mfl_amplitude, signal_width_axial, signal_width_circ,
            signal_shape_factor, internal_external_flag,
            pipe_wall_thickness, pipe_grade, pipe_diameter,
            distance_from_weld, clock_position,
            previous_run_signal (if available)

  Target: defect_type (metal_loss, dent, lamination, mill_defect,
          weld_anomaly, false_call)

  Model: Gradient Boosted Trees (XGBoost)
  Performance: 91% accuracy on GAIL validation set
              (vs 72% for vendor's rule-based classification)

  Critical metric: 98.5% recall on metal_loss class
  (must not miss corrosion — integrity failures are catastrophic)

Growth Rate Estimation

For pipelines with multiple ILI runs (common on GAIL and IOCL trunk lines with 3-5 year run intervals), AI models estimate corrosion growth rates by matching features across runs. Feature matching itself is a classification problem — same feature or different feature? — complicated by pig speed variations, sensor orientation changes, and new defects appearing between runs.

Growth Rate Method	Data Required	Accuracy (mm/year)	Limitation
Box matching (traditional)	2+ runs	±0.15 mm/yr	Mismatches common at feature clusters
Signal correlation matching	2+ runs with raw signals	±0.10 mm/yr	Requires vendor raw data access
ML probabilistic matching	2+ runs + pipe tally	±0.08 mm/yr	Needs training data from excavations
Bayesian growth model	2+ runs + CP data + soil survey	±0.05 mm/yr	Computationally intensive

For IOCL's cross-country crude oil pipelines (Salaya-Mathura, Mundra-Panipat, Paradip-Haldia), where external corrosion is the dominant threat, growth rate models incorporating cathodic protection effectiveness and soil corrosivity data reduce the number of required excavations by 40-50% compared to vendor-recommended dig lists.

Cathodic Protection Monitoring and Under-Protection Detection

Cathodic protection (CP) is the primary corrosion mitigation for buried pipelines. The principle: maintain pipe-to-soil potential more negative than -850 mV (CSE) to suppress anodic dissolution. Simple in theory, complex in practice — CP effectiveness varies with soil resistivity, coating condition, interference from other structures (railways, power lines, neighboring pipelines), and rectifier output.

Open data/cathodic-protection-data.csv — each row is a CP survey reading: test post location, pipe-to-soil potential (on/off), soil resistivity, coupon current density, coating condition notes, and rectifier station data.

Spatial-Temporal CP Modeling

Traditional CP assessment: annual CIPS (Close Interval Potential Survey) — walking the pipeline with a trailing wire, measuring potential every 1-3 meters. This is a snapshot. Between surveys, CP effectiveness changes with seasons (soil moisture), rectifier aging, coating degradation, and third-party damage.

An AI model that interpolates between annual CIPS surveys using continuous remote monitoring data:

Continuous inputs (from rectifier RTUs, every 15 minutes):
  rectifier_output_voltage, rectifier_output_current
  test_post_potentials (at key locations, remote-read)

Periodic inputs (from CIPS, annually):
  full spatial potential profile (every 1-3 meters)
  soil resistivity profile

Static inputs:
  coating_type, coating_age, pipe_depth
  crossing_locations (roads, railways, rivers)
  foreign_structure_locations

Model output: estimated potential at every 100m segment, daily
Flag: segments with estimated potential > -850 mV CSE (under-protected)

For GAIL's trunk pipeline network (16,000+ km), this approach identified 23 under-protected segments between annual surveys that would have been missed until the next CIPS. Three of these showed active corrosion on excavation.

Interference Detection

AC and DC interference from railways, HVDC transmission lines, and neighboring CP systems can cause both corrosion (AC corrosion at >30 A/m²) and CP reading errors (DC stray current makes pipe-to-soil readings unreliable). ML anomaly detection on rectifier output current identifies interference events in real time:

Interference signatures:
  DC stray current: sudden potential shifts correlated with railway timetable
  AC interference: 50 Hz ripple on potential readings, elevated coupon AC density
  Coating shielding: potential readings normal but coupon shows active corrosion

Detection model: Isolation Forest on rectifier current patterns
Alert: potential interference detected → field verification within 48 hours

Leak Detection: SCADA-Based, Acoustic, and Flow Balance Methods

Pipeline leaks range from catastrophic ruptures (immediately obvious) to small seepage (1-5% of flow — detectable only with instrumentation). Indian pipeline regulations (PNGRB Technical Standards) require leak detection systems capable of detecting leaks as small as 1% of flow rate within 2 hours.

Open data/leak-detection-events.json — historical leak and false alarm events with SCADA data snapshots, detection method, detection time, and field verification results.

SCADA-Based Statistical Leak Detection

Traditional CPM (Computational Pipeline Monitoring) uses mass balance: if inlet mass flow minus outlet mass flow exceeds a threshold for a sustained period, alarm. The challenge: measurement noise, temperature transients, and batch changes cause false alarms. Indian pipelines operating with ±0.5% flow meter accuracy generate hundreds of nuisance alarms per month at thresholds sensitive enough to detect 1% leaks.

ML-based leak detection learns the normal operating envelope and detects deviations:

Features (sampled every 30 seconds):
  inlet_flow, outlet_flow, line_pack_rate_of_change
  inlet_pressure, outlet_pressure, intermediate_pressures
  product_temperature_profile (from DTS or discrete sensors)
  pump_status, valve_positions
  ambient_temperature, soil_temperature (if available)

Approach 1 — Autoencoder anomaly detection:
  Train on 6 months of confirmed leak-free operation
  Reconstruction error > threshold → leak alarm
  Advantage: no labeled leak data needed (leaks are rare)

Approach 2 — Physics-informed neural network:
  Encode hydraulic model (pressure drop = f(flow, viscosity, elevation))
  Residuals between model and measured values → leak signature
  Advantage: can estimate leak location from pressure profile

Acoustic Leak Detection

Fiber optic DTS/DAS (Distributed Temperature/Acoustic Sensing) is increasingly deployed on new Indian pipelines. IOCL's Paradip-Haldia pipeline has fiber optic monitoring along its 600+ km length. The fiber converts the pipeline into a continuous microphone — acoustic emissions from leaks produce characteristic spectral signatures.

ML classification on DAS spectral data distinguishes:

Leak signatures: broadband noise centered at leak location, amplitude proportional to leak rate

Third-party interference: construction activity, vehicle crossing, excavation (the most common cause of pipeline damage in India)

Operational noise: pump transients, valve operations, pig passage

A CNN trained on labeled DAS events achieves 95% accuracy in classifying event types, with leak detection sensitivity down to 0.5% of flow rate within 5 minutes — far exceeding PNGRB requirements.

Repair Prioritization Using Fitness-for-Service Assessment

Not every defect requires immediate repair. Fitness-for-service (FFS) assessment — ASME B31G, Modified B31G, RSTRENG, BS 7910, API 579 — determines whether a defect can remain in service until the next planned maintenance window.

AI-Enhanced FFS Screening

Traditional FFS is computationally simple for individual defects but becomes complex at scale: a single ILI run reports 2,000-5,000 anomalies, each requiring assessment against operating conditions, interaction rules (are two defects close enough to interact?), and growth rate projections.

Prioritization model:
  Input per defect:
    depth_pct, length_mm, width_mm, defect_type
    pipe_grade, wall_thickness, diameter
    MAOP, operating_pressure, pressure_cycling_frequency
    growth_rate_mm_per_year (from growth model)
    distance_to_nearest_defect, interaction_flag

  Output:
    remaining_life_years (regression)
    repair_priority: immediate / next_shutdown / monitor (classification)
    estimated_failure_pressure_psi (regression)

  Validation: FEA (Finite Element Analysis) for top 100 critical defects
  Agreement: 94% of ML priority rankings match FEA-based rankings

For GAIL's pipeline network, this approach reduced the annual excavation program from 800 digs to 350 digs while maintaining the same safety level — saving ₹40+ crore annually in excavation and repair costs.

Key Takeaways

ILI defect classification with ML reduces false positive excavations by 40-50% — the ROI is immediate and measurable in reduced dig costs. The critical constraint: must maintain near-100% recall on true corrosion defects.

Continuous CP monitoring between CIPS surveys catches under-protection events — the combination of rectifier RTU data and ML interpolation finds gaps that annual surveys miss.

ML-based leak detection reduces false alarms by 70-80% — the key is learning the normal operating envelope rather than relying on fixed thresholds. Physics-informed models add leak location capability.

AI-enhanced FFS prioritization optimizes the repair budget — not every defect needs immediate attention. ML screening with FEA validation for critical cases gives the optimal repair program at minimum cost.

This is chapter 3 of AI for Oil & Gas / Energy.

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 2: AI for Refinery Process Optimization

Ch. 4: AI for Power Plant Performance Optimization