9 min

AI for Reservoir Engineering & Production Optimization

Decline Curve Analysis, Water Breakthrough Prediction, and ESP Optimization with ML

Beyond Arps: AI-Driven Decline Curve Analysis

Every petroleum engineer learns Arps decline models in year one — exponential, hyperbolic, harmonic. They work reasonably well for conventional reservoirs under boundary-dominated flow. They fail predictably in tight reservoirs, fractured carbonates, and wells with changing operating conditions (artificial lift changes, workover interventions, offset well interference).

The fundamental limitation: Arps assumes a single decline regime with constant b-factor. Real production data from ONGC Mumbai High or Cairn Rajasthan shows regime transitions — transient flow shifting to boundary-dominated, water breakthrough causing accelerated decline, ESP frequency changes creating step-changes in rate.

ML-based decline models learn these regime transitions from historical analogs. A Long Short-Term Memory (LSTM) network trained on 500+ wells from the same formation captures:

Regime detection — automatic identification of flow regime transitions without manual well test interpretation

Analog matching — finding the 10-20 most similar wells based on completion, reservoir properties, and early production to forecast the subject well

Operating condition encoding — ESP frequency, choke size, and waterflood pattern as input features that Arps cannot incorporate

Method	Inputs	P90-P10 Range (6-month forecast)	Failure Mode
Arps Hyperbolic	Rate vs time	±35%	Overpredicts tight reservoirs, ignores interventions
Modified Arps + Duong	Rate vs time + material balance time	±25%	Still single-regime
LSTM with Analog	Rate, pressure, watercut, ESP freq, completion data	±15%	Needs 50+ analog wells
Physics-Informed Neural Network	Rate + reservoir simulation constraints	±12%	Computationally expensive for real-time

Open data/well-production-data.csv in the code panel. Each row is a monthly production record: well_id, date, oil_rate_bopd, water_rate_bwpd, gas_rate_mscfd, flowing_bhp_psi, esp_frequency_hz, choke_size_64ths.

Water Cut Prediction and Breakthrough Detection

Water management is the single largest operating cost in mature Indian oilfields. Mumbai High produces 85%+ water cut on many wells. Mangala field in Rajasthan hit water breakthrough faster than expected after polymer flood initiation. The difference between predicting breakthrough 3 months early versus reacting after it happens is the difference between proactive well management and expensive workover campaigns.

Feature Engineering for Water Breakthrough

The key features are not just production rates — they are rate-of-change features and inter-well interference signals:

Primary features:
  watercut_trend_30d, watercut_trend_90d (slope of watercut vs time)
  oil_rate_derivative (dq/dt normalized)
  hall_plot_slope (cumulative injection vs cumulative pressure — injector performance)
  voidage_replacement_ratio (injection volume / production volume per pattern)

Inter-well features:
  offset_injector_rate_change_30d
  pattern_watercut_average
  distance_to_nearest_injector
  tracer_breakthrough_flag (if available)

A gradient boosted classifier (XGBoost) trained on 200+ wells from ONGC's western offshore fields predicts water breakthrough (defined as watercut increase > 5% in 60 days) with 82% precision and 78% recall at a 90-day horizon. The model's SHAP values consistently rank hall_plot_slope and voidage_replacement_ratio as the top features — confirming what reservoir engineers intuit: injector performance drives producer water breakthrough.

Cairn Rajasthan: Mangala-Bhagyam-Aishwariya

The MBA fields in Barmer Basin present a unique challenge: the Fatehgarh sandstone reservoir has high permeability channels (>1 Darcy) interspersed with tight zones. Polymer flood was implemented to improve sweep efficiency, but the high-permeability channels acted as thief zones. AI models trained on the first 50 wells that experienced breakthrough predicted channel-dominated breakthrough in the next 30 wells with 75% accuracy — enabling preemptive polymer concentration adjustments.

ESP Optimization from Sensor Data

Electric Submersible Pumps are the workhorse of Indian offshore production. Mumbai High alone has 400+ ESP installations. Each ESP has intake pressure, discharge pressure, motor temperature, vibration, and current sensors reporting every 15 minutes. That is 35,000+ data points per well per day.

Failure Prediction

ESP failures cost ₹1-2 crore per event (rig mobilization + lost production). Mean time between failures (MTBF) for Indian offshore ESPs is 500-800 days. Predicting failures 30-60 days in advance enables planned workovers during scheduled rig campaigns.

Open data/reservoir-properties.json — it contains static reservoir properties (porosity, permeability, net pay, fluid properties) linked to each well in the production dataset.

The failure prediction model uses rolling statistical features:

Features (computed over 7d, 14d, 30d windows):
  motor_temp_mean, motor_temp_std, motor_temp_max
  vibration_rms, vibration_peak, vibration_crest_factor
  current_imbalance (phase A-B-C deviation)
  intake_pressure_trend
  discharge_pressure_trend
  pump_efficiency (calculated: hydraulic power / electrical power)

Target: failure_within_60_days (binary)
Model: Random Forest with time-series cross-validation
Performance: AUC 0.87 on ONGC western offshore validation set

Frequency Optimization

ESP frequency directly controls production rate but also affects power consumption, gas handling, and pump life. The optimization problem: maximize oil rate while maintaining motor temperature < 150°C, intake pressure > 200 psi (above bubble point to avoid gas locking), and vibration < threshold.

A Bayesian optimization loop adjusts ESP frequency in 0.5 Hz increments, using a Gaussian process surrogate trained on historical frequency-response data for each well. At ONGC's Heera platform, this approach increased average oil rate by 8% while reducing power consumption by 12% — the classic "you were running your pumps wrong" finding that AI makes obvious.

Reserves Estimation: Probabilistic Methods Enhanced by ML

Reserves estimation is the most consequential calculation in petroleum engineering — it determines asset valuation, development decisions, and SEC/DGH reporting. Traditional methods: volumetric (OOIP × RF), material balance, and decline curve analysis. Each has uncertainty ranges that are typically handled by Monte Carlo simulation over input distributions.

Open data/decline-curve-analysis.json — it contains fitted decline parameters (qi, Di, b) and EUR estimates for each well, along with uncertainty ranges.

ML-Enhanced Volumetric Estimation

The ML contribution is not replacing the volumetric calculation — it is improving the input distributions. Specifically:

Net pay prediction from well logs using ML (replacing manual picks that vary by interpreter)

Porosity-permeability transforms using Random Forest instead of core-calibrated linear regression

Recovery factor prediction from analog fields using gradient boosted regression on reservoir parameters (depth, permeability, oil viscosity, drive mechanism, well spacing)

For Oil India's Assam fields (Nahorkatiya, Moran, Lakwa), where 60+ years of production history exist, the ML-enhanced volumetric method reduced P90-P10 range by 30% compared to traditional Monte Carlo — because the input distributions are informed by data rather than expert guesses.

DCA-Based EUR with Uncertainty

LSTM decline models produce probabilistic forecasts — each forward pass with dropout generates one realization. Running 1,000 passes gives a distribution of EUR. This is mathematically equivalent to Bayesian inference over decline parameters but without assuming a parametric decline model.

For KG Basin deepwater wells (ONGC's KG-DWN-98/2 block), where only 3-5 years of production history exist for each well, the LSTM approach produced tighter EUR distributions than Arps fitting because it borrowed information from analog deepwater wells globally.

Key Takeaways

Decline curve analysis with ML beats Arps in complex reservoirs — regime transitions, operating condition changes, and inter-well interference are captured by sequence models (LSTM, Transformer) but invisible to parametric methods.

Water breakthrough prediction is an inter-well problem — the best features come from injector performance (Hall plot, VRR) and pattern-level data, not just individual well production.

ESP optimization has immediate ROI — sensor data already exists, the models are straightforward (Random Forest, Bayesian optimization), and the payoff is measurable in weeks (higher rate, lower power, fewer failures).

Reserves estimation benefits from ML at the input level — better net pay picks, better poro-perm transforms, better recovery factor analogs reduce uncertainty in volumetric and DCA-based EUR calculations.

This is chapter 1 of AI for Oil & Gas / Energy.

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 2: AI for Refinery Process Optimization