8 min

AI for Geotechnical Engineering & Foundation Design

Soil Classification from SPT Data, Foundation Selection, Slope Stability, and Problem Soil Challenges

AI-Enhanced Soil Classification from SPT and Lab Data

Standard Penetration Test data is the backbone of geotechnical investigation — ASTM D1586 governs the procedure, and nearly every foundation design starts with an SPT borehole log. The interpretation, however, involves significant engineering judgment. Two geotechnical engineers looking at the same SPT log often disagree on soil layer boundaries, classification, and design parameters. AI formalizes the interpretation process while learning from thousands of historical boreholes.

Open data/soil-investigation-data.csv in the code panel. Each row is an SPT record: borehole_id, project_id, location_lat, location_lng, depth_m, spt_n_value, spt_n60_corrected, soil_description_field, soil_classification_uscs (ASTM D2487), liquid_limit, plastic_limit, grain_size_gravel_pct, grain_size_sand_pct, grain_size_fines_pct, moisture_content_pct, unit_weight_kn_m3, ucs_kpa (if rock/stiff clay), geological_formation.

From SPT N-values to Soil Properties

The classical correlations (Terzaghi-Peck for sands, Stroud for clays) are first-order approximations. They were developed on a limited set of soils and systematically over- or under-predict for problematic regional deposits:

Soil Type	Classical Correlation Issue
Expansive clay (e.g., Texas/Colorado)	SPT N = 2-8 but swelling pressure is the real design driver, not bearing capacity
Residual/saprolite (Piedmont)	High N-values (20-40) but relic structure softens on saturation — N-value misleading
Soft marine/bay clay (SF Bay, Boston)	Very low N (0-2) with high sensitivity — remoulded strength is 20-30% of undisturbed
Glacial outwash / till (Midwest, NE)	SPT refusal in cobble layers interbedded with loose sand — erratic N-profile
Weathered basalt / volcanic	Highly variable weathering — N ranges from 5 (completely weathered) to refusal in 1m vertical distance

A Random Forest classifier trained on 15,000+ SPT records from 800 projects classifies soil type (USCS groups: GW/GP/GM/GC/SW/SP/SM/SC/ML/CL/MH/CH) with 89% accuracy — compared to 72% for rule-based classification using only N-value and depth. The improvement comes from incorporating regional geological context (geological_formation as a feature) and grain size distribution when available.

Automated Borehole Log Interpretation

The more valuable application: automated identification of soil layer boundaries and generation of idealized soil profiles. A 1D convolutional neural network (1D-CNN) processes the N-value profile (N vs depth) along with available lab data and identifies layer transitions. On a test set of 200 boreholes where experienced geotechnical engineers manually defined layer boundaries, the CNN agreed with the engineer's interpretation 84% of the time — and on the 16% disagreement cases, the CNN's layering was judged "equally valid" by a third engineer in 60% of cases.

Foundation Type Selection: Correlating Soil Profile with Structural Loads

Foundation selection follows a well-established decision tree: isolated footings for light loads on competent soil, mat/raft for heavy loads or variable soil, deep foundations when bearing stratum is deep. But the "right" choice depends on the interaction between soil profile, structural loads, differential settlement tolerance, and site-specific constraints (water table, adjacent structures, construction access).

Open data/foundation-design-data.json — it contains: project_id, structure_type, column_loads_kn (array), soil_profile (array of layers with classification, depth, properties), water_table_m, selected_foundation_type, design_bearing_capacity_kpa, estimated_settlement_mm, pile_type (if applicable), pile_length_m, pile_diameter_mm.

Decision Support Model

A classification model (XGBoost) trained on 2,000+ foundation design records predicts the optimal foundation type:

Input features:
  max_column_load_kn
  load_variability (max/min column load ratio)
  bearing_stratum_depth_m (depth to first layer with N > 15 for clay, N > 30 for sand)
  soil_variability_index (coefficient of variation of N-values in top 10m)
  water_table_depth_m
  structure_type (residential/commercial/industrial/bridge)
  differential_settlement_limit_mm
  site_access_constraints (boolean flags: drill_rig_access, dewatering_feasible)
  seismic_design_category (ASCE 7 SDC A-F)

Output classes:
  spread_footing, combined_footing, mat_raft, drilled_shaft,
  driven_precast_pile, micropile, caisson

The model achieves 86% agreement with experienced foundation designers. The 14% disagreement is concentrated in borderline cases (mat vs piled-raft, isolated vs combined) where either choice is defensible. The model's value is not replacing judgment — it is flagging cases where the chosen foundation type is unusual for the given soil-load combination, prompting a second review.

Soft Bay Clay: A Special Case

For projects on soft marine/bay deposits (San Francisco Bay Mud, Boston Blue Clay), the model has a dedicated sub-model. The soft clay (CH classification, liquid limit 80-120%, sensitivity 4-8) can extend to 15-25m depth. The sub-model learns that:

Spread footings fail differential settlement checks even for 2-story structures

Mat foundations require surcharge pre-loading for 6-12 months (consolidation settlement of 200-400mm)

Drilled shafts or driven piles to the underlying dense sand or bedrock at 20-30m depth are the default choice for structures above 5 stories

Bearing capacity must use consolidated-undrained (CU) parameters, not unconsolidated-undrained (UU) — using UU overestimates allowable bearing by 30-40%

Slope Stability Analysis with Seismic Loading

Slope failures cause significant damage and casualties — the Pacific coast ranges, the Rockies, the Appalachians, and the Alpine regions of Europe are high-risk zones. ASCE 7 / Eurocode 8 seismic hazard mapping determines the pseudo-static seismic coefficient for slope stability analysis. But pseudo-static analysis with a single seismic coefficient is conservative for some slopes and unconservative for others — depending on slope height, soil type, and ground motion characteristics.

Open data/slope-stability-analysis.json — it contains: slope_id, location, slope_height_m, slope_angle_deg, soil_layers (with c, phi, gamma for each), water_table_condition, seismic_hazard_pga, failure_mode_predicted, fos_static, fos_pseudo_static, fos_newmark_displacement_mm, actual_failure_observed.

ML-Enhanced Slope Stability Screening

For regional slope hazard assessment (highway alignment selection, subdivision planning), running detailed FOS analysis on every slope is impractical. A screening model trained on 5,000+ slope stability analyses (Fellenius, Bishop, Morgenstern-Price) plus observed failure/no-failure data:

Feature	SHAP Importance
slope_angle_deg / friction_angle_deg ratio	0.24
saturation_condition (dry/partial/full)	0.19
seismic_hazard_pga	0.14
slope_height_m	0.12
cohesion_kpa	0.11
geological_formation	0.10
rainfall_intensity_mm_hr (design storm)	0.06
vegetation_cover	0.04

The screening model classifies slopes as stable (FOS > 1.5), marginal (1.0-1.5), or unstable (< 1.0) with 91% accuracy. For highway projects in mountainous corridors (Pacific coast routes, Alpine passes), the screening reduces detailed analysis requirements by 60% — focusing geotechnical effort on the 40% of slopes that are marginal or unstable.

Rainfall-Triggered Failure Prediction

Many slopes fail primarily during intense rainfall — not during earthquakes. The failure mechanism: infiltration raises the water table, reducing effective stress and hence shear strength. A threshold model:

Antecedent rainfall threshold:
  3-day cumulative rainfall > 200mm AND
  Daily intensity > 100mm/day AND
  Antecedent moisture index (30-day weighted rainfall) > 300mm

Combined with slope susceptibility (from screening model):
  High susceptibility + threshold exceeded → 78% probability of failure
  Medium susceptibility + threshold exceeded → 23% probability
  Low susceptibility + threshold exceeded → 4% probability

This is operationalized for highway maintenance: when a NOAA/national weather service nowcast exceeds the threshold for a highway section with known high-susceptibility slopes, traffic advisories and preventive closures can be triggered.

Problem Soil Challenges: Regional Considerations

Expansive Clays

Found across the US Gulf Coast, Front Range, and arid Southwest, and in over-consolidated clays in the UK and Mediterranean. Free swell potential can be 50-150%. The AI model learns that for expansive soil, the design is governed by swell pressure (50-300 kPa) rather than bearing capacity. Drilled piers with void boxes or post-tensioned slabs (per PTI design guidance) are standard solutions — the model recommends pier depth based on depth to the non-active (constant-moisture) zone.

Residual and Saprolitic Soils

Piedmont residual soils and tropical lateritic soils show high in-situ N-values but degrade on saturation and disturbance. The model incorporates a "saturation sensitivity factor" — the ratio of saturated to natural moisture content strength — as a critical feature. For these soils, allowable bearing from N-value correlations overpredicts by 25-35% compared to plate load test results.

Alluvial and Liquefiable Deposits

River valley and coastal alluvium — interbedded sand and clay layers with shallow water tables. The AI model identifies liquefiable layers (clean sand with N60 < 15, water table within 3m, moderate-to-high seismic hazard) using the simplified Seed-Idriss procedure adopted in ASCE 7 / Eurocode 8 — and flags projects where liquefaction assessment is mandatory but has been skipped in the geotechnical report.

Key Takeaways

SPT interpretation benefits from regional geological context — problem soils deviate systematically from textbook correlations. ML models trained on local data outperform global correlations by incorporating geological formation and index properties.

Foundation selection AI is a second-opinion tool — it does not replace engineering judgment but flags unusual choices that warrant review. The value is quality assurance, not automation.

Slope stability screening at regional scale is the high-leverage application — detailed analysis of every slope is impractical. ML screening focuses geotechnical effort where it matters most.

Soil-specific challenges (expansive clay, residual soils, soft marine clay) require sub-models — a single national model underperforms regional models that encode local failure mechanisms and design practices.

This is chapter 3 of AI for Civil & Infrastructure (Global).

Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.

View course details

Ch. 2: AI for Construction Project Scheduling & Cost Estimation

Ch. 4: AI for Transportation & Traffic Engineering