DATA, ANALYTICS & ARTIFICIAL INTELLIGENCE

DATA, ANALYTICS & ARTIFICIAL INTELLIGENCE

SOCAR Tech's Data, Analytics & AI practice spans from data infrastructure to production AI systems. Data Platforms delivers a centralized Lakehouse ingesting seismic, reservoir, financial, and operational data in real-time and batch modes. Data Governance addresses ownership, quality, lineage, and retention across nine common enterprise data problems. ML models cover product quality prediction (T95, RON, RVP) and refinery optimization. A Should Cost prediction model serves procurement with ~8.6% SMAPE accuracy.

Data Platforms & Engineering

Enterprise Data Platform & Industrial Data Collection Framework

Category: Data Engineering   
Usage: Enterprise-wide; Industrial data integration

Centralized Lakehouse-based data platform supporting analytics, ML models, and AI workloads. Unified data collection from production historian systems with real-time and batch modes. Built on open and decoupled architecture with horizontal scalability, single source of truth, fast and efficient processing, security, and governance.
•    Financial corporate data, industrial data, wells information, reservoir data, emission data, seismic data, oil & gas production
•    Supports dashboards & analytical workloads
•    Machine learning models
•    AI and LLM workloads 

Data Analytics & Business Intelligence

Business Intelligence Dashboards, Automated Reporting & Operations Visualization

Category: Data Analytics     
Usage: Enterprise-wide

Organizational performance monitoring and KPI tracking. Scheduled email reporting with KPI tracking, dashboard screenshots, and Excel exports. Unified UI platform consolidating upstream to downstream operations with KPI visibility.
•    Interactive visualizations with drill-down and filtering
•    Near real-time data refresh
•    Self-service analytics
•    Operational visibility: immediate detection of performance trends and anomalies
 

Real-Time Operations Monitoring & Safety Compliance Tools

Category: Data Analytics    
Usage: Downstream operations

Real-time monitoring solutions for flare tracking, safety compliance, and process alerts.

Data Governance

Data Governance Framework

Category: Data Governance   
Usage: Enterprise-wide

Comprehensive framework covering data strategy, accountability, quality, security, and retention policies. Addresses the nine most common data issues organizations face — from unclear data location and meaning to unauthorized access and long access processes.
•    Data Management Strategy — direction setting and investment focus
•    Data Accountability — ownership definition and responsibility
•    Data Prioritization — focus on highest-value data assets
•    Metadata & Data Lineage — describing what data exists and where it comes from
•    Data Quality — monitoring, risk management, and proactive improvement
•    Data Security — classification and protection of sensitive assets
•    Data Retention — aging policies and lifecycle governance
•    Processes, Standards & Policies — governance operating model
 

Machine Learning & Advanced Analytics

Refinery Product Quality Prediction & Process Unit Optimization

Category: Data Science – ML & Optimization   
Usage: Downstream refinery operations

Machine learning models predicting product quality parameters (T95, RON, RVP) in real-time to optimize refinery blending operations. Optimization models for blending different refinery streams to meet product specifications at minimum cost. Predictive and optimization models for refinery process units (hydrocracking, coking, reforming, hydrogen generation).

Energy Network & Plant Efficiency Optimization

Category: Data Science – ML & Optimization   
Usage: Downstream petrochemical operations

Linear optimization models for plant energy networks including steam and electricity balancing. AI-based recommendation engine identifying operational drivers of energy consumption variability.

AI Products & Intelligent Systems

SOCAR LLM — On-Premises Chatbot Infrastructure

Category: Data Science – Deep Learning  
Usage: Carbamide, Caspian AI Institute, Procurement

Unique on-premises chatbot infrastructure for SOCAR Group addressing the ‘Lost in Documentation’ challenge. 
•    Fully on-premises — no cloud dependency
 

Gauge Reading Detection — Automated Industrial Gauge Monitoring

Category: Data Science – Deep Learning   
Usage: Downstream – Carbamide; Scalable to any downstream asset

AI-based automated gauge reading solution eliminating manual on-site collection. Reduces hazard exposure and provides real-time data fed seamlessly into central analytics platforms.
•    Gauge detection model identifying instruments in field imagery
•    Needle keypoint-based reading system for accurate value extraction
•    Automated gauge monitoring platform with shift-based assignment and reporting
 

First Break Picking QC Automation

Category: Data Science – Deep Learning
Usage: Upstream / Exploration

Autonomous AI agent working within industry-standard seismic processing software to automate routine quality control in the First Break Picking (FBP) phase. Addresses the 40% of FBP time spent on QC on surveys that can take up to 6 months to process.
•    Interaction agent converts raw geophysical data into status updates
•    Synthetic geophysicist identifies trends, anomalies, and scores FBP images
 

Oil Pump Pattern Analysis — Dynamogram Classification

Category: Data Science – Deep Learning   
Usage: Upstream – Oil Production

Automated pump state analysis system classifying pump operation states using shape recognition and statistical features. Eliminates daily manual inspection of hundreds of well curves by field engineers.
•    Shape classification: gas interference, valve leakage, sucker rod breaks, insufficient liquid supply
•    Multi-format automated reports
•    Automated email alerts for abnormal conditions with confidence scores
•    Monitoring of hundreds of wells simultaneously without additional manpower
 

Data Processing — Daily Drilling Reports

Category: Data Science
Usage: Upstream – Drilling

AI agent pipeline automating extraction of daily drilling report data from unstructured Excel files, overcoming challenges that conventional RPA tools cannot solve.
•    Handles merged cells, free-form comments, unstructured sections, and inconsistent Excel formatting
•    Pipeline: preprocessing & CSV conversion → LLM prompt construction → LLM extraction → final mappings and data unification
•    Processes 37,000+ daily drilling reports

 

AI-Driven Business Applications

Should Cost Prediction — AI-Powered Procurement Price Forecasting

Category: Data Science – AI / Procurement   
Usage: SOCAR Group Procurement; Materials management

AI-powered time-series price forecasting model providing an approximate ‘should cost’ for a material given its purchase history and economic conditions — enabling procurement teams to negotiate from an informed, data-driven position. Replaces the traditional Weighted Moving Average (WMA) method with enriched, machine-learning-driven forecasts.
•    Incorporates economic indicators: CPI (Consumer Price Index), PPI (Producer Price Index), and LCI (Labour Cost Index)
•    Uses public commodity data: Gold, Brent crude, Nickel, Steel, and 7 additional series
•    Learns complex, non-linear patterns beyond simple weighted averages
•    Performs well even with few or no recent purchase observations
•    AI-powered time-series forecasting via Machine Learning models and Neural Networks
•    Output: price forecast with ±2x MAE confidence interval bands
•    Demonstrated test accuracy: ~8.6% SMAPE on held-out test data