Loading Runway...
Loading Runway...
Evidence-backed analysis across 20 specific tasks. Capability claims sourced from peer-reviewed research, independent benchmarks, and industry data. Adoption rates tracked by industry and company size.
At a glance
Early Signal intelligenceTasks tracked
Signals in database
Intelligence confidence
Last updated
AI Exposure
Defensibility
Avg Capability
20/20 tasks with evidence
Avg Deployment
177 evidence sources
What's changing for Data Analysts
Data Analyst hiring has softened from its 2021–2022 peak but remains structurally healthy. The role is bifurcating: pure report-pulling positions are contracting as self-serve BI tools mature and LLM-assisted querying lowers the floor for business users. Simultaneously, demand is growing for analysts who combine SQL and Python proficiency with sharp business judgment — roles that effectively function as embedded decision-support partners to product, commercial, or ops teams. Compensation premiums are concentrating in tech, fintech, and healthcare; mid-market firms are increasingly hiring one or two senior analysts rather than tiered teams. Looker and dbt knowledge now appear in a majority of senior job postings in tech verticals, signalling a shift toward analysts owning transformation logic rather than relying on engineering pipelines. Generative AI tools (GitHub Copilot, ChatGPT Code Interpreter) are accelerating output on SQL generation and EDA, raising the expected throughput per analyst rather than eliminating the role. Analysts who cannot translate findings into business recommendations — and defend them in the room — are losing ground to those who can.
Synthesised by claude-sonnet-4-6 · refreshed May 21, 2026
Capability dimensions
How the dimensions of this role are being reshaped by AI · top 8 by weight
Quantitative Reasoning
Insight Generation
Data Quality Judgment
Metric Definition
Data Modelling & Transformation
Data Storytelling
Structured Analysis
Data Acquisition Judgment
Market Context
ChatGPT Advanced Data Analysis (Code Interpreter), Tableau AI, Snowflake Cortex AI, and Databricks Genie now handle natural language querying, automated EDA, dashboard generation, and standard reporting. 'What happened' descriptive analytics is near-fully automatable. Agentic data loops (evaluate → adjust → re-run) make the traditional analyst bottleneck largely avoidable for standard business questions. The 'so what' layer — connecting data to strategic decisions — remains human-critical. BLS projects BI analyst roles declining while ML engineer roles grow 23% through 2032.
Source: Based on BLS Occupational Outlook 2025, Tableau AI adoption survey 2025, McKinsey Analytics Benchmark 2025, and Snowflake Cortex AI feature release data.
Task breakdown
Top 3 per pressure tier · expand for the full list
Medium automation pressure · 9
SQL Query Writing
Demonstrates agentic approach to SQL generation that improves upon standard LLM capabilities through iterative refinement and error correction.
Exploratory Data Analysis
DeepSeek V4 can perform data analysis tasks at a level competitive with leading US AI systems
Data Model Documentation
Integration of real Street View data into world models improves robotic environment understanding and generalization to real-world spaces.
ETL & Data Pipeline Development
Embedded agentic AI managing distributed data pipeline failures autonomously indicates incremental improvement in computer_use capability for complex system management.
Data Cleaning & Preparation
LLMs demonstrate strong capability in generating pandas and data transformation code for cleaning tasks including type conversion, missing value imputation, deduplication, and format standardisation. The Anthropic Econom…
Low automation pressure · 11
Data Quality Monitoring
LLMs and AI systems demonstrate strong capability in rule-based data quality checks including schema validation, null detection, type checking, range validation, and statistical anomaly detection. Traditional ML anomaly …
Data Source Evaluation
Self-service analytics agent enables autonomous query generation and analysis across distributed data sources, improving reasoning capability in data interpretation workflows.
Role Defensibility Profile
Higher = harder to automate
Task-Level Analysis — 20 Tasks
Design and build interactive dashboards and recurring reports in tools like Tableau, Power BI, or Looker that surface key metrics and enable self-service data exploration by stakeholders.
Highest Exposure Areas
Analysis / Reporting
Standard analysis and reporting is already being absorbed by AI at the enterprise level. McKinsey notes analysis tasks among the sharpest automation increases. The defensible remainder is interpretation requiring proprietary context — that window is closing.
Hands-On Technical Execution
41% of code written in 2025 is AI-generated. The defensible technical work is system architecture, novel problem-solving, and integration of AI tools — not execution of known patterns. Standard technical execution is being absorbed at an accelerating rate.
Writing / Summarising / Documentation
GPT-5 Deep Research and Claude already produce publication-quality reports, emails, and documentation. By 2027, AI writing assistants will handle first-draft creation for virtually all standard business documents with minimal human input.
Strongest Defenses
Analysis / Reporting
Standard analysis and reporting is already being absorbed by AI at the enterprise level. McKinsey notes analysis tasks among the sharpest automation increases. The defensible remainder is interpretation requiring proprietary context — that window is closing.
Hands-On Technical Execution
41% of code written in 2025 is AI-generated. The defensible technical work is system architecture, novel problem-solving, and integration of AI tools — not execution of known patterns. Standard technical execution is being absorbed at an accelerating rate.
Customer / Stakeholder Communication
AI agents are now handling routine customer communication autonomously. The protection in this task comes from novel relationship context and trust — which erodes when your client interactions become standardised or when AI gains sufficient context to replicate the pattern.
See data analysts by industry
Same role, different industry-specific exposure profiles.
Pick another role to see a side-by-side AI disruption comparison. The URL you land on is shareable.
Live signals
Real-time AI signals affecting this role
Compare roles
See how other roles compare
What this means for data analysts
The role-average exposure profile above is built on early signals — directionally useful but not yet corroborated across independent sources. Your specific task mix and tooling matter more than the role average here. Get a personal task-level breakdown rather than relying on the headline number.
How we build role intelligence
Runway maintains an atomic task taxonomy (20 tasks tracked for Data Analyst) anchored to O*NET occupational data. Per-task signals enter through tier-graded connectors (peer-reviewed papers, statutory labour data, vendor benchmarks, preprints) and pass through the Sentinel auditor — every claim is rubric-scored, cross-checked, and confidence-graded before it can affect a role page. The narrative and task breakdown above are computed from that ledger; nothing is synthesised from first principles. See /methodology for the full pipeline.
Confidence level: Early Signal — based on 0 validated signals for this role across the Sentinel-graded sources we track.
Statistical Analysis
LLMs can perform standard statistical analyses including regression, hypothesis testing, ANOVA, and correlation analysis by generating correct code in Python/R. The Stanford HAI AI Index 2024 documents strong LLM perform…
Peer Methodology Review
GitHub's updated impact study shows 46% of all code is now AI-generated among Copilot users, with 82% developer satisfaction. For tasks like Peer Methodology Review, AI coding assistants demonstrate 66% quality on routin…
Tool & Pipeline Maintenance
Demonstrates practical tool integration for domain-specific data access but no deployment scale metrics provided.
Ad-Hoc Analysis Requests
Gemini in Google Sheets achieved state-of-the-art performance for analyzing complex data and could automate many spreadsheet-based workflows
A/B Test Analysis
GitHub's updated impact study shows 46% of all code is now AI-generated among Copilot users, with 82% developer satisfaction. For tasks like A/B Test Analysis, AI coding assistants demonstrate 69% quality on routine impl…
Insight Narrative Writing
Codex shows capability to structure analytical findings and generate data-driven memos at scale, representing minor incremental improvement in analytical reasoning workflows.
Dashboard & Report Building
AI agents can compromise software supply chains through tools like LiteLLM
Cross-Functional Data Support
Google Finance expansion demonstrates incremental improvement in multimodal financial reasoning across European markets with localization support.
Forecast Modelling
LLMs can generate code for standard time series forecasting methods (ARIMA, Prophet, exponential smoothing) and assist with feature engineering for predictive models. The Stanford HAI AI Index 2024 documents improving AI…
Data Governance & Compliance
AI agent can perform executive-level governance tasks such as policy authoring and management
Metric Definition & Alignment
AI can analyze and align evaluation metrics to better reflect authentic model capabilities rather than benchmark gaming
Presentation of Findings
Enhanced language modeling could improve AI-assisted generation of clear, structured presentations of analytical findings
Stakeholder Requirement Gathering
AI agents can perform routine data gathering tasks autonomously in business contexts
Capability Evidence
Perform simulation and optimization tasks in building automation and energy management
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Dashboard & Report Building represent a significant category of AI-augmented work...
Eloundou et al. classify report and document creation as high-exposure tasks (E2 category), where LLMs with tool access can reduce time by at least 50%. However, dashboard building involves iterative ...
Deployment by Industry
Write and optimise SQL queries to extract, aggregate, and join data from relational databases and data warehouses for analysis, reporting, and ad-hoc investigations.
Capability Evidence
Demonstrates agentic approach to SQL generation that improves upon standard LLM capabilities through iterative refinement and error correction.
Neural-symbolic logic query answering could improve reasoning over incomplete knowledge graphs, potentially enhancing complex SQL query construction and optimization
The Anthropic Economic Index identifies SQL and database query generation as among the most frequent coding tasks performed by Claude in professional settings. Programming and code generation — includ...
Deployment by Industry
Clean, transform, and standardise raw data from multiple sources — handling missing values, deduplication, format inconsistencies, and schema alignment to produce analysis-ready datasets.
Capability Evidence
LLMs demonstrate strong capability in generating pandas and data transformation code for cleaning tasks including type conversion, missing value imputation, deduplication, and format standardisation. ...
ChatGPT can create data visualizations from datasets
AltimateAI assists with data cleaning and preparation tasks as part of its comprehensive data engineering harness
Deployment by Industry
Investigate datasets to identify patterns, anomalies, distributions, and correlations — forming initial hypotheses and identifying promising directions for deeper analysis.
Capability Evidence
Perform geospatial analysis beyond vector-only limitations
Enhanced reasoning on incomplete knowledge graphs could improve AI's ability to explore and discover patterns in incomplete datasets
LLMs can generate standard exploratory data analysis workflows including summary statistics, distribution plots, correlation matrices, and outlier detection. However, identifying genuinely novel or bu...
Deployment by Industry
Respond to time-sensitive, one-off analytical requests from stakeholders — quickly pulling data, running calculations, and delivering concise answers to specific business questions.
Capability Evidence
LLMs can handle well-specified ad hoc data questions by generating appropriate SQL queries, running calculations, and producing summary results. The Anthropic Economic Index shows data analysis and qu...
Eloundou et al. classify data analysis tasks as having moderate-to-high LLM exposure (E1/E2), noting that LLMs can reduce time on structured analytical tasks but that ad hoc requests often require und...
Amazon Bedrock multimodal models enable automated video insights extraction for specific business questions that previously required human reviewers
Deployment by Industry
Present analytical results to stakeholders and leadership — creating slide decks, leading walkthroughs, answering questions, and defending methodology and conclusions in real time.
Capability Evidence
The Anthropic Economic Index shows minimal professional AI usage for tasks requiring physical presence, live interaction, and social persuasion. Presentation delivery combines embodied communication, ...
AI agent can autonomously perform multi-step presentation tasks in PowerPoint
Enhanced language modeling could improve AI-assisted generation of clear, structured presentations of analytical findings
Deployment by Industry
Meet with business stakeholders to understand their analytical needs — translating vague business questions into specific, answerable data questions with defined scope and success criteria.
Capability Evidence
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Stakeholder Requirement Gathering represent a significant category of AI-augmente...
The Anthropic Economic Index shows that interpersonal, relationship-dependent professional tasks represent a minimal share of AI usage. Requirement gathering involves trust-building, reading implicit ...
LLMs can assist with structuring requirement documents and generating question templates for stakeholder interviews, but the core task of eliciting unstated needs, navigating organisational politics, ...
Deployment by Industry
Translate analytical findings into clear, written narratives with business context — explaining what the data shows, why it matters, and what actions it suggests, for non-technical audiences.
Capability Evidence
Codex shows capability to structure analytical findings and generate data-driven memos at scale, representing minor incremental improvement in analytical reasoning workflows.
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Insight Narrative Writing represent a significant category of AI-augmented work. ...
LLMs demonstrate strong capability in drafting structured analytical narratives from data findings, producing well-organised executive summaries, key takeaway sections, and recommendation frameworks. ...
Deployment by Industry
Apply statistical methods — hypothesis testing, regression analysis, significance testing, confidence intervals — to validate findings and quantify relationships in data.
Capability Evidence
Perform geospatial analysis beyond vector-only limitations
LLMs can perform standard statistical analyses including regression, hypothesis testing, ANOVA, and correlation analysis by generating correct code in Python/R. The Stanford HAI AI Index 2024 document...
Can perform genomics analysis
Deployment by Industry
Design, monitor, and analyse A/B tests and experiments — calculating sample sizes, checking statistical significance, identifying segment-level effects, and recommending ship/no-ship decisions.
Capability Evidence
LLMs can correctly perform standard A/B test significance calculations, compute confidence intervals, and generate analysis code for common experimental designs. The Stanford HAI AI Index 2024 documen...
Eloundou et al. classify quantitative analysis tasks including experimental analysis as having moderate-to-high LLM exposure. Standard statistical test execution is well within LLM capability, but the...
GitHub's updated impact study shows 46% of all code is now AI-generated among Copilot users, with 82% developer satisfaction. For tasks like A/B Test Analysis, AI coding assistants demonstrate 69% qua...
Deployment by Industry
Monitor data pipelines and sources for quality issues — detecting schema changes, missing data, unexpected nulls, anomalous values — and escalating or fixing problems before they affect downstream analysis.
Capability Evidence
Real-time verification system for RAG systems can automatically verify document-based responses and citations, reducing manual verification work for data quality monitoring
AIDABench provides evaluation standards for document understanding and processing that could improve assessment of data quality in document-based datasets
A systematic literature review of LLMs for code review found that AI detects 30-60% of code defects identified by human reviewers. For tasks like Data Quality Monitoring, AI-assisted review achieves a...
Deployment by Industry
Provide analytical support across multiple teams — helping marketing, product, finance, and operations answer data questions, validate assumptions, and make data-informed decisions.
Capability Evidence
Google Finance expansion demonstrates incremental improvement in multimodal financial reasoning across European markets with localization support.
Multi-agent LLM systems can provide training support for behavioral health professionals
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Cross-Functional Data Support represent a significant category of AI-augmented wo...
Deployment by Industry
Document data models, table definitions, field mappings, and data lineage — maintaining a shared understanding of what data exists, where it comes from, and how it should be used.
Capability Evidence
Integration of real Street View data into world models improves robotic environment understanding and generalization to real-world spaces.
AI systems can generate code for complex, multi-panel visualizations from real-world data using vision-language models
Fine-tuned large language model can automate systematic review screening by reviewing titles and abstracts for inclusion decisions
Deployment by Industry
Build and maintain forecasting models for business metrics — revenue projections, demand forecasting, churn prediction — using time series analysis and regression techniques.
Capability Evidence
LLMs can generate code for standard time series forecasting methods (ARIMA, Prophet, exponential smoothing) and assist with feature engineering for predictive models. The Stanford HAI AI Index 2024 do...
The Claude system card reports near-expert performance on graduate-level reasoning (GPQA), professional coding (SWE-bench), and document analysis tasks. For Forecast Modelling, Claude demonstrates app...
OpenAI's o1 system card demonstrates significant advancement in complex reasoning tasks, achieving 83rd percentile on Codeforces and 93rd percentile on AMC math competitions. For analytical aspects of...
Deployment by Industry
Maintain and troubleshoot analytical tools, data pipelines, and automated reporting systems — updating configurations, fixing broken jobs, and ensuring reliable data delivery.
Capability Evidence
Demonstrates practical tool integration for domain-specific data access but no deployment scale metrics provided.
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Tool & Pipeline Maintenance represent a significant category of AI-augmented work...
LLMs demonstrate capability in debugging data pipeline code, generating configuration files, and diagnosing common failure modes in ETL and analytics tooling. The Anthropic Economic Index shows that d...
Deployment by Industry
Define, standardise, and document business metrics — ensuring consistent calculation methods, resolving conflicting definitions across teams, and maintaining a shared metric dictionary.
Capability Evidence
The Anthropic Economic Index shows minimal professional AI usage for tasks requiring organisational consensus-building and cross-functional alignment. Metric definition alignment requires understandin...
LLMs can suggest standard metric definitions and KPI frameworks for common business contexts, but the core task of aligning stakeholders on what metrics mean, resolving conflicting definitions across ...
AI can analyze and align evaluation metrics to better reflect authentic model capabilities rather than benchmark gaming
Deployment by Industry
Build and maintain ETL pipelines that extract data from source systems, transform it into analytical models, and load it into data warehouses for reporting and analysis.
Capability Evidence
Embedded agentic AI managing distributed data pipeline failures autonomously indicates incremental improvement in computer_use capability for complex system management.
AI systems can generate code for complex, multi-panel visualizations from real-world data using vision-language models
kRAIG can automate the generation of ETL workflows through natural language instructions, reducing manual work required from data engineers
Deployment by Industry
Evaluate new data sources for reliability, completeness, and analytical value — assessing vendor data, API feeds, and internal instrumentation to determine whether they meet quality standards.
Capability Evidence
Self-service analytics agent enables autonomous query generation and analysis across distributed data sources, improving reasoning capability in data interpretation workflows.
A systematic literature review of LLMs for code review found that AI detects 30-60% of code defects identified by human reviewers. For tasks like Data Source Evaluation, AI-assisted review achieves ap...
Eloundou et al. classify data assessment tasks as having moderate LLM exposure, noting that structural and statistical evaluation of data sources can be automated, but judgment about vendor reliabilit...
Deployment by Industry
Ensure data handling practices comply with privacy regulations and internal governance policies — managing access controls, anonymisation, retention schedules, and audit trails.
Capability Evidence
Anthropic's study of real-world Claude usage across millions of professional conversations found that tasks related to Data Governance & Compliance represent a significant category of AI-augmented wor...
Eloundou et al. classify regulatory compliance tasks as having moderate LLM exposure, noting that AI can assist with knowledge retrieval and documentation but that the judgment, risk assessment, and o...
AI agent can perform executive-level governance tasks such as policy authoring and management
Deployment by Industry
Review analytical work from teammates — checking methodology, statistical validity, query correctness, and interpretation accuracy before findings are shared with stakeholders.
Capability Evidence
Fine-tuned large language model can automate systematic review screening by reviewing titles and abstracts for inclusion decisions
While LLMs can flag obvious statistical errors and code bugs, the deeper aspects of methodology review — assessing whether the analytical approach fits the business question, evaluating unstated assum...
LLMs can check statistical code for common errors, verify formula correctness, and identify standard methodological issues such as multiple comparison problems, inappropriate test selection, and sampl...
Deployment by Industry