From SAS Enterprise Guide to Google BigQuery: How a National Healthcare Network Modernized 1,900 Clinical Analytics Scripts

MigryX Case Study • April 2026 • Healthcare & Hospital Networks

Executive Summary

A leading US healthcare provider and hospital network operating across 14 states engaged MigryX to retire a deeply entrenched SAS Enterprise Guide environment that had accumulated nearly two decades of clinical, operational, and financial analytics. Spanning 1,900 SAS scripts, over 2.6 million lines of code, and tightly coupled to an on-premises Teradata data warehouse, the existing platform imposed escalating licensing costs, HIPAA-era compliance risks, and performance ceilings that were incompatible with the organization's push toward real-time population health analytics and value-based care reporting.

Over a focused nine-month engagement, MigryX deployed its parser-driven migration platform to systematically translate every SAS PROC, DATA step, and macro into idiomatic BigQuery SQL, Dataform pipelines, and PySpark jobs on Google Cloud Dataproc. The result: a fully cloud-native analytics estate, a 7X improvement in end-to-end query performance, and projected savings of $4.0 million over two years, all without a single disruption to live clinical reporting.

Client Overview

The client is a not-for-profit integrated health system operating 28 hospitals, over 300 outpatient facilities, and employing more than 45,000 clinical and administrative staff. Its analytics function supports a wide range of mission-critical use cases: clinical quality metrics, CMS star ratings, cost-of-care modeling, population health cohort analysis, staffing optimization, and payer contract performance reporting.

For nearly 18 years, SAS had been the organization's analytical backbone. SAS Enterprise Guide served as the primary authoring environment for statisticians, clinical data analysts, and financial modelers. The platform was tightly integrated with a Teradata EDW that ingested data from dozens of source systems including Epic, Cerner, Meditech, and various claims clearinghouses. While the combination had served the organization well during a period of steady-state reporting, the acceleration of value-based care initiatives, the growth of real-time bedside analytics, and the desire to leverage modern ML tooling made the legacy stack increasingly untenable.

Business Challenge

The analytics leadership team had explored a gradual, manual rewrite of SAS code on several prior occasions. Each attempt stalled within weeks as the true scale and interdependency of the code became apparent. The core challenges driving the MigryX engagement were:

The MigryX Approach

MigryX began the engagement with a comprehensive discovery phase, ingesting the full SAS code repository into its static analysis engine. The platform constructed a dependency graph spanning all 1,900 scripts, identifying 847 unique macros, 214 distinct PROC types, 63 SAS/ACCESS Teradata pass-through blocks, and 1,109 DATA step transformations. This inventory, completed in under 48 hours, replaced weeks of manual cataloging and gave both the MigryX team and the client's migration governance committee a precise view of migration complexity before a single line of target code was written.

AST-Driven PROC Translation

The MigryX Abstract Syntax Tree (AST) parser decomposed each SAS source file into a structured parse tree, isolating PROC boundaries, DATA step logic, macro definitions, and call sites. For procedural SQL workloads, PROC SQL blocks were translated directly to BigQuery Standard SQL, with automatic resolution of Teradata pass-through fragments into their BigQuery equivalents. PROC MEANS, PROC FREQ, PROC SORT, PROC TRANSPOSE, and PROC TABULATE were each mapped to native BigQuery SQL aggregation patterns, window functions, and UNNEST expressions, preserving statistical semantics without resorting to emulation layers.

Macro-heavy scripts presented the most complex translation challenge. The MigryX macro resolver expanded and inlined macro calls within the AST prior to translation, enabling the output generator to produce clean, readable BigQuery SQL and Dataform SQLX files rather than macro-laden intermediaries. Each Dataform model was annotated with dependency declarations derived directly from the original SAS library reference graph, ensuring that Dataform's DAG-based orchestration matched the logical execution order of the legacy SAS job chains.

DATA Step to PySpark on Dataproc

SAS DATA steps that performed row-by-row iterative logic, retain-based accumulation, or complex conditional branching were identified by the classifier as unsuitable for direct SQL translation. For these 312 DATA steps, MigryX generated equivalent PySpark DataFrame transformations deployed as managed jobs on Google Cloud Dataproc. The generated PySpark code preserved the original variable naming conventions and transformation logic, and each job was unit-tested against a sample of the original SAS output data to validate fidelity before promotion to the production pipeline.

Orchestration and Pipeline Governance

Legacy SAS batch schedules, managed through a combination of SAS Management Console and Windows Task Scheduler, were replaced with Cloud Composer (managed Apache Airflow) DAGs. MigryX generated DAG skeletons directly from the dependency graph, with each Dataform model execution and each Dataproc job submission represented as a native Airflow operator. This gave the analytics engineering team immediate visibility into pipeline lineage, retry behavior, and SLA tracking through the Airflow UI, replacing opaque SAS batch logs with structured, searchable execution history.

Migration Architecture

Component Legacy (Before) Modern (After)
Analytics authoring SAS Enterprise Guide 7.x Dataform SQLX + dbt-compatible models
Data warehouse Teradata 16.x on-premises EDW Google BigQuery (multi-region US)
Procedural transforms SAS DATA steps, PROC steps PySpark on Google Cloud Dataproc
Orchestration SAS Management Console + Windows Task Scheduler Cloud Composer (Apache Airflow 2.x)
Clinical ML scoring SAS Enterprise Miner batch scoring Vertex AI Model Registry + BigQuery ML
Access control SAS metadata server roles + Teradata grants Google IAM + BigQuery column-level security
Audit & lineage SAS log files on shared NFS BigQuery audit logs + Dataplex lineage
Storage cost model Fixed capacity hardware + SAN refresh cycles BigQuery on-demand + long-term storage pricing

Key Migration Highlights

MigryX Migration Highlights — SAS to BigQuery

Security & Compliance

For a healthcare organization operating under HIPAA, HITECH, and state-level health data privacy regulations, compliance was not a secondary concern but a primary migration driver. The MigryX platform and the Google Cloud target architecture were designed together to address these requirements comprehensively.

All PHI-bearing BigQuery tables were provisioned within a dedicated Google Cloud project governed by VPC Service Controls, ensuring that no data could be accessed outside the defined security perimeter even by privileged Google employees. BigQuery's native column-level security policies replaced the coarse SAS metadata server role model, allowing the client's compliance team to enforce least-privilege access at the individual field level for the first time.

Customer-managed encryption keys (CMEK) via Cloud KMS were applied to all BigQuery datasets containing clinical data, satisfying the organization's internal encryption policy and the HITRUST CSF requirement for customer-controlled key management. BigQuery audit logs were streamed to Cloud Logging and retained for seven years in a locked archive bucket, providing the immutable access history required for HIPAA audit obligations and Joint Commission readiness reviews.

De-identification pipelines for research analytics use cases were rebuilt using Cloud Healthcare API's DLP-based de-identification capabilities, replacing a custom SAS macro de-identification approach that had not undergone a formal security review since 2019. The new architecture was reviewed and approved by the client's Privacy Officer and external HIPAA counsel prior to go-live.

Results & Business Impact

The migration delivered measurable, quantified outcomes across financial, operational, and clinical dimensions within the first six months of production operation.

1,900
SAS scripts fully migrated to BigQuery & Dataproc
2.6M
Lines of SAS code translated by MigryX AST parser
7X
Query performance improvement vs. legacy Teradata + SAS
$4.0M
Projected cost savings over 2 years (licensing + infrastructure)
9 mo.
End-to-end migration duration from discovery to production cutover
94%
Scripts translated with zero manual intervention required

Population health cohort queries that previously required 60-to-90-minute SAS batch jobs now complete in under nine minutes in BigQuery, enabling clinical decision support dashboards to refresh intraday rather than overnight. The analytics engineering team has significantly reduced routine pipeline maintenance time, with the team reporting that the majority of prior Teradata queue management and SAS job debugging work had been eliminated, freeing staff to focus on new model development rather than SAS job debugging and Teradata queue management.

On the financial side, the elimination of SAS and Teradata licenses, combined with the decommission of three aging on-premises data warehouse nodes, is projected to deliver cumulative savings of $4.0 million over 24 months. The transition to BigQuery's on-demand pricing model also introduced a direct feedback loop between query cost and query design, motivating the analytics team to write more efficient SQL and adopt partition pruning and clustering practices that further reduced ongoing compute spend.

"We had been told our SAS estate was too complex and too critical to migrate without a multi-year rewrite. MigryX delivered a production-ready BigQuery environment in nine months without disrupting CMS reporting. Our analysts are now working in Dataform and Python, and the compliance improvements alone justified the investment."

— VP of Analytics Engineering, Leading US Healthcare Provider (anonymized)

Ready to Modernize Your SAS Estate?

See how MigryX can accelerate your migration to BigQuery — without disrupting clinical or operational reporting.

Explore BigQuery Migration →