Amazon SageMaker announces general availability of Data Lineage for Apache Spark jobs executed on Amazon EMR and AWS Glue in SageMaker Unified Studio for IDC based domains. Data Lineage provides you with the information you need to identify the root cause of complex issues and understand the impact of changes.
This feature supports lineage capture of schema and transformations of data assets and columns from Spark executions in EMR-EC2, EMR-Serverless, EMR-EKS, and AWS Glue. You can then explore this lineage visually as a graph in SageMaker Unified Studio or query it using APIs. You can also use lineage to compare transformations across Spark job’s history.
Spark lineage is available in all existing SageMaker Unified Studio regions. For detailed information on how to get started with lineage using these new features, refer to the documentation.
Categories: marketing:marchitecture/analytics,general:products/amazon-sagemaker-studio
Source: Amazon Web Services

![Power Apps - Microsoft 365 Copilot will be available in model-driven apps [MC1273407] 2 pexels karolina grabowska 7680142](https://mwpro.co.uk/wp-content/uploads/2025/06/pexels-karolina-grabowska-7680142-150x150.webp)
![(Updated) Improved OneDrive sync experience for macOS users [MC1255404] 3 pexels steve 1418595](https://mwpro.co.uk/wp-content/uploads/2024/08/pexels-steve-1418595-150x150.webp)


