Amazon SageMaker Unified Studio introduces data preview v2.0 for Visual ETL, a new data preview mode that delivers near-instant results when building and iterating on visual ETL jobs. With data preview v2.0, data engineers and analysts can see the output of each transform in about one second, with no session startup required and at no additional compute cost.
Data preview v2.0 uses an in-browser query engine to load and process data locally, removing the dependency on server-side Spark sessions for preview operations. Source data is fetched once and cached in the browser, so subsequent transforms apply instantly without re-querying the underlying data source. For Amazon Redshift users, this means you can iterate on transforms without additional queries against your Redshift cluster, keeping your preview workflow fast and your cluster resources focused on production workloads. Data preview v2.0 supports CSV, Parquet, and JSON files from Amazon S3, in addition to data from Amazon Redshift, Amazon S3 Tables, AWS Glue Data Catalog, and third-party sources including Snowflake, MySQL, PostgreSQL, SQL Server, Oracle, Google BigQuery, Amazon DynamoDB, and Amazon DocumentDB. A toggle in the Visual ETL editor gives you the option to switch between data preview v2.0 and the original Spark-based preview at any time.
Data preview v2.0 in Visual ETL is available in all AWS Regions where Amazon SageMaker Unified Studio is supported. To learn more, visit the Amazon SageMaker Unified Studio documentation.
Categories: general:products/amazon-sagemaker,marketing:marchitecture/analytics
Source: Amazon Web Services




