Amazon SageMaker Catalog provides automatic data classification using AI agents

Amazon SageMaker Catalog provides automatic data classification using AI agents

Amazon SageMaker Catalog now provides automated data classification that suggests business glossary terms during data publishing, reducing manual tagging effort and improving metadata consistency across organizations.

This capability analyzes table metadata and schema information using Amazon Bedrock’s language models to recommend relevant terms from organizational business glossaries. Data producers receive AI-generated suggestions for business terms defined within their glossaries, which include both functional terms and sensitive data classifications such as PII and PHI, making it easy to tag their datasets with standardized vocabulary. Producers can accept or modify these suggestions before publishing, ensuring consistent terminology across data assets and improving data discoverability for business users.

Automated data classification is available in US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Tokyo, Seoul, Singapore, Sydney, Mumbai), and Europe (Frankfurt, Ireland, London, Paris) AWS regions where Amazon
SageMaker operates.

To get started, go to SageMaker Unified Studio to configure your business glossary to generate recommendations for business glossary terms. You can also use the AWS CLI or SDKs to programmatically manage glossary term suggestions.
For more information, see the SageMaker Catalog user guide.

Categories: general:products/amazon-sagemaker,marketing:marchitecture/analytics

Source: Amazon Web Services



Latest Posts

Pass It On
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply