在亚马逊云科技上安全、合规地创建AI大模型训练基础设施并开发AI应用服务

项目简介:

小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。

本次介绍的是如何在亚马逊云科技利用Service Catalog服务创建和管理包含AI大模型的应用产品,并通过权限管理基于员工的身份职责限制所能访问的云资源,并创建SageMaker机器学习托管服务并在该服务上训练和部署大模型,通过VPC endpoint节点私密、安全的加载模型文件和模型容器镜像。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。本方案的解决方案架构图如下:

方案所需基础知识 

什么是 Amazon SageMaker?

Amazon SageMaker 是亚马逊云科技提供的一站式机器学习服务,旨在帮助开发者和数据科学家轻松构建、训练和部署机器学习模型。SageMaker 提供了从数据准备、模型训练到模型部署的全流程工具,使用户能够高效地在云端实现机器学习项目。

什么是亚马逊云科技 Service Catalog?

亚马逊云科技 Service Catalog 是一项服务,旨在帮助企业创建、管理和分发经过批准的云服务集合。通过 Service Catalog,企业可以集中管理已批准的资源和配置,确保开发团队在使用云服务时遵循组织的最佳实践和合规要求。用户可以从预定义的产品目录中选择所需的服务,简化了资源部署的过程,并减少了因配置错误导致的风险。

利用 SageMaker 构建 AI 服务的安全合规好处

符合企业合规性要求

使用 SageMaker 构建 AI 服务时,可以通过 Service Catalog 预先定义和管理符合公司合规标准的配置模板,确保所有的 AI 模型和资源部署都遵循组织的安全政策和行业法规,如 GDPR 或 HIPAA。

数据安全性

SageMaker 提供了端到端的数据加密选项,包括在数据存储和传输中的加密,确保敏感数据在整个 AI 模型生命周期中的安全性。同时可以利用VPC endpoint节点,私密安全的访问S3中的数据,加载ECR镜像库中保存的AI模型镜像容器。

访问控制和监控

通过与亚马逊云科技的身份和访问管理(IAM)集成,可以细粒度地控制谁可以访问和操作 SageMaker 中的资源。再结合 CloudTrail 和 CloudWatch 等监控工具,企业可以实时跟踪和审计所有的操作,确保透明度和安全性。

本方案包括的内容

1. 通过VPC Endpoint节点,私有访问S3中的模型文件

2. 创建亚马逊云科技Service Catalog资源组,统一创建、管理用户的云服务产品。

3. 作为Service Catalog的使用用户创建一个SageMaker机器学习训练计算实例

项目搭建具体步骤:

1. 登录亚马逊云科技控制台,进入无服务器计算服务Lambda,创建一个Lambda函数“SageMakerBuild”,复制以下代码,用于创建SageMaker Jupyter Notebook,训练AI大模型。

import json
import boto3
import requests
import botocore
import time
import base64

## Request Status ##
global ReqStatus


def CFTFailedResponse(event, status, message):
    print("Inside CFTFailedResponse")
    responseBody = {
        'Status': status,
        'Reason': message,
        'PhysicalResourceId': event['ServiceToken'],
        'StackId': event['StackId'],
        'RequestId': event['RequestId'],
        'LogicalResourceId': event['LogicalResourceId']
    }
	
    headers={
        'content-type':'',
        'content-length':str(len(json.dumps(responseBody)))	 
    }	
    print('Response = ' + json.dumps(responseBody))
    try:	
        req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)
        print("delete_respond_cloudformation res "+str(req))		
    except Exception as e:
        print("Failed to send cf response {}".format(e))
        
def CFTSuccessResponse(event, status, data=None):
    responseBody = {
        'Status': status,
        'Reason': 'See the details in CloudWatch Log Stream',
        'PhysicalResourceId': event['ServiceToken'],
        'StackId': event['StackId'],
        'RequestId': event['RequestId'],
        'LogicalResourceId': event['LogicalResourceId'],
        'Data': data
    }
    headers={
        'content-type':'',
        'content-length':str(len(json.dumps(responseBody)))	 
    }	
    print('Response = ' + json.dumps(responseBody))
    #print(event)
    try:	
        req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)
    except Exception as e:
        print("Failed to send cf response {}".format(e))


def lambda_handler(event, context):
    ReqStatus = "SUCCESS"
    print("Event:")
    print(event)
    client = boto3.client('sagemaker')
    ec2client = boto3.client('ec2')
    data = {}

    if event['RequestType'] == 'Create':
        try:
            ## Value Intialization from CFT ##
            project_name = event['ResourceProperties']['ProjectName']
            kmsKeyId = event['ResourceProperties']['KmsKeyId']
            Tags = event['ResourceProperties']['Tags']
            env_name = event['ResourceProperties']['ENVName']
            subnet_name = event['ResourceProperties']['Subnet']
            security_group_name = event['ResourceProperties']['SecurityGroupName']

            input_dict = {}
            input_dict['NotebookInstanceName'] = event['ResourceProperties']['NotebookInstanceName']
            input_dict['InstanceType'] = event['ResourceProperties']['NotebookInstanceType']
            input_dict['Tags'] = event['ResourceProperties']['Tags']
            input_dict['DirectInternetAccess'] = event['ResourceProperties']['DirectInternetAccess']
            input_dict['RootAccess'] = event['ResourceProperties']['RootAccess']
            input_dict['VolumeSizeInGB'] = int(event['ResourceProperties']['VolumeSizeInGB'])
            input_dict['RoleArn'] = event['ResourceProperties']['RoleArn']
            input_dict['LifecycleConfigName'] = event['ResourceProperties']['LifecycleConfigName']

        except Exception as e:
            print(e)
            ReqStatus = "FAILED"
            message = "Parameter Error: "+str(e)
            CFTFailedResponse(event, "FAILED", message)
        if ReqStatus == "FAILED":
            return None;
        print("Validating Environment name: "+env_name)
        print("Subnet Id Fetching.....")
        try:
            ## Sagemaker Subnet ##
            subnetName = env_name+"-ResourceSubnet"
            print(subnetName)
            response = ec2client.describe_subnets(
                Filters=[
                    {
                        'Name': 'tag:Name',
                        'Values': [
                            subnet_name
                        ]
                    },
                ]
            )
            #print(response)
            subnetId = response['Subnets'][0]['SubnetId']
            input_dict['SubnetId'] = subnetId
            print("Desc sg done!!")
        except Exception as e:
            print(e)
            ReqStatus = "FAILED"
            message = " Project Name is invalid - Subnet Error: "+str(e)
            CFTFailedResponse(event, "FAILED", message)
        if ReqStatus == "FAILED":
            return None;
        ## Sagemaker Security group ##
        print("Security GroupId Fetching.....")
        try:
            sgName = env_name+"-ResourceSG"
            response = ec2client.describe_security_groups(
                Filters=[
                    {
                        'Name': 'tag:Name',
                        'Values': [
                            security_group_name
                        ]
                    },
                ]
            )
            sgId = response['SecurityGroups'][0]['GroupId']
            input_dict['SecurityGroupIds'] = [sgId]
            print("Desc sg done!!")
        except Exception as e:
            print(e)
            ReqStatus = "FAILED"
            message = "Security Group ID Error: "+str(e)
            CFTFailedResponse(event, "FAILED", message)
        if ReqStatus == "FAILED":
            return None;    
        try:
            if kmsKeyId:
                input_dict['KmsKeyId'] = kmsKeyId
            else:
                print("in else")
                
            print(input_dict)
            instance = client.create_notebook_instance(**input_dict)
            print('Sagemager CLI response')
            print(str(instance))
            responseData = {'NotebookInstanceArn': instance['NotebookInstanceArn']}
            
            NotebookStatus = 'Pending'
            response = client.describe_notebook_instance(
                NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName']
            )
            NotebookStatus = response['NotebookInstanceStatus']
            print("NotebookStatus:"+NotebookStatus)
            
            ## Notebook Failure ##
            if NotebookStatus == 'Failed':
                message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"
                CFTFailedResponse(event, "FAILED", message)
            else:
                while NotebookStatus == 'Pending':
                    time.sleep(200)
                    response = client.describe_notebook_instance(
                        NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName']
                    )
                    NotebookStatus = response['NotebookInstanceStatus']
                    print("NotebookStatus in loop:"+NotebookStatus)
                
                ## Notebook Success ##
                if NotebookStatus == 'InService':
                    data['Message'] = "SageMaker Notebook name - "+event['ResourceProperties']['NotebookInstanceName']+" created succesfully"
                    print("message InService :",data['Message'])
                    CFTSuccessResponse(event, "SUCCESS", data)
                else:
                    message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"
                    print("message :",message)
                    CFTFailedResponse(event, "FAILED", message)
        except Exception as e:
            print(e)
            ReqStatus = "FAILED"
            CFTFailedResponse(event, "FAILED", str(e))
    if event['RequestType'] == 'Delete':
        NotebookStatus = None
        lifecycle_config = event['ResourceProperties']['LifecycleConfigName']
        NotebookName = event['ResourceProperties']['NotebookInstanceName']

        try:
            response = client.describe_notebook_instance(
                NotebookInstanceName=NotebookName
            )
            NotebookStatus = response['NotebookInstanceStatus']
            print("Notebook Status - "+NotebookStatus)
        except Exception as e:
            print(e)
            NotebookStatus = "Invalid"
            #CFTFailedResponse(event, "FAILED", str(e))
        while NotebookStatus == 'Pending':
            time.sleep(30)
            response = client.describe_notebook_instance(
                NotebookInstanceName=NotebookName
            )
            NotebookStatus = response['NotebookInstanceStatus']
            print("NotebookStatus:"+NotebookStatus)
        if NotebookStatus != 'Failed' and NotebookStatus != 'Invalid' :
            print("Delete request for Notebookk name: "+NotebookName)
            print("Stoping the Notebook.....")
            if NotebookStatus != 'Stopped':
                try:
                    response = client.stop_notebook_instance(
                        NotebookInstanceName=NotebookName
                    )
                    NotebookStatus = 'Stopping'
                    print("Notebook Status - "+NotebookStatus)
                    while NotebookStatus == 'Stopping':
                        time.sleep(30)
                        response = client.describe_notebook_instance(
                            NotebookInstanceName=NotebookName
                        )
                        NotebookStatus = response['NotebookInstanceStatus']
                    print("NotebookStatus:"+NotebookStatus)
                except Exception as e:
                    print(e)
                    NotebookStatus = "Invalid"
                    CFTFailedResponse(event, "FAILED", str(e))
                
            else:
                NotebookStatus = 'Stopped'
                print("NotebookStatus:"+NotebookStatus)
        
        if NotebookStatus != 'Invalid':
            print("Deleting The Notebook......")
            time.sleep(5)
            try:
                response = client.delete_notebook_instance(
                    NotebookInstanceName=NotebookName
                )
                print("Notebook Deleted")
                data["Message"] = "Notebook Deleted"
                CFTSuccessResponse(event, "SUCCESS", data)
            except Exception as e:
                print(e)
                CFTFailedResponse(event, "FAILED", str(e))
            
        else:
            print("Notebook Invalid status")
            data["Message"] = "Notebook is not available"
            CFTSuccessResponse(event, "SUCCESS", data)
    
    if event['RequestType'] == 'Update':
        print("Update operation for Sagemaker Notebook is not recommended")
        data["Message"] = "Update operation for Sagemaker Notebook is not recommended"
        CFTSuccessResponse(event, "SUCCESS", data)
        
    
        
		    

2. 接下来我们创建一个yaml脚本,复制以下代码,上传到S3桶中,用于通过CloudFormation,以IaC的形式创建SageMaker Jupyter Notebook。

AWSTemplateFormatVersion: 2010-09-09
Description: Template to create a SageMaker notebook
Metadata:
  'AWS::CloudFormation::Interface':
    ParameterGroups:
      - Label:
          default: Environment detail
        Parameters:
          - ENVName
      - Label:
          default: SageMaker Notebook configuration
        Parameters:
          - NotebookInstanceName
          - NotebookInstanceType
          - DirectInternetAccess
          - RootAccess
          - VolumeSizeInGB
      - Label:
          default: Load S3 Bucket to SageMaker
        Parameters:
          - S3CodePusher
          - CodeBucketName
      - Label:
          default: Project detail
        Parameters:
          - ProjectName
          - ProjectID
    ParameterLabels:
      DirectInternetAccess:
        default: Default Internet Access
      NotebookInstanceName:
        default: Notebook Instance Name
      NotebookInstanceType:
        default: Notebook Instance Type
      ENVName:
        default: Environment Name
      ProjectName:
        default: Project Suffix
      RootAccess:
        default: Root access
      VolumeSizeInGB:
        default: Volume size for the SageMaker Notebook
      ProjectID:
        default: SageMaker ProjectID
      CodeBucketName:
        default: Code Bucket Name        
      S3CodePusher:
        default: Copy code from S3 to SageMaker
Parameters:
  SubnetName:
    Default: ProSM-ResourceSubnet
    Description: Subnet Random String
    Type: String
  SecurityGroupName:
    Default: ProSM-ResourceSG
    Description: Security Group Name
    Type: String
  SageMakerBuildFunctionARN:
    Description: Service Token Value passed from Lambda Stack
    Type: String
  NotebookInstanceName:
    AllowedPattern: '[A-Za-z0-9-]{1,63}'
    ConstraintDescription: >-
      Maximum of 63 alphanumeric characters. Can include hyphens (-), but not
      spaces. Must be unique within your account in an AWS Region.
    Description: SageMaker Notebook instance name
    MaxLength: '63'
    MinLength: '1'
    Type: String
  NotebookInstanceType:
    ConstraintDescription: Must select a valid notebook instance type.
    Default: ml.t3.medium
    Description: Select Instance type for the SageMaker Notebook
    Type: String
  ENVName:
    Description: SageMaker infrastructure naming convention
    Type: String
  ProjectName:
    Description: >-
      The suffix appended to all resources in the stack.  This will allow
      multiple copies of the same stack to be created in the same account.
    Type: String
  RootAccess:
    Description: Root access for the SageMaker Notebook user
    AllowedValues:
      - Enabled
      - Disabled
    Default: Enabled
    Type: String
  VolumeSizeInGB:
    Description: >-
      The size, in GB, of the ML storage volume to attach to the notebook
      instance. The default value is 5 GB.
    Type: Number
    Default: '20'
  DirectInternetAccess:
    Description: >-
      If you set this to Disabled this notebook instance will be able to access
      resources only in your VPC. As per the Project requirement, we have
      Disabled it.
    Type: String
    Default: Disabled
    AllowedValues:
      - Disabled
    ConstraintDescription: Must select a valid notebook instance type.
  ProjectID:
    Type: String
    Description: Enter a valid ProjectID.
    Default: QuickStart007
  S3CodePusher:
    Description: Do you want to load the code from S3 to SageMaker Notebook
    Default: 'NO'
    AllowedValues:
      - 'YES'
      - 'NO'
    Type: String
  CodeBucketName:
    Description: S3 Bucket name from which you want to copy the code to SageMaker.
    Default: lab-materials-bucket-1234
    Type: String    
Conditions:
  BucketCondition: !Equals 
    - 'YES'
    - !Ref S3CodePusher
Resources:
  SagemakerKMSKey:
    Type: 'AWS::KMS::Key'
    Properties:
      EnableKeyRotation: true
      Tags:
        - Key: ProjectID
          Value: !Ref ProjectID
        - Key: ProjectName
          Value: !Ref ProjectName
      KeyPolicy:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            AWS: !Sub 'arn:aws:iam::${AWS::AccountId}:root'
          Action: 
            - 'kms:Encrypt'
            - 'kms:PutKeyPolicy' 
            - 'kms:CreateKey' 
            - 'kms:GetKeyRotationStatus' 
            - 'kms:DeleteImportedKeyMaterial' 
            - 'kms:GetKeyPolicy' 
            - 'kms:UpdateCustomKeyStore' 
            - 'kms:GenerateRandom' 
            - 'kms:UpdateAlias'
            - 'kms:ImportKeyMaterial'
            - 'kms:ListRetirableGrants' 
            - 'kms:CreateGrant' 
            - 'kms:DeleteAlias'
            - 'kms:RetireGrant'
            - 'kms:ScheduleKeyDeletion' 
            - 'kms:DisableKeyRotation' 
            - 'kms:TagResource' 
            - 'kms:CreateAlias' 
            - 'kms:EnableKeyRotation' 
            - 'kms:DisableKey'
            - 'kms:ListResourceTags'
            - 'kms:Verify' 
            - 'kms:DeleteCustomKeyStore'
            - 'kms:Sign' 
            - 'kms:ListKeys'
            - 'kms:ListGrants'
            - 'kms:ListAliases' 
            - 'kms:ReEncryptTo' 
            - 'kms:UntagResource' 
            - 'kms:GetParametersForImport'
            - 'kms:ListKeyPolicies'
            - 'kms:GenerateDataKeyPair'
            - 'kms:GenerateDataKeyPairWithoutPlaintext' 
            - 'kms:GetPublicKey' 
            - 'kms:Decrypt' 
            - 'kms:ReEncryptFrom'
            - 'kms:DisconnectCustomKeyStore' 
            - 'kms:DescribeKey'
            - 'kms:GenerateDataKeyWithoutPlaintext'
            - 'kms:DescribeCustomKeyStores' 
            - 'kms:CreateCustomKeyStore'
            - 'kms:EnableKey'
            - 'kms:RevokeGrant'
            - 'kms:UpdateKeyDescription' 
            - 'kms:ConnectCustomKeyStore' 
            - 'kms:CancelKeyDeletion' 
            - 'kms:GenerateDataKey'
          Resource:
            - !Join 
              - ''
              - - 'arn:aws:kms:'
                - !Ref 'AWS::Region'
                - ':'
                - !Ref 'AWS::AccountId'
                - ':key/*'
        - Sid: Allow access for Key Administrators
          Effect: Allow
          Principal:
            AWS: 
              - !GetAtt SageMakerExecutionRole.Arn
          Action:
            - 'kms:CreateAlias'
            - 'kms:CreateKey'
            - 'kms:CreateGrant' 
            - 'kms:CreateCustomKeyStore'
            - 'kms:DescribeKey'
            - 'kms:DescribeCustomKeyStores'
            - 'kms:EnableKey'
            - 'kms:EnableKeyRotation'
            - 'kms:ListKeys'
            - 'kms:ListAliases'
            - 'kms:ListKeyPolicies'
            - 'kms:ListGrants'
            - 'kms:ListRetirableGrants'
            - 'kms:ListResourceTags'
            - 'kms:PutKeyPolicy'
            - 'kms:UpdateAlias'
            - 'kms:UpdateKeyDescription'
            - 'kms:UpdateCustomKeyStore'
            - 'kms:RevokeGrant'
            - 'kms:DisableKey'
            - 'kms:DisableKeyRotation'
            - 'kms:GetPublicKey'
            - 'kms:GetKeyRotationStatus'
            - 'kms:GetKeyPolicy'
            - 'kms:GetParametersForImport'
            - 'kms:DeleteCustomKeyStore'
            - 'kms:DeleteImportedKeyMaterial'
            - 'kms:DeleteAlias'
            - 'kms:TagResource'
            - 'kms:UntagResource'
            - 'kms:ScheduleKeyDeletion'
            - 'kms:CancelKeyDeletion'
          Resource:
            - !Join 
              - ''
              - - 'arn:aws:kms:'
                - !Ref 'AWS::Region'
                - ':'
                - !Ref 'AWS::AccountId'
                - ':key/*'
        - Sid: Allow use of the key
          Effect: Allow
          Principal:
            AWS: 
              - !GetAtt SageMakerExecutionRole.Arn

          Action:
            - kms:Encrypt
            - kms:Decrypt
            - kms:ReEncryptTo
            - kms:ReEncryptFrom
            - kms:GenerateDataKeyPair
            - kms:GenerateDataKeyPairWithoutPlaintext
            - kms:GenerateDataKeyWithoutPlaintext
            - kms:GenerateDataKey
            - kms:DescribeKey
          Resource:
            - !Join 
              - ''
              - - 'arn:aws:kms:'
                - !Ref 'AWS::Region'
                - ':'
                - !Ref 'AWS::AccountId'
                - ':key/*'
        - Sid: Allow attachment of persistent resources
          Effect: Allow
          Principal:
            AWS: 
              - !GetAtt SageMakerExecutionRole.Arn

          Action:
            - kms:CreateGrant
            - kms:ListGrants
            - kms:RevokeGrant
          Resource:
            - !Join 
              - ''
              - - 'arn:aws:kms:'
                - !Ref 'AWS::Region'
                - ':'
                - !Ref 'AWS::AccountId'
                - ':key/*'
          Condition:
            Bool:
              kms:GrantIsForAWSResource: 'true'
  KeyAlias:
    Type: AWS::KMS::Alias
    Properties:
      AliasName: 'alias/SageMaker-CMK-DS'
      TargetKeyId:
        Ref: SagemakerKMSKey
  SageMakerExecutionRole:
    Type: 'AWS::IAM::Role'
    Properties:
      Tags:
        - Key: ProjectID
          Value: !Ref ProjectID
        - Key: ProjectName
          Value: !Ref ProjectName
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - sagemaker.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: !Join 
            - ''
            - - !Ref ProjectName
              - SageMakerExecutionPolicy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - 'iam:ListRoles'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:iam::'
                      - !Ref 'AWS::AccountId'
                      - ':role/*'
              - Sid: CloudArnResource
                Effect: Allow
                Action:
                  - 'application-autoscaling:DeleteScalingPolicy'
                  - 'application-autoscaling:DeleteScheduledAction'
                  - 'application-autoscaling:DeregisterScalableTarget'
                  - 'application-autoscaling:DescribeScalableTargets'
                  - 'application-autoscaling:DescribeScalingActivities'
                  - 'application-autoscaling:DescribeScalingPolicies'
                  - 'application-autoscaling:DescribeScheduledActions'
                  - 'application-autoscaling:PutScalingPolicy'
                  - 'application-autoscaling:PutScheduledAction'
                  - 'application-autoscaling:RegisterScalableTarget'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:autoscaling:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':*'
              - Sid: ElasticArnResource
                Effect: Allow
                Action:
                  - 'elastic-inference:Connect'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:elastic-inference:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':elastic-inference-accelerator/*'  
              - Sid: SNSArnResource
                Effect: Allow
                Action:
                  - 'sns:ListTopics'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:sns:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':*'
              - Sid: logsArnResource
                Effect: Allow
                Action:
                  - 'cloudwatch:DeleteAlarms'
                  - 'cloudwatch:DescribeAlarms'
                  - 'cloudwatch:GetMetricData'
                  - 'cloudwatch:GetMetricStatistics'
                  - 'cloudwatch:ListMetrics'
                  - 'cloudwatch:PutMetricAlarm'
                  - 'cloudwatch:PutMetricData'
                  - 'logs:CreateLogGroup'
                  - 'logs:CreateLogStream'
                  - 'logs:DescribeLogStreams'
                  - 'logs:GetLogEvents'
                  - 'logs:PutLogEvents'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:logs:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':log-group:/aws/lambda/*'
              - Sid: KmsArnResource
                Effect: Allow
                Action:
                  - 'kms:DescribeKey'
                  - 'kms:ListAliases'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:kms:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':key/*'
              - Sid: ECRArnResource
                Effect: Allow
                Action:
                  - 'ecr:BatchCheckLayerAvailability'
                  - 'ecr:BatchGetImage'
                  - 'ecr:CreateRepository'
                  - 'ecr:GetAuthorizationToken'
                  - 'ecr:GetDownloadUrlForLayer'
                  - 'ecr:DescribeRepositories'
                  - 'ecr:DescribeImageScanFindings'
                  - 'ecr:DescribeRegistry'
                  - 'ecr:DescribeImages'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:ecr:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':repository/*'
              - Sid: EC2ArnResource
                Effect: Allow
                Action:        
                  - 'ec2:CreateNetworkInterface'
                  - 'ec2:CreateNetworkInterfacePermission'
                  - 'ec2:DeleteNetworkInterface'
                  - 'ec2:DeleteNetworkInterfacePermission'
                  - 'ec2:DescribeDhcpOptions'
                  - 'ec2:DescribeNetworkInterfaces'
                  - 'ec2:DescribeRouteTables'
                  - 'ec2:DescribeSecurityGroups'
                  - 'ec2:DescribeSubnets'
                  - 'ec2:DescribeVpcEndpoints'
                  - 'ec2:DescribeVpcs'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:ec2:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':instance/*'
              - Sid: S3ArnResource
                Effect: Allow
                Action: 
                  - 's3:CreateBucket'
                  - 's3:GetBucketLocation'
                  - 's3:ListBucket'       
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:s3::'
                      - ':*sagemaker*'                  
              - Sid: LambdaInvokePermission
                Effect: Allow
                Action:
                  - 'lambda:ListFunctions'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:lambda:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':function'
                      - ':*'
              - Effect: Allow
                Action: 'sagemaker:InvokeEndpoint'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:sagemaker:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':notebook-instance-lifecycle-config/*'
                Condition:
                  StringEquals:
                    'aws:PrincipalTag/ProjectID': !Ref ProjectID
              - Effect: Allow
                Action:
                  - 'sagemaker:CreateTrainingJob'
                  - 'sagemaker:CreateEndpoint'
                  - 'sagemaker:CreateModel'
                  - 'sagemaker:CreateEndpointConfig'
                  - 'sagemaker:CreateHyperParameterTuningJob'
                  - 'sagemaker:CreateTransformJob'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:sagemaker:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':notebook-instance-lifecycle-config/*'
                Condition:
                  StringEquals:
                    'aws:PrincipalTag/ProjectID': !Ref ProjectID
                  'ForAllValues:StringEquals':
                    'aws:TagKeys':
                      - Username
              - Effect: Allow
                Action:
                  - 'sagemaker:DescribeTrainingJob'
                  - 'sagemaker:DescribeEndpoint'
                  - 'sagemaker:DescribeEndpointConfig'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:sagemaker:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':notebook-instance-lifecycle-config/*'
                Condition:
                  StringEquals:
                    'aws:PrincipalTag/ProjectID': !Ref ProjectID
              - Effect: Allow
                Action:
                  - 'sagemaker:DeleteTags'
                  - 'sagemaker:ListTags'
                  - 'sagemaker:DescribeNotebookInstance'
                  - 'sagemaker:ListNotebookInstanceLifecycleConfigs'
                  - 'sagemaker:DescribeModel'
                  - 'sagemaker:ListTrainingJobs'
                  - 'sagemaker:DescribeHyperParameterTuningJob'
                  - 'sagemaker:UpdateEndpointWeightsAndCapacities'
                  - 'sagemaker:ListHyperParameterTuningJobs'
                  - 'sagemaker:ListEndpointConfigs'
                  - 'sagemaker:DescribeNotebookInstanceLifecycleConfig'
                  - 'sagemaker:ListTrainingJobsForHyperParameterTuningJob'
                  - 'sagemaker:StopHyperParameterTuningJob'
                  - 'sagemaker:DescribeEndpointConfig'
                  - 'sagemaker:ListModels'
                  - 'sagemaker:AddTags'
                  - 'sagemaker:ListNotebookInstances'
                  - 'sagemaker:StopTrainingJob'
                  - 'sagemaker:ListEndpoints'
                  - 'sagemaker:DeleteEndpoint'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:sagemaker:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':notebook-instance-lifecycle-config/*'
                Condition:
                  StringEquals:
                    'aws:PrincipalTag/ProjectID': !Ref ProjectID
              - Effect: Allow
                Action:
                  - 'ecr:SetRepositoryPolicy'
                  - 'ecr:CompleteLayerUpload'
                  - 'ecr:BatchDeleteImage'
                  - 'ecr:UploadLayerPart'
                  - 'ecr:DeleteRepositoryPolicy'
                  - 'ecr:InitiateLayerUpload'
                  - 'ecr:DeleteRepository'
                  - 'ecr:PutImage'
                Resource: 
                  - !Join 
                    - ''
                    - - 'arn:aws:ecr:'
                      - !Ref 'AWS::Region'
                      - ':'
                      - !Ref 'AWS::AccountId'
                      - ':repository/*sagemaker*'
              - Effect: Allow
                Action:
                  - 's3:GetObject'
                  - 's3:ListBucket'
                  - 's3:PutObject'
                  - 's3:DeleteObject'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:s3:::'
                      - !Ref SagemakerS3Bucket
                  - !Join 
                    - ''
                    - - 'arn:aws:s3:::'
                      - !Ref SagemakerS3Bucket
                      - /*
                Condition:
                  StringEquals:
                    'aws:PrincipalTag/ProjectID': !Ref ProjectID
              - Effect: Allow
                Action: 'iam:PassRole'
                Resource:
                  - !Join 
                    - ''
                    - - 'arn:aws:iam::'
                      - !Ref 'AWS::AccountId'
                      - ':role/*'
                Condition:
                  StringEquals:
                    'iam:PassedToService': sagemaker.amazonaws.com
  CodeBucketPolicy:
    Type: 'AWS::IAM::Policy'
    Condition: BucketCondition
    Properties:
      PolicyName: !Join 
        - ''
        - - !Ref ProjectName
          - CodeBucketPolicy
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action:
              - 's3:GetObject'
            Resource:
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref CodeBucketName
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref CodeBucketName
                  - '/*'
      Roles:
        - !Ref SageMakerExecutionRole
  SagemakerS3Bucket:
    Type: 'AWS::S3::Bucket'
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      Tags:
        - Key: ProjectID
          Value: !Ref ProjectID
        - Key: ProjectName
          Value: !Ref ProjectName
  S3Policy:
    Type: 'AWS::S3::BucketPolicy'
    Properties:
      Bucket: !Ref SagemakerS3Bucket
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          - Sid: AllowAccessFromVPCEndpoint
            Effect: Allow
            Principal: "*"
            Action:
              - 's3:Get*'
              - 's3:Put*'
              - 's3:List*'
              - 's3:DeleteObject'
            Resource:
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref SagemakerS3Bucket
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref SagemakerS3Bucket
                  - '/*'
            Condition:
              StringEquals:
                "aws:sourceVpce": "<PASTE S3 VPC ENDPOINT ID>"
  EFSLifecycleConfig:
    Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'
    Properties:
      NotebookInstanceLifecycleConfigName: 'Provisioned-LC'
      OnCreate:
        - Content: !Base64 
            'Fn::Join':
              - ''
              - - |
                  #!/bin/bash 
                - |
                  aws configure set sts_regional_endpoints regional 
                - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config
      OnStart:
        - Content: !Base64 
            'Fn::Join':
              - ''
              - - |
                  #!/bin/bash  
                - |
                  aws configure set sts_regional_endpoints regional 
                - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config  
  EFSLifecycleConfigForS3:
    Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'
    Properties:
      NotebookInstanceLifecycleConfigName: 'Provisioned-LC-S3'
      OnCreate:
        - Content: !Base64 
            'Fn::Join':
              - ''
              - - |
                  #!/bin/bash 
                - |
                  # Copy Content
                - !Sub >
                  aws s3 cp s3://${CodeBucketName} /home/ec2-user/SageMaker/ --recursive 
                - |
                  # Set sts endpoint
                - >
                  aws configure set sts_regional_endpoints regional 
                - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config
      OnStart:
        - Content: !Base64 
            'Fn::Join':
              - ''
              - - |
                  #!/bin/bash  
                - |
                  aws configure set sts_regional_endpoints regional 
                - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config  
  SageMakerCustomResource:
    Type: 'Custom::SageMakerCustomResource'
    DependsOn: S3Policy
    Properties:
      ServiceToken: !Ref SageMakerBuildFunctionARN
      NotebookInstanceName: !Ref NotebookInstanceName
      NotebookInstanceType: !Ref NotebookInstanceType
      KmsKeyId: !Ref SagemakerKMSKey
      ENVName: !Join 
        - ''
        - - !Ref ENVName
          - !Sub Subnet1Id
      Subnet: !Ref SubnetName
      SecurityGroupName: !Ref SecurityGroupName
      ProjectName: !Ref ProjectName
      RootAccess: !Ref RootAccess
      VolumeSizeInGB: !Ref VolumeSizeInGB
      LifecycleConfigName: !If [BucketCondition, !GetAtt EFSLifecycleConfigForS3.NotebookInstanceLifecycleConfigName, !GetAtt EFSLifecycleConfig.NotebookInstanceLifecycleConfigName]  
      DirectInternetAccess: !Ref DirectInternetAccess
      RoleArn: !GetAtt 
        - SageMakerExecutionRole
        - Arn
      Tags:
        - Key: ProjectID
          Value: !Ref ProjectID
        - Key: ProjectName
          Value: !Ref ProjectName
Outputs:
  Message:
    Description: Execution Status
    Value: !GetAtt 
      - SageMakerCustomResource
      - Message
  SagemakerKMSKey:
    Description: KMS Key for encrypting Sagemaker resource
    Value: !Ref KeyAlias
  ExecutionRoleArn:
    Description: ARN of the Sagemaker Execution Role
    Value: !Ref SageMakerExecutionRole
  S3BucketName:
    Description: S3 bucket for SageMaker Notebook operation
    Value: !Ref SagemakerS3Bucket
  NotebookInstanceName:
    Description: Name of the Sagemaker Notebook instance created
    Value: !Ref NotebookInstanceName
  ProjectName:
    Description: Project ID used for SageMaker deployment
    Value: !Ref ProjectName
  ProjectID:
    Description: Project ID used for SageMaker deployment
    Value: !Ref ProjectID

3. 接下来我们进入VPC服务主页,进入Endpoint功能,点击Create endpoint创建一个VPC endpoint节点,用于SageMaker私密安全的访问S3桶中的大模型文件。

4. 为节点命名为“s3-endpoint”,并选择节点访问对象类型为AWS service,选择s3作为访问服务。

5. 选择节点所在的VPC,并配置路由表,最后点击创建。

6. 接下来我们进入亚马逊云科技service catalog服务主页,进入Portfolio功能,点击create创建一个新的portfolio,用于统一管理一整个包括不同云资源的服务。

7. 为service portfolio起名“SageMakerPortfolio“,所有者选为CQ。

8. 接下来我们为Portfolio添加云资源,点击"create product"

9. 我们选择通过CloudFormation IaC脚本的形式创建Product云资源,为Product其名为”SageMakerProduct“,所有者设置为CQ。

10. 在Product中添加CloudFormation脚本文件,我们通过URL的形式,将我们在第二步上传到S3中的CloudFormation脚本URL填入,并设置版本为1,最后点击Create创建Product云资源。

11.接下来我们进入到Constraints页面,点击create创建Constraints,用于通过权限管理限制利用Service Catalog Product对云资源的操作。

12. 选择限制我们刚刚创建的的Product: "SageMakerProduct",选择限制的类型为创建。

13. 为限制添加IAM角色规则,IAM角色中配置了对Product权限管理规则,再点击Create创建。

14. 接下来我们点击Access,创建一个Access来限制可以访问Product云资源的用户。

15. 我们添加了角色”SCEndUserRole“,用户代替用户访问Product创建云资源。

16. 接下来我们开始利用Service Catalog Product创建一些列的云资源。选中我们刚创建的Product,点击Launch

17. 为我们要创建的云资源Product起一个名字”DataScientistProduct“, 选择我们前一步创建的版本号1。

18. 为将要通过Product创建的SageMaker配置参数,环境名以及实例名

19. 添加我们在最开始创建的Lambda函数ARN ID,点击Launch开始创建。

20. 最后回到SageMaker服务主页,可以看到我们利用Service Catalog Product功能成功创建了一个新的Jupyter Notebook实例。利用这个实例,我们就可以开发我们的AI服务应用。

以上就是在亚马逊云科技上利用亚马逊云科技安全、合规地训练AI大模型和开发AI应用全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mfbz.cn/a/871556.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

就业c++02 随处可见红黑树

通过key来比较节点插入哪个地方 一种key value 另一种顺序执行 比如查找小于50的数字在左面还是在右面 访问那个资源 他的次数是多少构建了 资源key 次数 value 海量的io 过来访问 知道哪一个io 就是key value查找 根据黑色高度的差异 红色节点和红色节点是不相邻的

Python数据分析:数据可视化(Matplotlib、Seaborn)

数据可视化是数据分析中不可或缺的一部分&#xff0c;通过将数据以图形的方式展示出来&#xff0c;可以更直观地理解数据的分布和趋势。在Python中&#xff0c;Matplotlib和Seaborn是两个非常流行和强大的数据可视化库。本文将详细介绍这两个库的使用方法&#xff0c;并附上一个…

OceanMind海睿思入选《2024中国企业数智化转型升级服务全景图/产业图谱》

近日&#xff0c;国内知名数据智能产业创新服务媒体数据猿携手上海大数据联盟发布了《2024中国企业数智化转型升级服务全景图/产业图谱1.0版》。中新赛克海睿思从数千家企业中脱颖而出&#xff0c;成功入选「底层技术服务 - 大数据」细分领域。 在历经数月的时间里&#xff0c;…

实现Bezier样条曲线

1.给出n1 个控制点pk(xk,yk,zk),这里k可取值0-n,多项式函数公式如下 获取的单个点的代码 void zmBezier::getPoint(float u, double p[3]) {int n m_count - 1;double x 0, y 0, z 0;for(int k 0; k < n; k){x m_ctrlPoints[k][0] * BEZ_k_n(n, k, u);y m_ctrlPoin…

《机器学习》—— 使用过采样方法实现逻辑回归分类问题

文章目录 一、什么是过采样方法&#xff1f;二、使用过采样方法实现逻辑回归分类问题三、过采样的优缺点 本篇内容是 基于Python的scikit-learn库中sklearn.linear_model 类中的 LogisticRegression&#xff08;&#xff09;逻辑回归方法实现的&#xff0c;其内容中只是在处理…

创建、使用、删除 数据库

一、创建数据库 1.1 使用DDL语句创建数据库 CREATE DATABASE 数据库名 CHARACTER SET 字符编码 COLLATE 排序规则; 如果不指定数据库编码&#xff0c;默认是utf8&#xff1b; 如果不指定排序规则&#xff0c;默认是utf8_general_ci&#xff0c;即不区分大小写&#xff0c;区分…

我“开发“了一款大模型应用,AI门槛这么低了吗?

现在国产大模型多如牛毛。虽然可选的大模型产品很多&#xff0c;但普遍存在同质化、高分低能、实用性差、专业性不足的问题&#xff0c;哪怕是诸如ChatGPT、Gemini这样全球顶尖的大模型也会存在这种情况。 还有一点比较重要的是&#xff0c;由于大模型需要算力、算法、数据的基…

vue+ckEditor5 复制粘贴wold文字+图片并保存格式

第一步在vue2项目下安装 npm install --save ckeditor/ckeditor5-build-decoupled-document 第二 项目下新建一个plugins的文件夹将这个包ckeditor5-build-classic放入 &#xff08;包在页面最上方 有个下载按钮 可以下载&#xff09; 刚开始时 ckeditor5-build-classic文件…

「字符串」前缀函数|KMP匹配:规范化next数组 / LeetCode 28(C++)

目录 概述 思路 核心概念&#xff1a;前缀函数 1.前缀函数 2.next数组 1.考研版本 2.竞赛版本 算法过程 构建next数组 匹配过程 复杂度 Code 概述 为什么大家总觉得KMP难&#xff1f;难的根本就不是这个算法本身。 在互联网上你可以见到八十种KMP算法的next数组…

SQL 布尔盲注 (injection 第六关)

简介 SQL注入&#xff08;SQL Injection&#xff09;是一种常见的网络攻击方式&#xff0c;通过向SQL查询中插入恶意的SQL代码&#xff0c;攻击者可以操控数据库&#xff0c;SQL注入是一种代码注入攻击&#xff0c;其中攻击者将恶意的SQL代码插入到应用程序的输入字段中&am…

26.删除有序数组中的重复项---力扣

题目链接&#xff1a; . - 力扣&#xff08;LeetCode&#xff09;. - 备战技术面试&#xff1f;力扣提供海量技术面试资源&#xff0c;帮助你高效提升编程技能,轻松拿下世界 IT 名企 Dream Offer。https://leetcode.cn/problems/remove-duplicates-from-sorted-array/descript…

嵌入式学习——(Linux高级编程——线程)

线程 一、pthread 线程概述 pthread&#xff08;POSIX threads&#xff09;是一种用于在程序中实现多线程的编程接口。它与进程一样&#xff0c;可以用于实现并发执行任务&#xff0c;但与进程相比有一些不同的特点。 二、优点 1. 比多进程节省资源&#xff1a;进程在创建时…

PDPS软件 那智机器人 (丰田版)离线程序导出处理

在PDPS仿真软件中导出的那智机器人离线程序&#xff0c;一般是无法直接给TFD控制装置-那智机器人&#xff08;丰田式样版&#xff09;导入及识别使用。因此要对导出的程序进行转换编译处理&#xff0c;才能给TFD那智机器人&#xff08;丰田式样版&#xff09;导入离线程序。以下…

HarmonyOS 开发

环境 下载IDE 代码 import { hilog } from kit.PerformanceAnalysisKit; import testNapi from libentry.so; import { router } from kit.ArkUI; import { common, Want } from kit.AbilityKit;Entry Component struct Index {State message: string Hello HarmonyOS!;p…

类与对象(中(2))

开头 大家好啊&#xff0c;上一期内容我们介绍了类与对象中六大默认成员函数中的两种--->构造函数与析构函数&#xff0c;相信大家多少都形成了自己的独到见解。那么今天&#xff0c;我将继续就拷贝构造函数与运算符重载函数来展开讲解&#xff0c;话不多说&#xff0c;我们…

Python版《超级玛丽+源码》-Python制作超级玛丽游戏

小时候最喜欢玩的小游戏就是超级玛丽了&#xff0c;有刺激有又技巧&#xff0c;通关真的很难&#xff0c;救下小公主还被抓走了&#xff0c;唉&#xff0c;心累&#xff0c;最后还是硬着头皮继续闯&#xff0c;终于要通关了&#xff0c;之后再玩还是没有那么容易&#xff0c;哈…

十五年以来 — 战略性云平台服务的演进路径之全面呈现(含亚马逊、微软和谷歌)

Gartner每年都发布对全球IaaS平台进行评估的魔力象限报告。2023年底&#xff0c;Gartner将此项评估的名称改为“战略性云平台服务”&#xff08;Strategic cloud platform services&#xff09;&#xff0c;尽管其核心仍为IaaS&#xff0c;但是&#xff0c;毫无疑问&#xff0c…

手机云电脑游戏测评:ToDesk、易腾云、达龙云、青椒云四款对比分析

文章目录 &#x1f4d1; 引言一、背景概述测试目标 二、测试方案与评测标准2.1 测试设备2.2 评测标准 三、云电脑移动端实测3.1 ToDesk云电脑3.1.1 安装步骤与用户界面3.1.2 性能测试3.1.3 多场景适用性与兼容性3.1.4 性价比 3.2 易腾云电脑3.2.1 安装流程与用户界面3.2.2 帧率…

WebRTC为何成为视频开发领域的首选技术? EasyCVR视频转码助力无缝视频通信

随着互联网的飞速发展&#xff0c;视频通信已成为日常生活和工作中不可或缺的一部分。从在线教育、视频会议到远程医疗、在线直播&#xff0c;视频开发的需求日益增长。在这些应用场景中&#xff0c;选择何种技术来构建视频系统至关重要。 目前&#xff0c;在很多视频业务的开…