ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • MWAA 설치
    AWS/Airflow(AWS) 2023. 3. 28. 22:02

     

    회사에서 AWS내 Airflow인 MWAA를 구축하며 정리한 내용입니다.

     

    1. AWS CLI 자격 얻기


    1-1) AWS CLI 설치 - 링크

    • Command로 작업하도록 도와주는 역할
    $ curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
    $ sudo softwareupdate --install-rosetta
    $ sudo installer -pkg AWSCLIV2.pkg -target /
    
    # 제대로 동작하는지 체크
    $ which aws
    $ aws --version
    > aws-cli/2.8.5 Python/3.9.11 Darwin/21.6.0 exe/x86_64 prompt/off

    1-2) Certification 설정 - 링크

    • aws 사용자 계정에서 본인의 계정의 Certification ( access_key_id , access_key ) 다운로드
    • CLI 아래와 같이 설정
    $ aws configure 
    
    AWS Access Key ID [None]: id 
    AWS Secret Access Key [None]: key Default 
    region name [None]: ap-northeast-2 Default 
    output format [None]: json

     

    2. MWAA 설치


    2-1) download or save mwaa-environment-public-network.yml - 링크

    • Create Enviornment MWAA
      • ${AWS::Region}
        • aws/config region
      • ${AWS::AccountId}
        • Aws 계정 ID
      • ${AWS::StackName}
        • command 시 stack-name 값
        • 알파벳으로 시작하여야 한다.
    • 필요에 따라 VPC,CIDR,Subnet,Gateway 값을 변경하여 사용한다.
    AWSTemplateFormatVersion: "2010-09-09"
    
    Parameters:
    
      EnvironmentName:
        # 모든 이름 앞에 붙여서 사용할 값
        Description: An environment name that is prefixed to resource names
        Type: String
        Default: MWAAEnvironment
        # Network Setting
      VpcCIDR:
        Description: The IP range (CIDR notation) for this VPC
        Type: String
        Default: 10.192.0.0/16
    
      PublicSubnet1CIDR:
        Description: The IP range (CIDR notation) for the public subnet in the first Availability Zone
        Type: String
        Default: 10.192.10.0/24
    
      PublicSubnet2CIDR:
        Description: The IP range (CIDR notation) for the public subnet in the second Availability Zone
        Type: String
        Default: 10.192.11.0/24
    
      PrivateSubnet1CIDR:
        Description: The IP range (CIDR notation) for the private subnet in the first Availability Zone
        Type: String
        Default: 10.192.20.0/24
      PrivateSubnet2CIDR:
        Description: The IP range (CIDR notation) for the private subnet in the second Availability Zone
        Type: String
        Default: 10.192.21.0/24
      # 최대 worker 수
      MaxWorkerNodes:
        Description: The maximum number of workers that can run in the environment
        Type: Number
        Default: 2
      # Log 구성
      DagProcessingLogs:
        Description: Log level for DagProcessing
        Type: String
        Default: INFO
      SchedulerLogsLevel:
        Description: Log level for SchedulerLogs
        Type: String
        Default: INFO
      TaskLogsLevel:
        Description: Log level for TaskLogs
        Type: String
        Default: INFO
      WorkerLogsLevel:
        Description: Log level for WorkerLogs
        Type: String
        Default: INFO
      WebserverLogsLevel:
        Description: Log level for WebserverLogs
        Type: String
        Default: INFO
    
    Resources:
      #####################################################################################################################
      # CREATE VPC
      #####################################################################################################################
    
      VPC:
        Type: AWS::EC2::VPC
        Properties:
          CidrBlock: !Ref VpcCIDR
          EnableDnsSupport: true
          EnableDnsHostnames: true
          Tags:
            - Key: Name
              # vpc에서 보이는 이름 설정 값
              Value: MWAAEnvironment
    
      InternetGateway:
        Type: AWS::EC2::InternetGateway
        Properties:
          Tags:
            - Key: Name
              # 인터넷 게이트웨이에 보이는 이름 설정 값
              Value: MWAAEnvironment
    
      InternetGatewayAttachment:
        Type: AWS::EC2::VPCGatewayAttachment
        Properties:
          InternetGatewayId: !Ref InternetGateway
          VpcId: !Ref VPC
    
      PublicSubnet1:
        Type: AWS::EC2::Subnet
        Properties:
          VpcId: !Ref VPC
          AvailabilityZone: !Select [ 0, !GetAZs '' ]
          CidrBlock: !Ref PublicSubnet1CIDR
          MapPublicIpOnLaunch: true
          Tags:
            - Key: Name
              # Subnet 이름
              Value: !Sub ${EnvironmentName} Public Subnet (AZ1)
    
      PublicSubnet2:
        Type: AWS::EC2::Subnet
        Properties:
          VpcId: !Ref VPC
          AvailabilityZone: !Select [ 1, !GetAZs  '' ]
          CidrBlock: !Ref PublicSubnet2CIDR
          MapPublicIpOnLaunch: true
          Tags:
            - Key: Name
              # Subnet 이름
              Value: !Sub ${EnvironmentName} Public Subnet (AZ2)
    
      PrivateSubnet1:
        Type: AWS::EC2::Subnet
        Properties:
          VpcId: !Ref VPC
          AvailabilityZone: !Select [ 0, !GetAZs  '' ]
          CidrBlock: !Ref PrivateSubnet1CIDR
          MapPublicIpOnLaunch: false
          Tags:
            - Key: Name
              # Subnet 이름
              Value: !Sub ${EnvironmentName} Private Subnet (AZ1)
    
      PrivateSubnet2:
        Type: AWS::EC2::Subnet
        Properties:
          VpcId: !Ref VPC
          AvailabilityZone: !Select [ 1, !GetAZs  '' ]
          CidrBlock: !Ref PrivateSubnet2CIDR
          MapPublicIpOnLaunch: false
          Tags:
            - Key: Name
              # Subnet 이름
              Value: !Sub ${EnvironmentName} Private Subnet (AZ2)
    
      NatGateway1EIP:
        Type: AWS::EC2::EIP
        DependsOn: InternetGatewayAttachment
        Properties:
          Domain: vpc
    
      NatGateway2EIP:
        Type: AWS::EC2::EIP
        DependsOn: InternetGatewayAttachment
        Properties:
          Domain: vpc
    
      NatGateway1:
        Type: AWS::EC2::NatGateway
        Properties:
          AllocationId: !GetAtt NatGateway1EIP.AllocationId
          SubnetId: !Ref PublicSubnet1
    
      NatGateway2:
        Type: AWS::EC2::NatGateway
        Properties:
          AllocationId: !GetAtt NatGateway2EIP.AllocationId
          SubnetId: !Ref PublicSubnet2
    
      PublicRouteTable:
        Type: AWS::EC2::RouteTable
        Properties:
          VpcId: !Ref VPC
          Tags:
            - Key: Name
              # Route Table Name
              Value: !Sub ${EnvironmentName} Public Routes
    
      DefaultPublicRoute:
        Type: AWS::EC2::Route
        DependsOn: InternetGatewayAttachment
        Properties:
          RouteTableId: !Ref PublicRouteTable
          DestinationCidrBlock: 0.0.0.0/0
          GatewayId: !Ref InternetGateway
    
      PublicSubnet1RouteTableAssociation:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PublicRouteTable
          SubnetId: !Ref PublicSubnet1
    
      PublicSubnet2RouteTableAssociation:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PublicRouteTable
          SubnetId: !Ref PublicSubnet2
    
    
      PrivateRouteTable1:
        Type: AWS::EC2::RouteTable
        Properties:
          VpcId: !Ref VPC
          Tags:
            - Key: Name
              # Route Table Name
              Value: !Sub ${EnvironmentName} Private Routes (AZ1)
    
      DefaultPrivateRoute1:
        Type: AWS::EC2::Route
        Properties:
          RouteTableId: !Ref PrivateRouteTable1
          DestinationCidrBlock: 0.0.0.0/0
          NatGatewayId: !Ref NatGateway1
    
      PrivateSubnet1RouteTableAssociation:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PrivateRouteTable1
          SubnetId: !Ref PrivateSubnet1
    
      PrivateRouteTable2:
        Type: AWS::EC2::RouteTable
        Properties:
          VpcId: !Ref VPC
          Tags:
            - Key: Name
              # Route Table Name
              Value: !Sub ${EnvironmentName} Private Routes (AZ2)
    
      DefaultPrivateRoute2:
        Type: AWS::EC2::Route
        Properties:
          RouteTableId: !Ref PrivateRouteTable2
          DestinationCidrBlock: 0.0.0.0/0
          NatGatewayId: !Ref NatGateway2
    
      PrivateSubnet2RouteTableAssociation:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PrivateRouteTable2
          SubnetId: !Ref PrivateSubnet2
    
      SecurityGroup:
        Type: AWS::EC2::SecurityGroup
        Properties:
          GroupName: "mwaa-security-group"
          GroupDescription: "Security group with a self-referencing inbound rule."
          VpcId: !Ref VPC
    
      SecurityGroupIngress:
        Type: AWS::EC2::SecurityGroupIngress
        Properties:
          GroupId: !Ref SecurityGroup
          IpProtocol: "-1"
          SourceSecurityGroupId: !Ref SecurityGroup
    
      EnvironmentBucket:
        Type: AWS::S3::Bucket
        Properties:
          VersioningConfiguration:
            Status: Enabled
          PublicAccessBlockConfiguration: 
            BlockPublicAcls: true
            BlockPublicPolicy: true
            IgnorePublicAcls: true
            RestrictPublicBuckets: true
    
      #####################################################################################################################
      # CREATE MWAA
      #####################################################################################################################
    
      MwaaEnvironment:
        Type: AWS::MWAA::Environment
        DependsOn: MwaaExecutionPolicy
        Properties:
          # ${AWS::Region} = aws/config region
          # ${AWS::AccountId} = ??
          # {AWS::StackName} = command 시 stack-name 값
          # 알파벳으로 시작하여야 한다.
          Name: !Sub "${AWS::StackName}-MwaaEnvironment"
          SourceBucketArn: !GetAtt EnvironmentBucket.Arn
          ExecutionRoleArn: !GetAtt MwaaExecutionRole.Arn
          DagS3Path: dags
          NetworkConfiguration:
            SecurityGroupIds:
              - !GetAtt SecurityGroup.GroupId
            SubnetIds:
              - !Ref PrivateSubnet1
              - !Ref PrivateSubnet2
          WebserverAccessMode: PUBLIC_ONLY
          MaxWorkers: !Ref MaxWorkerNodes
          LoggingConfiguration:
            DagProcessingLogs:
              LogLevel: !Ref DagProcessingLogs
              Enabled: true
            SchedulerLogs:
              LogLevel: !Ref SchedulerLogsLevel
              Enabled: true
            TaskLogs:
              LogLevel: !Ref TaskLogsLevel
              Enabled: true
            WorkerLogs:
              LogLevel: !Ref WorkerLogsLevel
              Enabled: true
            WebserverLogs:
              LogLevel: !Ref WebserverLogsLevel
              Enabled: true
      SecurityGroup:
        Type: AWS::EC2::SecurityGroup
        Properties:
          VpcId: !Ref VPC
          GroupDescription: !Sub "Security Group for Amazon MWAA Environment ${AWS::StackName}-MwaaEnvironment"
          GroupName: !Sub "airflow-security-group-${AWS::StackName}-MwaaEnvironment"
      
      SecurityGroupIngress:
        Type: AWS::EC2::SecurityGroupIngress
        Properties:
          GroupId: !Ref SecurityGroup
          IpProtocol: "-1"
          SourceSecurityGroupId: !Ref SecurityGroup
    
      SecurityGroupEgress:
        Type: AWS::EC2::SecurityGroupEgress
        Properties:
          GroupId: !Ref SecurityGroup
          IpProtocol: "-1"
          CidrIp: "0.0.0.0/0"
    
      MwaaExecutionRole:
        Type: AWS::IAM::Role
        Properties:
          AssumeRolePolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Principal:
                  Service:
                    - airflow-env.amazonaws.com
                    - airflow.amazonaws.com
                Action:
                 - "sts:AssumeRole"
          Path: "/service-role/"
    
      MwaaExecutionPolicy:
        DependsOn: EnvironmentBucket
        Type: AWS::IAM::ManagedPolicy
        Properties:
          Roles:
            - !Ref MwaaExecutionRole
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action: airflow:PublishMetrics
                Resource:
                  - !Sub "arn:aws:airflow:${AWS::Region}:${AWS::AccountId}:environment/${EnvironmentName}"
              - Effect: Deny
                Action: s3:ListAllMyBuckets
                Resource:
                  - !Sub "${EnvironmentBucket.Arn}"
                  - !Sub "${EnvironmentBucket.Arn}/*"
    
              - Effect: Allow
                Action:
                  - "s3:GetObject*"
                  - "s3:GetBucket*"
                  - "s3:List*"
                Resource:
                  - !Sub "${EnvironmentBucket.Arn}"
                  - !Sub "${EnvironmentBucket.Arn}/*"
              - Effect: Allow
                Action:
                  - logs:DescribeLogGroups
                Resource: "*"
    
              - Effect: Allow
                Action:
                  - logs:CreateLogStream
                  - logs:CreateLogGroup
                  - logs:PutLogEvents
                  - logs:GetLogEvents
                  - logs:GetLogRecord
                  - logs:GetLogGroupFields
                  - logs:GetQueryResults
                  - logs:DescribeLogGroups
                Resource:
                  - !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:airflow-${AWS::StackName}*"
              - Effect: Allow
                Action: cloudwatch:PutMetricData
                Resource: "*"
              - Effect: Allow
                Action:
                  - sqs:ChangeMessageVisibility
                  - sqs:DeleteMessage
                  - sqs:GetQueueAttributes
                  - sqs:GetQueueUrl
                  - sqs:ReceiveMessage
                  - sqs:SendMessage
                Resource:
                  - !Sub "arn:aws:sqs:${AWS::Region}:*:airflow-celery-*"
              - Effect: Allow
                Action:
                  - kms:Decrypt
                  - kms:DescribeKey
                  - "kms:GenerateDataKey*"
                  - kms:Encrypt
                NotResource: !Sub "arn:aws:kms:*:${AWS::AccountId}:key/*"
                Condition:
                  StringLike:
                    "kms:ViaService":
                      - !Sub "sqs.${AWS::Region}.amazonaws.com"
    Outputs:
      VPC:
        Description: A reference to the created VPC
        Value: !Ref VPC
    
      PublicSubnets:
        Description: A list of the public subnets
        Value: !Join [ ",", [ !Ref PublicSubnet1, !Ref PublicSubnet2 ]]
    
      PrivateSubnets:
        Description: A list of the private subnets
        Value: !Join [ ",", [ !Ref PrivateSubnet1, !Ref PrivateSubnet2 ]]
    
      PublicSubnet1:
        Description: A reference to the public subnet in the 1st Availability Zone
        Value: !Ref PublicSubnet1
    
      PublicSubnet2:
        Description: A reference to the public subnet in the 2nd Availability Zone
        Value: !Ref PublicSubnet2
    
      PrivateSubnet1:
        Description: A reference to the private subnet in the 1st Availability Zone
        Value: !Ref PrivateSubnet1
    
      PrivateSubnet2:
        Description: A reference to the private subnet in the 2nd Availability Zone
        Value: !Ref PrivateSubnet2
    
      SecurityGroupIngress:
        Description: Security group with self-referencing inbound rule
        Value: !Ref SecurityGroupIngress
    
      MwaaApacheAirflowUI:
        Description: MWAA Environment
        Value: !Sub  "https://${MwaaEnvironment.WebserverUrl}"

    2-2) Create Stack with AWS CLI

    • file:// 은 필수 값
    • download file split data = - 이지만 코드 명령어는 _
    $ cd file # move yml file
    $ aws cloudformation create-stack --stack-name mwaa-environment --template-body file://mwaa_public_network.yml --capabilities CAPABILITY_IAM

    2-3) 결과

    • 아래와 같이 설치된 것을 볼 수 있습니다.

    https://ifh.cc/g/RF070R.png

    댓글

Designed by Tistory.