AWS/Airflow(AWS)
MWAA 설치
Dortmoot
2023. 3. 28. 22:02
회사에서 AWS내 Airflow인 MWAA를 구축하며 정리한 내용입니다.
1. AWS CLI 자격 얻기
1-1) AWS CLI 설치 - 링크
- Command로 작업하도록 도와주는 역할
$ curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
$ sudo softwareupdate --install-rosetta
$ sudo installer -pkg AWSCLIV2.pkg -target /
# 제대로 동작하는지 체크
$ which aws
$ aws --version
> aws-cli/2.8.5 Python/3.9.11 Darwin/21.6.0 exe/x86_64 prompt/off
1-2) Certification 설정 - 링크
- aws 사용자 계정에서 본인의 계정의 Certification ( access_key_id , access_key ) 다운로드
- CLI 아래와 같이 설정
$ aws configure
AWS Access Key ID [None]: id
AWS Secret Access Key [None]: key Default
region name [None]: ap-northeast-2 Default
output format [None]: json
2. MWAA 설치
2-1) download or save mwaa-environment-public-network.yml - 링크
- Create Enviornment MWAA
- ${AWS::Region}
- aws/config region
- ${AWS::AccountId}
- Aws 계정 ID
- ${AWS::StackName}
- command 시 stack-name 값
- 알파벳으로 시작하여야 한다.
- ${AWS::Region}
- 필요에 따라 VPC,CIDR,Subnet,Gateway 값을 변경하여 사용한다.
AWSTemplateFormatVersion: "2010-09-09"
Parameters:
EnvironmentName:
# 모든 이름 앞에 붙여서 사용할 값
Description: An environment name that is prefixed to resource names
Type: String
Default: MWAAEnvironment
# Network Setting
VpcCIDR:
Description: The IP range (CIDR notation) for this VPC
Type: String
Default: 10.192.0.0/16
PublicSubnet1CIDR:
Description: The IP range (CIDR notation) for the public subnet in the first Availability Zone
Type: String
Default: 10.192.10.0/24
PublicSubnet2CIDR:
Description: The IP range (CIDR notation) for the public subnet in the second Availability Zone
Type: String
Default: 10.192.11.0/24
PrivateSubnet1CIDR:
Description: The IP range (CIDR notation) for the private subnet in the first Availability Zone
Type: String
Default: 10.192.20.0/24
PrivateSubnet2CIDR:
Description: The IP range (CIDR notation) for the private subnet in the second Availability Zone
Type: String
Default: 10.192.21.0/24
# 최대 worker 수
MaxWorkerNodes:
Description: The maximum number of workers that can run in the environment
Type: Number
Default: 2
# Log 구성
DagProcessingLogs:
Description: Log level for DagProcessing
Type: String
Default: INFO
SchedulerLogsLevel:
Description: Log level for SchedulerLogs
Type: String
Default: INFO
TaskLogsLevel:
Description: Log level for TaskLogs
Type: String
Default: INFO
WorkerLogsLevel:
Description: Log level for WorkerLogs
Type: String
Default: INFO
WebserverLogsLevel:
Description: Log level for WebserverLogs
Type: String
Default: INFO
Resources:
#####################################################################################################################
# CREATE VPC
#####################################################################################################################
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcCIDR
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
# vpc에서 보이는 이름 설정 값
Value: MWAAEnvironment
InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
# 인터넷 게이트웨이에 보이는 이름 설정 값
Value: MWAAEnvironment
InternetGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
PublicSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: !Ref PublicSubnet1CIDR
MapPublicIpOnLaunch: true
Tags:
- Key: Name
# Subnet 이름
Value: !Sub ${EnvironmentName} Public Subnet (AZ1)
PublicSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [ 1, !GetAZs '' ]
CidrBlock: !Ref PublicSubnet2CIDR
MapPublicIpOnLaunch: true
Tags:
- Key: Name
# Subnet 이름
Value: !Sub ${EnvironmentName} Public Subnet (AZ2)
PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [ 0, !GetAZs '' ]
CidrBlock: !Ref PrivateSubnet1CIDR
MapPublicIpOnLaunch: false
Tags:
- Key: Name
# Subnet 이름
Value: !Sub ${EnvironmentName} Private Subnet (AZ1)
PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [ 1, !GetAZs '' ]
CidrBlock: !Ref PrivateSubnet2CIDR
MapPublicIpOnLaunch: false
Tags:
- Key: Name
# Subnet 이름
Value: !Sub ${EnvironmentName} Private Subnet (AZ2)
NatGateway1EIP:
Type: AWS::EC2::EIP
DependsOn: InternetGatewayAttachment
Properties:
Domain: vpc
NatGateway2EIP:
Type: AWS::EC2::EIP
DependsOn: InternetGatewayAttachment
Properties:
Domain: vpc
NatGateway1:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt NatGateway1EIP.AllocationId
SubnetId: !Ref PublicSubnet1
NatGateway2:
Type: AWS::EC2::NatGateway
Properties:
AllocationId: !GetAtt NatGateway2EIP.AllocationId
SubnetId: !Ref PublicSubnet2
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
# Route Table Name
Value: !Sub ${EnvironmentName} Public Routes
DefaultPublicRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PublicSubnet1RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet1
PublicSubnet2RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PublicRouteTable
SubnetId: !Ref PublicSubnet2
PrivateRouteTable1:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
# Route Table Name
Value: !Sub ${EnvironmentName} Private Routes (AZ1)
DefaultPrivateRoute1:
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTable1
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGateway1
PrivateSubnet1RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable1
SubnetId: !Ref PrivateSubnet1
PrivateRouteTable2:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
# Route Table Name
Value: !Sub ${EnvironmentName} Private Routes (AZ2)
DefaultPrivateRoute2:
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PrivateRouteTable2
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NatGateway2
PrivateSubnet2RouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PrivateRouteTable2
SubnetId: !Ref PrivateSubnet2
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: "mwaa-security-group"
GroupDescription: "Security group with a self-referencing inbound rule."
VpcId: !Ref VPC
SecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref SecurityGroup
IpProtocol: "-1"
SourceSecurityGroupId: !Ref SecurityGroup
EnvironmentBucket:
Type: AWS::S3::Bucket
Properties:
VersioningConfiguration:
Status: Enabled
PublicAccessBlockConfiguration:
BlockPublicAcls: true
BlockPublicPolicy: true
IgnorePublicAcls: true
RestrictPublicBuckets: true
#####################################################################################################################
# CREATE MWAA
#####################################################################################################################
MwaaEnvironment:
Type: AWS::MWAA::Environment
DependsOn: MwaaExecutionPolicy
Properties:
# ${AWS::Region} = aws/config region
# ${AWS::AccountId} = ??
# {AWS::StackName} = command 시 stack-name 값
# 알파벳으로 시작하여야 한다.
Name: !Sub "${AWS::StackName}-MwaaEnvironment"
SourceBucketArn: !GetAtt EnvironmentBucket.Arn
ExecutionRoleArn: !GetAtt MwaaExecutionRole.Arn
DagS3Path: dags
NetworkConfiguration:
SecurityGroupIds:
- !GetAtt SecurityGroup.GroupId
SubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
WebserverAccessMode: PUBLIC_ONLY
MaxWorkers: !Ref MaxWorkerNodes
LoggingConfiguration:
DagProcessingLogs:
LogLevel: !Ref DagProcessingLogs
Enabled: true
SchedulerLogs:
LogLevel: !Ref SchedulerLogsLevel
Enabled: true
TaskLogs:
LogLevel: !Ref TaskLogsLevel
Enabled: true
WorkerLogs:
LogLevel: !Ref WorkerLogsLevel
Enabled: true
WebserverLogs:
LogLevel: !Ref WebserverLogsLevel
Enabled: true
SecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
VpcId: !Ref VPC
GroupDescription: !Sub "Security Group for Amazon MWAA Environment ${AWS::StackName}-MwaaEnvironment"
GroupName: !Sub "airflow-security-group-${AWS::StackName}-MwaaEnvironment"
SecurityGroupIngress:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref SecurityGroup
IpProtocol: "-1"
SourceSecurityGroupId: !Ref SecurityGroup
SecurityGroupEgress:
Type: AWS::EC2::SecurityGroupEgress
Properties:
GroupId: !Ref SecurityGroup
IpProtocol: "-1"
CidrIp: "0.0.0.0/0"
MwaaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- airflow-env.amazonaws.com
- airflow.amazonaws.com
Action:
- "sts:AssumeRole"
Path: "/service-role/"
MwaaExecutionPolicy:
DependsOn: EnvironmentBucket
Type: AWS::IAM::ManagedPolicy
Properties:
Roles:
- !Ref MwaaExecutionRole
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: airflow:PublishMetrics
Resource:
- !Sub "arn:aws:airflow:${AWS::Region}:${AWS::AccountId}:environment/${EnvironmentName}"
- Effect: Deny
Action: s3:ListAllMyBuckets
Resource:
- !Sub "${EnvironmentBucket.Arn}"
- !Sub "${EnvironmentBucket.Arn}/*"
- Effect: Allow
Action:
- "s3:GetObject*"
- "s3:GetBucket*"
- "s3:List*"
Resource:
- !Sub "${EnvironmentBucket.Arn}"
- !Sub "${EnvironmentBucket.Arn}/*"
- Effect: Allow
Action:
- logs:DescribeLogGroups
Resource: "*"
- Effect: Allow
Action:
- logs:CreateLogStream
- logs:CreateLogGroup
- logs:PutLogEvents
- logs:GetLogEvents
- logs:GetLogRecord
- logs:GetLogGroupFields
- logs:GetQueryResults
- logs:DescribeLogGroups
Resource:
- !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:airflow-${AWS::StackName}*"
- Effect: Allow
Action: cloudwatch:PutMetricData
Resource: "*"
- Effect: Allow
Action:
- sqs:ChangeMessageVisibility
- sqs:DeleteMessage
- sqs:GetQueueAttributes
- sqs:GetQueueUrl
- sqs:ReceiveMessage
- sqs:SendMessage
Resource:
- !Sub "arn:aws:sqs:${AWS::Region}:*:airflow-celery-*"
- Effect: Allow
Action:
- kms:Decrypt
- kms:DescribeKey
- "kms:GenerateDataKey*"
- kms:Encrypt
NotResource: !Sub "arn:aws:kms:*:${AWS::AccountId}:key/*"
Condition:
StringLike:
"kms:ViaService":
- !Sub "sqs.${AWS::Region}.amazonaws.com"
Outputs:
VPC:
Description: A reference to the created VPC
Value: !Ref VPC
PublicSubnets:
Description: A list of the public subnets
Value: !Join [ ",", [ !Ref PublicSubnet1, !Ref PublicSubnet2 ]]
PrivateSubnets:
Description: A list of the private subnets
Value: !Join [ ",", [ !Ref PrivateSubnet1, !Ref PrivateSubnet2 ]]
PublicSubnet1:
Description: A reference to the public subnet in the 1st Availability Zone
Value: !Ref PublicSubnet1
PublicSubnet2:
Description: A reference to the public subnet in the 2nd Availability Zone
Value: !Ref PublicSubnet2
PrivateSubnet1:
Description: A reference to the private subnet in the 1st Availability Zone
Value: !Ref PrivateSubnet1
PrivateSubnet2:
Description: A reference to the private subnet in the 2nd Availability Zone
Value: !Ref PrivateSubnet2
SecurityGroupIngress:
Description: Security group with self-referencing inbound rule
Value: !Ref SecurityGroupIngress
MwaaApacheAirflowUI:
Description: MWAA Environment
Value: !Sub "https://${MwaaEnvironment.WebserverUrl}"
2-2) Create Stack with AWS CLI
- file:// 은 필수 값
- download file split data =
-
이지만 코드 명령어는_
$ cd file # move yml file
$ aws cloudformation create-stack --stack-name mwaa-environment --template-body file://mwaa_public_network.yml --capabilities CAPABILITY_IAM
2-3) 결과
- 아래와 같이 설치된 것을 볼 수 있습니다.