AWS Batch enables the execution of batch computing workloads on the AWS Cloud. Batch computing is a popular method for developers, scientists, and engineers to have access to massive volumes of compute resources. AWS Batch, like traditional batch computing tools, removes the undifferentiated heavy lifting of establishing and managing the requisite infrastructure. In order to alleviate capacity restrictions, decrease compute costs, and provide results fast, this service may efficiently supply resources in response to workloads submitted.
AWS Batch, as a fully managed service, enables you to perform batch computing workloads of any size. AWS Batch automatically provisioned computing resources and optimized workload allocation depending on workload amount and size. There is no need to install or administer batch computing tools with AWS Batch, so you can spend your time evaluating findings and fixing problems.
Components of AWS Batch
AWS Batch enables batch processing jobs to run across multiple AZs in a single Region. You may create AWS Batch computing environments either inside or outside of your current VPC. You may set up a job queue after establishing a computing environment and linking it to a job queue. You may then write job descriptions that specify the Docker container images that will be used to perform the job. Container registries that store and provide container images are accessible both inside and outside of your AWS infrastructure.
A job consists of AWS Batch’s definition of a unit of work (for example, a shell script, a Linux executable, or a Docker container image). It operates as an application container in your AWS Fargate or Amazon EC2 compute environment, with job parameters specified in a job specification. Jobs may be linked to one another by name or ID, and they may depend on the completion of other jobs.
A job description can be thought of as a plan for the resources required to execute a job. When you define a job, you specify how it should be done. You can give your task an IAM role to give it access to other AWS services. You can also set the RAM and CPU resources it will require. In addition to defining container attributes, environment variables, and permanent storage mount points, you can also define them in job specifications. Using a new value, you can alter many job description parameters by submitting individual Jobs.
When you submit an AWS Batch task, it is sent to a specific job queue, which will remain until it is scheduled to run on a computing environment. A task queue may be scheduled to run on a computing environment or environment. Priority values may also be assigned to various computing environments and even across task queues. For example, you might have a low-priority queue for tasks that can be run whenever computing resources are cheaper or a high-priority queue for time-critical operations.
A computing environment is a collection of controlled or unmanaged computing resources used to execute tasks. You may define the desired compute type (Fargate or EC2) at multiple degrees of detail with managed compute environments. You may create compute environments that employ a certain type of EC2 instance, such as c5.2xlarge or m5.10xlarge. You can also specify that you only want to utilize the most recent instance types. You may also indicate the environment’s minimum, preferred, and maximum vCPU count, as well as the amount you’re prepared to pay for a Spot Instance as a % of the On-Demand Instance pricing and a target set of VPC subnets.
Let’s Start with AWS Batch
To get started fast using AWS Batch, follow the AWS Batch first-run wizard. After you have completed the Prerequisites, you may utilize the AWS Batch first-run wizard to quickly construct a computing environment, a job specification, and a work queue.
To test your setup, submit an example “Hello World” task to the AWS Batch first-run wizard. You may utilize an existing Docker image to generate a job description if you wish to deploy it in AWS (Amazon Web Service) Batch.
Step 1: Prerequisites
Before you begin the AWS Batch first-run wizard, complete the following steps:
- Complete the procedures outlined in Setting Up with AWS Batch.
- Check that your AWS account has the necessary permissions.
Step 2: Establish A Computing Environment
Your Amazon EC2 instances are referred to as a computing environment. The compute environment settings and limitations instruct AWS Batch on how to configure and automatically start the Amazon EC2 instance.
To set up the computing environment, do the following:
- Launch the AWS Batch console and conduct the first-run wizard.
- In the section Compute environment configuration:
- Enter a unique name for the computing environment.
- Choose a service role that has the ability to call other AWS services on your behalf. If you don’t already have a service role that can contact other AWS services, one is established for you.
In the section Instance configuration:
- Select Fargate, Fargate Spot, On-demand, or Spot as the provisioning model.
- Enter the maximum percentage of On-demand pricing that you wish to pay for Spot resources in the Maximum% on-demand price.
- Enter the instance’s minimum number of vCPUs in Minimum vCPUs.
- Enter the maximum number of vCPUs that the instance can use in Maximum vCPUs.
- Enter the required number of vCPUs for the instance in Desired vCPUs.
- Select the instance types that the instance uses under Allowed instance types.
- Choose BEST FIT PROGRESSIVE for On-Demand or SPOT CAPACITY OPTIMIZED for Spot for Allocation technique.
- Choose an Amazon VPC for VPC ID in the Networking section.
- By default, the subnets for your AWS account are shown under Subnets. Clear subnets and then select the subnets you wish to establish a custom set of subnets.
Step 3: Create A Job Queue
Your submitted jobs are stored in a job queue until the AWS Batch Scheduler performs them on a resource in your computing environment. To build a task queue, follow these steps:
- In the Job queue setup section, create a unique name for the Job queue.
- Enter an integer between 0 and 100 for the work queue’s Priority.
- The AWS Batch Scheduler gives greater integer values a higher priority.
- If you wish to add an AWS scheduling policy to the work queue:
- Turn on the ARN Scheduling policy.
- Select the desired Amazon Resource Name (ARN).
- Expand Additional setup (optional).
- Select a work queue status for State.
- Click Next
Step 4: Create The Job Definition
In the General settings section:
- Enter a unique job definition name in the Name field.
- Enter the period of time, in seconds, that an unfinished job will be terminated after for Execution timeout.
- Configure the Additional tags.
- A tag is a label that is attached to a resource. Select Add tag to add a tag. Enter a key-value pair, then choose Add tag once more.
- To propagate tags to the Amazon Elastic Container Service job, enable Propagate tags.
Step 5: Create A Job
AWS Batch does work in the form of jobs. Jobs are deployed on Amazon Elastic Container Service container instances in an Amazon ECS cluster as containerized apps.
In the General settings section:
- Enter a unique name for Name.
- Enter a time length, in seconds, before an incomplete job is terminated for Execution timeout. The timeout period is set to 60 seconds.
- To distribute the work across different hosts, enable Array jobs. In Array size, enter the number of hosts.
- Turn on Job dependencies if the job has any dependents. Then, input the dependency’s Job ID and click Add.
- Configure the Additional tags. To add a tag, select Add tag from the Tags menu. Choose a tag key and optional value, then
Step 6: Review and Creation Done
Review the configuration processes for Review and create. Choose Edit if you need to make changes. When you’re satisfied with the settings, click Create.