Auto scaling Java REST APIs using Amazon ECS with Fargate

Get to know more about using the ECS service with Fargate serverless compute engine to run dockerized Java REST APIs, and be able to automatically scale them.

In this blog post, the following AWS services are used:

For auto scaling Java REST APIs using the EC2 Auto Scaling service see the previous blog post. Some of the auto scaling and application load balancing concepts that are covered in the previous blog post also apply to the ECS with Fargate service, and thus will not be covered here.

ECS with Fargate Service

When using ECS, you need to choose between two types of compute engines:

  • EC2

  • Fargate

Fargate is a serverless compute engine, and thus there are no virtual server machines (EC2 instances) to manage when running your web applications and (micro) services.

In ECS you'll need to create a cluster. A cluster contains one or more services. Each service represents a task definition, and allows additional features such as auto scaling to apply to a task definition. A service can also make use of an application load balancer which can route the traffic to multiple running tasks part of the service. A task definition is the template of a task. It contains one or more container definitions for docker images to run in docker containers when a task is run. Related container definitions can be grouped together in one task definition for which a service is created with auto scaling enabled to allow multiple tasks to run.

Within the ECS cluster, it is also possible to run tasks without creating a service first. However, auto scaling among other features cannot be applied to the task.

See below for a class model that represents the relationships among the entities described above.

ECS with Fargate sandbox environment

Setting up the sandbox environment

Components and Services

The sandbox environment consists of the following components and services:

  • Apollo Missions API

    • Apache JMeter is used for load testing purposes.

  • Elastic Container Service (ECS)

    • Cluster

    • Task Definition

    • Service

  • Elastic Container Registry (ECR)

    • Repository

    • Docker Image

  • Elastic Load Balancing (ELB)

    • Target Group

    • Load Balancer

  • CloudWatch

Apollo Missions API

Apollo Missions API is a simple REST API written in Java using the Quarkus framework. This Java application runs in a docker container as part of a task launched by the ECS. Later on, I will describe how to create a docker image, and upload it to ECR to be used by ECS.

The application consists of the following endpoints:

  • /missions/manned

  • /missions/manned/{missionId}

  • /longComputation

  • /health

The first two endpoints provide some basic data regarding the manned Apollo Missions. The /longComputation endpoint is used by the Apache JMeter for load testing purposes. Creating load on the ECS service will trigger an alarm in CloudWatch, and cause a scaling policy to take a scale out action. The /health endpoint is used by the ELB load balancer to monitor the health of the Java application (whether it is running, accepting and successfully processing HTTP requests or not).

Elastic Container Service (ECS)

A cluster is created which will run the sandbox components. Furthermore, a task definition is created from which tasks can be created to run the REST API microservice in a docker container. Finally, a service in ECS is created to run the tasks and enable the auto scaling of the REST API microservice.

Elastic Container Registry (ECR)

A repository is created to which a docker image for the REST API microservice is uploaded. This docker image will be referenced in the task definition of ECS.

Elastic Load Balancing (ELB)

A target group is created with health checks. The /health endpoint of the Java application is used by the health checks. The target group allows the created application load balancer to route the HTTP traffic to the ECS tasks launched by the ECS Auto Scaling service.


You can use this service to see which alarms went off, and view the ECS and ELB statistics such as for instance the CPU utilization and Request Count Sum.


The setup in AWS is done using the AWS Management Console. The values mentioned here regarding Availability Zones (AZs) are for the Europe (Frankfurt) eu-central-1 region. You can of course use a different region but then you will need to provide the values for the default public subnets and AZs of your chosen region.

Docker Image in ECR

Create a docker image of the REST API microservice. Within the Apollo Missions API code repository, look for section Run Quarkus in JVM mode in a docker container.

Now, create a repository in ECR (e.g. isaacdeveloperblog) and upload the version 1.0.0 of the REST API to the ECR repository. Once uploaded, click on the image and copy the Image URI. You will need this URI for your task definition in ECS.

ECS Cluster

Navigate to the ECS and click on Clusters. In the Clusters section click on the Create Cluster button. Choose for Networking only which is powered by AWS Fargate.

Provide the following values.

Cluster name

Create VPC
Tick the box ‘Create a new VPC for this cluster’.

Leave the default values for CIDR block, Subnet 1 and Subnet 2 as is.

Now click on the Create button and a new ECS cluster will be created for the sandbox environment. It may take a couple of minutes before all resources within the cluster are created.

Task Definition

Go to the Task Definitions section, and click on the Create new Task Definition button. For the launch type select Fargate.

Provide the following values, and create the task definition.

Task Definition Name

Task Role
None (default)

Network Mode
awsvpc (default)

Task execution role
ecsTaskExecutionRole (default)

Task memory (GB)

Task CPU (vCPU)
1 vCPU

Container Definitions - Container Name

Container Definitions - Image
Paste here your uploaded Image URI.


Container Definitions – Port mapping

Container port: 8080

Protocol: tcp

Elastic Load Balancing

Navigate to the EC2 service and click on the Load Balancers section. Click on the Create Load Balancer button, and select the Application Load Balancer.

Now create an ELB application load balancer with the below values which need to be specified or are different from the default values.

Step 1: Configure Load Balancer


Listeners – Load Balancer Protocol
HTTP: 80

Select the VPC which belongs to the ECS cluster.

Availability Zones
Select all available AZs:



Step 2: Configure Security Settings

Step 3: Configure Security Groups

Create a new security group

  • Security group name: apollo-missions-api-lb-sg

  • Type: HTTP

  • Protocol: TCP

  • Port Range: 80

  • Source: Anywhere

Step 4: Configure Routing

Create a new target group
New target group

  • Name: apollo-missions-api-tg

  • Target type: IP

  • Protocol: HTTP

  • Port: 8080

Health checks

  • Protocol: HTTP

  • Path: /health


Go to the Services tab within the newly created ECS cluster, and click on the Create button.

Provide the following values, and create the service.

Step 1: Configure service

Launch type

Task Definition
Family: apollo-missions-api
Revision: 1 (latest)

Platform version
LATEST (default)


Service name

Service type
REPLICA (default)

Number of tasks

Minimum healthy percent
100 (default)

Maximum percent
200 (default)

Deployment type
Rolling update (default)

Step 2: Configure network

Cluster VPC
Select the VPC which belongs to the ECS cluster.

Select all available subnets of the VPC.

Configure Security Groups
Create a new security group

  • Security group name: apollo-missions-api-ecs-sg

  • Type: Custom TCP

  • Protocol: TCP

  • Port Range: 8080

  • Source: Anywhere

Auto-assign public IP
ENABLED (default)

Health check grace period

Load balancing
Application Load Balancer
Load balancer name: apollo-missions-api-lb

Container to load balance
Container name : port: apollo-missions-api:8080:8080
Click on the Add to load balancer button.

  • Production listener port: 80:HTTP

  • Target group name: apollo-missions-api-tg

Step 3: Set Auto Scaling

Service Auto Scaling
Choose for ‘Configure Service Auto Scaling to adjust your service’s desired count’.

Minimum number of tasks

Desired number of tasks

Maximum number of tasks

IAM role for Service Auto Scaling
ecsAutoscaleRole (default)

Scaling policy type
Target tracking

Policy name

ECS service metric

Target value

Scale-out cooldown period
60 seconds

Scale-in cooldown period
60 seconds

It may take some time before the service is created. When the service is created, a task will be started by the service.

Running the sandbox environment

If you’ve successfully setup the sandbox environment, you should be able to see the first ECS task launched by the ECS Auto Scaling service to meet the desired capacity of 1.

Wait until the last status becomes RUNNING.

Upon launching the ECS task, the ECS Auto Scaling service has registered the task with the Target Group so that the application load balancer can route the traffic to the task.

You can access the REST API of that task directly, or via the application load balancer.

  • ECS Task: http://<TASK network interface public IPv4 DNS >:8080/missions/manned

  • ELB: http://<ELB load balancer’s DNS name>/missions/manned

Now click on the apollo-missions-api service in ECS and click on the Events tab. In the events tab you can see that the cluster has only started one task and reached a steady state.

Let's update the ECS service, and manually set the minimum and the number of desired tasks to 2. Once done, verify in the events tab that a new task has been created.

To trigger a scale out action by the ECS Auto Scaling service, you will need to put some extra load on the ECS service. You can use the provided JMeter project as part of the Apollo Missions API source code. The project file is located in the resources/jmeter folder. This project has been created with Apache JMeter version 5.2.1.

Once the project is opened in JMeter, provide the ELB load balancer’s DNS name for the field Server Name or IP of the /longComputation endpoint.

Hit the play button to send HTTP requests to the ELB load balancer. Ten concurrent users will for a period of 20 minutes continuously send HTTP requests.

Click on the Summary Report and watch the statistics.

In CloudWatch you can see that two alarms which are created by the ECS Auto Scaling service are in OK state.

After some time, the alarm for the condition ‘CPUUtilization > 50 for 3 datapoints within 3 minutes’ will go off, and this event will trigger a scale out action in ECS.

In ECS service, in the Events tab we can see the new log entries mentioning the scale out action.

If we now look in the Tasks tab, we can see that two additional tasks have been launched.

The number of ECS tasks increases gradually over time to meet the dynamically adjusted desired capacity by the ECS Auto Scaling service. Once we stop the JMeter tests, the CPU utilization will drop drastically, and thus multiple scale in actions will take place to decrease the number of running ECS tasks until only the minimum number of tasks are left to run.

Dit blog is geschreven door de specialisten van ISAAC.

Inmiddels is ISAAC onderdeel van iO. Meer weten? Neem gerust contact op!

logo iO