API Scaling, Rate Limiting and API Testing Terminology

Client id and Client secret

When you register your app, you will receive a client ID and optionally a client secret.

The client ID is considered public information.
The client secret must be kept confidential.

Client ID :
Is used to identify the application.
Let's say you are building an App that would like to access google maps APIs, You need to register the app with google & google will give you client id which is an id to identify the client in our case it's your app. Client ID is publicly available.

Client Secret : This is the true secret key, which is stored on server side securely & not available to public.

Apdex (Application Performance Index)

How Apdex scores are calculated

The equation Apdex uses to find a responsiveness score is the number of satisfied samples, plus half the tolerated samples, divided by the total sample number.

Apdex Score:

Equation:
(Satisfied+Tolerating/2)/Total samples

For example, a system with 100 total users has 70 satisfied users, as defined by the application's high responsiveness.

This is a T value of 70. There are 20 users with slower responses from the application, but within acceptable range, so they are tolerating users.

The equation is (70+20/2)/100= 0.8.

The 0.8 score falls between 0 and 1. An excellent score falls in 1.00-0.94, a good score ranks from 0.93-0.85, a fair score hits 0.84-0.70 and a poor one between 0.69 and 0.49. Any lower number is unacceptable.

In Jmeter Meter dashboard we can see success in column OK and failure in KO column

so KO means "not OK" means Failed

410 GONE
The target resource is no longer available at the origin server and that this condition is likely to be permanent.

If the origin server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 Not Found ought to be used instead.

(TPS) Transactions per second (Its a measure scale for Throughput)

Throughput — how many transactions can be completed at one time, usually measured in transactions per second, or TPS

Latency — how long each individual transaction takes, usually measured as the average or percentile of the latency for a sample number of transactions

What Limits Scaling

Disk, Network,CPU, Memory,Database, App Server, API Proxy,Load Balancer,
Cache Servers

What are some limits

Seek Time,Rotational Speed,Transfer speed, Clock Speed, Number of Cores,Amount of RAM,Database design and tuning, App server coding and config, Load Balancer Policies,Cache Configuraion.

How to Test based on TPS

At 10 transactions per second

864,000 per day / 25 million per month
Most infrastructure can still handle this.

Areas of focus

What about the application server?
Is the app well-designed enough?
Does it make an excessive number of database calls?

At 100 TPS

At 100 transactions per second i.e. 8.6 million per day / 259 million per month

RDBMS systems may struggle

Less-efficient app servers may struggle“Free” tiers on hosting platforms aren’t an option

Strategies for 100 tps

Database optimization and tuning is critical here
Allocate fast storage, and lots of it
Allocate lots of memory
Tune the database to use it!
Find bad queries and fix them or optimize them

App server tuning is critical here
Are there enough threads in the thread pool?
Are there enough processes?

At 1000 transactions per second
86 million per day / 2.5 billion per month
Now everything may start to break…

Focus Area
What is the mix between reads and writes?

Strategies for 1000 tps

Cache the reads as much as you can If you can cache them closer to the client better

Understand your app server performance
Faster app servers should still be able to handle (like Java) RoR, Python, PHP, etc will require much bigger clusters

Stateless app servers are your friend!

More Strategies for 1000 tps

Scale the database layer
Shared RDBMSes
Or scalable NoSQL Database works here

At 10,000 TPS

No Single Database can handle this
if API calls are large, what will the bandwidth be?

Database writes are poblematic
No single database server can write 10,000 times per second
Scalable , eventually-consistent databases can scale this big (like cassandra)

App Server
You will need cluster of App Server

Note: Every API call has overhead

TCP Conncection, SSL Handshake, Load Balancer CPU, API Proxy CPU , App Server CPU and Thread Pool , Database Connections, Disk I/O

How to improve Latency
Latency kills user experience!

How can the API server reduce it?

Remove steps in the processing flow through caching Cache closer to the API clients

Always keep Cache closer to the clients

How to Add Security in Architecture

How to Limit access(Rate Implementing) your APIs

In the world of APIs, nobody gives direct access to his resources because you never know how much your services are going to be used.

If you start thinking about limiting access to your APIs, a lot of things come to mind.

So what are API throttling, API quota, API rate limiting and API burst?

Three Methods Of Implementing API Rate-Limiting

Request Queues
Example
Amazon Simple Queue Service (ASQS)

Amazon’s Simple Queue Service (ASQS) is a ready-made request queue library that is perfect for request and messaging queues. The project is regularly maintained, so you won’t have to constantly debug your hardware or software to make ASQS work.

Throttling

Throttling is another common way to practically implement rate-limiting.
When the throttle is triggered, a user may either be disconnected or simply have their bandwidth reduced.

Rate-limiting Algorithms
Algorithms are another way to create scalable rate-limited APIs. As with request queue libraries and throttling services, there are many rate-limiting algorithms already available.

Leaky Bucket
The leaky bucket algorithm is a simple, easy-to-implement rate-limiting solution. It translates requests into a First In First Out (FIFO) format, processing the items on the queue at a regular rate.

Leaky Bucket smooths outbursts of traffic, easy to implement on a single server or load balancer. It’s also small and memory-efficient, due to the limited queue size.

Fixed Window
The window is defined for a set number of seconds, like 3600 for one hour, for example. If the counter exceeds the limit for the set duration, the additional requests will be discarded.

Terminology

Spike Arrest Policy: This will smooth out the flow of incoming requests, so it prevents too many requests per second from reaching your servers. You can specify an identifier to group requests.

Quota policy: This will limit the total number of requests within a given period per some identifier.

Api Burst:
What is the Burst?

The Burst limit is quite simply the maximum number of concurrent requests that API gateway will serve at any given point. So it is your maximum concurrency for the API.

How do the Rate and Burst Throttle work together?
The Burst setting and Rate setting work together to control how many requests can be processed by your API.

Let's assume you set the throttle to Rate = 100 (requests per second) and the Burst = 50 (requests). With those settings if 100 concurrent requests are sent at the exact same millisecond only 50 would be processed due to the burst setting, the remaining 50 requests would get a 429 Too Many Requests response.
Assuming the first 50 requests completed in 100ms each, your client could then retry the remaining 50 requests.

About Me

Followers

Saturday, 7 March 2020

API Scaling, Rate Limiting and API Testing Terminology

No comments:

Post a Comment

Search This Blog

Total Pageviews

Blog Archive

Social

Popular

Recent

Clustr-Maps

Contact Form