The Elasti project is designed to enable serverless capability for Kubernetes services by dynamically scaling services based on incoming requests. It comprises two main components: operator and resolver. The elasti-operator manages the scaling of target services, while the resolver intercepts and queues requests when the target service is scaled down to zero replicas.
- Operator: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scales target services as needed.
- Resolver: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-operator to scale up the target service.
-
[In Serve Mode] Traffic hits the gateway, is routed to the target service, then to the target pod, and resolves the request.
-
[CRD Created] The Operator fetches details from the CRD.
- Adds a finalizer to the CRD, ensuring it is only deleted by the Operator for proper cleanup.
- Fetches the
ScaleTargetRef
and initiates a watch on it. - Adds the CRD details to a
crdDirectory
, caching the details of all CRDs.
- [ScaleTargetRef Watch] When a watch is added to the
ScaleTargetRef
:- Identifies the kind of target and checks the available ready pods.
- If
replicas == 0
-> Switches to Proxy Mode. - If
replicas > 0
-> Switches to Serve Mode. - Currently, it supports only
deployments
androllouts
.
-
When pods scale to 0, either via HPA, KEDA, or any other auto-scaler.
-
[Switch to Proxy Mode]
- Creates a Private Service for the target service. This allows the resolver to reach the target pod, even when the public service has been modified, as described in the following steps.
- Creates a watch on the public service to monitor changes in ports or selectors.
- Creates a new
EndpointSlice
for the public service to redirect any traffic to the resolver. - Creates a watch on the resolver to monitor the addition of new pods.
-
[In Proxy Mode]
- Traffic reaching the target service, which has no pods, is sent to the resolver, capable of handling requests on all endpoints.
- [In Resolver]
- Once traffic hits the resolver, it reaches the
handleAnyRequest
handler. - The host is extracted from the request. If it's a known host, the cache is retrieved from
hostManager
. If not, the service name is extracted from the host and saved inhostManager
. - The service name is used to identify the private service.
- Using
operatorRPC
, the controller is informed about the incoming request. - The request is sent to the
throttler
, which queues the requests. It checks if the pods for the private service are up.- If yes, a proxy request is made, and the response is sent back.
- If no, the request is re-enqueued, and the check is retried after a configurable time interval (set in the Helm values file).
- If the request is successful, traffic for this host is disabled temporarily (configurable). This prevents new incoming requests to the resolver, as the target is now verified to be up.
- Once traffic hits the resolver, it reaches the
- [In Controller/Operator]
- ElastiServer processes requests from the resolver, containing the service experiencing traffic.
- Matches the service with the
crdDirectory
entry to retrieve theScaleTargetRef
, which is then used to scale the target.
-
When pods scale to 1, either via Elasti, HPA, KEDA, or any other auto-scaler.
-
[Switch to Serve Mode]
- The Operator stops the informer/watch on the resolver.
- The Operator deletes the
EndpointSlice
pointing to the resolver. - The system switches to Serve Mode.
Values you can pass to elastiResolver env.
// HeaderForHost is the header to look for to get the host. X-Envoy-Decorator-Operation is the key for istio
headerForHost: X-Envoy-Decorator-Operation
// InitialCapacity is the initial capacity of the semaphore
initialCapacity: "500"
maxIdleProxyConns: "100"
maxIdleProxyConnsPerHost: "500"
// MaxQueueConcurrency is the maximum number of concurrent requests
maxQueueConcurrency: "100"
// OperatorRetryDuration is the duration for which we don't inform the operator
// about the traffic on the same host
operatorRetryDuration: "10"
// QueueRetryDuration is the duration after we retry the requests in queue
queueRetryDuration: "3"
// QueueSize is the size of the queue
queueSize: "50000"
// ReqTimeout is the timeout for each request
reqTimeout: "120"
// TrafficReEnableDuration is the duration for which the traffic is disabled for a host
// This is also duration for which we don't recheck readiness of the service
trafficReEnableDuration: "5"