Resilience Frameworks

These frameworks provide the resilience pattern for various service calls.

A resilience pattern is a design pattern used in software systems (especially distributed systems and microservices) to help applications stay stable, responsive, and fault-tolerant when things go wrong — like network failures, timeouts, overload, or downstream service crashes.

Think of it as defensive strategies your application uses to “bend but not break” under stress.

Why?

Networks are unreliable (latency, packet loss).
Services fail or slow down unexpectedly.
Traffic spikes can overwhelm dependencies.
Distributed systems make failures inevitable, not exceptional.

Without resilience patterns, one failing service can cause a cascading failure across the whole system (classic "domino effect").

Core Patterns

Here are the core patterns (many implemented in libraries like Resilience4j, Netflix Hystrix, or Spring Cloud):

1. Circuit Breaker

Detects failures and “opens the circuit” to stop making calls to a failing service.
Protects the system from repeated failed attempts.
Example: Like a fuse box at home — when overload happens, it cuts the current.

Parameters (configurable):

failureRateThreshold → % of failures to trigger breaker (e.g., 50%).
slowCallRateThreshold → % of slow calls considered failures.
slowCallDurationThreshold → time limit (e.g., 2s) after which a call is “slow.”
slidingWindowType → COUNT-based (# calls) or TIME-based (last N seconds).
slidingWindowSize → size of window for failure calculation.
minimumNumberOfCalls → minimum calls before breaker can trip.
permittedNumberOfCallsInHalfOpenState → trial calls when half-open.
waitDurationInOpenState → how long before trying again.
automaticTransitionFromOpenToHalfOpenEnabled → true/false.

Metrics: Failure Rate, Slow Call Rate, State, Transition Count.

2. Retry

Automatically retries a failed request with a delay/backoff.
Useful for temporary glitches (e.g., a momentary network blip).
Needs careful tuning (too many retries can worsen the problem).

Parameters:

maxAttempts → max retries (including first try).
waitDuration → delay between retries.
retryExceptions → exception types eligible for retry.
ignoreExceptions → exceptions not retried.
intervalFunction → constant / exponential backoff / random.
enableExponentialBackoff → true/false.
enableRandomizedWait → true/false.

Metrics: Retry Attempts, Retry Success/Failure Count, Backoff Duration.

3. Timeout

Limits the max time a request can take.
Prevents “hanging” calls that block resources.

Parameters:

timeoutDuration → max allowed time (e.g., 2s).
cancelRunningFuture → whether to cancel running task.

Metrics: Timeout Count, Avg Latency, P95/P99 Latency, Aborted Requests.

4. Bulkhead

Isolates resources into pools (like compartments in a ship).
If one pool is overloaded, others remain unaffected.
Example: Keep API calls to external payment service in their own thread pool.

Parameters:

maxConcurrentCalls → max calls allowed at once.
maxWaitDuration → how long to wait for a permit (if pool full).
maxThreadPoolSize (for thread pool bulkhead).
coreThreadPoolSize.
queueCapacity → max requests waiting.
keepAliveDuration → how long idle threads live.

Metrics: Active Calls, Pool Utilization %, Rejected Calls, Queue Wait Time.

5. Rate Limiter

Restricts the number of requests per unit time.
Prevents overload of downstream services.
Example: Only allow 100 API calls/sec to external service.

Parameters:

limitForPeriod → max calls allowed per refresh period.
limitRefreshPeriod → duration of one period (e.g., 1s).
timeoutDuration → max wait for permit.

Metrics: Allowed Calls, Denied Calls, Wait Time, Utilization %.

6. Fallback

Provides an alternative response when a call fails.
Example: If product pricing service is down, return “price unavailable” instead of crashing the app.

Parameters:(Fallback is usually code-defined rather than config-based, so parameters = what you define in logic)

fallbackMethod → function to invoke when primary fails.
fallbackResponse → static or computed response.
onExceptionTypes → exception triggers for fallback.

Metrics: Fallback Count, Fallback Success Rate, Degraded Mode Ratio.

7. Cache

Store frequently used results locally to reduce repeated calls.
Both improves performance and reduces dependency failures.

Parameters:

ttl (time-to-live) → how long before entry expires.
maxSize → max entries allowed.
evictionPolicy → LRU, LFU, FIFO.
refreshAhead → whether to refresh before expiry.
keyStrategy → how to generate cache keys.

Metrics: Hit Rate, Miss Rate, Evictions, Cache Size, Entry Staleness.

Where Resilience Patterns Are Used

Microservices (protect against downstream failures)
API gateways (limit or shape traffic)
GraphQL resolvers / REST endpoints (wrap external service calls)
Database access (timeouts, retries, caching)

Financial Portfolio Development

Theories

Banking

Investing

Index

Retail

Decision Making

Growth and Transformation

Software development

DevOps

SAFe

Estimation - TCS course 5474

Business Analysis

Research Methodology

Ai

Narrow

High Powered Computing

Networks

Software_engineering

Software_security

Computer Science

Algorithms

Automata Theory

Theory of Computation

Problems - Computation

Electronics & Automation

Micro_controllers

Atmega

Data Science

Stream_processing

Database Management Systems

Relational Database Management System

Networks

Physical Layer(OSI Layer 1)

Radio

Index

Software Engineering

Artificial Intelligence

Generative AI

Narrow

Sensory

Architect

Software architecture patterns

Togaf

Amazon Web Services

Software Frameworks

Frameworks used for REST

Middleware softwares

Packaging

High Powered Computing

NVIDIA

Observability

Software Operations

Containers

OS

Index

Practical Setups

Introduction to Programming Languages

C(++)

Java

JavaScript

Python

R

Security in Softwares

Layer 6

Statistical Modelling

Linear_regressions

Monte_carlo

Resilience Frameworks ​

Core Patterns ​

1. Circuit Breaker ​

2. Retry ​

3. Timeout ​

4. Bulkhead ​

5. Rate Limiter ​

6. Fallback ​

7. Cache ​

Where Resilience Patterns Are Used ​

Resilience Frameworks

Core Patterns

1. Circuit Breaker

2. Retry

3. Timeout

4. Bulkhead

5. Rate Limiter

6. Fallback

7. Cache

Where Resilience Patterns Are Used