One of the easiest cloud design pattern that one can try out is the Retry Pattern. I wanted to show how to use an Retry Pattern using Polly in C# as a example. So what does the Retry Pattern achieves?
Problem Statement – What is the issue the pattern solves?
When building applications you always have some sort of outside/external service including another MicroService that you have to consume or call. Sometimes there could be momentary loss of network connectivity, or a temporary unavailability, or timeouts that occur when that service is busy. You may be calling a database or a restful service that may be busy and fail but if you try back again it will pass. These types of faults are usually self-correcting, and most of the time require some type of delay in calling it again, which will have a success response.
Retry Pattern
- Enable an application to handle transient failures
- When the applications tries to connect to a service or network resource
- By transparently retrying a failed operation
- Improves the stability of your application
- Use retry for only transient failure that is more than likely to resolve themselves quickly
- Match the retry policies with the application
- Otherwise use the circuit break pattern
- Don’t cause a chain reaction to all components
- For internal exceptions caused by business logic
- Log all retry attempts to the service
Typical Application
Below is a typical application diagram, where you a service or web app.
But when the connection to the service fails we usually get an error on our application.
When to use Retry Pattern
When not to use Retry Pattern
Sample Code
Below is a sample dotnet core Console Application that shows the usage using Polly. The code tries to call https://httpbin.org/status/200,408 with a POST which gives us a status of 200 or 408 randomly. First, lets create our code and add the package Polly into it.
$mkdir RetryPattern
$cd RetryPattern
#create console app
$dotnet new console
The template "Console Application" was created successfully.
Processing post-creation actions...
Running 'dotnet restore' on C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj...
Restoring packages for C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj...
Generating MSBuild file C:\Users\Taswar\Documents\GitHub\RetryPattern\obj\RetryPattern.csproj.nuget.g.props.
Generating MSBuild file C:\Users\Taswar\Documents\GitHub\RetryPattern\obj\RetryPattern.csproj.nuget.g.targets.
Restore completed in 1.18 sec for C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj.
Restore succeeded.
#add the polly package
$dotnet add package Polly
Writing C:\Users\Taswar\AppData\Local\Temp\tmp4FCB.tmp
info : Adding PackageReference for package 'Polly' into project 'C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj'.
log : Restoring packages for C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj...
info : GET https://www.nuget.org/api/v2/FindPackagesById()?id='Polly'&semVerLevel=2.0.0
info : OK https://www.nuget.org/api/v2/FindPackagesById()?id='Polly'&semVerLevel=2.0.0 2341ms
info : GET https://www.nuget.org/api/v2/package/Polly/6.1.2
info : OK https://www.nuget.org/api/v2/package/Polly/6.1.2 2254ms
log : Installing Polly 6.1.2.
info : Package 'Polly' is compatible with all the specified frameworks in project 'C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj'.
info : PackageReference for package 'Polly' version '6.1.2' added to file 'C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj'.
info : Committing restore...
info : Writing lock file to disk. Path: C:\Users\Taswar\Documents\GitHub\RetryPattern\obj\project.assets.json
log : Restore completed in 7.77 sec for C:\Users\Taswar\Documents\GitHub\RetryPattern\RetryPattern.csproj.
#launch vscode
$code .
Without Polly
We will write a sample application that will call the the web service without polly to get 408 errors.
using System;
using System.Net.Http;
using System.Threading.Tasks;
namespace RetryPattern
{
class Program
{
static async Task Main(string[] args)
{
//create the http client
var httpClient = new HttpClient();
//call the httpbin service
var response = await httpClient.PostAsync("https://httpbin.org/status/200,408", null);
if (response.IsSuccessStatusCode)
Console.WriteLine("Response was successful with 200");
else
Console.WriteLine($"Response failed. Status code {response.StatusCode}");
}
}
}
After couple of runs you will see it response back with 408 RequestTimeout
> dotnet run
Response was successful with 200
> dotnet run
Response was successful with 200
> dotnet run
Response failed. Status code RequestTimeout
Using Polly
Now we will introduce Polly into our code with an incremental delay of 1 second to 3 seconds and lastly 9 seconds.
using System;
using System.Net.Http;
using System.Threading.Tasks;
using Polly;
namespace RetryPattern
{
class Program
{
static async Task Main(string[] args)
{
//create the http client
var httpClient = new HttpClient();
//call the httpbin service with Polly
var response = await Policy
.HandleResult(message => !message.IsSuccessStatusCode)
.WaitAndRetryAsync(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(3),
TimeSpan.FromSeconds(9)
}, (result, timeSpan, retryCount, context) => {
Console.WriteLine($"Request failed with {result.Result.StatusCode}. Retry count = {retryCount}. Waiting {timeSpan} before next retry. ");
})
.ExecuteAsync(() => httpClient.PostAsync("https://httpbin.org/status/200,408", null));
if (response.IsSuccessStatusCode)
Console.WriteLine("Response was successful with 200");
else
Console.WriteLine($"Response failed. Status code {response.StatusCode}");
}
}
}
Output
Below you will see three runs of the application with sample output.
#First Run
> dotnet run
Request failed with RequestTimeout. Retry count = 1. Waiting 00:00:01 before next retry.
Request failed with RequestTimeout. Retry count = 2. Waiting 00:00:03 before next retry.
Response was successful with 200
#second run
> dotnet run
Request failed with RequestTimeout. Retry count = 1. Waiting 00:00:01 before next retry.
Request failed with RequestTimeout. Retry count = 2. Waiting 00:00:03 before next retry.
Request failed with RequestTimeout. Retry count = 3. Waiting 00:00:09 before next retry.
Response failed. Status code RequestTimeout
#third run
> dotnet run
Request failed with RequestTimeout. Retry count = 1. Waiting 00:00:01 before next retry.
Response was successful with 200
Summary
As you can see Retry Pattern is quite useful for transient and self correcting failure, not to mention it is quite simple to implement in C# with the help of Polly. If you are looking for Java solutions you can look at Hysterix or even roll your own.