An API without rate limiting is an API that can be overwhelmed by anyone — intentionally or accidentally. A client with a bug in the code that makes 10,000 requests per minute, a scraping script, a brute-force attack on the authentication endpoint — all produce the same effect: exhausted resources, increased latency, degraded service for legitimate users.
Until .NET 7, rate limiting required external libraries (AspNetCoreRateLimit, etc.) or custom solutions. Starting with .NET 8 and refined in later versions, the framework includes a complete, flexible rate limiting system well integrated with the ASP.NET Core middleware pipeline. All examples in this article are valid on .NET 10.
1. The four rate limiting algorithms
Before any code, it’s worth understanding which algorithm fits each scenario. The wrong choice results in either insufficient protection or frustration for legitimate users.
Fixed Window
Allows a fixed number of requests within a fixed time window (e.g., 100 requests per minute). When the window expires, the counter resets completely.
Advantage: simple, predictable, easy to communicate to users.
Disadvantage: vulnerable to burst attack — a client can make 100 requests in the last 5 seconds of the window and 100 in the first 5 seconds of the next window: 200 requests in 10 seconds, even though the limit is 100/minute.
Sliding Window
Similar to Fixed Window, but the window moves in real time relative to the last request. Eliminates burst attacks at the window boundary.
Advantage: more even distribution of requests.
Disadvantage: more memory-intensive (you must keep timestamps of recent requests).
Token Bucket
A "bucket" with a maximum number of tokens. Each request consumes a token. Tokens regenerate at a constant rate. Allows short bursts (if the bucket is full) but limits the average rate over the long term.
Advantage: most natural for real human use — a user can make a few quick requests but cannot sustain a high rate indefinitely.
Disadvantage: more complex to communicate to users (when exactly do tokens reload?).
Concurrency Limiter
Limits the number of requests processed simultaneously, not the rate over time. It does not count requests per second but how many are active at the same moment.
Advantage: direct protection against resource overload (DB connections, memory, CPU).
Disadvantage: does not prevent long-term abuse if requests are short.
2. Basic setup
Rate limiting is available in System.Threading.RateLimiting (built-in) and integrated into ASP.NET Core via middleware.
2.1 Fixed Window — starter example
// Program.cs
using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("fixed", limiterOptions =>
{
limiterOptions.PermitLimit = 100; // max requests
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 10; // queued requests
});
// Custom 429 response
options.OnRejected = async (context, cancellationToken) =>
{
context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
if (context.Lease.TryGetMetadata(
MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
await context.HttpContext.Response.WriteAsJsonAsync(new
{
error = "Too many requests.",
message = "You have exceeded the request limit. Please try again later."
}, cancellationToken);
};
});
// IMPORTANT: UseRateLimiter before UseRouting / MapControllers
app.UseRateLimiter();
app.MapControllers();
2.2 Applying on an endpoint
[ApiController]
[Route("api/[controller]")]
public class ProductsController : ControllerBase
{
[HttpGet]
[EnableRateLimiting("fixed")] // apply "fixed" policy
public IActionResult GetAll() => Ok();
[HttpGet("public")]
[DisableRateLimiting] // explicitly exclude from any rate limiting
public IActionResult GetPublic() => Ok();
}
2.3 Global application on all endpoints
builder.Services.AddRateLimiter(options =>
{
// Global policy — applies to all endpoints
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
httpContext => RateLimitPartition.GetFixedWindowLimiter(
partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
factory: _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 200,
Window = TimeSpan.FromMinutes(1)
}));
});
3. Sliding Window
builder.Services.AddRateLimiter(options =>
{
options.AddSlidingWindowLimiter("sliding", limiterOptions =>
{
limiterOptions.PermitLimit = 100;
limiterOptions.Window = TimeSpan.FromMinutes(1);
limiterOptions.SegmentsPerWindow = 6; // window divided into 6 segments of 10s
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 5;
});
});
SegmentsPerWindow controls the granularity of the sliding window. With 6 segments per minute, the window updates every 10 seconds — finer than Fixed Window, less costly than timestamp per request.
4. Token Bucket
builder.Services.AddRateLimiter(options =>
{
options.AddTokenBucketLimiter("token-bucket", limiterOptions =>
{
limiterOptions.TokenLimit = 50; // max bucket capacity
limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
limiterOptions.TokensPerPeriod = 10; // 10 tokens every 10s = 1/s
limiterOptions.AutoReplenishment = true; // automatic background reload
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 5;
});
});
The above configuration allows bursts of up to 50 requests if the bucket is full, but the sustained average rate is 1 request/second. Ideal for endpoints that need to be responsive to normal human interactions but block automated scripts.
5. Concurrency Limiter
builder.Services.AddRateLimiter(options =>
{
options.AddConcurrencyLimiter("concurrency", limiterOptions =>
{
limiterOptions.PermitLimit = 20; // max 20 simultaneous requests
limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
limiterOptions.QueueLimit = 10;
});
});
Suitable for endpoints performing costly operations: report generation, image processing, calls to slow external services. You limit how many are processed simultaneously, not how many come per second.
6. Rate limiting per user — PartitionedRateLimiter
The most common production scenario: different limits for authenticated vs. anonymous users, or limits per subscription plan.
builder.Services.AddRateLimiter(options =>
{
options.AddPolicy("per-user", httpContext =>
{
var user = httpContext.User;
// Authenticated user — more generous limit
if (user.Identity?.IsAuthenticated == true)
{
var userId = user.FindFirstValue(ClaimTypes.NameIdentifier)
?? "authenticated-unknown";
// Limit per subscription plan
var plan = user.FindFirstValue("subscription_plan") ?? "basic";
var limit = plan switch
{
"pro" => 1000,
"business" => 5000,
_ => 100 // basic
};
return RateLimitPartition.GetSlidingWindowLimiter(
partitionKey: $"user:{userId}",
factory: _ => new SlidingWindowRateLimiterOptions
{
PermitLimit = limit,
Window = TimeSpan.FromMinutes(1),
SegmentsPerWindow = 6
});
}
// Anonymous user — limited per IP
var ip = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
return RateLimitPartition.GetFixedWindowLimiter(
partitionKey: $"anon:{ip}",
factory: _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 20,
Window = TimeSpan.FromMinutes(1)
});
});
options.OnRejected = async (context, ct) =>
{
context.HttpContext.Response.StatusCode = 429;
if (context.Lease.TryGetMetadata(
MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
await context.HttpContext.Response.WriteAsJsonAsync(new
{
error = "Rate limit exceeded.",
retryAfterSeconds = retryAfter.TotalSeconds
}, ct);
};
});
Applying on a controller
[ApiController]
[Route("api/[controller]")]
[EnableRateLimiting("per-user")]
public class ApiController : ControllerBase
{
// All endpoints in the controller follow the "per-user" policy
}
7. Rate limiting on critical endpoints — authentication and registration
Authentication endpoints are the primary target of brute-force attacks. They must be treated separately, with much stricter limits:
builder.Services.AddRateLimiter(options =>
{
// Login endpoint — strict, per IP
options.AddFixedWindowLimiter("auth-strict", limiterOptions =>
{
limiterOptions.PermitLimit = 5; // 5 attempts
limiterOptions.Window = TimeSpan.FromMinutes(15); // per 15 minutes
limiterOptions.QueueLimit = 0; // no queue
});
// Registration endpoint — moderate, per IP
options.AddFixedWindowLimiter("register-moderate", limiterOptions =>
{
limiterOptions.PermitLimit = 3;
limiterOptions.Window = TimeSpan.FromHours(1);
limiterOptions.QueueLimit = 0;
});
// Password reset — very strict
options.AddFixedWindowLimiter("password-reset", limiterOptions =>
{
limiterOptions.PermitLimit = 3;
limiterOptions.Window = TimeSpan.FromHours(24);
limiterOptions.QueueLimit = 0;
});
});
[HttpPost("login")]
[EnableRateLimiting("auth-strict")]
[AllowAnonymous]
public async Task<IActionResult> Login([FromBody] LoginDto dto) { ... }
[HttpPost("register")]
[EnableRateLimiting("register-moderate")]
[AllowAnonymous]
public async Task<IActionResult> Register([FromBody] RegisterDto dto) { ... }
[HttpPost("forgot-password")]
[EnableRateLimiting("password-reset")]
[AllowAnonymous]
public async Task<IActionResult> ForgotPassword([FromBody] ForgotPasswordDto dto) { ... }
Beware of IP spoofing: If your application is behind a proxy or load balancer,
RemoteIpAddresswill always be the proxy’s IP. You must read the real IP from theX-Forwarded-FororX-Real-IPheader, configured viaForwardedHeadersmiddleware.
// Program.cs — read real IP behind proxy
builder.Services.Configure<ForwardedHeadersOptions>(options =>
{
options.ForwardedHeaders =
ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto;
// Restrict to trusted proxy IPs
options.KnownProxies.Add(IPAddress.Parse("10.0.0.1"));
});
app.UseForwardedHeaders();
app.UseRateLimiter(); // after ForwardedHeaders
8. 429 response and standard headers
A well-formed 429 response allows clients to behave intelligently — to wait exactly as long as needed before retrying:
options.OnRejected = async (context, cancellationToken) =>
{
var response = context.HttpContext.Response;
response.StatusCode = StatusCodes.Status429TooManyRequests;
response.ContentType = "application/json";
// Retry-After: how many seconds the client should wait
if (context.Lease.TryGetMetadata(
MetadataName.RetryAfter, out var retryAfter))
{
response.Headers.RetryAfter =
((int)retryAfter.TotalSeconds).ToString();
}
// Log for monitoring
var logger = context.HttpContext.RequestServices
.GetRequiredService<ILogger<Program>>();
var ip = context.HttpContext.Connection.RemoteIpAddress;
var path = context.HttpContext.Request.Path;
var user = context.HttpContext.User.Identity?.Name ?? "anonymous";
logger.LogWarning(
"Rate limit exceeded. IP: {IP}, Path: {Path}, User: {User}",
ip, path, user);
await response.WriteAsJsonAsync(new
{
type = "https://tools.ietf.org/html/rfc6585#section-4",
title = "Too Many Requests",
status = 429,
detail = "You have exceeded the allowed request limit.",
retryAfterSeconds = retryAfter.TotalSeconds
}, cancellationToken);
};
9. Distributed rate limiting — multiple instances
The built-in rate limiting is in-process — it stores counters in memory. If you have multiple instances of the application (scale-out, Kubernetes), each instance has its own counters — a client can make N times more requests than the limit if it hits different instances.
Solutions for distributed rate limiting:
9.1 Redis with RedisRateLimiting
dotnet add package RedisRateLimiting
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = builder.Configuration
.GetConnectionString("Redis");
});
builder.Services.AddRateLimiter(options =>
{
var redisConnection = ConnectionMultiplexer.Connect(
builder.Configuration.GetConnectionString("Redis")!);
options.AddRedisSlidingWindowLimiter("distributed-sliding",
limiterOptions =>
{
limiterOptions.ConnectionMultiplexerFactory = () => redisConnection;
limiterOptions.PermitLimit = 100;
limiterOptions.Window = TimeSpan.FromMinutes(1);
});
});
9.2 Azure API Management
If you use Azure API Management as a gateway, rate limiting can be configured at the gateway level — before the request reaches the application, regardless of how many backend instances exist:
<!-- APIM policy: 100 requests per minute per subscription key -->
<rate-limit-by-key
calls="100"
renewal-period="60"
counter-key="@(context.Subscription.Id)" />
<!-- OR per IP -->
<rate-limit-by-key
calls="20"
renewal-period="60"
counter-key="@(context.Request.IpAddress)" />
APIM and application rate limiting can coexist — APIM provides macro-level protection (per client/plan), the application provides granular protection (per specific endpoint).
10. Combined rate limiting — multiple policies on the same endpoint
You can combine multiple limiters for layered protection. For example: limit per IP and global limit simultaneously:
builder.Services.AddRateLimiter(options =>
{
// Global limit — protects server resources
options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
// Layer 1: limit per IP
PartitionedRateLimiter.Create<HttpContext, string>(
ctx => RateLimitPartition.GetFixedWindowLimiter(
partitionKey: ctx.Connection.RemoteIpAddress?.ToString() ?? "unknown",
factory: _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 200,
Window = TimeSpan.FromMinutes(1)
})),
// Layer 2: total global limit (all IPs)
PartitionedRateLimiter.Create<HttpContext, string>(
_ => RateLimitPartition.GetConcurrencyLimiter(
partitionKey: "global",
factory: _ => new ConcurrencyLimiterOptions
{
PermitLimit = 500,
QueueLimit = 0
}))
);
});
11. Rate limiting testing
[TestFixture]
public class RateLimitingTests
{
private WebApplicationFactory<Program> _factory = default!;
[SetUp]
public void Setup()
{
_factory = new WebApplicationFactory<Program>()
.WithWebHostBuilder(builder =>
{
builder.ConfigureServices(services =>
{
// Override with small limits for tests
services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("fixed",
o =>
{
o.PermitLimit = 3;
o.Window = TimeSpan.FromSeconds(10);
o.QueueLimit = 0;
});
options.OnRejected = async (ctx, ct) =>
{
ctx.HttpContext.Response.StatusCode = 429;
await Task.CompletedTask;
};
});
});
});
}
[Test]
public async Task RateLimit_ExceedingLimit_Returns429()
{
var client = _factory.CreateClient();
// First 3 requests should pass
for (int i = 0; i < 3; i++)
{
var response = await client.GetAsync("/api/products");
Assert.That(response.StatusCode,
Is.Not.EqualTo(HttpStatusCode.TooManyRequests));
}
// 4th should be rejected
var rejected = await client.GetAsync("/api/products");
Assert.That(rejected.StatusCode,
Is.EqualTo(HttpStatusCode.TooManyRequests));
}
[TearDown]
public void TearDown() => _factory.Dispose();
}
12. Common problems and their solutions
| Problem | Likely cause | Solution |
|---|---|---|
| Rate limiting does not work | app.UseRateLimiter() is missing or in the wrong order |
Add it before app.MapControllers() / app.UseRouting() |
| All users get 429, not just abusers | Proxy IP is used as partition key | Configure ForwardedHeaders middleware before rate limiter |
| Different limits between instances (scale-out) | Counters are in-process, not distributed | Migrate to Redis with RedisRateLimiting or use APIM |
| Client does not know when to retry | Retry-After header is missing from 429 response |
Add MetadataName.RetryAfter in OnRejected |
| Internal endpoints blocked by rate limiting | Health checks, metrics endpoints are limited | Add [DisableRateLimiting] on health check endpoints or exclude by path |
13. Production checklist
- ✅
app.UseRateLimiter()registered in the correct order in the pipeline - ✅
ForwardedHeadersmiddleware configured if the app is behind a proxy - ✅ Authentication endpoints (
/login,/register,/forgot-password) with separate strict limits - ✅ 429 response with
Retry-Afterheader and structured JSON body - ✅ Logging in
OnRejectedwith IP, path, and user for monitoring - ✅ Different limits for authenticated vs. anonymous users
- ✅ Limits per subscription plan if the app has multiple tiers
- ✅ Redis or APIM for deployment with multiple instances
- ✅
[DisableRateLimiting]on health check and metrics endpoints - ✅ Integration tests verifying behavior when limits are exceeded
- ✅ Alert in Application Insights / Azure Monitor when 429 rate exceeds a threshold
Conclusion
The built-in rate limiting in .NET 10 removes the need for external libraries for the most common scenarios. The four algorithms cover different cases: Fixed Window for simplicity, Sliding Window for even distribution, Token Bucket for natural human behavior, Concurrency Limiter for resource-level protection.
The most important architectural decision: in-process or distributed. If you run a single instance, built-in is sufficient. If you scale horizontally, you need Redis or a gateway (APIM, Nginx) that centralizes counters.
The security series concludes with the last article: Audit Logging and Security Events — how to record what happens in the system so you can detect attacks, investigate incidents, and demonstrate compliance.