Rate Limiting in .NET — protect your APIs from abuse

An API without rate limiting is an API that can be overwhelmed by anyone — intentionally or accidentally. A client with a bug in the code that makes 10,000 requests per minute, a scraping script, a brute-force attack on the authentication endpoint — all produce the same effect: exhausted resources, increased latency, degraded service for legitimate users.

Until .NET 7, rate limiting required external libraries (AspNetCoreRateLimit, etc.) or custom solutions. Starting with .NET 8 and refined in later versions, the framework includes a complete, flexible rate limiting system well integrated with the ASP.NET Core middleware pipeline. All examples in this article are valid on .NET 10.

1. The four rate limiting algorithms

Before any code, it’s worth understanding which algorithm fits each scenario. The wrong choice results in either insufficient protection or frustration for legitimate users.

Fixed Window

Allows a fixed number of requests within a fixed time window (e.g., 100 requests per minute). When the window expires, the counter resets completely.

Advantage: simple, predictable, easy to communicate to users.
Disadvantage: vulnerable to burst attack — a client can make 100 requests in the last 5 seconds of the window and 100 in the first 5 seconds of the next window: 200 requests in 10 seconds, even though the limit is 100/minute.

Sliding Window

Similar to Fixed Window, but the window moves in real time relative to the last request. Eliminates burst attacks at the window boundary.

Advantage: more even distribution of requests.
Disadvantage: more memory-intensive (you must keep timestamps of recent requests).

Token Bucket

A "bucket" with a maximum number of tokens. Each request consumes a token. Tokens regenerate at a constant rate. Allows short bursts (if the bucket is full) but limits the average rate over the long term.

Advantage: most natural for real human use — a user can make a few quick requests but cannot sustain a high rate indefinitely.
Disadvantage: more complex to communicate to users (when exactly do tokens reload?).

Concurrency Limiter

Limits the number of requests processed simultaneously, not the rate over time. It does not count requests per second but how many are active at the same moment.

Advantage: direct protection against resource overload (DB connections, memory, CPU).
Disadvantage: does not prevent long-term abuse if requests are short.

2. Basic setup

Rate limiting is available in System.Threading.RateLimiting (built-in) and integrated into ASP.NET Core via middleware.

2.1 Fixed Window — starter example

// Program.cs
using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", limiterOptions =>
    {
        limiterOptions.PermitLimit         = 100;           // max requests
        limiterOptions.Window              = TimeSpan.FromMinutes(1);
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        limiterOptions.QueueLimit          = 10;            // queued requests
    });

    // Custom 429 response
    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;

        if (context.Lease.TryGetMetadata(
                MetadataName.RetryAfter, out var retryAfter))
        {
            context.HttpContext.Response.Headers.RetryAfter =
                ((int)retryAfter.TotalSeconds).ToString();
        }

        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error   = "Too many requests.",
            message = "You have exceeded the request limit. Please try again later."
        }, cancellationToken);
    };
});

// IMPORTANT: UseRateLimiter before UseRouting / MapControllers
app.UseRateLimiter();
app.MapControllers();

2.2 Applying on an endpoint

[ApiController]
[Route("api/[controller]")]
public class ProductsController : ControllerBase
{
    [HttpGet]
    [EnableRateLimiting("fixed")]  // apply "fixed" policy
    public IActionResult GetAll() => Ok();

    [HttpGet("public")]
    [DisableRateLimiting]  // explicitly exclude from any rate limiting
    public IActionResult GetPublic() => Ok();
}

2.3 Global application on all endpoints

builder.Services.AddRateLimiter(options =>
{
    // Global policy — applies to all endpoints
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
        httpContext => RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            factory: _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 200,
                Window      = TimeSpan.FromMinutes(1)
            }));
});

3. Sliding Window

builder.Services.AddRateLimiter(options =>
{
    options.AddSlidingWindowLimiter("sliding", limiterOptions =>
    {
        limiterOptions.PermitLimit          = 100;
        limiterOptions.Window               = TimeSpan.FromMinutes(1);
        limiterOptions.SegmentsPerWindow    = 6;   // window divided into 6 segments of 10s
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        limiterOptions.QueueLimit           = 5;
    });
});

SegmentsPerWindow controls the granularity of the sliding window. With 6 segments per minute, the window updates every 10 seconds — finer than Fixed Window, less costly than timestamp per request.

4. Token Bucket

builder.Services.AddRateLimiter(options =>
{
    options.AddTokenBucketLimiter("token-bucket", limiterOptions =>
    {
        limiterOptions.TokenLimit            = 50;   // max bucket capacity
        limiterOptions.ReplenishmentPeriod   = TimeSpan.FromSeconds(10);
        limiterOptions.TokensPerPeriod       = 10;   // 10 tokens every 10s = 1/s
        limiterOptions.AutoReplenishment     = true; // automatic background reload
        limiterOptions.QueueProcessingOrder  = QueueProcessingOrder.OldestFirst;
        limiterOptions.QueueLimit            = 5;
    });
});

The above configuration allows bursts of up to 50 requests if the bucket is full, but the sustained average rate is 1 request/second. Ideal for endpoints that need to be responsive to normal human interactions but block automated scripts.

5. Concurrency Limiter

builder.Services.AddRateLimiter(options =>
{
    options.AddConcurrencyLimiter("concurrency", limiterOptions =>
    {
        limiterOptions.PermitLimit          = 20;  // max 20 simultaneous requests
        limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        limiterOptions.QueueLimit           = 10;
    });
});

Suitable for endpoints performing costly operations: report generation, image processing, calls to slow external services. You limit how many are processed simultaneously, not how many come per second.

6. Rate limiting per user — PartitionedRateLimiter

The most common production scenario: different limits for authenticated vs. anonymous users, or limits per subscription plan.

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("per-user", httpContext =>
    {
        var user = httpContext.User;

        // Authenticated user — more generous limit
        if (user.Identity?.IsAuthenticated == true)
        {
            var userId = user.FindFirstValue(ClaimTypes.NameIdentifier)
                ?? "authenticated-unknown";

            // Limit per subscription plan
            var plan  = user.FindFirstValue("subscription_plan") ?? "basic";
            var limit = plan switch
            {
                "pro"      => 1000,
                "business" => 5000,
                _          => 100    // basic
            };

            return RateLimitPartition.GetSlidingWindowLimiter(
                partitionKey: $"user:{userId}",
                factory: _ => new SlidingWindowRateLimiterOptions
                {
                    PermitLimit       = limit,
                    Window            = TimeSpan.FromMinutes(1),
                    SegmentsPerWindow = 6
                });
        }

        // Anonymous user — limited per IP
        var ip = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
        return RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: $"anon:{ip}",
            factory: _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 20,
                Window      = TimeSpan.FromMinutes(1)
            });
    });

    options.OnRejected = async (context, ct) =>
    {
        context.HttpContext.Response.StatusCode = 429;

        if (context.Lease.TryGetMetadata(
                MetadataName.RetryAfter, out var retryAfter))
        {
            context.HttpContext.Response.Headers.RetryAfter =
                ((int)retryAfter.TotalSeconds).ToString();
        }

        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "Rate limit exceeded.",
            retryAfterSeconds = retryAfter.TotalSeconds
        }, ct);
    };
});

Applying on a controller

[ApiController]
[Route("api/[controller]")]
[EnableRateLimiting("per-user")]
public class ApiController : ControllerBase
{
    // All endpoints in the controller follow the "per-user" policy
}

7. Rate limiting on critical endpoints — authentication and registration

Authentication endpoints are the primary target of brute-force attacks. They must be treated separately, with much stricter limits:

builder.Services.AddRateLimiter(options =>
{
    // Login endpoint — strict, per IP
    options.AddFixedWindowLimiter("auth-strict", limiterOptions =>
    {
        limiterOptions.PermitLimit = 5;                        // 5 attempts
        limiterOptions.Window      = TimeSpan.FromMinutes(15); // per 15 minutes
        limiterOptions.QueueLimit  = 0;                        // no queue
    });

    // Registration endpoint — moderate, per IP
    options.AddFixedWindowLimiter("register-moderate", limiterOptions =>
    {
        limiterOptions.PermitLimit = 3;
        limiterOptions.Window      = TimeSpan.FromHours(1);
        limiterOptions.QueueLimit  = 0;
    });

    // Password reset — very strict
    options.AddFixedWindowLimiter("password-reset", limiterOptions =>
    {
        limiterOptions.PermitLimit = 3;
        limiterOptions.Window      = TimeSpan.FromHours(24);
        limiterOptions.QueueLimit  = 0;
    });
});

[HttpPost("login")]
[EnableRateLimiting("auth-strict")]
[AllowAnonymous]
public async Task<IActionResult> Login([FromBody] LoginDto dto) { ... }

[HttpPost("register")]
[EnableRateLimiting("register-moderate")]
[AllowAnonymous]
public async Task<IActionResult> Register([FromBody] RegisterDto dto) { ... }

[HttpPost("forgot-password")]
[EnableRateLimiting("password-reset")]
[AllowAnonymous]
public async Task<IActionResult> ForgotPassword([FromBody] ForgotPasswordDto dto) { ... }

Beware of IP spoofing: If your application is behind a proxy or load balancer, RemoteIpAddress will always be the proxy’s IP. You must read the real IP from the X-Forwarded-For or X-Real-IP header, configured via ForwardedHeaders middleware.

// Program.cs — read real IP behind proxy
builder.Services.Configure<ForwardedHeadersOptions>(options =>
{
    options.ForwardedHeaders =
        ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto;
    // Restrict to trusted proxy IPs
    options.KnownProxies.Add(IPAddress.Parse("10.0.0.1"));
});

app.UseForwardedHeaders();
app.UseRateLimiter(); // after ForwardedHeaders

8. 429 response and standard headers

A well-formed 429 response allows clients to behave intelligently — to wait exactly as long as needed before retrying:

options.OnRejected = async (context, cancellationToken) =>
{
    var response = context.HttpContext.Response;
    response.StatusCode  = StatusCodes.Status429TooManyRequests;
    response.ContentType = "application/json";

    // Retry-After: how many seconds the client should wait
    if (context.Lease.TryGetMetadata(
            MetadataName.RetryAfter, out var retryAfter))
    {
        response.Headers.RetryAfter =
            ((int)retryAfter.TotalSeconds).ToString();
    }

    // Log for monitoring
    var logger = context.HttpContext.RequestServices
        .GetRequiredService<ILogger<Program>>();

    var ip   = context.HttpContext.Connection.RemoteIpAddress;
    var path = context.HttpContext.Request.Path;
    var user = context.HttpContext.User.Identity?.Name ?? "anonymous";

    logger.LogWarning(
        "Rate limit exceeded. IP: {IP}, Path: {Path}, User: {User}",
        ip, path, user);

    await response.WriteAsJsonAsync(new
    {
        type    = "https://tools.ietf.org/html/rfc6585#section-4",
        title   = "Too Many Requests",
        status  = 429,
        detail  = "You have exceeded the allowed request limit.",
        retryAfterSeconds = retryAfter.TotalSeconds
    }, cancellationToken);
};

9. Distributed rate limiting — multiple instances

The built-in rate limiting is in-process — it stores counters in memory. If you have multiple instances of the application (scale-out, Kubernetes), each instance has its own counters — a client can make N times more requests than the limit if it hits different instances.

Solutions for distributed rate limiting:

9.1 Redis with RedisRateLimiting

dotnet add package RedisRateLimiting

builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = builder.Configuration
        .GetConnectionString("Redis");
});

builder.Services.AddRateLimiter(options =>
{
    var redisConnection = ConnectionMultiplexer.Connect(
        builder.Configuration.GetConnectionString("Redis")!);

    options.AddRedisSlidingWindowLimiter("distributed-sliding",
        limiterOptions =>
        {
            limiterOptions.ConnectionMultiplexerFactory = () => redisConnection;
            limiterOptions.PermitLimit       = 100;
            limiterOptions.Window            = TimeSpan.FromMinutes(1);
        });
});

9.2 Azure API Management

If you use Azure API Management as a gateway, rate limiting can be configured at the gateway level — before the request reaches the application, regardless of how many backend instances exist:

<!-- APIM policy: 100 requests per minute per subscription key -->
<rate-limit-by-key
    calls="100"
    renewal-period="60"
    counter-key="@(context.Subscription.Id)" />

<!-- OR per IP -->
<rate-limit-by-key
    calls="20"
    renewal-period="60"
    counter-key="@(context.Request.IpAddress)" />

APIM and application rate limiting can coexist — APIM provides macro-level protection (per client/plan), the application provides granular protection (per specific endpoint).

10. Combined rate limiting — multiple policies on the same endpoint

You can combine multiple limiters for layered protection. For example: limit per IP and global limit simultaneously:

builder.Services.AddRateLimiter(options =>
{
    // Global limit — protects server resources
    options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
        // Layer 1: limit per IP
        PartitionedRateLimiter.Create<HttpContext, string>(
            ctx => RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: ctx.Connection.RemoteIpAddress?.ToString() ?? "unknown",
                factory: _ => new FixedWindowRateLimiterOptions
                {
                    PermitLimit = 200,
                    Window      = TimeSpan.FromMinutes(1)
                })),

        // Layer 2: total global limit (all IPs)
        PartitionedRateLimiter.Create<HttpContext, string>(
            _ => RateLimitPartition.GetConcurrencyLimiter(
                partitionKey: "global",
                factory: _ => new ConcurrencyLimiterOptions
                {
                    PermitLimit = 500,
                    QueueLimit  = 0
                }))
    );
});

11. Rate limiting testing

[TestFixture]
public class RateLimitingTests
{
    private WebApplicationFactory<Program> _factory = default!;

    [SetUp]
    public void Setup()
    {
        _factory = new WebApplicationFactory<Program>()
            .WithWebHostBuilder(builder =>
            {
                builder.ConfigureServices(services =>
                {
                    // Override with small limits for tests
                    services.AddRateLimiter(options =>
                    {
                        options.AddFixedWindowLimiter("fixed",
                            o =>
                            {
                                o.PermitLimit = 3;
                                o.Window      = TimeSpan.FromSeconds(10);
                                o.QueueLimit  = 0;
                            });
                        options.OnRejected = async (ctx, ct) =>
                        {
                            ctx.HttpContext.Response.StatusCode = 429;
                            await Task.CompletedTask;
                        };
                    });
                });
            });
    }

    [Test]
    public async Task RateLimit_ExceedingLimit_Returns429()
    {
        var client = _factory.CreateClient();

        // First 3 requests should pass
        for (int i = 0; i < 3; i++)
        {
            var response = await client.GetAsync("/api/products");
            Assert.That(response.StatusCode,
                Is.Not.EqualTo(HttpStatusCode.TooManyRequests));
        }

        // 4th should be rejected
        var rejected = await client.GetAsync("/api/products");
        Assert.That(rejected.StatusCode,
            Is.EqualTo(HttpStatusCode.TooManyRequests));
    }

    [TearDown]
    public void TearDown() => _factory.Dispose();
}

12. Common problems and their solutions

Problem	Likely cause	Solution
Rate limiting does not work	`app.UseRateLimiter()` is missing or in the wrong order	Add it before `app.MapControllers()` / `app.UseRouting()`
All users get 429, not just abusers	Proxy IP is used as partition key	Configure `ForwardedHeaders` middleware before rate limiter
Different limits between instances (scale-out)	Counters are in-process, not distributed	Migrate to Redis with `RedisRateLimiting` or use APIM
Client does not know when to retry	`Retry-After` header is missing from 429 response	Add `MetadataName.RetryAfter` in `OnRejected`
Internal endpoints blocked by rate limiting	Health checks, metrics endpoints are limited	Add `[DisableRateLimiting]` on health check endpoints or exclude by path

13. Production checklist

✅ app.UseRateLimiter() registered in the correct order in the pipeline
✅ ForwardedHeaders middleware configured if the app is behind a proxy
✅ Authentication endpoints (/login, /register, /forgot-password) with separate strict limits
✅ 429 response with Retry-After header and structured JSON body
✅ Logging in OnRejected with IP, path, and user for monitoring
✅ Different limits for authenticated vs. anonymous users
✅ Limits per subscription plan if the app has multiple tiers
✅ Redis or APIM for deployment with multiple instances
✅ [DisableRateLimiting] on health check and metrics endpoints
✅ Integration tests verifying behavior when limits are exceeded
✅ Alert in Application Insights / Azure Monitor when 429 rate exceeds a threshold

Conclusion

The built-in rate limiting in .NET 10 removes the need for external libraries for the most common scenarios. The four algorithms cover different cases: Fixed Window for simplicity, Sliding Window for even distribution, Token Bucket for natural human behavior, Concurrency Limiter for resource-level protection.

The most important architectural decision: in-process or distributed. If you run a single instance, built-in is sufficient. If you scale horizontally, you need Redis or a gateway (APIM, Nginx) that centralizes counters.

The security series concludes with the last article: Audit Logging and Security Events — how to record what happens in the system so you can detect attacks, investigate incidents, and demonstrate compliance.