Skip to content

Improve and add ratelimit tests#899

Open
Tharsanan1 wants to merge 4 commits intowso2:mainfrom
Tharsanan1:token-rl-new
Open

Improve and add ratelimit tests#899
Tharsanan1 wants to merge 4 commits intowso2:mainfrom
Tharsanan1:token-rl-new

Conversation

@Tharsanan1
Copy link
Contributor

@Tharsanan1 Tharsanan1 commented Feb 3, 2026

Purpose

$subject

Summary by CodeRabbit

  • New Features

    • Advanced multi-quota rate-limiting (GCRA & fixed-window) with memory and Redis backends.
    • Token-based and LLM cost-based rate limiting with dynamic cost extraction and per-provider delegates.
    • Standardized RateLimit headers (legacy and IETF) and customizable 429 responses.
  • Tests

    • Extensive unit, integration, and BDD suites covering token- and cost-based scenarios, concurrency, and edge cases.
  • Documentation

    • Added example provider/template manifests and policy definition/config samples.
  • Chores

    • Test infra: docker-compose/service added for integration tests.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

Walkthrough

Adds a full advanced rate-limiting subsystem plus two delegating policies (token-based and LLM cost-based), supporting GCRA and fixed-window algorithms, memory/Redis backends, CEL-based key/cost extraction, extensive unit/integration/BDD tests, examples, and related build/config updates.

Changes

Cohort / File(s) Summary
Infrastructure & Test Config
gateway/build.yaml, gateway/it/docker-compose.test.yaml, gateway/it/test-config.yaml, gateway/it/go.mod
Added token-based-ratelimit policy entry, new echo-backend-multi-arch test service, LLM cost-based ratelimit config block, and dependency updates in integration test module.
Advanced Rate Limiting Core
gateway/policies/advanced-ratelimit/..., gateway/policies/advanced-ratelimit/go.mod, gateway/policies/advanced-ratelimit/policy-definition.yaml
New advanced-ratelimit policy implementation (policy runtime, config schema, module), with multi-quota support, key extraction, cost extraction, and backend selection.
Limiter Abstractions & Utilities
gateway/policies/advanced-ratelimit/limiter/...
New limiter interface, factory/registry, Result type with header helpers, and Clock abstraction (System/Fixed) for testing.
Fixed-Window Algorithm
gateway/policies/advanced-ratelimit/algorithms/fixedwindow/...
Fixed-window algorithm implementation: Policy, memory & Redis limiters, MultiLimiter composition, factory registration, and unit tests.
GCRA Algorithm
gateway/policies/advanced-ratelimit/algorithms/gcra/...
GCRA implementation with Policy, memory & Redis limiters (Lua script), MultiLimiter, factory registration, and tests.
CEL & Cost Extraction
gateway/policies/advanced-ratelimit/cel_evaluator.go, .../cost_extractor.go
CEL evaluator with cached programs plus CostExtractor supporting multiple source types (header/metadata/body/CEL) across request/response phases.
Advanced RateLimit Integration Tests
gateway/policies/advanced-ratelimit/ratelimit_integration_test.go
Integration tests covering shared vs route-scoped quota lifecycle and limiter reuse/cleanup.
Token-Based Rate Limiting
gateway/policies/token-based-ratelimit/..., gateway/policies/token-based-ratelimit/go.mod, gateway/policies/token-based-ratelimit/policy-definition.yaml
New token-based policy that resolves provider templates, creates per-provider delegates into advanced-ratelimit, includes concurrency protections, conversion helpers, and extensive tests.
LLM Cost-Based Rate Limiting
gateway/policies/llm-cost-based-ratelimit/..., gateway/policies/llm-cost-based-ratelimit/go.mod, gateway/policies/llm-cost-based-ratelimit/policy-definition.yaml
New policy converting provider templates/budgets into advanced-ratelimit quotas, delegate caching per provider, cost scaling/transform, and comprehensive unit tests.
Integration Tests & Examples
gateway/it/features/llm-cost-based-ratelimit.feature, gateway/it/features/token-based-ratelimit.feature, gateway/it/features/ratelimit.feature, gateway/it/suite_test.go, gateway/it/steps/http_steps.go, gateway/examples/test-template.yaml, gateway/examples/test-provider.yaml, gateway/it/test_output.txt
Large BDD feature files for token/LLM cost-based testing, a new active token-based feature in suite, added verbose HTTP step logging, example provider/template manifests, and captured test output.
Tests & Coverage
numerous _test.go files under policies/advanced-ratelimit, token-based-ratelimit, llm-cost-based-ratelimit
Extensive unit and integration tests added for algorithms, memory/redis implementations, cost extraction, concurrency, and delegate lifecycle.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Gateway as Gateway
    participant Limiter as Limiter
    participant Backend as Backend (Memory/Redis)
    participant CostExt as Cost Extractor
    participant Upstream as Upstream

    Client->>Gateway: Request
    Gateway->>Gateway: Extract key (header/CEL/metadata)
    Gateway->>Limiter: Allow/AllowN(key, n)
    Limiter->>Backend: Check/INCR per-window key
    Backend-->>Limiter: Allowed/Denied + Remaining/Reset
    Limiter-->>Gateway: Result
    alt Denied
        Gateway-->>Client: 429 + RateLimit headers
    else Allowed
        Gateway->>Upstream: Forward request
        Upstream-->>Gateway: Response
        Gateway->>CostExt: Extract response cost (body/header/CEL)
        CostExt-->>Gateway: costValue
        Gateway->>Limiter: ConsumeN(key, costValue)
        Limiter->>Backend: Force-consume tokens
        Backend-->>Limiter: Updated Remaining
        Limiter-->>Gateway: Updated Result
        Gateway-->>Client: 200 + RateLimit headers
    end
Loading
sequenceDiagram
    participant Client
    participant PolicyA as Provider Policy A
    participant Resolver as Template Resolver
    participant Delegate as Advanced Ratelimit Delegate

    Client->>PolicyA: Request (provider=P)
    PolicyA->>Resolver: Resolve provider template (cached)
    Resolver-->>PolicyA: Template
    PolicyA->>Delegate: Create/Reuse delegate (per-template hash)
    Delegate-->>PolicyA: Delegate instance
    PolicyA->>Delegate: OnRequest -> Rate check
    Delegate-->>PolicyA: Result
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • Remove Hardcoded Policies #788 — Overlaps policy subsystem changes (both touch rate-limit/policy implementations), likely related to replacing older policy implementations with new advanced/token/LLM cost-based modules.

Suggested reviewers

  • pubudu538
  • malinthaprasan
  • Krishanx92

Poem

🐰 I hopped through code with eager paws,
Counting tokens, costs, and clause by clause,
Windows fixed and GCRA bright,
Redis hums into the night,
Headers sing—rate limits light!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is severely incomplete. It contains only the template heading '## Purpose' followed by a placeholder '> $subject', with no actual content explaining the purpose, goals, approach, test coverage, or any other required sections from the template. Complete the PR description with all required sections from the template: Purpose (with issue links), Goals, Approach, User stories, Documentation, Automation tests (unit and integration with coverage details), Security checks, Samples, Related PRs, and Test environment details.
Title check ❓ Inconclusive The title 'Improve and add ratelimit tests' is vague and generic. It uses non-specific terms like 'Improve' and 'add' without clarifying the primary change or scope of the ratelimit enhancements. Provide a more specific title that clearly describes the main ratelimit feature or test addition, e.g., 'Add token-based and LLM cost-based ratelimit tests' or 'Implement advanced ratelimit policies with multi-algorithm support'.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@gateway/it/features/llm-cost-based-ratelimit.feature`:
- Around line 381-393: The test expectation is inconsistent: scenario 3 uses
default costs from test-config.yaml (prompt_token_cost and
completion_token_cost) but the comments and expected status for the second
request assume different per-token costs; update the scenario in
gateway/it/features/llm-cost-based-ratelimit.feature by either (A) adding a
params block in the scenario's policy config that sets prompt_token_cost and
completion_token_cost to the values used in the comments, or (B) change the
second POST step ("When I send a POST request to
\"http://localhost:8080/token-type-cost/chat/completions\" with body: ...") to
expect 200 and add a new third request that exceeds budget and expects 429 so
the scenario mirrors patterns used in scenarios 1 & 2; ensure you reference the
scenario's params and budgetLimits keys and update the commented cost
calculations accordingly.

In `@gateway/it/features/ratelimit.feature`:
- Line 48: The tests in gateway/it/features/ratelimit.feature reference
advanced-ratelimit@v0.1.3 but policy-manifest.yaml only exposes
advanced-ratelimit@v0.1.1, causing a version mismatch; fix by either updating
policy-manifest.yaml to include advanced-ratelimit@v0.1.3 (add or change the
manifest entry for advanced-ratelimit to v0.1.3) or change all occurrences of
"v0.1.3" in ratelimit.feature back to "v0.1.1" so the test scenarios and the
manifest use the same version.

In `@gateway/it/test-config.yaml`:
- Around line 354-371: Remove the orphaned llm_cost_ratelimit block from
test-config.yaml because the corresponding policy is disabled in
policy-manifest.yaml and there is no policy-definition.yaml using
wso2/defaultValue; alternatively, if you intend to enable it, instead reactivate
the policy entry in gateway/policies/policy-manifest.yaml, add the matching
policy-definition.yaml that exposes the needed wso2/defaultValue keys, re-enable
the feature test in gateway/it/suite_test.go, and then restore the
llm_cost_ratelimit configuration in test-config.yaml.

In `@gateway/policies/policy-manifest.yaml`:
- Around line 3-4: The gomodule path for the policy named
"token-based-ratelimit" contains an incorrect "./" prefix; update the gomodule
field for token-based-ratelimit to match the other policies by removing the "./"
so it reads
"github.com/wso2/gateway-controllers/policies/token-based-ratelimit@v0.1.0"
(edit the gomodule value for the token-based-ratelimit entry in
policy-manifest.yaml).
🧹 Nitpick comments (2)
gateway/it/features/llm-cost-based-ratelimit.feature (1)

248-249: Clarify cost rate differences across scenarios.

The cost rate in this scenario's comments ($0.0000001 per prompt token) differs from scenario 1 ($0.00000001 per prompt token) by a factor of 10. If both scenarios share the same systemParameters from test-config.yaml, the comments should be consistent. Consider adding a note explaining if different configurations apply, or correct the calculations to use uniform rates.

gateway/it/features/token-based-ratelimit.feature (1)

1092-1094: Consider increasing wait time for reliability.

The 2-second delay for cleanup may be insufficient in slower CI environments. Consider increasing to 3-5 seconds or implementing a readiness check instead of a fixed delay.

     # Small delay to allow cleanup
-    When I wait for "2" seconds
+    When I wait for "3" seconds

Comment on lines +381 to +393
# Request 2: 10M prompt + 20M completion = (10M×$0.00000001) + (20M×$0.00000005) = $0.10 + $1.00 = $1.10
# Total consumed: $1.00 + $1.10 = $2.10 (exceeds $2.00 budget)
When I send a POST request to "http://localhost:8080/token-type-cost/chat/completions" with body:
"""
{
"usage": {
"prompt_tokens": 10000000,
"completion_tokens": 20000000,
"total_tokens": 30000000
}
}
"""
Then the response status code should be 429
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

wc -l gateway/it/features/llm-cost-based-ratelimit.feature

Repository: wso2/api-platform

Length of output: 117


🏁 Script executed:

cat -n gateway/it/features/llm-cost-based-ratelimit.feature | head -150

Repository: wso2/api-platform

Length of output: 7191


🏁 Script executed:

cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '106,120p'

Repository: wso2/api-platform

Length of output: 962


🏁 Script executed:

cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '250,320p'

Repository: wso2/api-platform

Length of output: 3050


🏁 Script executed:

cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '320,410p'

Repository: wso2/api-platform

Length of output: 3799


🏁 Script executed:

find gateway -name "*policy-definition*" -o -name "*cost*ratelimit*" | grep -E "\.(yaml|yml|json)$"

Repository: wso2/api-platform

Length of output: 261


🏁 Script executed:

find gateway -path "*/it/*" -name "*.yaml" -o -name "*.yml" | head -20

Repository: wso2/api-platform

Length of output: 310


🏁 Script executed:

# Look for the llm-cost-based-ratelimit policy definition
rg "llm-cost-based-ratelimit" gateway --type yaml -B 5 -A 20

Repository: wso2/api-platform

Length of output: 2402


🏁 Script executed:

find gateway -type d -name "*llm*"

Repository: wso2/api-platform

Length of output: 118


🏁 Script executed:

rg "llm-cost-based-ratelimit" gateway -B 2 -A 10

Repository: wso2/api-platform

Length of output: 13456


🏁 Script executed:

# Check the test-config.yaml to see if there are system parameters or policy configurations
cat gateway/it/test-config.yaml

Repository: wso2/api-platform

Length of output: 15176


🏁 Script executed:

cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '328,365p'

Repository: wso2/api-platform

Length of output: 1612


🏁 Script executed:

# Let's also check if there's any specific cost parameter configuration in scenario 3
cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '298,400p' | grep -A 50 "Cost calculation with different token types"

Repository: wso2/api-platform

Length of output: 2059


🏁 Script executed:

# Check the policy-definition.yaml or any policy-specific defaults that might override the test-config values
find gateway -name "policy-definition.yaml" | xargs grep -l "llm-cost" 2>/dev/null

Repository: wso2/api-platform

Length of output: 45


🏁 Script executed:

# Search for any reference to the $0.00000005 cost mentioned in scenario 3 comments
rg "0.00000005" gateway --type yaml --type feature

Repository: wso2/api-platform

Length of output: 92


🏁 Script executed:

# Let's examine the complete scenario 3 to see if there are any configuration overrides we missed
cat -n gateway/it/features/llm-cost-based-ratelimit.feature | sed -n '298,402p'

Repository: wso2/api-platform

Length of output: 4375


Correct test logic inconsistency in scenario 3: request 2 should be 200, not 429.

Looking at the scenario 3 policy configuration (lines 352-361), no cost overrides are specified in the params section—only budgetLimits is defined. This means the policy uses the default token costs from test-config.yaml:

  • prompt_token_cost: 0.01 per 1M tokens
  • completion_token_cost: 0.02 per 1M tokens

However, scenario 3's comments cite completion cost of $0.00000005 per token ($0.05 per 1M), which doesn't match the defaults.

Using the default costs for scenario 3:

  • Request 1: 50M×$0.00000001 + 10M×$0.00000002 = $0.70, remaining = $1.30 > 0
  • Request 2: 10M×$0.00000001 + 20M×$0.00000002 = $0.50, total = $1.20, remaining = $0.80 > 0

Per lines 106-108, request 2 should be allowed since remaining > 0. The test expects 429, which violates the documented behavior.

Fix: Either:

  1. Add cost parameters to scenario 3's policy config to match the commented values, or
  2. Update request 2 to expect 200 and add a request 3 that exceeds the budget and receives 429 (matching the pattern in scenarios 1 & 2)
Suggested correction
-    Then the response status code should be 429
+    Then the response status code should be 200
+
+    # Request 3: Should now be blocked because budget is exhausted
+    When I send a POST request to "http://localhost:8080/token-type-cost/chat/completions" with body:
+      """
+      {
+        "usage": {
+          "prompt_tokens": 1000000,
+          "completion_tokens": 500000,
+          "total_tokens": 1500000
+        }
+      }
+      """
+    Then the response status code should be 429
🤖 Prompt for AI Agents
In `@gateway/it/features/llm-cost-based-ratelimit.feature` around lines 381 - 393,
The test expectation is inconsistent: scenario 3 uses default costs from
test-config.yaml (prompt_token_cost and completion_token_cost) but the comments
and expected status for the second request assume different per-token costs;
update the scenario in gateway/it/features/llm-cost-based-ratelimit.feature by
either (A) adding a params block in the scenario's policy config that sets
prompt_token_cost and completion_token_cost to the values used in the comments,
or (B) change the second POST step ("When I send a POST request to
\"http://localhost:8080/token-type-cost/chat/completions\" with body: ...") to
expect 200 and add a new third request that exceeds budget and expects 429 so
the scenario mirrors patterns used in scenarios 1 & 2; ensure you reference the
scenario's params and budgetLimits keys and update the commented cost
calculations accordingly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@gateway/it/features/llm-cost-based-ratelimit.feature`:
- Around line 173-289: The test assumes $0.0000001 per prompt token and
$0.0000002 per completion token but the suite uses different defaults; update
the fixture by adding explicit per-token cost fields to match the scenario
math—add the prompt token cost and completion token cost (e.g., promptTokenCost:
0.0000001 and completionTokenCost: 0.0000002) to the multi-window-cost-template
(or multi-window-cost-provider) so the llm-cost-based-ratelimit policy
calculations against the budgetLimits ($5 for "1h" and $10 for "24h") produce
the expected $2/$4 consumption for the first two requests.

In `@gateway/it/features/token-based-ratelimit.feature`:
- Around line 1006-1107: Rename the scenario title to reflect the actual steps
(e.g., change "Template change triggers cache invalidation" to "Provider
recreate grants fresh quota") or instead modify the steps to perform an actual
template update on the LlmProviderTemplate named "change-test-template" before
recreating the provider; locate the "Scenario:" line and the resources
"change-test-template" and "change-test-provider" in the feature and update the
title or add a patch/update step that changes the template spec (e.g., modify
totalTokens or requestModel identifiers) so the scenario name matches the
actions.
- Around line 40-48: The test uses response-phase extraction (identifiers
$.json.usage.*) for promptTokens, completionTokens and totalTokens but the
scenario expects pre-request blocking (immediate 429 on the third request);
change those three mappings (promptTokens, completionTokens, totalTokens) to use
request-payload extraction so cost is known before the request (e.g. use the
request body path such as $.request.body.usage.prompt_tokens or the suite's
request-phase identifier pattern) and keep the immediate 429 expectation for
that scenario; alternatively, if you prefer response-phase behavior, update the
scenario assertions to allow the third request and assert a 429 on the following
request instead.

Comment on lines +173 to +289
Scenario: Cost-based rate limit with multiple budget time windows
Given I authenticate using basic auth as "admin"

# Create LLM provider template
When I create this LLM provider template:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProviderTemplate
metadata:
name: multi-window-cost-template
spec:
displayName: Multi Window Cost Template
promptTokens:
location: payload
identifier: $.json.usage.prompt_tokens
completionTokens:
location: payload
identifier: $.json.usage.completion_tokens
totalTokens:
location: payload
identifier: $.json.usage.total_tokens
requestModel:
location: payload
identifier: $.json.model
responseModel:
location: payload
identifier: $.json.model
"""
Then the response status code should be 201

# Create LLM provider with multiple budget limits
Given I authenticate using basic auth as "admin"
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: multi-window-cost-provider
spec:
displayName: Multi Window Cost Provider
version: v1.0
context: /multi-window-cost
template: multi-window-cost-template
upstream:
url: http://echo-backend-multi-arch:8080/anything
auth:
type: api-key
header: Authorization
value: test-api-key
accessControl:
mode: deny_all
exceptions:
- path: /chat/completions
methods: [POST, GET]
policies:
- name: llm-cost-based-ratelimit
version: v0.1.0
paths:
- path: /chat/completions
methods: [POST]
params:
# Multiple budget limits: $5/hour AND $10/day
budgetLimits:
- amount: 5.0 # $5 per hour
duration: "1h"
- amount: 10.0 # $10 per day
duration: "24h"
"""
Then the response status code should be 201
And I wait for the endpoint "http://localhost:8080/multi-window-cost/chat/completions" to be ready

# Must use application/json content-type for the echo backend to parse the body
# NOTE: For post-response cost extraction, requests are blocked only when budget is exhausted.
Given I set header "Content-Type" to "application/json"

# Request 1: 10M prompt + 5M completion = (10M×$0.0000001) + (5M×$0.0000002) = $1.00 + $1.00 = $2.00
# Hourly: Consumed $2.00, Remaining $3.00 | Daily: Consumed $2.00, Remaining $8.00
When I send a POST request to "http://localhost:8080/multi-window-cost/chat/completions" with body:
"""
{
"usage": {
"prompt_tokens": 10000000,
"completion_tokens": 5000000,
"total_tokens": 15000000
}
}
"""
Then the response status code should be 200

# Request 2: 20M prompt + 10M completion = (20M×$0.0000001) + (10M×$0.0000002) = $2.00 + $2.00 = $4.00
# Hourly: Consumed $6.00 (overage), Remaining $0.00 | Daily: Consumed $6.00, Remaining $4.00
# NOTE: With post-response cost extraction, this request is allowed because remaining > 0.
When I send a POST request to "http://localhost:8080/multi-window-cost/chat/completions" with body:
"""
{
"usage": {
"prompt_tokens": 20000000,
"completion_tokens": 10000000,
"total_tokens": 30000000
}
}
"""
Then the response status code should be 200

# Request 3: Any request should be blocked because hourly budget is now exhausted
When I send a POST request to "http://localhost:8080/multi-window-cost/chat/completions" with body:
"""
{
"usage": {
"prompt_tokens": 1000000,
"completion_tokens": 500000,
"total_tokens": 1500000
}
}
"""
Then the response status code should be 429
And the response body should contain "Rate limit exceeded"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Budget exhaustion math doesn’t match the default costs.
With the defaults in test-config (0.01/0.02 per 1M), the first two requests only consume about $0.60, so the third request shouldn’t 429. Either lower the budgets/increase token counts, or set explicit prompt/completion costs to match the $2/$4 calculations.

Possible fix: set per‑token costs to match the scenario math
                 params:
                   # Multiple budget limits: $5/hour AND $10/day
                   budgetLimits:
                     - amount: 5.0       # $5 per hour
                       duration: "1h"
                     - amount: 10.0      # $10 per day
                       duration: "24h"
+                  promptTokenCost: 0.10
+                  completionTokenCost: 0.20
🤖 Prompt for AI Agents
In `@gateway/it/features/llm-cost-based-ratelimit.feature` around lines 173 - 289,
The test assumes $0.0000001 per prompt token and $0.0000002 per completion token
but the suite uses different defaults; update the fixture by adding explicit
per-token cost fields to match the scenario math—add the prompt token cost and
completion token cost (e.g., promptTokenCost: 0.0000001 and completionTokenCost:
0.0000002) to the multi-window-cost-template (or multi-window-cost-provider) so
the llm-cost-based-ratelimit policy calculations against the budgetLimits ($5
for "1h" and $10 for "24h") produce the expected $2/$4 consumption for the first
two requests.

Comment on lines +1006 to +1107
Scenario: Template change triggers cache invalidation
Given I authenticate using basic auth as "admin"

# Create initial template
When I create this LLM provider template:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProviderTemplate
metadata:
name: change-test-template
spec:
displayName: Change Test Template
totalTokens:
location: payload
identifier: $.json.usage.total_tokens
requestModel:
location: payload
identifier: $.json.model
responseModel:
location: payload
identifier: $.json.model
"""
Then the response status code should be 201

# Create provider with limit of 5
Given I authenticate using basic auth as "admin"
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
url: http://echo-backend-multi-arch:8080/anything
auth:
type: api-key
header: Authorization
value: test-api-key
accessControl:
mode: deny_all
exceptions:
- path: /chat/completions
methods: [POST, GET]
policies:
- name: token-based-ratelimit
version: v0.1.0
paths:
- path: /chat/completions
methods: [POST]
params:
totalTokenLimits:
- count: 5
duration: "1h"
algorithm: fixed-window
backend: memory
"""
Then the response status code should be 201
And I wait for the endpoint "http://localhost:8080/change-test/chat/completions" to be ready

Given I set header "Content-Type" to "application/json"

# Use up the quota (5/5)
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 5}}
"""
Then the response status code should be 200
And the response header "X-Ratelimit-Remaining" should be "0"

# Next request should be blocked
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 1}}
"""
Then the response status code should be 429

# Delete and recreate provider with higher limit
Given I authenticate using basic auth as "admin"
When I delete the LLM provider "change-test-provider"
Then the response status code should be 200

# Small delay to allow cleanup
When I wait for "2" seconds

# Recreate provider with limit of 10 (should get fresh quota)
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Scenario title doesn’t match the steps.

“Template change triggers cache invalidation” (Line 1006) doesn’t actually change the template; it deletes and recreates the provider with a new limit. Either rename the scenario or add a real template update to match the intent.

✏️ Minimal fix: rename scenario
-  Scenario: Template change triggers cache invalidation
+  Scenario: Provider recreation resets quota cache
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Scenario: Template change triggers cache invalidation
Given I authenticate using basic auth as "admin"
# Create initial template
When I create this LLM provider template:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProviderTemplate
metadata:
name: change-test-template
spec:
displayName: Change Test Template
totalTokens:
location: payload
identifier: $.json.usage.total_tokens
requestModel:
location: payload
identifier: $.json.model
responseModel:
location: payload
identifier: $.json.model
"""
Then the response status code should be 201
# Create provider with limit of 5
Given I authenticate using basic auth as "admin"
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
url: http://echo-backend-multi-arch:8080/anything
auth:
type: api-key
header: Authorization
value: test-api-key
accessControl:
mode: deny_all
exceptions:
- path: /chat/completions
methods: [POST, GET]
policies:
- name: token-based-ratelimit
version: v0.1.0
paths:
- path: /chat/completions
methods: [POST]
params:
totalTokenLimits:
- count: 5
duration: "1h"
algorithm: fixed-window
backend: memory
"""
Then the response status code should be 201
And I wait for the endpoint "http://localhost:8080/change-test/chat/completions" to be ready
Given I set header "Content-Type" to "application/json"
# Use up the quota (5/5)
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 5}}
"""
Then the response status code should be 200
And the response header "X-Ratelimit-Remaining" should be "0"
# Next request should be blocked
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 1}}
"""
Then the response status code should be 429
# Delete and recreate provider with higher limit
Given I authenticate using basic auth as "admin"
When I delete the LLM provider "change-test-provider"
Then the response status code should be 200
# Small delay to allow cleanup
When I wait for "2" seconds
# Recreate provider with limit of 10 (should get fresh quota)
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
Scenario: Provider recreation resets quota cache
Given I authenticate using basic auth as "admin"
# Create initial template
When I create this LLM provider template:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProviderTemplate
metadata:
name: change-test-template
spec:
displayName: Change Test Template
totalTokens:
location: payload
identifier: $.json.usage.total_tokens
requestModel:
location: payload
identifier: $.json.model
responseModel:
location: payload
identifier: $.json.model
"""
Then the response status code should be 201
# Create provider with limit of 5
Given I authenticate using basic auth as "admin"
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
url: http://echo-backend-multi-arch:8080/anything
auth:
type: api-key
header: Authorization
value: test-api-key
accessControl:
mode: deny_all
exceptions:
- path: /chat/completions
methods: [POST, GET]
policies:
- name: token-based-ratelimit
version: v0.1.0
paths:
- path: /chat/completions
methods: [POST]
params:
totalTokenLimits:
- count: 5
duration: "1h"
algorithm: fixed-window
backend: memory
"""
Then the response status code should be 201
And I wait for the endpoint "http://localhost:8080/change-test/chat/completions" to be ready
Given I set header "Content-Type" to "application/json"
# Use up the quota (5/5)
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 5}}
"""
Then the response status code should be 200
And the response header "X-Ratelimit-Remaining" should be "0"
# Next request should be blocked
When I send a POST request to "http://localhost:8080/change-test/chat/completions" with body:
"""
{"usage": {"total_tokens": 1}}
"""
Then the response status code should be 429
# Delete and recreate provider with higher limit
Given I authenticate using basic auth as "admin"
When I delete the LLM provider "change-test-provider"
Then the response status code should be 200
# Small delay to allow cleanup
When I wait for "2" seconds
# Recreate provider with limit of 10 (should get fresh quota)
When I create this LLM provider:
"""
apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LlmProvider
metadata:
name: change-test-provider
spec:
displayName: Change Test Provider
version: v1.0
context: /change-test
template: change-test-template
upstream:
🤖 Prompt for AI Agents
In `@gateway/it/features/token-based-ratelimit.feature` around lines 1006 - 1107,
Rename the scenario title to reflect the actual steps (e.g., change "Template
change triggers cache invalidation" to "Provider recreate grants fresh quota")
or instead modify the steps to perform an actual template update on the
LlmProviderTemplate named "change-test-template" before recreating the provider;
locate the "Scenario:" line and the resources "change-test-template" and
"change-test-provider" in the feature and update the title or add a patch/update
step that changes the template spec (e.g., modify totalTokens or requestModel
identifiers) so the scenario name matches the actions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🤖 Fix all issues with AI agents
In `@gateway/examples/test-provider.yaml`:
- Around line 22-23: The example's tokenBasedRateLimit policy version is
incorrect; update the version field for the policy named "tokenBasedRateLimit"
in gateway/examples/test-provider.yaml to match the policy-definition.yaml
version (change version: v1.0.0 to version: v0.1.0) so the example and the
actual policy definition
(gateway/policies/token-based-ratelimit/policy-definition.yaml) are consistent.

In `@gateway/it/features/token-based-ratelimit.feature`:
- Around line 97-98: Remove the systemParameters "algorithm" and "backend" from
the API params block (they are systemParameters defined in
policy-definition.yaml) and rely on the gateway config values in
gateway/it/test-config.yaml (policy_configurations.ratelimit_v010) referenced
via wso2/defaultValue; specifically, delete the "algorithm" and "backend"
entries from the token-based-ratelimit feature's params so the implementation
uses the centralized systemParameter values instead of duplicating them in the
API definition.

In `@gateway/it/test_output.txt`:
- Around line 1-92: The committed raw test run output in test_output.txt
contains local paths and environment details and must be removed from version
control; delete test_output.txt from the repo, add its filename/pattern to
.gitignore (or adjust existing ignore rules) so similar runtime logs are not
committed, and update CI to capture and upload this output as a job artifact
instead (or point contributors to how to collect local logs without committing).
Also ensure any cleanup or README notes mention that test artifacts belong to CI
and not source control.

In `@gateway/policies/advanced-ratelimit/algorithms/fixedwindow/factory.go`:
- Around line 51-58: The factory currently adds per-policy indices to Redis keys
which duplicates indexing also done in MultiLimiter.AllowN; to fix, stop adding
the policy index in the factory: in the loop that constructs limiters use the
common KeyPrefix (e.g., policyPrefix := config.KeyPrefix) when calling
NewRedisLimiter(policyPrefix) so each Redis limiter gets the uniform prefix and
MultiLimiter.AllowN remains responsible for appending ":p{i}" to keys; update
the loop that builds limiters (the call to NewRedisLimiter and the variable
policyPrefix) and leave NewMultiLimiter and MultiLimiter.AllowN unchanged.

In `@gateway/policies/advanced-ratelimit/algorithms/fixedwindow/redis.go`:
- Around line 99-120: The current sequence in the fixed-window rate limiter
risks leaving redisKey without a TTL when r.client.IncrBy succeeds but
r.client.Expire fails (newCount == n), causing stale counters; replace the
two-step INCRBY then EXPIRE with an atomic server-side operation (use a Redis
Lua script that does INCRBY and sets EXPIRE only when count == n, or use
EXPIREAT with a fixed expiration timestamp so retries are idempotent) and update
the logic in the function handling IncrBy/Expire (references: r.client.IncrBy,
r.client.Expire, redisKey, newCount, windowEnd) so the increment+ttl is atomic
or provide a recovery path that detects keys without TTL and sets expiry.

In `@gateway/policies/advanced-ratelimit/algorithms/gcra/gcra.lua`:
- Around line 10-59: The script does not guard against negative count values
which can reduce new_tat and inflate capacity; clamp count to a non-negative
integer immediately after it's parsed (the local count = tonumber(ARGV[6]) or 1
line) by replacing negative counts with 0 (e.g., count = math.max(0, count)) so
all subsequent logic (the allowance check if now >= allow_at and count <=
remaining, the new_tat = tat + (emission_interval * count) calculation, and the
peek/SET behavior that checks count > 0) operate on a non-negative count and
cannot be abused to decrease TAT.

In `@gateway/policies/advanced-ratelimit/algorithms/gcra/memory.go`:
- Around line 338-361: cleanupLoop currently reads from a real time ticker
(m.cleanup.C) while removeExpired uses m.clock.Now(), causing tests with a
mocked clock (WithClock) to be flaky; change MemoryLimiter to accept an
injectable ticker (e.g., add a cleanupTicker field or accept it in the
constructor/WithClock) and update cleanupLoop to read from that ticker (instead
of m.cleanup), ensure the constructor(s) that create MemoryLimiter (and
WithClock) set the ticker from the provided clock or allow tests to pass a
fake/test ticker, and/or add a test-only method (e.g., RunCleanupOnce or
triggerCleanup) that calls removeExpired directly so tests can deterministically
trigger cleanup; update tests to use the injectable ticker or the test trigger.

In `@gateway/policies/advanced-ratelimit/algorithms/gcra/redis.go`:
- Around line 162-171: The current RedisLimiter.ConsumeN simply calls AllowN
which only consumes tokens on success, so it doesn't implement "force-consume"
semantics needed for post-response cost accounting; implement a dedicated Lua
script and a new method (e.g., forceConsumeLua or forceConsumeN) that always
decrements the token bucket regardless of allowance and call that from
RedisLimiter.ConsumeN (or replace the call) so ConsumeN will run the new Lua
script atomically for overage handling; update RedisLimiter to load/execute the
new script, handle returned state (remaining tokens/timestamps), and keep the
same *limiter.Result/error contract as AllowN.
- Around line 133-141: The current unchecked type assertions on result -> values
and each element (values[0]..values[4]) can panic; update the parsing in the
function that handles the Lua reply to defensively validate types: first assert
result is []interface{} using the comma-ok pattern, check len(values) >= 5, then
for each element use safe type assertions (e.g., check each values[i] is int64
or convert from int/float64) and return a clear error if any check fails instead
of allowing a panic; ensure the variables allowed, remaining, resetNanos,
retryAfterNanos, and fullQuotaAtNanos are only assigned after validation so
callers can handle the error.

In `@gateway/policies/advanced-ratelimit/limiter/result.go`:
- Around line 83-95: The header timestamps are computed with time.Until in
SetIETFHeaders which can drift if headers are emitted later; modify the API so
the reference time is stable: either add precomputed fields on Result (e.g.,
ResetSeconds and FullQuotaSeconds) populated when the Result is created from the
rate limiter, and have SetIETFHeaders use those fields instead of recomputing
from Reset/FullQuotaAt, or change SetIETFHeaders to accept a now time.Time
parameter and compute seconds using that same now passed in from the rate check.
Update places that construct Result (e.g., New/constructor or the rate-check
function) to populate the new fields or pass the same now value to
SetIETFHeaders so header values remain consistent.

In `@gateway/policies/advanced-ratelimit/policy-definition.yaml`:
- Around line 320-451: Update all "wso2/defaultValue" systemParameter paths to
use this policy's ID advanced_ratelimit_v0_1_3 instead of ratelimit_v010: locate
every "wso2/defaultValue" entry in this file (e.g., the one for algorithm,
backend, redis.host, redis.port, redis.password, redis.username, redis.db,
redis.keyPrefix, redis.failureMode, redis.connectionTimeout, redis.readTimeout,
redis.writeTimeout, memory.maxEntries, memory.cleanupInterval,
headers.includeXRateLimit, headers.includeIETF, headers.includeRetryAfter) and
replace the prefix ${config.policy_configurations.ratelimit_v010...} with
${config.policy_configurations.advanced_ratelimit_v0_1_3...} so all keys follow
the <policyname>_v<version> underscore format required by the system.

In `@gateway/policies/advanced-ratelimit/ratelimit_integration_test.go`:
- Around line 404-417: The getSharedQuotaRefCount function currently ignores its
parameters (apiName, quotaName, limit, duration) and sums all refCount entries
from globalLimiterCache.byQuotaKey; either implement filtering or remove the
unused params—preferably remove them to match current behavior: change
getSharedQuotaRefCount signature to take no arguments, update any callers to
call getSharedQuotaRefCount() accordingly, and keep locking via
globalLimiterCache.mu while summing entry.refCount for entries in
globalLimiterCache.byQuotaKey; ensure references to byQuotaKey and refCount
remain correct.

In `@gateway/policies/advanced-ratelimit/ratelimit.go`:
- Around line 130-134: The apiId, apiName and apiVersion variables are left
empty before cache-key generation which collapses API-scoped cache keys and
removes API context from logs; update the initialization to populate apiId,
apiName and apiVersion from the existing metadata object (e.g. metadata.ID,
metadata.Name, metadata.Version or the appropriate fields on metadata) before
any cache key creation and logging (referencing the apiId/apiName/apiVersion
variables and the metadata variable in ratelimit.go) so cache keys and log
entries include the correct API context.

In `@gateway/policies/llm-cost-based-ratelimit/go.mod`:
- Line 1: The go.mod module path is pointing to a personal GitHub account string
"github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit";
change it to the organization path
"github.com/wso2/gateway-controllers/policies/llm-cost-based-ratelimit" and then
update any imports in this package that reference the old module path (search
for
"github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit")
so all module and import references (module declaration in go.mod and any source
files) consistently use the org path.
- Around line 5-10: The replace directive path doesn't match the module path in
the require block so Go will ignore it; update the replace to match the required
module path or change the require to the intended module. Specifically, make the
replace target use
github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit =>
../advanced-ratelimit (or alternatively change the require to
github.com/wso2/gateway-controllers/policies/advanced-ratelimit if that was
intended) so the replace takes effect for the module referenced in go.mod.

In `@gateway/policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go`:
- Around line 134-189: OnResponse currently uses DefaultCostScaleFactor when
MetadataKeyCostScaleFactor is missing (e.g., if OnRequest was skipped); update
LLMCostRateLimitPolicy.OnResponse to call extractCostScaleFactor(params) as a
fallback before defaulting so custom scales in params are respected.
Specifically, inside OnResponse where you set costScaleFactor (currently
checking ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor]), if that lookup
fails call extractCostScaleFactor(params) and use its result when >0, otherwise
fall back to DefaultCostScaleFactor; keep using addDollarHeaders(delegateAction,
costScaleFactor) unchanged. Ensure you reference MetadataKeyCostScaleFactor,
DefaultCostScaleFactor, extractCostScaleFactor, OnResponse, and addDollarHeaders
in the change.
- Around line 31-33: Change the import for the advanced-ratelimit package from
the personal fork to the canonical module path (use
github.com/wso2/gateway-controllers/policies/advanced-ratelimit in the import
that assigns the ratelimit alias) and update go.mod so the module
replace/require entries reference the same canonical path (ensure the replace
directive points github.com/wso2/gateway-controllers/policies/advanced-ratelimit
=> ../advanced-ratelimit or remove the forked path); adjust any other imports
referencing github.com/Tharsanan1/... to the github.com/wso2/... path so builds
use the canonical gateway-controllers module.

In `@gateway/policies/token-based-ratelimit/token_based_ratelimit.go`:
- Around line 403-512: The cost extraction mapping in
transformToRatelimitParams/addQuota is incorrect: change the source type mapping
so that template locations used in the response phase are mapped to
response-phase source types (e.g., map "header" -> "response_header" and
"metadata" -> "response_metadata" instead of "request_header" and "metadata"),
set sourceConfig["type"] accordingly and still set sourceConfig["key"] to the
identifier path; ensure "payload" and default continue to use "response_body"
with jsonPath; update the costExtraction block creation in addQuota to use these
corrected source types so header/metadata extraction reads from the response
phase.
🧹 Nitpick comments (17)
gateway/policies/advanced-ratelimit/limiter/interface.go (1)

25-54: Limiter interface looks good; please rebuild gateway images before IT runs.

Please confirm you rebuilt gateway images with cd gateway && make build-local since gateway component code changed.
As per coding guidelines: "When modifying code in gateway components (gateway-controller, policy-engine, router, etc., excluding gateway/it/), rebuild Docker images using cd gateway && make build-local".

gateway/it/go.mod (1)

187-198: Please confirm these dependency bumps are intentional and the module graph is clean.

Kindly verify go mod tidy results are clean and go test ./... passes for gateway/it after these updates.

gateway/policies/advanced-ratelimit/algorithms/fixedwindow/memory_test.go (2)

161-173: Calling t.Errorf from a goroutine can cause issues.

While t.Errorf is technically safe to call from goroutines in recent Go versions, calling t.Fatal* or t.Skip* is not. For consistency and to avoid potential issues if someone refactors this to use t.Fatal, consider collecting errors via a channel or using t.Error only after wg.Wait().

♻️ Optional: Safer error collection pattern
 func TestMemoryLimiter_Concurrent(t *testing.T) {
 	// 100 req/sec
 	policy := NewPolicy(100, time.Second)
 	rl := NewMemoryLimiter(policy, 0)
 	defer rl.Close()

 	ctx := context.Background()
 	var allowed, denied atomic.Int64
 	var wg sync.WaitGroup
+	errCh := make(chan error, 200)

 	// 200 concurrent requests (100 should succeed, 100 should fail)
 	for i := 0; i < 200; i++ {
 		wg.Add(1)
 		go func() {
 			defer wg.Done()
 			result, err := rl.Allow(ctx, "concurrent-test")
 			if err != nil {
-				t.Errorf("unexpected error: %v", err)
+				errCh <- err
 				return
 			}
 			if result.Allowed {
 				allowed.Add(1)
 			} else {
 				denied.Add(1)
 			}
 		}()
 	}

 	wg.Wait()
+	close(errCh)
+	for err := range errCh {
+		t.Errorf("unexpected error: %v", err)
+	}

186-210: Test relies on real time.Sleep which may cause flakiness.

The cleanup test uses time.Sleep(150 * time.Millisecond) to wait for the cleanup goroutine. This can be flaky in CI environments under load. Consider either:

  1. Increasing the sleep margin significantly (e.g., 500ms)
  2. Exposing a method to trigger cleanup synchronously for testing
  3. Using a polling loop with timeout instead of fixed sleep
gateway/policies/advanced-ratelimit/algorithms/fixedwindow/memory.go (1)

122-129: Consider making the expiration buffer configurable.

The 1-minute buffer after window end (windowEnd.Add(time.Minute)) is hardcoded. For very short windows (sub-second) or very long windows (hours/days), this fixed buffer may not be optimal. Consider making it a fraction of the window duration or a configurable parameter.

gateway/policies/advanced-ratelimit/limiter/factory.go (2)

41-47: Consider adding duplicate registration check.

RegisterAlgorithm silently overwrites any existing algorithm with the same name. This could mask configuration errors or unintended overwrites.

🛡️ Optional: Add duplicate detection
 // RegisterAlgorithm registers a new rate limiting algorithm
 func RegisterAlgorithm(name string, factory AlgorithmFactory) {
+	if _, exists := algorithms[name]; exists {
+		panic(fmt.Sprintf("algorithm %q already registered", name))
+	}
 	algorithms[name] = factory
 }

58-65: GetSupportedAlgorithms returns non-deterministic order.

Map iteration order in Go is intentionally randomized. If consistent ordering matters for logging, documentation, or testing, consider sorting the result.

♻️ Optional: Return sorted algorithm names
+import "sort"
+
 // GetSupportedAlgorithms returns a list of registered algorithms
 func GetSupportedAlgorithms() []string {
 	algos := make([]string, 0, len(algorithms))
 	for name := range algorithms {
 		algos = append(algos, name)
 	}
+	sort.Strings(algos)
 	return algos
 }
gateway/policies/advanced-ratelimit/algorithms/gcra/memory_test.go (3)

138-150: Using t.Errorf inside goroutine may not behave as expected.

Calling t.Errorf from a goroutine (line 142) is allowed but the test will continue executing even if an error is logged. The test correctly uses atomic counters for the actual pass/fail determination, but consider using t.Error with channel-based error collection for clearer failure reporting.


163-188: Test relies on time.Sleep for cleanup verification.

This test uses time.Sleep(150 * time.Millisecond) to wait for cleanup cycles, which can be flaky in slow CI environments. The test acknowledges this limitation in the comment at line 182-183.

Consider adding a test helper or exposing a TriggerCleanup() method for deterministic testing, as suggested in the memory.go review.


30-211: Good test coverage overall, but missing tests for ConsumeN and GetAvailable.

The tests cover Allow, AllowN, burst refill, concurrency, cleanup, and multi-key scenarios. Consider adding tests for:

  • ConsumeN: Force consumption beyond limits and verify the result reflects overage
  • GetAvailable: Check available tokens without consuming

Would you like me to generate test cases for ConsumeN and GetAvailable?

gateway/policies/advanced-ratelimit/algorithms/fixedwindow/multi.go (1)

27-82: Extract MultiLimiter to shared package to eliminate duplication.

The MultiLimiter implementation is nearly identical between gateway/policies/advanced-ratelimit/algorithms/fixedwindow/multi.go and gateway/policies/advanced-ratelimit/algorithms/gcra/multi.go. All methods (Allow, AllowN, ConsumeN, GetAvailable, Close) contain identical logic; the only differences are the package declarations and algorithm-specific comments ("window tracking" vs "TAT tracking").

Since MultiLimiter is algorithm-agnostic and operates solely with the limiter.Limiter interface, move it to gateway/policies/advanced-ratelimit/limiter/multi.go to serve both algorithms. This reduces maintenance burden and eliminates the risk of divergence between implementations.

gateway/it/steps/http_steps.go (1)

392-413: Consider extracting duplicate logging logic.

The header and body logging code is duplicated between sendRequestWithTempHeader (lines 326-347) and sendRequest (lines 392-413). Consider extracting this into a helper method to reduce duplication.

♻️ Optional refactor to reduce duplication
+// logResponseDetails logs response headers and body for debugging
+func (h *HTTPSteps) logResponseDetails(resp *http.Response, body []byte) {
+	for name, values := range resp.Header {
+		for _, value := range values {
+			log.Printf("DEBUG: Response header %s: %s", name, value)
+		}
+	}
+	bodyStr := string(body)
+	if len(bodyStr) > 500 {
+		bodyStr = bodyStr[:500] + "..."
+	}
+	log.Printf("DEBUG: Response body: %s", bodyStr)
+}

Then call h.logResponseDetails(resp, h.lastBody) after reading the body in both methods.

gateway/policies/llm-cost-based-ratelimit/policy-definition.yaml (1)

56-56: Inconsistent wso2/defaultValue path format for LLM cost parameters.

The paths for LLM cost-specific parameters use llm_cost_ratelimit without the version suffix, while the coding guidelines specify the format ${config.policy_configurations.<policyname>_v<version>.<param_name>}.

For policy llm-cost-based-ratelimit version v0.1.0, the expected format would be llm_cost_based_ratelimit_v010.

♻️ Suggested path format correction
-      "wso2/defaultValue": "${config.policy_configurations.llm_cost_ratelimit.prompt_token_cost}"
+      "wso2/defaultValue": "${config.policy_configurations.llm_cost_based_ratelimit_v010.prompt_token_cost}"

Apply similar changes to:

  • Line 67: completion_token_cost
  • Line 78: total_token_cost
  • Line 89: cost_per_n_tokens
  • Line 110: cost_scale_factor

As per coding guidelines: "Use the policy-definition.yaml wso2/defaultValue path format ${config.policy_configurations.<policy_id>.<param_name>} where policy_id follows the pattern <policyname>_v<version> with underscores".

gateway/policies/token-based-ratelimit/token_based_ratelimit_test.go (2)

42-50: Potential false positives in GetResourcesByType prefix matching.

The current implementation matches any key that starts with resourceType, which could incorrectly match keys like LlmProviderTemplate:foo when searching for LlmProvider (since LlmProvider is a prefix of LlmProviderTemplate).

♻️ Safer prefix matching
 func (m *mockResourceStore) GetResourcesByType(resourceType string) ([]policy.LazyResource, error) {
 	var result []policy.LazyResource
+	prefix := resourceType + ":"
 	for key, res := range m.resources {
-		if len(key) > len(resourceType) && key[:len(resourceType)] == resourceType {
+		if len(key) >= len(prefix) && key[:len(prefix)] == prefix {
 			result = append(result, res)
 		}
 	}
 	return result, nil
 }

758-767: Non-deterministic assertion may miss actual bugs.

The test intentionally doesn't fail when the third request doesn't return ImmediateResponse, citing timing variability. While this prevents flaky tests, it also means the rate limiting behavior isn't being strictly verified.

Consider either:

  1. Adding a small sleep or retry loop to ensure quota is consumed before the third request
  2. Using a synchronization mechanism to ensure deterministic order

This is acceptable for now as the comment documents the intentional leniency.

gateway/policies/advanced-ratelimit/cel_evaluator.go (1)

320-336: Consider adding uint and uint32 types to toFloat64.

The type switch handles common numeric types but omits uint and uint32. While these are less common from CEL evaluation, adding them would make the function more robust.

♻️ Extended type coverage
 func toFloat64(val interface{}) (float64, error) {
 	switch v := val.(type) {
 	case int64:
 		return float64(v), nil
 	case int:
 		return float64(v), nil
+	case int32:
+		return float64(v), nil
 	case float64:
 		return v, nil
 	case float32:
 		return float64(v), nil
 	case uint64:
 		return float64(v), nil
+	case uint:
+		return float64(v), nil
+	case uint32:
+		return float64(v), nil
 	default:
 		return 0, fmt.Errorf("CEL expression must return numeric value, got %T", val)
 	}
 }
gateway/policies/advanced-ratelimit/ratelimit.go (1)

114-123: Remember to rebuild gateway images after these gateway changes.

Run cd gateway && make build-local to ensure local images are updated.

As per coding guidelines: gateway/**/*.{go,yaml,yml,Dockerfile}: When modifying code in gateway components (gateway-controller, policy-engine, router, etc., excluding gateway/it/), rebuild Docker images using cd gateway && make build-local.

Comment on lines +22 to +23
- name: tokenBasedRateLimit
version: v1.0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Policy version mismatch with the actual policy definition.

The example specifies version: v1.0.0 for tokenBasedRateLimit, but the actual policy definition at gateway/policies/token-based-ratelimit/policy-definition.yaml defines version v0.1.0.

🔧 Suggested fix
     - name: tokenBasedRateLimit
-      version: v1.0.0
+      version: v0.1.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: tokenBasedRateLimit
version: v1.0.0
- name: tokenBasedRateLimit
version: v0.1.0
🤖 Prompt for AI Agents
In `@gateway/examples/test-provider.yaml` around lines 22 - 23, The example's
tokenBasedRateLimit policy version is incorrect; update the version field for
the policy named "tokenBasedRateLimit" in gateway/examples/test-provider.yaml to
match the policy-definition.yaml version (change version: v1.0.0 to version:
v0.1.0) so the example and the actual policy definition
(gateway/policies/token-based-ratelimit/policy-definition.yaml) are consistent.

Comment on lines +97 to +98
algorithm: fixed-window
backend: memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the policy-definition.yaml for token-based-ratelimit
fd -t f "policy-definition.yaml" gateway/policies | grep -i "token.*rate" | head -5

Repository: wso2/api-platform

Length of output: 122


🏁 Script executed:

# Search for token-based-ratelimit policy definition
find gateway/policies -type d -name "*token*rate*" | head -5

Repository: wso2/api-platform

Length of output: 99


🏁 Script executed:

# Look for the policy file more broadly
fd -t f "policy-definition.yaml" gateway/policies

Repository: wso2/api-platform

Length of output: 246


🏁 Script executed:

# Check the structure of gateway/policies
ls -la gateway/policies/ 2>/dev/null | head -20

Repository: wso2/api-platform

Length of output: 384


🏁 Script executed:

# Get full context around lines 97-98 in the feature file
cat -n gateway/it/features/token-based-ratelimit.feature | sed -n '85,110p'

Repository: wso2/api-platform

Length of output: 1381


🏁 Script executed:

# Search for algorithm and backend in policy definitions
rg -n "algorithm:|backend:" gateway/policies --type yaml | head -20

Repository: wso2/api-platform

Length of output: 538


🏁 Script executed:

# Check the policy-definition.yaml to see if algorithm/backend are systemParameters
cat -n gateway/policies/token-based-ratelimit/policy-definition.yaml | sed -n '50,85p'

Repository: wso2/api-platform

Length of output: 1491


🏁 Script executed:

# Get a broader view of the policy-definition.yaml structure
cat -n gateway/policies/token-based-ratelimit/policy-definition.yaml

Repository: wso2/api-platform

Length of output: 6008


🏁 Script executed:

# Check test-config.yaml to see if these values are configured there
cat -n gateway/it/test-config.yaml

Repository: wso2/api-platform

Length of output: 20250


🏁 Script executed:

# Check if other feature files use algorithm/backend in params
rg -B5 "algorithm:|backend:" gateway/it/features/*.feature | head -80

Repository: wso2/api-platform

Length of output: 5783


Remove systemParameters from API definition params.

The algorithm and backend fields are systemParameters (defined in policy-definition.yaml lines 62 and 72), not regular parameters. Per coding guidelines, systemParameters must be "configured exclusively in the gateway config file and retrieved via wso2/defaultValue references." Remove these fields from the params block (lines 97-98); they are already configured in gateway/it/test-config.yaml under policy_configurations.ratelimit_v010 with values fixed-window and memory.

🤖 Prompt for AI Agents
In `@gateway/it/features/token-based-ratelimit.feature` around lines 97 - 98,
Remove the systemParameters "algorithm" and "backend" from the API params block
(they are systemParameters defined in policy-definition.yaml) and rely on the
gateway config values in gateway/it/test-config.yaml
(policy_configurations.ratelimit_v010) referenced via wso2/defaultValue;
specifically, delete the "algorithm" and "backend" entries from the
token-based-ratelimit feature's params so the implementation uses the
centralized systemParameter values instead of duplicating them in the API
definition.

Comment on lines +1 to +92
go test -v -timeout 30m ./...
=== RUN TestFeatures
2026/02/03 11:02:24 Setting up test reporter...
2026/02/03 11:02:24 Reports directory ready at reports
2026/02/03 11:02:24 === Integration Test Suite Starting ===
2026/02/03 11:02:24 Setting up coverage collection directories...
2026/02/03 11:02:24 Cleaned existing coverage data
2026/02/03 11:02:24 Coverage directories created at coverage
2026/02/03 11:02:24 Colima detected, setting up environment variables...
2026/02/03 11:02:24 Set DOCKER_HOST=unix:///Users/tharsanan/.colima/default/docker.sock
2026/02/03 11:02:24 Set TESTCONTAINERS_RYUK_DISABLED=true
2026/02/03 11:02:25 Using compose file: /Users/tharsanan/Documents/github/forked/platform/api-platform/gateway/it/docker-compose.test.yaml
2026/02/03 11:02:25 github.com/testcontainers/testcontainers-go - Connected to docker:
Server Version: 28.4.0
API Version: 1.51
Operating System: Ubuntu 24.04.2 LTS
Total Memory: 1959 MB
Testcontainers for Go Version: v0.40.0
Resolved Docker Host: unix:///Users/tharsanan/.colima/default/docker.sock
Resolved Docker Socket Path: /Users/tharsanan/.colima/default/docker.sock
Test SessionID: 80aa5854b08ae3b54ccc1130a92b745dc62a28d8c3663d5b066c59b53d80514b
Test ProcessID: 4649d0b2-4bf1-4544-aaea-0bf09a14670a
2026/02/03 11:02:25 Starting Docker Compose services...
Container it-mock-jwks Stopping
Container mcp-server-backend Stopping
Container it-mock-analytics-collector Stopping
Container it-echo-backend-multi-arch Stopping
Container it-router Stopping
Container it-redis Stopping
Container it-mock-embedding-provider Stopping
Container it-mock-openapi-https Stopping
Container it-mock-aws-bedrock-guardrail Stopping
Container it-policy-engine Stopping
Container it-sample-backend Stopping
Container it-mock-azure-content-safety Stopping
Container it-echo-backend Stopping
Container it-router Stopped
Container it-mock-openapi-https Stopped
Container it-mock-openapi Stopping
Container it-gateway-controller Stopping
Container it-echo-backend-multi-arch Stopped
Container it-mock-jwks Stopped
Container it-sample-backend Stopped
Container it-gateway-controller Stopped
Container it-policy-engine Stopped
Container it-mock-embedding-provider Stopped
Container it-mock-aws-bedrock-guardrail Stopped
Container it-mock-openapi Stopped
Container it-redis Stopped
Container it-mock-azure-content-safety Stopped
Container it-mock-analytics-collector Stopped
Container it-echo-backend Stopped
Container mcp-server-backend Stopped
Network gateway-it_it-gateway-network Removed
Network gateway-it_it-gateway-network Creating
Network gateway-it_it-gateway-network Created
Container it-mock-aws-bedrock-guardrail Starting
Container mcp-server-backend Starting
Container it-sample-backend Starting
Container it-echo-backend-multi-arch Starting
Container it-redis Starting
Container it-mock-analytics-collector Starting
Container it-mock-azure-content-safety Starting
Container it-mock-jwks Starting
Container it-mock-embedding-provider Starting
Container it-gateway-controller Starting
Container it-echo-backend Starting
Container it-mock-openapi Starting
Container it-policy-engine Starting
Container it-redis Started
Container mcp-server-backend Started
Container it-mock-jwks Started
Container it-mock-azure-content-safety Started
Container it-mock-aws-bedrock-guardrail Started
Container it-policy-engine Started
Container it-echo-backend Started
Container it-sample-backend Started
Container it-echo-backend-multi-arch Started
Container it-mock-analytics-collector Started
Container it-mock-embedding-provider Started
Container it-mock-openapi Started
Container it-mock-openapi Waiting
Container it-gateway-controller Started
Container it-mock-openapi Waiting
Container it-gateway-controller Waiting
Container it-gateway-controller Healthy
Container it-mock-openapi Error
Container it-mock-openapi Error
2026/02/03 11:03:28 Failed to start services: failed to start docker compose: compose up: dependency failed to start: container it-mock-openapi is unhealthy
FAIL github.com/wso2/api-platform/gateway/it 64.712s
? github.com/wso2/api-platform/gateway/it/steps [no test files]
FAIL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove local test run logs from the repo.

This is raw execution output with absolute local paths and environment details (e.g., Line 10 and Line 12). It should be kept as CI artifacts, not versioned, to avoid leaking developer info and cluttering the repo.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~12-~12: The official name of this software platform is spelled with a capital “H”.
Context: ...ompose file: /Users/tharsanan/Documents/github/forked/platform/api-platform/gateway/it...

(GITHUB)

🤖 Prompt for AI Agents
In `@gateway/it/test_output.txt` around lines 1 - 92, The committed raw test run
output in test_output.txt contains local paths and environment details and must
be removed from version control; delete test_output.txt from the repo, add its
filename/pattern to .gitignore (or adjust existing ignore rules) so similar
runtime logs are not committed, and update CI to capture and upload this output
as a job artifact instead (or point contributors to how to collect local logs
without committing). Also ensure any cleanup or README notes mention that test
artifacts belong to CI and not source control.

Comment on lines +51 to +58
// Multi-limiter for Redis
limiters := make([]limiter.Limiter, len(policies))
for i, policy := range policies {
// Use different key prefix for each policy
policyPrefix := fmt.Sprintf("%sp%d:", config.KeyPrefix, i)
limiters[i] = NewRedisLimiter(config.RedisClient, policy, policyPrefix)
}
return NewMultiLimiter(limiters...), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read factory.go to see the exact key construction
echo "=== factory.go (lines 45-65) ==="
sed -n '45,65p' gateway/policies/advanced-ratelimit/algorithms/fixedwindow/factory.go

echo ""
echo "=== multi.go around key construction ==="
# First find the file
find gateway -name "multi.go" -path "*/fixedwindow/*"

Repository: wso2/api-platform

Length of output: 826


🏁 Script executed:

#!/bin/bash
# Get multi.go content if it exists
file_path=$(find gateway -name "multi.go" -path "*/fixedwindow/*" 2>/dev/null)
if [ -n "$file_path" ]; then
  echo "=== Found $file_path ==="
  wc -l "$file_path"
  echo ""
  echo "=== Searching for key construction patterns in $file_path ==="
  rg -n "fmt.Sprintf|policyKey|AllowN" "$file_path" -A 2 -B 2
else
  echo "=== multi.go not found, searching entire fixedwindow directory ==="
  fd . gateway/policies/advanced-ratelimit/algorithms/fixedwindow/
fi

Repository: wso2/api-platform

Length of output: 1755


🏁 Script executed:

#!/bin/bash
# Check the Redis limiter implementation
echo "=== Searching for Redis limiter key construction ==="
rg -n "fmt.Sprintf.*p%d|:p[0-9]" gateway/policies/advanced-ratelimit/algorithms/fixedwindow/ -A 2 -B 2

Repository: wso2/api-platform

Length of output: 2366


🏁 Script executed:

#!/bin/bash
# Find RedisLimiter implementation
echo "=== Looking for RedisLimiter struct and AllowN method ==="
rg -n "type RedisLimiter|func.*RedisLimiter.*AllowN" gateway/policies/advanced-ratelimit/algorithms/fixedwindow/ -A 10

Repository: wso2/api-platform

Length of output: 2492


🏁 Script executed:

#!/bin/bash
# Search for NewRedisLimiter to understand initialization
echo "=== NewRedisLimiter function ==="
rg -n "func NewRedisLimiter" gateway/policies/advanced-ratelimit/algorithms/fixedwindow/redis.go -A 15

echo ""
echo "=== How RedisLimiter stores and uses KeyPrefix ==="
rg -n "type RedisLimiter|KeyPrefix" gateway/policies/advanced-ratelimit/algorithms/fixedwindow/redis.go -B 2 -A 5

Repository: wso2/api-platform

Length of output: 838


Remove redundant policy index from factory or MultiLimiter.

The factory adds per-policy prefixes (line 55: fmt.Sprintf("%sp%d:", config.KeyPrefix, i)) while MultiLimiter.AllowN also appends policy indices to keys (line 57: fmt.Sprintf("%s:p%d", key, i)). This results in redundant key construction like "ratelimit:p0:user:p0:" when using Redis with multiple policies.

Either remove the policy index appending from the factory (pass config.KeyPrefix uniformly and rely on MultiLimiter's indexing), or remove it from MultiLimiter (rely on the factory's per-limiter prefixes). The current approach duplicates key isolation logic.

🤖 Prompt for AI Agents
In `@gateway/policies/advanced-ratelimit/algorithms/fixedwindow/factory.go` around
lines 51 - 58, The factory currently adds per-policy indices to Redis keys which
duplicates indexing also done in MultiLimiter.AllowN; to fix, stop adding the
policy index in the factory: in the loop that constructs limiters use the common
KeyPrefix (e.g., policyPrefix := config.KeyPrefix) when calling
NewRedisLimiter(policyPrefix) so each Redis limiter gets the uniform prefix and
MultiLimiter.AllowN remains responsible for appending ":p{i}" to keys; update
the loop that builds limiters (the call to NewRedisLimiter and the variable
policyPrefix) and leave NewMultiLimiter and MultiLimiter.AllowN unchanged.

Comment on lines +99 to +120
} else {
// Atomic increment - this is the core of fixed window
newCount, err = r.client.IncrBy(ctx, redisKey, n).Result()
if err != nil {
return nil, fmt.Errorf("redis INCRBY failed: %w", err)
}

// Set TTL only on first request in window (when newCount == n)
// This avoids calling EXPIRE on every request
if newCount == n {
// Add jitter (0-5s) to spread expiration load across Redis
// Prevents "thundering herd" of expirations at window boundaries
jitter := time.Duration(rand.Int63n(int64(5 * time.Second)))
ttl := time.Until(windowEnd) + jitter

// Set expiration and handle potential error to avoid keys without TTL
if err := r.client.Expire(ctx, redisKey, ttl).Err(); err != nil {
// Log with context; surface error so callers can decide (fail-open/closed)
slog.Error("redis EXPIRE failed for rate limit key", "redisKey", redisKey, "ttl", ttl, "error", err)
return nil, fmt.Errorf("redis EXPIRE failed for key %s ttl %s: %w", redisKey, ttl.String(), err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Race condition: INCRBY succeeds but EXPIRE fails, leaving key without TTL.

If INCRBY succeeds but EXPIRE fails (network issue, Redis error), the key will persist indefinitely without a TTL. While the error is returned, subsequent requests will see the orphaned count. Consider using a Lua script for atomicity or implementing a recovery mechanism.

💡 Options to address
  1. Use SETEX/INCRBY in a Lua script for atomicity:
local key = KEYS[1]
local n = tonumber(ARGV[1])
local ttl = tonumber(ARGV[2])
local count = redis.call('INCRBY', key, n)
if count == n then
  redis.call('EXPIRE', key, ttl)
end
return count
  1. Use EXPIREAT with a fixed timestamp to make retries idempotent.

  2. Accept the risk if fail-open is acceptable, but document this behavior clearly.

🤖 Prompt for AI Agents
In `@gateway/policies/advanced-ratelimit/algorithms/fixedwindow/redis.go` around
lines 99 - 120, The current sequence in the fixed-window rate limiter risks
leaving redisKey without a TTL when r.client.IncrBy succeeds but r.client.Expire
fails (newCount == n), causing stale counters; replace the two-step INCRBY then
EXPIRE with an atomic server-side operation (use a Redis Lua script that does
INCRBY and sets EXPIRE only when count == n, or use EXPIREAT with a fixed
expiration timestamp so retries are idempotent) and update the logic in the
function handling IncrBy/Expire (references: r.client.IncrBy, r.client.Expire,
redisKey, newCount, windowEnd) so the increment+ttl is atomic or provide a
recovery path that detects keys without TTL and sets expiry.

@@ -0,0 +1,26 @@
module github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Module path uses personal GitHub account instead of organization.

The module path github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit uses a personal GitHub account instead of the organization path github.com/wso2/gateway-controllers/policies/llm-cost-based-ratelimit. This is inconsistent with other policies in the codebase (e.g., token-based-ratelimit uses github.com/wso2/...).

🔧 Proposed fix
-module github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit
+module github.com/wso2/gateway-controllers/policies/llm-cost-based-ratelimit
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
module github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit
module github.com/wso2/gateway-controllers/policies/llm-cost-based-ratelimit
🤖 Prompt for AI Agents
In `@gateway/policies/llm-cost-based-ratelimit/go.mod` at line 1, The go.mod
module path is pointing to a personal GitHub account string
"github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit";
change it to the organization path
"github.com/wso2/gateway-controllers/policies/llm-cost-based-ratelimit" and then
update any imports in this package that reference the old module path (search
for
"github.com/Tharsanan1/gateway-controllers/policies/llm-cost-based-ratelimit")
so all module and import references (module declaration in go.mod and any source
files) consistently use the org path.

Comment on lines +5 to +10
require (
github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit v0.1.23
github.com/wso2/api-platform/sdk v0.3.8
)

replace github.com/wso2/gateway-controllers/policies/advanced-ratelimit => ../advanced-ratelimit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dependency path mismatch with replace directive.

The require statement references github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit but the replace directive maps github.com/wso2/gateway-controllers/policies/advanced-ratelimit. This mismatch means the replace directive won't take effect, and the build will attempt to fetch from the personal account.

🔧 Proposed fix
 require (
-	github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit v0.1.23
+	github.com/wso2/gateway-controllers/policies/advanced-ratelimit v0.1.23
 	github.com/wso2/api-platform/sdk v0.3.8
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
require (
github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit v0.1.23
github.com/wso2/api-platform/sdk v0.3.8
)
replace github.com/wso2/gateway-controllers/policies/advanced-ratelimit => ../advanced-ratelimit
require (
github.com/wso2/gateway-controllers/policies/advanced-ratelimit v0.1.23
github.com/wso2/api-platform/sdk v0.3.8
)
replace github.com/wso2/gateway-controllers/policies/advanced-ratelimit => ../advanced-ratelimit
🤖 Prompt for AI Agents
In `@gateway/policies/llm-cost-based-ratelimit/go.mod` around lines 5 - 10, The
replace directive path doesn't match the module path in the require block so Go
will ignore it; update the replace to match the required module path or change
the require to the intended module. Specifically, make the replace target use
github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit =>
../advanced-ratelimit (or alternatively change the require to
github.com/wso2/gateway-controllers/policies/advanced-ratelimit if that was
intended) so the replace takes effect for the module referenced in go.mod.

Comment on lines +31 to +33
policy "github.com/wso2/api-platform/sdk/gateway/policy/v1alpha"
ratelimit "github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd -type f -name "go.mod" gateway/policies/llm-cost-based-ratelimit

Repository: wso2/api-platform

Length of output: 231


🏁 Script executed:

cat -n gateway/policies/llm-cost-based-ratelimit/go.mod

Repository: wso2/api-platform

Length of output: 1314


🏁 Script executed:

rg -n "gateway-controllers" gateway/policies/llm-cost-based-ratelimit/

Repository: wso2/api-platform

Length of output: 990


Fix import to use canonical gateway-controllers module path.

The code imports from a personal fork (github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit), but the go.mod includes a replace directive for the canonical path (github.com/wso2/gateway-controllers/policies/advanced-ratelimit => ../advanced-ratelimit). Update the import to match the replace directive.

🔧 Required fix
-	ratelimit "github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit"
+	ratelimit "github.com/wso2/gateway-controllers/policies/advanced-ratelimit"

Also update go.mod line 6:

-	github.com/Tharsanan1/gateway-controllers/policies/advanced-ratelimit v0.1.23
+	github.com/wso2/gateway-controllers/policies/advanced-ratelimit v0.1.23
🤖 Prompt for AI Agents
In `@gateway/policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go` around
lines 31 - 33, Change the import for the advanced-ratelimit package from the
personal fork to the canonical module path (use
github.com/wso2/gateway-controllers/policies/advanced-ratelimit in the import
that assigns the ratelimit alias) and update go.mod so the module
replace/require entries reference the same canonical path (ensure the replace
directive points github.com/wso2/gateway-controllers/policies/advanced-ratelimit
=> ../advanced-ratelimit or remove the forked path); adjust any other imports
referencing github.com/Tharsanan1/... to the github.com/wso2/... path so builds
use the canonical gateway-controllers module.

Comment on lines +134 to +189
// OnResponse processes the response phase by delegating to the same provider-specific instance.
// Uses the delegate pinned during OnRequest to ensure consistency even if template changes.
// After delegation, it adds custom headers that show cost values in dollars.
func (p *LLMCostRateLimitPolicy) OnResponse(
ctx *policy.ResponseContext,
params map[string]interface{},
) policy.ResponseAction {
slog.Debug("OnResponse: processing LLM cost-based rate limit",
"route", p.metadata.RouteName)

var delegateAction policy.ResponseAction

// First, try to use the delegate pinned during OnRequest (ensures consistency)
if delegate, ok := ctx.SharedContext.Metadata[MetadataKeyDelegate].(policy.Policy); ok {
slog.Debug("OnResponse: using pinned delegate from request phase",
"route", p.metadata.RouteName)
delegateAction = delegate.OnResponse(ctx, params)
} else {
// Fallback: look up by provider name (for cases where OnRequest didn't run)
providerName, ok := ctx.SharedContext.Metadata[MetadataKeyProviderName].(string)
if !ok || providerName == "" {
slog.Debug("OnResponse: provider name not found in metadata; skipping",
"route", p.metadata.RouteName)
return nil
}

slog.Debug("OnResponse: looking up delegate by provider (fallback)",
"route", p.metadata.RouteName,
"provider", providerName)

if entry, ok := p.delegates.Load(providerName); ok {
if de, ok := entry.(*delegateEntry); ok && de.delegate != nil {
slog.Debug("OnResponse: delegating to advanced-ratelimit",
"route", p.metadata.RouteName,
"provider", providerName)
delegateAction = de.delegate.OnResponse(ctx, params)
}
}

if delegateAction == nil {
slog.Debug("OnResponse: no delegate found for provider",
"route", p.metadata.RouteName,
"provider", providerName)
return nil
}
}

// Retrieve the cost scale factor (stored during OnRequest, or use default)
costScaleFactor := DefaultCostScaleFactor
if scaleFactor, ok := ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor].(int); ok && scaleFactor > 0 {
costScaleFactor = scaleFactor
}

// Add custom dollar-denominated headers by transforming the delegate's response
return p.addDollarHeaders(delegateAction, costScaleFactor)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

OnResponse ignores configured costScaleFactor when request phase is skipped.

If OnRequest didn't run, you fall back to the default even when params specify a custom scale. Consider re-reading extractCostScaleFactor(params) when metadata is missing.

🔧 Proposed fix
-	costScaleFactor := DefaultCostScaleFactor
-	if scaleFactor, ok := ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor].(int); ok && scaleFactor > 0 {
-		costScaleFactor = scaleFactor
-	}
+	costScaleFactor := DefaultCostScaleFactor
+	if scaleFactor, ok := ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor].(int); ok && scaleFactor > 0 {
+		costScaleFactor = scaleFactor
+	} else {
+		costScaleFactor = extractCostScaleFactor(params)
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// OnResponse processes the response phase by delegating to the same provider-specific instance.
// Uses the delegate pinned during OnRequest to ensure consistency even if template changes.
// After delegation, it adds custom headers that show cost values in dollars.
func (p *LLMCostRateLimitPolicy) OnResponse(
ctx *policy.ResponseContext,
params map[string]interface{},
) policy.ResponseAction {
slog.Debug("OnResponse: processing LLM cost-based rate limit",
"route", p.metadata.RouteName)
var delegateAction policy.ResponseAction
// First, try to use the delegate pinned during OnRequest (ensures consistency)
if delegate, ok := ctx.SharedContext.Metadata[MetadataKeyDelegate].(policy.Policy); ok {
slog.Debug("OnResponse: using pinned delegate from request phase",
"route", p.metadata.RouteName)
delegateAction = delegate.OnResponse(ctx, params)
} else {
// Fallback: look up by provider name (for cases where OnRequest didn't run)
providerName, ok := ctx.SharedContext.Metadata[MetadataKeyProviderName].(string)
if !ok || providerName == "" {
slog.Debug("OnResponse: provider name not found in metadata; skipping",
"route", p.metadata.RouteName)
return nil
}
slog.Debug("OnResponse: looking up delegate by provider (fallback)",
"route", p.metadata.RouteName,
"provider", providerName)
if entry, ok := p.delegates.Load(providerName); ok {
if de, ok := entry.(*delegateEntry); ok && de.delegate != nil {
slog.Debug("OnResponse: delegating to advanced-ratelimit",
"route", p.metadata.RouteName,
"provider", providerName)
delegateAction = de.delegate.OnResponse(ctx, params)
}
}
if delegateAction == nil {
slog.Debug("OnResponse: no delegate found for provider",
"route", p.metadata.RouteName,
"provider", providerName)
return nil
}
}
// Retrieve the cost scale factor (stored during OnRequest, or use default)
costScaleFactor := DefaultCostScaleFactor
if scaleFactor, ok := ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor].(int); ok && scaleFactor > 0 {
costScaleFactor = scaleFactor
}
// Add custom dollar-denominated headers by transforming the delegate's response
return p.addDollarHeaders(delegateAction, costScaleFactor)
}
// OnResponse processes the response phase by delegating to the same provider-specific instance.
// Uses the delegate pinned during OnRequest to ensure consistency even if template changes.
// After delegation, it adds custom headers that show cost values in dollars.
func (p *LLMCostRateLimitPolicy) OnResponse(
ctx *policy.ResponseContext,
params map[string]interface{},
) policy.ResponseAction {
slog.Debug("OnResponse: processing LLM cost-based rate limit",
"route", p.metadata.RouteName)
var delegateAction policy.ResponseAction
// First, try to use the delegate pinned during OnRequest (ensures consistency)
if delegate, ok := ctx.SharedContext.Metadata[MetadataKeyDelegate].(policy.Policy); ok {
slog.Debug("OnResponse: using pinned delegate from request phase",
"route", p.metadata.RouteName)
delegateAction = delegate.OnResponse(ctx, params)
} else {
// Fallback: look up by provider name (for cases where OnRequest didn't run)
providerName, ok := ctx.SharedContext.Metadata[MetadataKeyProviderName].(string)
if !ok || providerName == "" {
slog.Debug("OnResponse: provider name not found in metadata; skipping",
"route", p.metadata.RouteName)
return nil
}
slog.Debug("OnResponse: looking up delegate by provider (fallback)",
"route", p.metadata.RouteName,
"provider", providerName)
if entry, ok := p.delegates.Load(providerName); ok {
if de, ok := entry.(*delegateEntry); ok && de.delegate != nil {
slog.Debug("OnResponse: delegating to advanced-ratelimit",
"route", p.metadata.RouteName,
"provider", providerName)
delegateAction = de.delegate.OnResponse(ctx, params)
}
}
if delegateAction == nil {
slog.Debug("OnResponse: no delegate found for provider",
"route", p.metadata.RouteName,
"provider", providerName)
return nil
}
}
// Retrieve the cost scale factor (stored during OnRequest, or use default)
costScaleFactor := DefaultCostScaleFactor
if scaleFactor, ok := ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor].(int); ok && scaleFactor > 0 {
costScaleFactor = scaleFactor
} else {
costScaleFactor = extractCostScaleFactor(params)
}
// Add custom dollar-denominated headers by transforming the delegate's response
return p.addDollarHeaders(delegateAction, costScaleFactor)
}
🤖 Prompt for AI Agents
In `@gateway/policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go` around
lines 134 - 189, OnResponse currently uses DefaultCostScaleFactor when
MetadataKeyCostScaleFactor is missing (e.g., if OnRequest was skipped); update
LLMCostRateLimitPolicy.OnResponse to call extractCostScaleFactor(params) as a
fallback before defaulting so custom scales in params are respected.
Specifically, inside OnResponse where you set costScaleFactor (currently
checking ctx.SharedContext.Metadata[MetadataKeyCostScaleFactor]), if that lookup
fails call extractCostScaleFactor(params) and use its result when >0, otherwise
fall back to DefaultCostScaleFactor; keep using addDollarHeaders(delegateAction,
costScaleFactor) unchanged. Ensure you reference MetadataKeyCostScaleFactor,
DefaultCostScaleFactor, extractCostScaleFactor, OnResponse, and addDollarHeaders
in the change.

Comment on lines +403 to +512
// transformToRatelimitParams converts LLM-specific parameters to the advanced-ratelimit structure.
func transformToRatelimitParams(params map[string]interface{}, template map[string]interface{}) map[string]interface{} {
slog.Debug("transformToRatelimitParams: starting parameter transformation",
"params", params,
"template", template)

var quotas []interface{}

addQuota := func(name string, limitsKey string, templateKey string) {
limits := params[limitsKey]
if limits == nil {
slog.Debug("addQuota: no limits found, skipping",
"name", name,
"limitsKey", limitsKey)
return
}

quota := map[string]interface{}{
"name": name,
"limits": convertLimits(limits),
"keyExtraction": []interface{}{
map[string]interface{}{"type": "routename"},
},
}

if template != nil {
// The template structure has spec directly: template["spec"]
if spec, ok := template["spec"].(map[string]interface{}); ok {
if usage, ok := spec[templateKey].(map[string]interface{}); ok {
if path, ok := usage["identifier"].(string); ok && path != "" {
// Map template location to cost extraction type
location, _ := usage["location"].(string)
sourceType := "response_body" // default
sourceConfig := map[string]interface{}{
"type": sourceType,
}

switch location {
case "header":
sourceType = "request_header"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "metadata":
sourceType = "metadata"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "payload":
// payload location uses response_body type with jsonPath
sourceConfig["jsonPath"] = path
default:
// For any other location, assume payload/response_body
sourceConfig["jsonPath"] = path
}

slog.Debug("addQuota: configured cost extraction",
"name", name,
"location", location,
"sourceType", sourceType,
"path", path)

quota["costExtraction"] = map[string]interface{}{
"enabled": true,
"sources": []interface{}{sourceConfig},
}
}
}
}
}
quotas = append(quotas, quota)
}

addQuota("prompt_tokens", "promptTokenLimits", "promptTokens")
addQuota("completion_tokens", "completionTokenLimits", "completionTokens")
addQuota("total_tokens", "totalTokenLimits", "totalTokens")

rlParams := map[string]interface{}{
"quotas": quotas,
}

for _, key := range []string{"algorithm", "backend", "redis", "memory"} {
if val, ok := params[key]; ok {
rlParams[key] = val
}
}

slog.Debug("transformToRatelimitParams: completed transformation",
"quotasCount", len(quotas),
"hasAlgorithm", rlParams["algorithm"] != nil,
"hasBackend", rlParams["backend"] != nil,
"quotas", quotas)

return rlParams
}

func convertLimits(rawLimits interface{}) []interface{} {
items, ok := rawLimits.([]interface{})
if !ok {
return nil
}
var converted []interface{}
for _, item := range items {
if m, ok := item.(map[string]interface{}); ok {
converted = append(converted, map[string]interface{}{
"limit": m["count"],
"duration": m["duration"],
})
}
}
return converted
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cost extraction source types are mis-mapped for header/metadata.

metadata is not a recognized cost source type in the extractor, and header is mapped to request_header even though token usage is derived from provider templates in the response phase. This causes cost extraction to be skipped or read from the wrong phase.

🔧 Proposed fix
-						switch location {
-						case "header":
-							sourceType = "request_header"
+						switch location {
+						case "header":
+							sourceType = "response_header"
 							sourceConfig["type"] = sourceType
 							sourceConfig["key"] = path
 						case "metadata":
-							sourceType = "metadata"
+							sourceType = "response_metadata"
 							sourceConfig["type"] = sourceType
 							sourceConfig["key"] = path
 						case "payload":
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// transformToRatelimitParams converts LLM-specific parameters to the advanced-ratelimit structure.
func transformToRatelimitParams(params map[string]interface{}, template map[string]interface{}) map[string]interface{} {
slog.Debug("transformToRatelimitParams: starting parameter transformation",
"params", params,
"template", template)
var quotas []interface{}
addQuota := func(name string, limitsKey string, templateKey string) {
limits := params[limitsKey]
if limits == nil {
slog.Debug("addQuota: no limits found, skipping",
"name", name,
"limitsKey", limitsKey)
return
}
quota := map[string]interface{}{
"name": name,
"limits": convertLimits(limits),
"keyExtraction": []interface{}{
map[string]interface{}{"type": "routename"},
},
}
if template != nil {
// The template structure has spec directly: template["spec"]
if spec, ok := template["spec"].(map[string]interface{}); ok {
if usage, ok := spec[templateKey].(map[string]interface{}); ok {
if path, ok := usage["identifier"].(string); ok && path != "" {
// Map template location to cost extraction type
location, _ := usage["location"].(string)
sourceType := "response_body" // default
sourceConfig := map[string]interface{}{
"type": sourceType,
}
switch location {
case "header":
sourceType = "request_header"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "metadata":
sourceType = "metadata"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "payload":
// payload location uses response_body type with jsonPath
sourceConfig["jsonPath"] = path
default:
// For any other location, assume payload/response_body
sourceConfig["jsonPath"] = path
}
slog.Debug("addQuota: configured cost extraction",
"name", name,
"location", location,
"sourceType", sourceType,
"path", path)
quota["costExtraction"] = map[string]interface{}{
"enabled": true,
"sources": []interface{}{sourceConfig},
}
}
}
}
}
quotas = append(quotas, quota)
}
addQuota("prompt_tokens", "promptTokenLimits", "promptTokens")
addQuota("completion_tokens", "completionTokenLimits", "completionTokens")
addQuota("total_tokens", "totalTokenLimits", "totalTokens")
rlParams := map[string]interface{}{
"quotas": quotas,
}
for _, key := range []string{"algorithm", "backend", "redis", "memory"} {
if val, ok := params[key]; ok {
rlParams[key] = val
}
}
slog.Debug("transformToRatelimitParams: completed transformation",
"quotasCount", len(quotas),
"hasAlgorithm", rlParams["algorithm"] != nil,
"hasBackend", rlParams["backend"] != nil,
"quotas", quotas)
return rlParams
}
func convertLimits(rawLimits interface{}) []interface{} {
items, ok := rawLimits.([]interface{})
if !ok {
return nil
}
var converted []interface{}
for _, item := range items {
if m, ok := item.(map[string]interface{}); ok {
converted = append(converted, map[string]interface{}{
"limit": m["count"],
"duration": m["duration"],
})
}
}
return converted
}
// transformToRatelimitParams converts LLM-specific parameters to the advanced-ratelimit structure.
func transformToRatelimitParams(params map[string]interface{}, template map[string]interface{}) map[string]interface{} {
slog.Debug("transformToRatelimitParams: starting parameter transformation",
"params", params,
"template", template)
var quotas []interface{}
addQuota := func(name string, limitsKey string, templateKey string) {
limits := params[limitsKey]
if limits == nil {
slog.Debug("addQuota: no limits found, skipping",
"name", name,
"limitsKey", limitsKey)
return
}
quota := map[string]interface{}{
"name": name,
"limits": convertLimits(limits),
"keyExtraction": []interface{}{
map[string]interface{}{"type": "routename"},
},
}
if template != nil {
// The template structure has spec directly: template["spec"]
if spec, ok := template["spec"].(map[string]interface{}); ok {
if usage, ok := spec[templateKey].(map[string]interface{}); ok {
if path, ok := usage["identifier"].(string); ok && path != "" {
// Map template location to cost extraction type
location, _ := usage["location"].(string)
sourceType := "response_body" // default
sourceConfig := map[string]interface{}{
"type": sourceType,
}
switch location {
case "header":
sourceType = "response_header"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "metadata":
sourceType = "response_metadata"
sourceConfig["type"] = sourceType
sourceConfig["key"] = path
case "payload":
// payload location uses response_body type with jsonPath
sourceConfig["jsonPath"] = path
default:
// For any other location, assume payload/response_body
sourceConfig["jsonPath"] = path
}
slog.Debug("addQuota: configured cost extraction",
"name", name,
"location", location,
"sourceType", sourceType,
"path", path)
quota["costExtraction"] = map[string]interface{}{
"enabled": true,
"sources": []interface{}{sourceConfig},
}
}
}
}
}
quotas = append(quotas, quota)
}
addQuota("prompt_tokens", "promptTokenLimits", "promptTokens")
addQuota("completion_tokens", "completionTokenLimits", "completionTokens")
addQuota("total_tokens", "totalTokenLimits", "totalTokens")
rlParams := map[string]interface{}{
"quotas": quotas,
}
for _, key := range []string{"algorithm", "backend", "redis", "memory"} {
if val, ok := params[key]; ok {
rlParams[key] = val
}
}
slog.Debug("transformToRatelimitParams: completed transformation",
"quotasCount", len(quotas),
"hasAlgorithm", rlParams["algorithm"] != nil,
"hasBackend", rlParams["backend"] != nil,
"quotas", quotas)
return rlParams
}
func convertLimits(rawLimits interface{}) []interface{} {
items, ok := rawLimits.([]interface{})
if !ok {
return nil
}
var converted []interface{}
for _, item := range items {
if m, ok := item.(map[string]interface{}); ok {
converted = append(converted, map[string]interface{}{
"limit": m["count"],
"duration": m["duration"],
})
}
}
return converted
}
🤖 Prompt for AI Agents
In `@gateway/policies/token-based-ratelimit/token_based_ratelimit.go` around lines
403 - 512, The cost extraction mapping in transformToRatelimitParams/addQuota is
incorrect: change the source type mapping so that template locations used in the
response phase are mapped to response-phase source types (e.g., map "header" ->
"response_header" and "metadata" -> "response_metadata" instead of
"request_header" and "metadata"), set sourceConfig["type"] accordingly and still
set sourceConfig["key"] to the identifier path; ensure "payload" and default
continue to use "response_body" with jsonPath; update the costExtraction block
creation in addQuota to use these corrected source types so header/metadata
extraction reads from the response phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant