Skip to content

Feat: add traction tenant mode for secure webhook registration#926

Merged
esune merged 15 commits intoopenwallet-foundation:mainfrom
MonolithicMonk:feat/traction-tenant-mode
Dec 22, 2025
Merged

Feat: add traction tenant mode for secure webhook registration#926
esune merged 15 commits intoopenwallet-foundation:mainfrom
MonolithicMonk:feat/traction-tenant-mode

Conversation

@MonolithicMonk
Copy link
Contributor

Reference: Closes #923

Summary:
This PR introduces full support for integrating with Traction (or other secured multi-tenant ACA-Py deployments) where the privileged Admin API endpoints are restricted.

  • New Tenancy Mode: Introduces AGENT_TENANT_MODE="traction" (mapped to ACAPY_TENANCY).
  • Authentication: Adds TractionTenantAcapy configuration class to handle authentication via TRACTION_TENANT_ID and TRACTION_TENANT_API_KEY (preferred), with a fallback to Wallet ID/Key authentication.
  • Secure Webhook Registration: Refactors register_tenant_webhook to support a "direct" mode. When in Traction mode, the controller bypasses the restricted Admin API (PUT /multitenancy/wallet/{id}) and uses the authenticated Tenant API (PUT /tenant/wallet) directly.
  • Configuration:
    • Updates docker-compose and manage scripts to pass through TRACTION_ variables.
  • Documentation: Updates ConfigurationGuide.md and MigrationGuide.md with new variables and startup requirements.
  • Testing: Adds comprehensive unit tests for the new configuration logic, client initialization, and webhook registration fallback strategies.

Impact:
Enables VC-AuthN to be deployed in hardened environments where the ACA-Py Admin API is blocked for security, preventing the "webhook registration hanging" issue described in #923.

Signed-off-by: Yuki I <omoge.real@gmail.com>
Signed-off-by: Yuki I <omoge.real@gmail.com>
Signed-off-by: Yuki I <omoge.real@gmail.com>
@coveralls
Copy link

coveralls commented Nov 29, 2025

Pull Request Test Coverage Report for Build 20345995063

Details

  • 127 of 133 (95.49%) changed or added relevant lines in 5 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.4%) to 88.146%

Changes Missing Coverage Covered Lines Changed/Added Lines %
oidc-controller/api/core/webhook_utils.py 59 65 90.77%
Files with Coverage Reduction New Missed Lines %
oidc-controller/api/core/acapy/config.py 1 98.75%
Totals Coverage Status
Change from base Build 19948422664: 0.4%
Covered Lines: 2030
Relevant Lines: 2303

💛 - Coveralls

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive support for integrating with Traction and other secured multi-tenant ACA-Py deployments where privileged Admin API endpoints are restricted. The implementation adds a new "traction" tenancy mode alongside the existing "single" and "multi" modes, with unified configuration variables and fallback authentication strategies.

Key changes include:

  • Introduction of AGENT_TENANT_MODE="traction" with dedicated TractionTenantAcapy configuration class
  • Unified ACAPY_TENANT_WALLET_* configuration variables with backward compatibility for legacy MT_ACAPY_WALLET_* variables
  • Refactored webhook registration supporting both Admin API and direct Tenant API modes with automatic fallback

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
oidc-controller/api/tests/test_webhook_registration.py Adds comprehensive test coverage for new webhook registration fallback logic, traction mode, and multi-tenant token injection
oidc-controller/api/main.py Integrates traction mode logic into startup sequence with conditional token fetcher instantiation and webhook registration strategies
oidc-controller/api/core/webhook_utils.py Implements fallback mechanism from Admin API to Tenant API with retry logic and comprehensive error handling
oidc-controller/api/core/config.py Adds unified tenant configuration variables with automatic fallback to legacy variables and validation for multi-tenant/traction modes
oidc-controller/api/core/acapy/tests/test_config.py Expands test suite to verify TractionTenantAcapy functionality, unified variable handling, and error conditions
oidc-controller/api/core/acapy/tests/test_client.py Adds test coverage for traction tenancy mode client initialization
oidc-controller/api/core/acapy/config.py Implements TractionTenantAcapy class with tenant-specific authentication and token fetching logic
oidc-controller/api/core/acapy/client.py Extends AcapyClient initialization to support traction tenancy mode
docs/MigrationGuide.md Documents configuration unification, backward compatibility, and new traction mode requirements
docs/ConfigurationGuide.md Provides comprehensive documentation for traction mode setup, variable mapping, and deployment examples
docker/manage Exports unified tenant configuration variables alongside legacy variables for backward compatibility
docker/docker-compose.yaml Passes unified and legacy tenant configuration variables to controller service
docker/.env.example Provides complete example configuration with traction mode setup and variable documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@loneil loneil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great to get going @MonolithicMonk . I tried this PR out locally with my vcauth using the hosted Traction Sandbox and it was working for me. Great to be able to watch connection and verification go through on the Tenant UI as testing.

I just have a concern about token expiry to look at I think.

@@ -15,12 +15,15 @@ def get_headers() -> dict[str, str]: ...


class MultiTenantAcapy:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that, while not an issue with Traction, could this multi tenant token fetch be a problem on standard ACA-Py if a VCAuth instance is running with multiple processes (VCAuth can start up locally like that to try it out). A reason we use the multitenant provider plugin is that we weren't able to get more than 1 token at once. So if I start up 3 pods in a scalable deployment, would 2 of the pods on a regular ACA-Py possibly have trouble as the issuance of the other tokens invalidate the existing ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACA-Py wallet tokens are typically stateless JWTs, so issuing a new one in one pod shouldn't invalidate valid tokens held by other pods. However, to be safe and efficient, the new caching logic I implemented (1-hour TTL) minimizes the frequency of these fetch calls across all instances and significantly reduces load on the Agent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that, while not an issue with Traction, could this multi tenant token fetch be a problem on standard ACA-Py if a VCAuth instance is running with multiple processes (VCAuth can start up locally like that to try it out). A reason we use the multitenant provider plugin is that we weren't able to get more than 1 token at once. So if I start up 3 pods in a scalable deployment, would 2 of the pods on a regular ACA-Py possibly have trouble as the issuance of the other tokens invalidate the existing ones?

Explain this more. Very interesting...

(p/s, ignore my prev reply for now)

Copy link
Contributor Author

@MonolithicMonk MonolithicMonk Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking if the issuance of a multitenant token invalidates all preceding token issued to the tenant in a non traction multitenant deployment of aca-py?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly. That's my understanding for ACA-Py Multitenancy and one of the the reasons we made Traction in the first place (and extracted the solution to the Multitenant Provider plugin).

It still states that in the docs here:
https://github.com/openwallet-foundation/acapy/blob/main/docs/features/Multitenancy.md#method-2-get-tenant-token

but it's been a while since I've used that feature other than with Traction

Copy link
Contributor

@loneil loneil Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the caching solution, if it's in-memory rather than a persistence layer (the mongo Database the controller uses, or some other shared) then multiple pods or serverless processes or would still be fetching individual tokens and invalidating each other. 3 pods all starting up at the same time, or auto-scaling, etc would have 3 different tokens in this model, and 2 of those pods would have tokens that are invalidated.

Not too much a concern for me as we use a single tenant aca-py (or would use Traction), but something that unless I'm mistaken would be a blocker for using VCAuth in a MT ACA-Py deployment that isn't a single process. Though of course using the Multitenancy Provider plugin could alleviate that.

I've posted for clarification just to make sure this does still apply: https://discord.com/channels/1022962884864643214/1286299858994462842/1448556475692748801

@MonolithicMonk
Copy link
Contributor Author

Thanks you two for your feedbacks. I am a little scatter brained this past day so I haven't fully digested the feedback properly. I goal is to have resolved end of day and provide any additional feedback

@MonolithicMonk
Copy link
Contributor Author

I'm not too sure about the class level ttl cache. Please provide feedback if needed.

loneil
loneil previously approved these changes Dec 11, 2025
Copy link
Contributor

@loneil loneil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping the token in-memory and refreshing the way it does can be a way to go here.
On a different project I work on I'm a consumer of Traction issuing credentials to businesses and it's using GCP Cloud Run processes, so a similar need. We get a token when needed on cold start, store in the processes memory like here, and each process has a token.
I think as long as there's no odd edge cases with the time this can work as it is.

Could be a more scalable solution to persist the token in the DB, and then still use this caching to avoid churn of the token, but that could be a future todo if considerations for very high scaling are needed I think.

The notes about non-traction multitenant mode with the invalidating tokens are still something to consider though, but keeping in mind this PR is really for the Traction addition, so the MT-mode problem is already existing in there (unless I've got that very wrong...). It might just have to be a consideration that for now that mode can't parallelize? Could add some documentation at least if more investigation is needed.

Will see if @esune thinks we should proceed with this

# Class-level cache
_token: str | None = None
_token_expiry: float = 0.0
TOKEN_TTL: int = 3600
Copy link
Contributor

@loneil loneil Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it defaults to 1 day, I could see Traction setups using tokens that live for less than 1 hour for security. Might be good to at least add a TODO to make this configurable in vcauth (or implement if simple)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this, better to keep it conigurable

@esune
Copy link
Member

esune commented Dec 15, 2025

I think keeping the token in-memory and refreshing the way it does can be a way to go here. On a different project I work on I'm a consumer of Traction issuing credentials to businesses and it's using GCP Cloud Run processes, so a similar need. We get a token when needed on cold start, store in the processes memory like here, and each process has a token. I think as long as there's no odd edge cases with the time this can work as it is.

Could be a more scalable solution to persist the token in the DB, and then still use this caching to avoid churn of the token, but that could be a future todo if considerations for very high scaling are needed I think.

The notes about non-traction multitenant mode with the invalidating tokens are still something to consider though, but keeping in mind this PR is really for the Traction addition, so the MT-mode problem is already existing in there (unless I've got that very wrong...). It might just have to be a consideration that for now that mode can't parallelize? Could add some documentation at least if more investigation is needed.

Will see if @esune thinks we should proceed with this

I would leave it as-is (in-memory) for now and revisit on the future as we are introducing quite a bit of extra complexity between this and the redis state sharing changes.

esune
esune previously approved these changes Dec 15, 2025
Copy link
Member

@esune esune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving change, if @MonolithicMonk can do the quick update for the configurable TTL in the next day or so I can re-approve and merge then.

@MonolithicMonk
Copy link
Contributor Author

Hey all, I've been swamped with family members for the past week and haven't had time to respond. I will respond by today. Thanks for feedback and review.

Signed-off-by: Yuki I <omoge.real@gmail.com>
Signed-off-by: Yuki I <omoge.real@gmail.com>
@MonolithicMonk MonolithicMonk dismissed stale reviews from esune and loneil via 5b32eef December 18, 2025 17:47
@esune esune merged commit fab1838 into openwallet-foundation:main Dec 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feat: support Traction tenant API fallback for webhook registration

5 participants