-
Notifications
You must be signed in to change notification settings - Fork 28
UN-3477 Client Policy for LLMs #420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0b331d8
941d91c
2327b1c
4373108
8bd59d4
50cd2c0
420d081
81957f9
20146f1
7560186
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,44 +9,38 @@ | |
| # neuro-san SDK Software in commercial settings. | ||
| # | ||
| # END COPYRIGHT | ||
|
|
||
| from typing import Any | ||
| from typing import Dict | ||
|
|
||
| import os | ||
|
|
||
| from neuro_san.internals.run_context.langchain.llms.langchain_llm_client import LangChainLlmClient | ||
|
|
||
|
|
||
| class LangChainLlmClientFactory: | ||
| class EnvironmentConfiguration: | ||
| """ | ||
| Interface for Factory classes creating LLM client connections for LangChain. | ||
| Easy policy add on for the get_value_or_env() method for various classes that | ||
| are effected by configuration via dictionary/hocon and/or environment variables. | ||
| """ | ||
|
|
||
| def create_llm_client(self, config: Dict[str, Any]) -> LangChainLlmClient: | ||
| """ | ||
| Create a LangChainLlmClient instance from the fully-specified llm config. | ||
| :param config: The fully specified llm config | ||
| :return: A LangChainLlmClient instance containing run-time resources necessary | ||
| for model usage by the service. | ||
|
|
||
| Can raise a ValueError if the config's class or model_name value is | ||
| unknown to this method. | ||
| """ | ||
| raise NotImplementedError | ||
|
|
||
| def get_value_or_env(self, config: Dict[str, Any], key: str, env_key: str) -> Any: | ||
| @staticmethod | ||
| def get_value_or_env(config: Dict[str, Any], key: str, env_key: str, | ||
| none_obj: Any = None) -> Any: | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even though this is a static method, classes that multiply inherit can still use self. to access this method. This is more concise for one-line value reading than "EnvironmentConfiguration.get_value_or_env(blah blah)". At some point we could add a DictionaryExtractor in here to allow for deep keys. |
||
| """ | ||
| :param config: The config dictionary to search | ||
| :param key: The key for the config to look for | ||
| :param env_key: The os.environ key whose value should be gotten if either | ||
| the key does not exist or the value for the key is None | ||
| :param none_obj: An optional object instance to test. | ||
| If present this method will return None, implying | ||
| that some other external object/mechanism is supplying the values. | ||
| """ | ||
| if none_obj is not None: | ||
| return None | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do the none_obj policy now. |
||
|
|
||
| value = None | ||
| if config is not None: | ||
| value = config.get(key) | ||
|
|
||
| if value is None: | ||
| if value is None and env_key is not None: | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Only refer to the environ if the env_key is something. |
||
| value = os.getenv(env_key) | ||
|
|
||
| return value | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
|
|
||
| # Copyright (C) 2023-2025 Cognizant Digital Business, Evolutionary AI. | ||
| # All Rights Reserved. | ||
| # Issued under the Academic Public License. | ||
| # | ||
| # You can be released from the terms, and requirements of the Academic Public | ||
| # License by purchasing a commercial license. | ||
| # Purchase of a commercial license is mandatory for any use of the | ||
| # neuro-san SDK Software in commercial settings. | ||
| # | ||
| # END COPYRIGHT | ||
|
|
||
| from typing import Any | ||
|
|
||
| from contextlib import suppress | ||
|
|
||
| from neuro_san.internals.run_context.langchain.llms.client_policy import ClientPolicy | ||
|
|
||
|
|
||
| class AnthropicClientPolicy(ClientPolicy): | ||
| """ | ||
| Implementation of the ClientPolicy for Anthtropic chat models. | ||
|
|
||
| Anthropic chat models do not allow for passing in an externally managed | ||
| async web client. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Anthtopic implementation of the ClientPolicy interface. |
||
|
|
||
| async def delete_resources(self): | ||
| """ | ||
| Release the run-time resources used by the model | ||
| """ | ||
| if self.llm is None: | ||
| return | ||
|
|
||
| # Do the necessary reach-ins to successfully shut down the web client | ||
|
|
||
| # This is really an anthropic.AsyncClient, but we don't really want to do the Resolver here. | ||
| # Note we don't want to do this in the constructor, as AnthropicChat lazily | ||
| # creates these as needed via a cached_property that needs to be done in its own time | ||
| # via Anthropic infrastructure. By the time we get here, it's already been created. | ||
| anthropic_async_client: Any = self.llm._async_client # pylint:disable=protected-access | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do the reach-ins in order to shut down the client as correctly as we can inside the delete_resources() method of the ClientPolicy interface. This is the general pattern to follow when the LLM class itself does not allow you to pass in a client class at all. - all we can do. |
||
|
|
||
| if anthropic_async_client is not None: | ||
| with suppress(Exception): | ||
| await anthropic_async_client.aclose() | ||
|
|
||
| # Let's not do this again, shall we? | ||
| self.llm = None | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
|
|
||
| # Copyright (C) 2023-2025 Cognizant Digital Business, Evolutionary AI. | ||
| # All Rights Reserved. | ||
| # Issued under the Academic Public License. | ||
| # | ||
| # You can be released from the terms, and requirements of the Academic Public | ||
| # License by purchasing a commercial license. | ||
| # Purchase of a commercial license is mandatory for any use of the | ||
| # neuro-san SDK Software in commercial settings. | ||
| # | ||
| # END COPYRIGHT | ||
| from typing import Any | ||
| from typing import Dict | ||
|
|
||
| from leaf_common.config.resolver import Resolver | ||
|
|
||
| from neuro_san.internals.run_context.langchain.llms.openai_client_policy import OpenAIClientPolicy | ||
|
|
||
|
|
||
| class AzureClientPolicy(OpenAIClientPolicy): | ||
| """ | ||
| ClientPolicy implementation for OpenAI via Azure. | ||
|
|
||
| OpenAI's BaseLanguageModel implementations do allow us to pass in a web client | ||
| as an argument, so this implementation takes advantage of the create_client() | ||
| method to do that. Worth noting that where many other implementations might care about | ||
| the llm reference, because of our create_client() implementation, we do not. | ||
| """ | ||
|
|
||
| def create_client(self, config: Dict[str, Any]) -> Any: | ||
| """ | ||
| Creates the web client to used by a BaseLanguageModel to be | ||
| constructed in the future. Neuro SAN infrastructures prefers that this | ||
| be an asynchronous client, however we realize some BaseLanguageModels | ||
| do not support that (even though they should!). | ||
|
|
||
| Implementations should retain any references to state that needs to be cleaned up | ||
| in the delete_resources() method. | ||
|
|
||
| :param config: The fully specified llm config | ||
| :return: The web client that accesses the LLM. | ||
| By default this is None, as many BaseLanguageModels | ||
| do not allow a web client to be passed in as an arg. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Azure implementation follows the make-a-client-first pattern. |
||
| # OpenAI is the one chat class that we do not require any extra installs. | ||
| # This is what we want to work out of the box. | ||
| # Nevertheless, have it go through the same lazy-loading resolver rigamarole as the others. | ||
|
|
||
| # Set up a resolver to use to resolve lazy imports of classes from | ||
| # langchain_* packages to prevent installing the world. | ||
| resolver = Resolver() | ||
|
|
||
| # pylint: disable=invalid-name | ||
| AsyncAzureOpenAI = resolver.resolve_class_in_module("AsyncAzureOpenAI", | ||
| module_name="openai", | ||
| install_if_missing="langchain-openai") | ||
|
|
||
| self.create_http_client(config) | ||
|
|
||
| # Prepare some more complex args | ||
| openai_api_key: str = self.get_value_or_env(config, "openai_api_key", "AZURE_OPENAI_API_KEY") | ||
| if openai_api_key is None: | ||
| openai_api_key = self.get_value_or_env(config, "openai_api_key", "OPENAI_API_KEY") | ||
|
|
||
| # From lanchain_openai.chat_models.azure.py | ||
| default_headers: Dict[str, str] = {} | ||
| default_headers = config.get("default_headers", default_headers) | ||
| default_headers.update({ | ||
| "User-Agent": "langchain-partner-python-azure-openai", | ||
| }) | ||
|
|
||
| self.async_openai_client = AsyncAzureOpenAI( | ||
| azure_endpoint=self.get_value_or_env(config, "azure_endpoint", | ||
| "AZURE_OPENAI_ENDPOINT"), | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because ClientPolicy derives from EnvironmentConfiguration, we can easily use the self.get_value_or_env() method when configuring the clients too, now. |
||
| deployment_name=self.get_value_or_env(config, "deployment_name", | ||
| "AZURE_OPENAI_DEPLOYMENT_NAME"), | ||
| api_version=self.get_value_or_env(config, "openai_api_version", | ||
| "OPENAI_API_VERSION"), | ||
| api_key=openai_api_key, | ||
| # AD here means "ActiveDirectory" | ||
| azure_ad_token=self.get_value_or_env(config, "azure_ad_token", | ||
| "AZURE_OPENAI_AD_TOKEN"), | ||
| # azure_ad_token_provider is a complex object, and we can't set that through config | ||
|
|
||
| organization=self.get_value_or_env(config, "openai_organization", "OPENAI_ORG_ID"), | ||
| # project - not set in langchain_openai | ||
| # webhook_secret - not set in langchain_openai | ||
| base_url=self.get_value_or_env(config, "openai_api_base", "OPENAI_API_BASE"), | ||
| timeout=config.get("request_timeout"), | ||
| max_retries=config.get("max_retries"), | ||
| default_headers=default_headers, | ||
| # default_query - don't understand enough to set, but set in langchain_openai | ||
| http_client=self.http_client | ||
| ) | ||
|
|
||
| # We retain the async_openai_client reference, but we hand back this reach-in | ||
| # to pass to the BaseLanguageModel constructor. | ||
| return self.async_openai_client.chat.completions | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
|
|
||
| # Copyright (C) 2023-2025 Cognizant Digital Business, Evolutionary AI. | ||
| # All Rights Reserved. | ||
| # Issued under the Academic Public License. | ||
| # | ||
| # You can be released from the terms, and requirements of the Academic Public | ||
| # License by purchasing a commercial license. | ||
| # Purchase of a commercial license is mandatory for any use of the | ||
| # neuro-san SDK Software in commercial settings. | ||
| # | ||
| # END COPYRIGHT | ||
|
|
||
| from neuro_san.internals.run_context.langchain.llms.client_policy import ClientPolicy | ||
|
|
||
|
|
||
| class BedrockClientPolicy(ClientPolicy): | ||
| """ | ||
| ClientPolicy implementation for Bedrock. | ||
|
|
||
| Bedrock does not allow for passing in async web clients. | ||
| As a matter of fact, all of its clients are synchronous, | ||
| which is not the best for an async service. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bedrock is another one of those reach-in during delete_resources, as it does not allow a client to be passed in. During this exercise I also learned that bedrock doesn't even allow an async client, so not-so-great performance is nearly guaranteed within the async server model.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be worth poking langchain-aws maintainers to allow for an asynchronous client. |
||
|
|
||
| async def delete_resources(self): | ||
| """ | ||
| Release the run-time resources used by the model | ||
| """ | ||
| if self.llm is None: | ||
| return | ||
|
|
||
| # Do the necessary reach-ins to successfully shut down the web client | ||
| if self.llm.client is not None: | ||
| # This is a boto3 client | ||
| self.llm.client.close() | ||
|
|
||
| if self.llm.bedrock_client is not None: | ||
| # This is a boto3 client | ||
| self.llm.bedrock_client.close() | ||
|
|
||
| # Let's not do this again, shall we? | ||
| self.llm = None | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
|
|
||
| # Copyright (C) 2023-2025 Cognizant Digital Business, Evolutionary AI. | ||
| # All Rights Reserved. | ||
| # Issued under the Academic Public License. | ||
| # | ||
| # You can be released from the terms, and requirements of the Academic Public | ||
| # License by purchasing a commercial license. | ||
| # Purchase of a commercial license is mandatory for any use of the | ||
| # neuro-san SDK Software in commercial settings. | ||
| # | ||
| # END COPYRIGHT | ||
| from typing import Any | ||
| from typing import Dict | ||
|
|
||
| from langchain.llms.base import BaseLanguageModel | ||
|
|
||
| from neuro_san.internals.interfaces.environment_configuration import EnvironmentConfiguration | ||
|
|
||
|
|
||
| class ClientPolicy(EnvironmentConfiguration): | ||
| """ | ||
| Policy interface to manage the lifecycles of web clients that talk to LLM services. | ||
| This inherits from EnvironmentConfiguration in order to support easy access to the | ||
| get_value_or_env() method. | ||
|
|
||
| There are really two styles of implementation encompassed by this one interface. | ||
|
|
||
| 1) When BaseLanguageModels can have web clients passed into their constructor, | ||
| implementations should use the create_client() method to retain any references necessary | ||
| to help them clean up nicely in the delete_resources() method. | ||
|
|
||
| 2) When BaseLanguageModels cannot have web clients passed into their constructor, | ||
| implementations should pass the already created llm into their implementation's | ||
| constructor. Later delete_resources() implementations will need to do a reach-in | ||
| to the llm instance to clean up any references related to the web client. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the basic interface for LLM ClientPolicy. This is the 2nd iteration on this kind of interface, moving from the data-only class w/ external policy of the LangChainLlmClient we had previously to a class that is more policy based. (Verb-centered as opposed to Noun-centered). Having slept on this, I think there is perhaps another 3rd iteration to be had which adds a notion of 2 more This last create_chat_model() would allow for LLM creation and deletion policy to exist within the same class, and with standardized external structure could allow for standardized calling and registration as long as the class was listed in the user's llm_info.hocon file. The idea would be we could also register our own classes this way in default_llm_info.hocon as well for an example. This would mean that LlmFactory would no longer be the interface to override (though we could keep it for backwards compatibility) but this new LlmPolicy interface would be. Thoughts about this direction anyone? |
||
|
|
||
| def __init__(self, llm: BaseLanguageModel = None): | ||
| """ | ||
| Constructor. | ||
|
|
||
| :param llm: BaseLanguageModel | ||
| """ | ||
| self.llm: BaseLanguageModel = llm | ||
|
|
||
| # pylint: disable=useless-return | ||
| def create_client(self, config: Dict[str, Any]) -> Any: | ||
| """ | ||
| Creates the web client to used by a BaseLanguageModel to be | ||
| constructed in the future. Neuro SAN infrastructures prefers that this | ||
| be an asynchronous client, however we realize some BaseLanguageModels | ||
| do not support that (even though they should!). | ||
|
|
||
| Implementations should retain any references to state that needs to be cleaned up | ||
| in the delete_resources() method. | ||
|
|
||
| :param config: The fully specified llm config | ||
| :return: The web client that accesses the LLM. | ||
| By default this is None, as many BaseLanguageModels | ||
| do not allow a web client to be passed in as an arg. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the meantime, this interface allows for a create_client() method, which saves a place for OpenAI/Azure models to create a client to access LLM services separately from the BaseLanguageModel itself, a la @andreidenissov-cog 's original methodology. |
||
| _ = config | ||
| return None | ||
|
|
||
| async def delete_resources(self): | ||
| """ | ||
| Release the run-time resources used by the instance. | ||
|
|
||
| Unfortunately for many BaseLanguageModels, this tends to involve | ||
| a reach-in to its private internals in order to shutting down | ||
| any web client references in there. | ||
| """ | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Specify exactly how the LLM instance's resources should be handled when we are shutting it down. |
||
| raise NotImplementedError | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the greatest diff here.
This is actually a new interface to tack on to things that need to get config values from a dictionary or an env var. This originally comes from LangChainLlmFactory, but I added the notion of a "none_object", which returns None for the value when the object passed in for that is something. You'll see this used for the OpenAI cases when passing in a client.