UN-3477 Client Policy for LLMs#420
UN-3477 Client Policy for LLMs#420d1donlydfink merged 10 commits intoASD-UN-3477-model-resources01from
Conversation
|
|
||
| Anthropic chat models do not allow for passing in an externally managed | ||
| async web client. | ||
| """ |
There was a problem hiding this comment.
Anthtopic implementation of the ClientPolicy interface.
Still need to ve sure I test the non-OpenAI ones.
| # Note we don't want to do this in the constructor, as AnthropicChat lazily | ||
| # creates these as needed via a cached_property that needs to be done in its own time | ||
| # via Anthropic infrastructure. By the time we get here, it's already been created. | ||
| anthropic_async_client: Any = self.llm._async_client # pylint:disable=protected-access |
There was a problem hiding this comment.
Do the reach-ins in order to shut down the client as correctly as we can inside the delete_resources() method of the ClientPolicy interface.
This is the general pattern to follow when the LLM class itself does not allow you to pass in a client class at all. - all we can do.
| :return: The web client that accesses the LLM. | ||
| By default this is None, as many BaseLanguageModels | ||
| do not allow a web client to be passed in as an arg. | ||
| """ |
There was a problem hiding this comment.
Azure implementation follows the make-a-client-first pattern.
Worth noting that this implementation relies on the OpenAI implementation to do the proper delete_resources() policy
| Bedrock does not allow for passing in async web clients. | ||
| As a matter of fact, all of its clients are synchronous, | ||
| which is not the best for an async service. | ||
| """ |
There was a problem hiding this comment.
Bedrock is another one of those reach-in during delete_resources, as it does not allow a client to be passed in. During this exercise I also learned that bedrock doesn't even allow an async client, so not-so-great performance is nearly guaranteed within the async server model.
There was a problem hiding this comment.
Might be worth poking langchain-aws maintainers to allow for an asynchronous client.
| implementations should pass the already created llm into their implementation's | ||
| constructor. Later delete_resources() implementations will need to do a reach-in | ||
| to the llm instance to clean up any references related to the web client. | ||
| """ |
There was a problem hiding this comment.
This is the basic interface for LLM ClientPolicy.
This is the 2nd iteration on this kind of interface, moving from the data-only class w/ external policy of the LangChainLlmClient we had previously to a class that is more policy based. (Verb-centered as opposed to Noun-centered).
Having slept on this, I think there is perhaps another 3rd iteration to be had which adds a notion of 2 more
methods looking something like this:
def get_llm_class_name(self) -> str:
"""
:return: The string class name for the LLM represented so as to register this policy class
with the neuro-san LlmFactory system
"""
# Would return something like "openai" or "anthropic" or whatever the llm class name for the llm_info.hocon files would be
raise NotImplementedError
def create_chat_model(self, config: Dict[str, Any], client: Any = None) -> BaseLanguageModel
This last create_chat_model() would allow for LLM creation and deletion policy to exist within the same class, and with standardized external structure could allow for standardized calling and registration as long as the class was listed in the user's llm_info.hocon file. The idea would be we could also register our own classes this way in default_llm_info.hocon as well for an example. This would mean that LlmFactory would no longer be the interface to override (though we could keep it for backwards compatibility) but this new LlmPolicy interface would be.
Thoughts about this direction anyone?
| stream_usage=True, | ||
| thinking=config.get("thinking"), | ||
| mcp_servers=config.get("mcp_servers"), | ||
| context_management=config.get("context_management"), |
There was a problem hiding this comment.
Add some params I discovered while kicking around with Anthropic.
| install_if_missing="langchain-anthropic") | ||
|
|
||
| # ChatAnthropic currently only supports _async_client() as a cached_property, | ||
| # not as a constructor arg. |
There was a problem hiding this comment.
Comment of note.
| # Create the client_policy after the fact, with reach-in | ||
| client_policy = AnthropicClientPolicy(llm) | ||
|
|
||
| elif chat_class == "ollama": |
There was a problem hiding this comment.
Not doing ollama or gemini just yet.
| # Return the LlmResources with the client_policy that was created. | ||
| # That might be None, and that's OK. | ||
| return LangChainLlmResources(llm, llm_client=llm_client) | ||
| return LangChainLlmResources(llm, client_policy=client_policy) |
There was a problem hiding this comment.
Stash the ClientPolicy created. If this is still None, then that's till fine.
|
|
||
| def create_llm_resources_with_client(self, config: Dict[str, Any], | ||
| llm_client: LangChainLlmClient = None) -> LangChainLlmResources: | ||
| def create_llm_resources(self, config: Dict[str, Any]) -> LangChainLlmResources: |
There was a problem hiding this comment.
Change the TestLlmFactory to be exemplary.
|
@d1donlydfink Warning: I reran Neuro-san's smoke test on this branch. It reported a failure on the Azure test case. I will try to find help by running the same test manually on my local machine. |
andreidenissov-cog
left a comment
There was a problem hiding this comment.
Looks good to me.
|
Thanks @vince-leaf . Great catch! I believe I have a fix in my next branch. |
0db0b54
into
ASD-UN-3477-model-resources01
Previously we had a LangChainLlmClient object that held references for use in closing.
Upon further looking into how to do this kind of thing for providers other than OpenAI,
we find that most other ChatModels do not actually let you pass in your own async client.
In at least one case (Bedrock), the client itself can't even be an async client.
So with this PR, we are moving to a ClientPolicy model which serves two different styles in the same interface.
The first style is the OpenAI style. The create_client() method returns the correct object
to pass into the ChatModel's constructor as a client. Inside create_client(), the ClientPolicy
has access to the full llm config so that client objects can also be configured, and it has the freedom
to store objects that are relevant to shutting down web clients for llm access correctly.
The second style is one where the ChatModel does not allow an external client instance coming in
its constructor or being set anywhere on the object. In this case we pass in the LLM ChatModel
itself as the basis reference.
Both styles need to implement a delete_resources() method, which correctly cleans up.
Most often for style 2, this involves a reach-in to private member variables to do the right thing
for clean up (at least until we find a better way of doing it.)
Even with all this change, I still think there might be yet another iteration on this notion of a ClientPolicy interface.
See ClientPolicy to comment on near-future direction.