Un-3217 Implement pooling for AsyncExecutors. by andreidenissov-cog · Pull Request #328 · cognizant-ai-lab/neuro-san

andreidenissov-cog · 2025-07-24T16:29:41Z

This PR implements a very simple pooling for AsyncExecutors we use to run per-request service session.
The main goal here is to compensate for (unknown to us) httpx connections pooling policy, which results in transport connection being closed at some unpredictable moment in time, after our request is already processed and underlying AsyncExecutor is shutdown.
With pooling, we basically keep all our AsyncExecutors alive and running, with their event loops active and available for incoming "close this connection" task. There is also some performance advantage in not creating AsyncExecutor from scratch for each request , with thread creation/starting and all.
We can still specify "backward compatibility" mode for the pool with pass-thru creation and shutdown for each AsyncExecutor.

Tested: usual stress tests for both OpenAI and AzureOpenAI models with no exceptions observed.

d1donlydfink · 2025-07-24T18:53:28Z

neuro_san/service/generic/agent_service.py

 ]

+# No limit on number of concurrently running executors
+MAX_CONCURRENT_EXECUTORS = 0


Why does this need to be defined twice?

Maybe this should move to AsyncioExecutorPool and be the default for the constructor?

d1donlydfink · 2025-07-24T18:53:44Z

neuro_san/service/generic/async_agent_service.py

 ]

+# No limit on number of concurrently running executors
+MAX_CONCURRENT_EXECUTORS = 0


Why does this need to be defined twice?

d1donlydfink

Very interesting approach!

In a way there is a leak because if there is a burst of requests, there will be 1 executor per request
and that number will never go down. But I can see how this will address the "Event loop is closed problem".
Maybe there is a way to slowly shut them down? (not for this PR)

Single definition for MAX_CONCURRENT_EXECUTORS please
Move AsyncioExecutorPool to internals.utils to avoid the tangles you've just created.

Also: Is per-request logging metadata still working as expected?
I can maybe see a problem where we only set up the logging for the event loop thread once and then the logging for the next re-use gets mal-ed up.

d1donlydfink · 2025-07-24T19:04:28Z

neuro_san/service/generic/agent_service.py

        config: Dict[str, Any] = agent_network.get_config()
        self.llm_factory: ContextTypeLlmFactory = MasterLlmFactory.create_llm_factory(config)
        self.toolbox_factory: ContextTypeToolboxFactory = MasterToolboxFactory.create_toolbox_factory(config)
+        self.async_executors_pool: AsyncioExecutorPool = AsyncioExecutorPool(MAX_CONCURRENT_EXECUTORS)


Should this pool be owned by the server and not the service?

d1donlydfink · 2025-07-24T19:04:49Z

neuro_san/service/generic/async_agent_service.py

        config: Dict[str, Any] = agent_network.get_config()
        self.llm_factory: ContextTypeLlmFactory = MasterLlmFactory.create_llm_factory(config)
        self.toolbox_factory: ContextTypeToolboxFactory = MasterToolboxFactory.create_toolbox_factory(config)
+        self.async_executors_pool: AsyncioExecutorPool = AsyncioExecutorPool(MAX_CONCURRENT_EXECUTORS)


Should this pool be owned by the server and not the service?

d1donlydfink · 2025-07-24T19:05:50Z

neuro_san/client/direct_agent_session_factory.py

 from neuro_san.internals.graph.persistence.registry_manifest_restorer import RegistryManifestRestorer
 from neuro_san.internals.interfaces.agent_network_provider import AgentNetworkProvider
 from neuro_san.internals.network_providers.agent_network_storage import AgentNetworkStorage
+from neuro_san.service.generic.asyncio_executor_pool import AsyncioExecutorPool


Cannot import stuff from neuro_san.service in the client area. Huge reach-around tangle
and will be a problem should we decide to break service stuff out to separate repo.

I think what you want to do is put this in internals.utils for now.
It could easily make its way down to leaf-common.asyncio.

d1donlydfink · 2025-07-24T19:07:50Z

neuro_san/session/session_invocation_context.py

 from leaf_common.asyncio.asyncio_executor import AsyncioExecutor
 from leaf_server_common.logging.logging_setup import setup_extra_logging_fields

+from neuro_san.service.generic.asyncio_executor_pool import AsyncioExecutorPool


Cannot include neuro_san.service here either.

Move it to internals.utils.

d1donlydfink · 2025-07-24T19:09:47Z

neuro_san/service/generic/asyncio_executor_pool.py

+                           False, if requested executor instances are created new
+                                 and shutdown on return (backward compatible mode)
+        """
+        self.max_concurrent = max_concurrent


Where is self.max_concurrent ever used?

Nowhere right now. Was thinking somewhat ahead. Maybe not worth it.

andreidenissov-cog · 2025-07-24T19:21:43Z

neuro_san/service/generic/agent_service.py


+# No limit on number of concurrently running executors
+MAX_CONCURRENT_EXECUTORS = 0
+


For now, we don't limit number of AsyncExecutors which can be created in the pool.

andreidenissov-cog · 2025-07-24T19:23:19Z

neuro_san/service/generic/agent_service.py

        self.llm_factory: ContextTypeLlmFactory = MasterLlmFactory.create_llm_factory(config)
        self.toolbox_factory: ContextTypeToolboxFactory = MasterToolboxFactory.create_toolbox_factory(config)
+        self.async_executors_pool: AsyncioExecutorPool = AsyncioExecutorPool(MAX_CONCURRENT_EXECUTORS)
        # Load once


Create an instance of AsyncioExecutorPool for this service.

andreidenissov-cog · 2025-07-24T19:23:23Z

neuro_san/service/generic/agent_service.py

+            self.llm_factory,
+            self.toolbox_factory,
+            metadata)
        invocation_context.start()


SessionInvocationContext now takes AsyncioExecutorPool as a parameter.

andreidenissov-cog · 2025-07-24T19:25:31Z

neuro_san/session/session_invocation_context.py

        # Get an async executor to run all tasks for this session instance:
-        self.asyncio_executor: AsyncioExecutor = AsyncioExecutor()
+        self.asyncio_executor: AsyncioExecutor = self.async_executors_pool.get_executor()
        self.origination: Origination = Origination()


Instead of creating AsynioExecutor instance directly here, request it from the pool.

andreidenissov-cog · 2025-07-24T19:26:21Z

neuro_san/session/session_invocation_context.py

        if self.asyncio_executor is not None:
-            self.asyncio_executor.shutdown()
+            self.async_executors_pool.return_executor(self.asyncio_executor)
            self.asyncio_executor = None


And then we are done, return our AsyncioExecutor back to the pool.

andreidenissov-cog · 2025-07-24T19:26:56Z

neuro_san/service/generic/async_agent_service.py


+# No limit on number of concurrently running executors
+MAX_CONCURRENT_EXECUTORS = 0
+


For now, we don't limit number of AsyncExecutors which can be created in the pool.

andreidenissov-cog · 2025-07-24T19:27:43Z

neuro_san/service/generic/async_agent_service.py

+            self.llm_factory,
+            self.toolbox_factory,
+            metadata)
        invocation_context.start()


We do the same thing as for sync AgentService above.

andreidenissov-cog · 2025-07-24T19:50:42Z

neuro_san/service/generic/asyncio_executor_pool.py

+        self.logger = logging.getLogger(self.__class__.__name__)
+        self.logger.info("AsyncioExecutorPool created: %s reuse: %s max concurrent: %d",
+                         id(self), str(self.reuse_mode), self.max_concurrent)
+


Pretty simple-minded pool of AsyncioExecutors implementation.
Instances move through self.pool sequence in queue mode: fist in - first out.

andreidenissov-cog added 6 commits July 23, 2025 17:21

First implementation for AsyncExecutorsPool.

2a3d6b5

Add reuse_mode parameter to constructor.

05b68b1

More logging.

44394f6

Pylint+flake8 fixes.

156654f

Add AsyncioExecutorPool() to direct client session.

774ac98

Typo.

d5bf66c

d1donlydfink reviewed Jul 24, 2025

View reviewed changes

d1donlydfink requested changes Jul 24, 2025

View reviewed changes

d1donlydfink reviewed Jul 24, 2025

View reviewed changes

andreidenissov-cog commented Jul 24, 2025

View reviewed changes

andreidenissov-cog added 2 commits July 24, 2025 13:31

Better handling of AsyncioExecutor instances.

2f4d7db

Switch to using AsyncioExecutorPool from leaf-common.

3494bf3

andreidenissov-cog closed this Jul 25, 2025

andreidenissov-cog deleted the ASD-UN-3217-async-executor-pool01 branch October 2, 2025 20:23


		# No limit on number of concurrently running executors
		MAX_CONCURRENT_EXECUTORS = 0

Comments

Conversation

andreidenissov-cog commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d1donlydfink Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

d1donlydfink left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

d1donlydfink Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andreidenissov-cog commented Jul 24, 2025 •

edited

Loading

d1donlydfink Jul 24, 2025 •

edited

Loading

d1donlydfink left a comment •

edited

Loading

d1donlydfink Jul 24, 2025 •

edited

Loading