Skip to content

Comments

UN-3310 Faster server startup.#290

Merged
andreidenissov-cog merged 6 commits intomainfrom
ASD-UN-3310-faster-startup01
Jul 9, 2025
Merged

UN-3310 Faster server startup.#290
andreidenissov-cog merged 6 commits intomainfrom
ASD-UN-3310-faster-startup01

Conversation

@andreidenissov-cog
Copy link
Contributor

@andreidenissov-cog andreidenissov-cog commented Jul 9, 2025

Main idea of this PR is switch to "lazy" service instances instantiation.
Right now, we always create service instances - 2 of them in fact for each agent defined in server manifest;
one for gRPC, one for http, even if say we don't use gRPC for this particular server invocation,
and only use a single agent for chat.
Each service creation involves reading at least 2 files, which adds up.
Changing to service instantiation on demand makes server start up much faster.
There is an initial lag of course for loading all Python classes, but nothing to be done for that I guess.

@andreidenissov-cog andreidenissov-cog self-assigned this Jul 9, 2025
agent_network_provider,
agent_server_logging)
self.allowed_agents[agent_name] = agent_service_provider

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of AsyncAgentService instances, "allowed_agents" map now contains AsyncAgentServiceProvider instances for lazy instantiation of AsyncAgentService.

self.server_logging: AgentServerLogging = server_logging
self.agent_network_provider: AgentNetworkProvider = agent_network_provider
self.agent_name: str = agent_name
self.lock: Lock = Lock()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save constructor parameters for later - when we will need to instantiate AgentService object on the first use.

self.security_cfg,
self.agent_name,
self.agent_network_provider,
self.server_logging)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standard instance creation pattern for multi-threaded environment.

from typing import Dict
from threading import Lock
import copy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class more or less duplicates AgentServiceProvider logic, but for AsyncAgentService instances.
Maybe we could cook up something more generic and elegant Pythonic style for both classes,
but not in this PR.

# Service is not yet instantiated - it has no requests
return 0
service: AgentService = self.service_provider.get_service()
return service.get_request_count()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's perf optimization: if service instance has not been instantiated, then its request count is obviously 0.
Otherwise with current gRPC server logic, we'll always instantiate all agent services,
even when not using gRPC service at all.

agent_name,
agent_network_provider,
server_logging)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create "lazy" service provider - instantiate actual AgentService only when needed.

service: AgentService = self.service_provider.get_service()
response_dict: Dict[str, Any] =\
service.function(request_dict, request_metadata, context)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below - get AgentService from AgentServiceProvider for actual request processing.

self.do_finish()
return None
return service_provider.get_service()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factor out common logic for getting AsyncAgentService for request processing.
Will be used in all (3) neuro-san API requests.


self.logger.info(self.get_metadata(), f"[REQUEST RECEIVED] {self.request.method} {self.request.uri}")
self.logger.debug(self.get_metadata(), f"[REQUEST RECEIVED] {self.request.method} {self.request.uri}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less chatty.


service: AsyncAgentService = self.agent_policy.allow(agent_name)
service: AsyncAgentService = await self.get_service(agent_name, metadata)
if service is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below - use common function from base_request_handler module.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that this consolidates all that other status/error/do_finish business.

self.set_status(404)
self.logger.error({}, "error: Invalid request path %s", self.request.path)
self.do_finish()
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use common function from base_request_handler module.

self.logger.error({}, "error: Invalid request path %s", self.request.path)
self.do_finish()
return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use common function from base_request_handler module.

app.listen(self.http_port)
self.logger.info({}, "HTTP server is running on port %d", self.http_port)
self.logger.info({}, "HTTP server is shutting down after %d requests", self.requests_limit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start actually listening after all preparations are done.
Log server start at this point.

Copy link
Collaborator

@d1donlydfink d1donlydfink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider the nits I mention below for a future PR.

It's conceivable we could get some complaints about initial start up time of individual service. (Thinking: 1C) Cross that bridge when we get there.

self.server_logging)
return self.service_instance

def service_created(self) -> bool:
Copy link
Collaborator

@d1donlydfink d1donlydfink Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since this is a boolean, perhaps a better name would be is_service_created()
Here and below in the async version.

self.security_cfg,
self.agent_name,
self.agent_network_provider,
self.server_logging)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For next PR consideration: Since all of this is the same except for the type of service_instance, maybe there is a path to abstraction here.
However well functioning all the async stuff is, someday Python will clean up this copy-paste cancer on the language.


service: AsyncAgentService = self.agent_policy.allow(agent_name)
service: AsyncAgentService = await self.get_service(agent_name, metadata)
if service is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that this consolidates all that other status/error/do_finish business.

@andreidenissov-cog andreidenissov-cog merged commit 172ca02 into main Jul 9, 2025
4 checks passed
@andreidenissov-cog andreidenissov-cog deleted the ASD-UN-3310-faster-startup01 branch July 9, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants