I raised the question on Discord about the 153 sub-agent architectural choice for frosty. This is my follow-up suggesting alternatives that can reduce latency (and likely cost). I see many specialist sub-agents perform singular tasks that could be better (speed, cost, reliability) achieved as straight-up tool/method calls with a little more forethought.
I'd propose running a pilot test replacing /src/frosty_ai/objagents/sub_agents/securityengineer/passwordpolicy with minor changes to the Security Engineer subagent. Have it call Python that parameterizes and executes the CREATE PASSWORD POLICY IF NOT EXISTS SQL statement in Snowflake. Most of the password policy specialist sub-agent's responsibilities can be performed programmatically in Python--no LLMAgent needed.
The present behavior described by the Password Policy Specialist's prompt.py lines 31-63 (except explanation requirement on line 48) is better implemented as a Python config class (suggested name PasswordPolicyConfig). Modifications to the business logic here can be as easily changed in some config.json file as easily as a Markdown file. All of these ranges, validity checks, defaults, are far more reliably represented directly by code. They will execute faster, they will not repeatedly infer tokens and fill context, and they will be deterministic. In some %age of requests, an LLM can misremember and misapply any of these defaults and/or range boundaries, which can lead to silent errors. If you don't want to write this config class, you can pass this block largely to any coding LLM with supporting instructions and it should generate a draft of this Python class and config.json structure for you without too much work.
Justifying that the password policy meets the user's intent (line 48) is more complex and is best handled by the Security Engineer subagent for now (there are ways to build-in greater mechanistic explainability, but those would require a total overhaul that are out-of-scope here).
LLMAgents are responsible for determining user intent and handling the conversational interface aspect. In identifying the user needs new or renamed password policies, the Security Engineer ought to recognize if the user wants a "strong password," etc. (In fact, to the extent "strong password" may be entered by the user as literal text, that can be pre-identified before their message is sent to the LLM via regex and stored in a bool; the LLM should be used to infer intentions that are harder to detect in natural language than this). Once the Security Engineer has a complete-enough picture (it can be instructed to ask follow-up questions to fill-in missing info until this sense of "completeness" is the case) to output structured JSON that essentially describes all of the arguments necessary for the CREATE PASSWORD POLICY IF NOT EXISTS statement parameters, receipt of this structured response from the Security Engineer should call a Python method passing this dictionary (whether as a tool or called from code in the Security Engineer's agent.py, is a matter of choice--if calling a Tool requires wrapping in a Subagent then I'd just call the Python method directly).
My suggestion is to implement this approach (a security engineer that recognizes the user's password policy intent and forms a structured output response that can be passed to standard Python (a "tool" call one way or the other) to apply default config guidelines, issue the SQL statement, and handle errors), and run it alongside the present-day security engineer using a password-policy specialist sub-agent which infers all of its behavior from skills/prompts. Measure time, latency, tokens over multiple runs using each agent flow. Then you can judge whether it makes sense to apply similar patterns to replace frosty's other specialist sub-agents. At the end-of-the-day, you may find an architecture based on 153 sub-agents to do everything that can be done in Snowflake is sub-optimal.
I'll also reference a rather timely YouTube video by Nate B Jones about this same subject that came out yesterday, titled "
Your Claude Limit Burns In 90 Minutes Because Of One ChatGPT Habit" (26m 34s) which gives more motivation for why designing AI solutions with greater token economy in mind has become increasingly important.
I raised the question on Discord about the 153 sub-agent architectural choice for frosty. This is my follow-up suggesting alternatives that can reduce latency (and likely cost). I see many specialist sub-agents perform singular tasks that could be better (speed, cost, reliability) achieved as straight-up tool/method calls with a little more forethought.
I'd propose running a pilot test replacing /src/frosty_ai/objagents/sub_agents/securityengineer/passwordpolicy with minor changes to the Security Engineer subagent. Have it call Python that parameterizes and executes the
CREATE PASSWORD POLICY IF NOT EXISTSSQL statement in Snowflake. Most of the password policy specialist sub-agent's responsibilities can be performed programmatically in Python--noLLMAgentneeded.The present behavior described by the Password Policy Specialist's prompt.py lines 31-63 (except explanation requirement on line 48) is better implemented as a Python config class (suggested name
PasswordPolicyConfig). Modifications to the business logic here can be as easily changed in someconfig.jsonfile as easily as a Markdown file. All of these ranges, validity checks, defaults, are far more reliably represented directly by code. They will execute faster, they will not repeatedly infer tokens and fill context, and they will be deterministic. In some %age of requests, an LLM can misremember and misapply any of these defaults and/or range boundaries, which can lead to silent errors. If you don't want to write this config class, you can pass this block largely to any coding LLM with supporting instructions and it should generate a draft of this Python class andconfig.jsonstructure for you without too much work.Justifying that the password policy meets the user's intent (line 48) is more complex and is best handled by the Security Engineer subagent for now (there are ways to build-in greater mechanistic explainability, but those would require a total overhaul that are out-of-scope here).
LLMAgents are responsible for determining user intent and handling the conversational interface aspect. In identifying the user needs new or renamed password policies, the Security Engineer ought to recognize if the user wants a "strong password," etc. (In fact, to the extent "strong password" may be entered by the user as literal text, that can be pre-identified before their message is sent to the LLM via regex and stored in a bool; the LLM should be used to infer intentions that are harder to detect in natural language than this). Once the Security Engineer has a complete-enough picture (it can be instructed to ask follow-up questions to fill-in missing info until this sense of "completeness" is the case) to output structured JSON that essentially describes all of the arguments necessary for theCREATE PASSWORD POLICY IF NOT EXISTSstatement parameters, receipt of this structured response from the Security Engineer should call a Python method passing this dictionary (whether as a tool or called from code in the Security Engineer's agent.py, is a matter of choice--if calling a Tool requires wrapping in a Subagent then I'd just call the Python method directly).My suggestion is to implement this approach (a security engineer that recognizes the user's password policy intent and forms a structured output response that can be passed to standard Python (a "tool" call one way or the other) to apply default config guidelines, issue the SQL statement, and handle errors), and run it alongside the present-day security engineer using a password-policy specialist sub-agent which infers all of its behavior from skills/prompts. Measure time, latency, tokens over multiple runs using each agent flow. Then you can judge whether it makes sense to apply similar patterns to replace frosty's other specialist sub-agents. At the end-of-the-day, you may find an architecture based on 153 sub-agents to do everything that can be done in Snowflake is sub-optimal.
I'll also reference a rather timely YouTube video by Nate B Jones about this same subject that came out yesterday, titled "
Your Claude Limit Burns In 90 Minutes Because Of One ChatGPT Habit" (26m 34s) which gives more motivation for why designing AI solutions with greater token economy in mind has become increasingly important.