AI continues to be a fast-moving space and we’re always looking for the next big thing. There’s a lot of buzz now around something called agentic workflows – ugly name but a good idea. LLMs had a good run as the state-of-the-AI-art, however evidence is building that the foundation model behind LLMs alone has limitations, both theoretically and in practical applications. Simply building bigger and bigger models (over a trillion parameters last time I looked) may not deliver any breakthroughs beyond excess cost and power consumption. We need new ideas and agentic workflows might be an answer.
Image courtesy Mike McKenzie
Limits on transformers/LLMs
First I should acknowledge a Quanta article that started me down this path. A recent paper looked at theoretical limits on transformers based on complexity analysis. The default use model starts with a prompt to the LLM which should then return the result you want. Viewing the transformer as a compute machine, the authors prove that the range of problems that can be addressed is quite limited for these or any comparable model architectures.
A later paper generalizes their work to consider chain of thought architectures, in which reasoning proceeds in a chain of steps. The prompt suggests breaking the task down into a series of simpler intermediate goals which are demonstrated in “show your work” results. The authors prove complexity limits increase slightly with a slowly growing number of steps (with respect to the prompt size), more quickly with linear growth in steps, and faster still with polynomial growth. In the last of these cases they prove the class of problems that can be solved is exactly those solvable in polynomial time.
Complexity-based proofs might seem too abstract to be important. After all the travelling salesman problem is known to be NP-hard, yet chip design routinely depends on heuristic solutions to such problems and works very well. However limitations in practical applications of LLMs to math reasoning (see my earlier blog) hint that these theoretical analyses may not be too far off-base. Accuracy certainly grows with more intermediate steps in real chain of thought analyses. Time complexity in running multiple steps also grows, and per the theory will grow at corresponding rates. Suggesting while higher accuracy may be possible, the price is likely to be longer run times.
Agentic flows
The name derives from use of “agents” in a flow. There’s a nice description of the concepts in a YouTube video by Andrew Ng who contrasts the one-shot LLM approach (you provide a prompt, it provides an answer in one shot) with the Agentic approach which looks more like the way a human would approach a task. Develop a plan of attack, do some research, write a first pass, consider what areas might need to be improved (perhaps even have another expert review the draft), iterate until satisfied.
Agentic flows in my understanding provide a framework to generalize chain of thought reasoning. At a first level, following the Andrew Ng video, in a prompt you might ask a coder agent LLM to write a piece of code (step 1), and in the same prompt ask it to review the code it generated for possible errors (step 2). If it finds errors, it can refine the code and you could imagine this process continuing through multiple stages of self-refinement. A next step would be to use a second agent to test the code against some series of tests it might generate based on a specification. Together these steps are called “Reflection” for obvious reasons.
There are additional components in the flow Andrew that suggests: for Tool Use, Planning and Multi-Agent Collaboration. However the Reflection part is most interesting to me.
What does an Agentic flow buy you?
Agentic flows do not fix the time complexity problem; instead, they suggest an architecture concept for extending accuracy for complex problems through a system of collaborating agents. You could imagine this being very flexible and there are some compelling demonstrations. At the same time, Andrew notes we will have to think of agentic workflows taking minutes or even hours to return a useful result.
A suggestion
I see long run times as an interesting human engineering challenge. We’re OK waiting seconds to get an OK result (like a web search). Waiting possibly hours for anything less than a very good result would be a tough sell.
I get that VCs and the ventures they fund are aiming for moonshots – artificial general intelligence (AGI) as the only thing that might attract enough attention in a white-hot AI market. I wish them well, especially in the intermediate discoveries they make along the way. The big goal I suspect is still a long way off.
However the agentic concept might deliver practical and near-term value if we are prepared to allow expert human agents in the flow. Let the LLM do the hard work to get to a nearby goal, and perhaps suggest a few alternatives for paths it might follow next. This should take minutes at most. An expert human agent then directs the LLM to follow one of those paths. Repeat as necessary.
I’m thinking particularly of verification debug. In the Innovation in Verification series we’ve covered a few research papers on fault localization. All useful but still challenged to accurately locate a root cause. An agentic workflow alternating between an LLM and an expert human agent might help push accurate localization further and it could progress as quickly as the expert could decide between alternatives.
Any thoughts?
Share this post via:
Intel’s Death Spiral Took Another Turn