They use Customer Service as their domain example. This is an area that I've spent the last decade applying AI/NLP to. 99% of tasks in this domain are fully deterministic - get information about a product, place an order, cancel an order, get status on an order, process a return, troubleshoot a product, etc. These are all well-defined processes that should make use of orchestrated workflows that inject AI at the appropriate time (to determine the customer's intent using a classifier, to obtain information for slot-filling using NER, and lately to return information from a knowledgebase using RAG). Once we know what the customer wants to do and the information required, we execute the workflow. There's no reason to use an "autonomous agent" to engage in reasoning and planning. Unless we just want to drive up token costs for some reason.
I disagree, basically. In our experience actual real world processes are not compactly defined, and don't have sharp edges.
When you actually go to pull them out of a customer they have messy probabilistic edges, where you can sometimes make progress a lot faster, and end up with a much more compact and manageable representation of the process, by leveraging an LLM.
We've a strong opinion this is the future of the space and that purely deterministic workflows will get left behind! I guess we'll see.
> However, most examples of high agency agents operate in ideal environments which provide complete knowledge to the agent, and are ‘patient’ to erroneous or flaky interactions. That is, the agent has access to the complete snapshot of its environment at all times, and the environment is forgiving of its mistakes.
This is a long way around basic (very basic) existing concepts in classic AI ("Fully" vs "partially" observable environments; nondeterministic actions; are all, if I'm not sorely mistaken, discussed in Russel and Norvig - the standard textbook for AI 101).
Now, maybe the authors _do_ know and choose to discuss this at pre-undergrad level like this on purpose. Regardless of whether they don't know the basic concepts or think its better to pretend they don't because their audience doesn't - this signals that folks working on/with the new AI wave (authors or audience) have not read the basic literature of classic AI
The bastardization of the word 'agent' is related to this, but I'll stop here at the risk of going too far as 'old man yells at clouds', as if I'd never seen technical terms coopted for hype
Yes we're familiar with the terminology and framing of gofai.
Fwiw I read (most of) the 3rd edition of Russell and Norvig in my undergrad days.
However, the point we're trying to make here is at a higher level of abstraction.
Basically most demos of agents you see these days don't prioritize reliability. Even a Copilot use case is quite a bit less demanding than a really frustrated user trying to get a refund or locate a missing order.
I'm not sure putting that in the language of pomdps is going to improve things for the reader, rather than just make us look more well read.
They use Customer Service as their domain example. This is an area that I've spent the last decade applying AI/NLP to. 99% of tasks in this domain are fully deterministic - get information about a product, place an order, cancel an order, get status on an order, process a return, troubleshoot a product, etc. These are all well-defined processes that should make use of orchestrated workflows that inject AI at the appropriate time (to determine the customer's intent using a classifier, to obtain information for slot-filling using NER, and lately to return information from a knowledgebase using RAG). Once we know what the customer wants to do and the information required, we execute the workflow. There's no reason to use an "autonomous agent" to engage in reasoning and planning. Unless we just want to drive up token costs for some reason.
At Intercom we've also a lot of experience here.
I disagree, basically. In our experience actual real world processes are not compactly defined, and don't have sharp edges.
When you actually go to pull them out of a customer they have messy probabilistic edges, where you can sometimes make progress a lot faster, and end up with a much more compact and manageable representation of the process, by leveraging an LLM.
We've a strong opinion this is the future of the space and that purely deterministic workflows will get left behind! I guess we'll see.
This is not core the to article, but -
> However, most examples of high agency agents operate in ideal environments which provide complete knowledge to the agent, and are ‘patient’ to erroneous or flaky interactions. That is, the agent has access to the complete snapshot of its environment at all times, and the environment is forgiving of its mistakes.
This is a long way around basic (very basic) existing concepts in classic AI ("Fully" vs "partially" observable environments; nondeterministic actions; are all, if I'm not sorely mistaken, discussed in Russel and Norvig - the standard textbook for AI 101).
Now, maybe the authors _do_ know and choose to discuss this at pre-undergrad level like this on purpose. Regardless of whether they don't know the basic concepts or think its better to pretend they don't because their audience doesn't - this signals that folks working on/with the new AI wave (authors or audience) have not read the basic literature of classic AI
The bastardization of the word 'agent' is related to this, but I'll stop here at the risk of going too far as 'old man yells at clouds', as if I'd never seen technical terms coopted for hype
Yes we're familiar with the terminology and framing of gofai. Fwiw I read (most of) the 3rd edition of Russell and Norvig in my undergrad days.
However, the point we're trying to make here is at a higher level of abstraction.
Basically most demos of agents you see these days don't prioritize reliability. Even a Copilot use case is quite a bit less demanding than a really frustrated user trying to get a refund or locate a missing order.
I'm not sure putting that in the language of pomdps is going to improve things for the reader, rather than just make us look more well read.
But your feedback is noted!
Would that be old man yelling at atmospheric clusters of water droplets or old man yelling at remote datacenters behind an API?