While there are big ambitions for virtual engineers and other self-guiding agentic applications, today estimates show 83-90% of AI inferences are for internet searches. On a related note, chatbots are now said to account for nearly 60% of internet traffic. Search and support are the biggest market drivers for automation and unquestionably have improved through AI automation. Search gets closer to what you want in one pass. Chatbots also depend on retrieving domain-specific information following a question. In either case, RAG – retrieval augmented generation – plays an important role in finding the most relevant sources for a search or chat response.

Or so you hope. My experience is that the RAG results I get in a basic search/question are most useful for simple (one-part) questions in areas where I have no expertise. The more expertise I have or the more complex my question, the less useful I find the response. I can do somewhat better by adding context (You are an expert in … My question is …). Asking for citations also helps. But even these tricks don’t always work. The problem is that RAG as originally conceived (2020) has limitations. I thought it would be interesting to look at advances in this field, for variety looking from a business perspective and a healthcare perspective. In AI, it seems that our needs and priorities are not so different.
A business perspective from Elastic and Cohere
Target applications here cover a wide range in business: finance, public sector, energy, media, etc. I found this webinar which presents a combination of these two technologies with particular emphasis on RAG, the basics, challenges, and advances.
First, a quick note on RAG. LLMs are trained on publicly accessible corpora. RAG training derives information from separate and typically internal proprietary sources: PDFs, spreadsheets, images, etc. This information is chunked in some manner (eg. paragraphs in PDF text) and encoded as vectors based on similarity (scalar products of vectors, so related objects are close and unrelated objects are not). Chunks in training data must be expert (human) labeled.
Retrieval then uses a mix of keyword matching and similarity-based search to develop a top-ranked set of responses to your question. RAG is more accurate in retrieval than a general-purpose LLM because it can exploit semantic understanding based on similarity matching between a query and labeled training data.
So far this is naive RAG, with known limitations. These include struggles where the answer needed may require wider understanding of a source, or multiple sources, or the question asked has multiple clauses and requires sequential reasoning.
You know what’s coming next: agentic RAG, also called advanced-RAG. To address these limitations a system must develop a plan of attack, do multiple hops of reasoning, and self-reflect/verify after each step, potentially triggering rework. This is what agentic does. As soon as a question/request becomes even moderately complex, resolution must turn agentic, even in RAG. Tools used to support such agentic flows in business applications might be Microsoft Office, CRM, or SQL databases.
For completeness, a further advance you may find is modular RAG. These systems allow for more building block approaches to blend retrieval and refinement in structuring pipelines.
A healthcare perspective from Kent State and Rutgers
Here I draw on a long but very interesting paper. The authors suggest the following as key applications in healthcare: diagnostic assistance by retrieving information on similar cases; summarizing health records and discharge notes; answering complex medical questions; educating patients and tailoring responses to user profiles; matching candidates to clinical trials; and retrieving and summarizing biomedical literature, especially recent literature, in response to a clinical or research query.
The authors note a range of challenges in retrieving information. Obviously such a system must handle a wide range of data types (modalities), from doctor notes to X-rays, EKG traces, lab results, etc. They must also contend with a wide range of potentially incompatible health record sources, some with technically precise notes (myocardial infarction), some less precise (heart attack). Users face challenges in understanding the credibility of sources (media health articles, versus Reddit, versus respected journals in a field) and how these contribute to ranking conclusions. Familiar challenges even in our field.
There is a longer list from which I’ll call out one widely relevant item: the need to continuously update as new research, drugs and treatments emerge, also need to deprecate outdated sources. In a medical context the authors suggest that manual updates would be too slow and error-prone and that any useful RAG system for their purposes must build continuous update into the system.
They look at tradeoffs between the three RAG architectures mentioned earlier (naive, advanced, and modular). They find naive RAG easy to setup and use, though for their purposes too noisy and risky for high-stakes scenarios. Advanced RAG is more promising in diagnostic support and EHR summarization, striking a balance between factual grounding and speed, but requires significant compute resource (presumably an on-prem datacenter). This method looks most ready today for clinical use, at least in hospitals and large clinics. They see modular RAG as interesting for ongoing research, though training and resource costs make it impractical to consider for near-term deployment.
Relevance to design automation
Accuracy is critical for technical support in our domain, whether internal or external. Our users are very knowledgeable and intolerant of beginner-level suggestions. Experiences above suggest that advanced/agentic RAG may be the most appropriate method to deploy support here.
That guidance should aim to avoid mistakes made in some ambitious all-AI rollouts (Klarna customer support for example). These certainly should include emphasis on “don’t know” for suggestions with low support, explainability for top candidate response offered, and methods to escalate to a human expert when the bot is uncertain. I am starting to see some of this in general customer support.
Meantime, agentic RAG can make a big difference in productivity and user satisfaction for in-house and external users. Most of us would prefer to explore on our own supported by effective agentic RAG, only turning to a human expert when we’re not making progress. That’s technology worth supporting.
Also Read:
Bronco Debug Stress Tested Measures Up
TSMC and Cadence Strengthen Partnership to Enable Next-Generation AI and HPC Silicon
How Memory Technology Is Powering the Next Era of Compute
Share this post via:


TSMC vs Intel Foundry vs Samsung Foundry 2026