Let’s Rag and Role! 

Robot rock star

When ChatGPT first came out the most common complaint was: “It is trained on data from 2021!” In response to this concern, ChatGPT introduced plugins and the browser plugin was aimed directly at this criticism. The browser plugin integrated live web searches into ChatGPT. It did this by running web searches with Bing and putting the results into the LLM prompt as a work-around for not having access to real-time information. 

Very quickly, we realized the possibility of taking knowledge, data and content from our own repositories and mimicking what this plugin was doing. But there was a snag: the context size was capped at about 50 pages of text. Since context builds with each interaction, we practically had much less than 50 pages. Not only was the context size a hurdle, but the costs of managing a huge context could skyrocket. I mean, no one’s going to upload the entire Library of Congress into a prompt—not when it could run you over $40 million! 

Enter stage right: RAG (you can dive deeper into RAG on our blog). RAG tucks away knowledge into a vector database and hunts through it to find content that may help answer your question. The bits of information it pulls are way less than the total knowledge stored, making it a cost-effective option. It sends these snippets to a large language model (LLM) to generate an answer. RAG does introduce a layer (search) which may not perform as well as pure LLMs do, but it remains a promising method and even though it’s in its infancy, you are sure to see it offered by the service desk solutions. 

What about RAG in the Enterprise? 

Integrating RAG into the enterprise realm is a natural progression of the browser plugin. By tapping into an organization’s wealth of resources, like policy documents or knowledge base articles, RAG can tailor its responses to reflect your company or school guidelines. However, the leap into the enterprise space isn’t without its challenges. High standards for data security and a nuanced understanding of different roles are paramount—after all, enterprise solutions aren’t quite like their consumer counterparts. 

Consider the scenario where different unions offer distinct benefits and perks. It’s crucial that responses from a bot are precisely aligned with the specific union’s contract. Additionally, there are layers of information accessibility, such as certain details that should only be visible to managers. These intricacies certainly add a layer of complexity to implementing enterprise RAG effectively. 

Enterprise RAG Use Case 

For this blog post, we conducted a small experiment using OpenAI’s assistants, which feature built-in RAG capabilities. We uploaded two documents into the RAG’s database: one was a manager’s guide to interviewing candidates, and the other, a guide for employees or students preparing for interviews. These documents present a unique challenge because they cover the same topic but cater to different audiences. 

We began with the question, “How do you prepare for an interview?” 

The RAG model had content addressing both the interviewer’s and the interviewee’s perspectives. We anticipated a response that would reflect both viewpoints, but unfortunately, that wasn’t the case: 

The response exclusively addressed interviewees. Given that language models operate based on probabilities, it’s more likely for someone to be an interviewee than an interviewer. However, it was surprising and a bit concerning that the model completely overlooked one of the documents and perspectives. 

Next, we asked, “What tips do you have on interviews?” Expecting an interviewee-focused reply, the result surprised us again: 

Our answer here is clearly for interviewers! The bot knows about both documents and has indexed their content, but why and how it chooses one path over another is a mystery (as with most things LLM). 

Thinking it might help to specify our role, we told the bot we were managers to see if it would alter its responses: 

These instructions had no effect on the output of the previous questions. Ok, as a final test, we will put information about the user’s role directly in the question itself.  

The results were the oddest yet blending together content for both interviewers and interviewees.  

This exercise shows that any deployment of an LLM with RAG in an enterprise setting needs to be carefully thought out, planned and tested! 

The Ida Approach 

Ida is designed with an enterprise-first mindset. From the outset, we’ve prioritized security and system integration, making these features the core of Ida’s DNA. Ida excels in using RAG to quickly enhance capabilities and deepen understanding. Equally, it recognizes when personalization and role sensitivity are crucial, adeptly managing both simultaneously through Ida’s Multi-level AI system. 

Our strategy allows client administrators to tailor Ida for personalized and predictable results in situations that demand specific attention, while still offering more flexible, ChatGPT-style interactions when possible. With Ida, you’re not confined to a single approach; instead, you can customize your chatbot experience to fit your needs. 

This flexibility is achievable because Ida operates on dedicated customer tenancies and utilizes bespoke AI models, ensuring your data and insights remain secure and protected. 

Wrapping Up 

If you are working on RAG or similar challenges and want to have a chat about your use case, contact us below.  

Contact Us