Retrieval-Augmented Generation (RAG) is a concept that’s hard to avoid encountering in the era of generative AI. RAG first emerged in 2020 when researchers developed a way to combine the best aspects of retrieval and generative AI. This approach involves AI first searching for relevant information, e.g., from a database, and then using it to generate more accurate and informative responses.
“Organizations have vast amounts of data. Finding the exact information or document can be challenging. RAG can provide a solution to this everyday challenge,” says AI expert Iidaliisa Pardalin.
For many organizations, RAG offers the opportunity to improve information accessibility and ensure that responses are based on accurate, up-to-date information. AI can quickly assist with technical questions in customer service situations or internal information searches, always based on current data.
How does RAG work?
RAG requires a vector database, where the organization’s text data is converted into numerical form. These numerical values indicate how closely the text contents are related to each other.
When a user inputs a question into the system, it retrieves relevant text contents (context) from the vector database and sends them along with the question (prompt) to the language model (LLM). Based on these, the language model responds to the user’s question.
The RAG interface is almost always chat-based, like ChatGPT. With RAG, companies can leverage their own data in AI responses.
Where is RAG best suited?
RAG works most effectively with high-quality, well-structured text-based data. This can include technical documentation, user manuals, project reports, or system code – all sources from which the language model can extract detailed and useful information for its responses. RAG helps organizations make comprehensive use of their data resources and save time.
“The best solution is achieved by addressing one use case at a time. The use case affects what data should be used in the RAG system and how it should be limited. Often, the best results are not achieved by bringing all possible data into the RAG system,” Iidaliisa emphasizes.
It’s also important to note RAG’s limitations. While the model is effective in text-based information retrieval, language models do not handle numerical data or mathematical calculations as accurately.
“A commonly used example is asking generative AI to count how many ‘r’ letters are in the word ‘strawberry.’ In this task, a preschooler still outperforms many language models.”
This is because language models are developed for linguistic analysis and do not reliably manage numerical logic.
RAG in practice: What should companies consider?
When creating a RAG system, it’s possible to ensure that users receive answers based only on the information they already have access to. This is a crucial privacy issue related to RAG. Naturally, GDPR must also be considered to prevent data from leaking outside the organization.
The omniscience associated with language models is another issue that needs to be addressed.
“The reliability of the solution is increased by forcing the model to respond only based on the company’s data. If the company’s data is technical documentation, the solution should not agree to provide information about dolphins, even though the language models used as the basis of the solution know a lot about them,” Iidaliisa notes.
For many international companies, multilingualism is a welcome feature. RAG models are language-independent but not perfect. If, for example, the company’s data is in Finnish and questions are asked in Swedish, this can cause hallucinations – responses that do not correspond to reality.
“If multilingualism is an important requirement, hallucinations can be reduced by a process where the question is translated into the data’s language before retrieving the answer.”
How to get started?
The implementation of a RAG solution should be approached carefully. The first step is to identify and prioritize use cases where better information retrieval can significantly enhance or improve operations. The next step is to identify the relevant data for these use cases and assess the current quality of the data. After this, you can move on to considering the technical implementation.
We have created a user-friendly solution for the City of Turku, making the information in the city’s case management system more easily accessible. We have also developed the ATR Codrian solution, which significantly lowers the threshold for coders to take over a foreign legacy system.
We are happy to provide consultation on RAG-related topics – book a time in our sales calendar.
Written by:
Katja Toivanen, Deputy CEO
With tips and assistance from an expert and ChatGPT
6.11.2024