Skip to content
Athrael.net logo Athrael.net
Go back

Naive NoSQL Conversational History Retrieval for Dummies

Edit page

1

What is persistent memory in Generative AI?

Generative AI, particularly conversational AI, has made significant strides in the latest couple of years, thanks to the development of Large Language Models (LLMs) like GPT-3 and its successors. These models have the ability to generate human-like text and carry on conversations that are often indistinguishable from those with a human. However, one of the key challenges in creating truly human-like conversations is the ability to remember and recall information from previous interactions. This is where persistent memory comes into play. In this article, we’ll explore the concept of persistent memory in Generative AI and its significance in creating more human-like conversations.

One of the key aspects of implementing persistent Memory in Generative AI is to enable the AI to remember and recall information from previous interactions. To do this effectively, the AI needs to be able to store and retrieve information from a dataset of previous interactions. Storing and retrieving the dataset will depend on the architecture of the implementation, but will most likely involve a database or some form of storage system. Two common database systems used in AI are NoSQL and Vector databases.

A Typical User Interaction

2

A typical user interaction with a conversational AI might look like this:

We can iteratively build the dataset in the NoSQL database by storing each user message and the AI response as a document.

In this article, we’ll investigate an implementation of the Naive NoSQL conversational history retrieval strategy, which involves using a NoSQL database to store and retrieve conversational history, which the AI can then use to generate responses to new user inputs. We’ll explore how this strategy works, its benefits and limitations, and how it can be optimized for better performance.

Dataset & Prompt

3

An SQL representation of the sample dataset could look something like this:

| conversation_id | sender | message                                      | created_at           |
|-----------------|--------|----------------------------------------------|----------------------|
| 12345           | user   | Hello, how are you?                          | 2024-02-19T12:00:00Z |
| 12345           | ai     | I'm good, thanks. How can I help you today?  | 2024-02-19T12:01:00Z |

And in a NoSQL database, it could look something like this:

{
  "conversation_id": "12345",
  "messages": [
    {
      "conversation": {
        "id": "12345",
        "conversation_id": "12345",
        "sender": "user",
        "message": "Hello, how are you?",
        "created_at": "2024-02-19T12:00:00Z"
      }
    },
    {
      "conversation": {
        "id": "12346",
        "conversation_id": "12345",
        "sender": "ai",
        "message": "I'm good, thanks. How can I help you today?",
        "created_at": "2024-02-19T12:01:00Z"
      }
    }
  ]
}

In this schema, each conversation is composed of a user message and an ai response. The created_at field is a timestamp that indicates when the message was sent. This will be helpful later on when we want to retrieve the messages in chronological order and to help the AI retrieve information more precisely. In order to ensure that the AI better understands how to make good use of the dataset, when responding to a user message, we need to build a prompt, which will be included in every message being sent to the AI.

Here’s one example of how it could look like:

Instruction 1: Below is your conversation History. Draw inspiration from it to respond to the user's message.

HISTORY:

1. Date: 2024-02-19T12:00:00Z. Sender: User. Message: "Hello, my name is Bob."
2. Date: 2024-02-19T12:01:00Z. Sender: AI. Message: "Hi Bob. Nice to meet you."

Instruction 2: Below is the user's latest message. Use the Conversation History above to respond to it.

MESSAGE:

Date: 2024-02-19T12:02:00Z. Sender: User. Message: "Can you recall my name?"

By constructing the prompt above, we ensure that the AI has access to the conversation history and the user’s latest message, which will help it to generate a more relevant response. We need to ensure that the instructions are clear and separated from the conversation history and the user’s latest message, so that the AI can easily distinguish between them. Also, LLMs love to see bullent points, so use them as much as possible.

Some Considerations

4

It’s only natural that there may be several considerations to be addressed. Here are a few of them:

What if there is no conversation history?

What if the conversation history is too long?

Is this solution scalable?

What are the costs like?

How effective is this solution?

Can we do better?

5

This strategy would be a good starting point, but we can optimize the solution to make it more efficient, effective and potentially cheaper. Let’s explore a couple of ways to do this:

Summarize the conversation history

Step 1: summary

Instruction 1: Below is your conversation History. Summarize it to be shorter, Include all relevant information and do not exceed 500 characters.

HISTORY:

1. Date: 2024-02-19T12:00:00Z. Sender: User. Message: "Hello, my name is Bob."
2. Date: 2024-02-19T12:01:00Z. Sender: AI. Message: "Hi Bob. Nice to meet you."

Step 2: Use the summary to get a response

Instruction 1: Below is your Summarized History. Draw inspiration from it to respond to the user's message.

SUMMARY:
On 2024-02-19T12:00, user Introduced himself as Bob. On 2024-02-19T12:01, the AI confirmed kindly.

Instruction 2: Below is the user's latest message. Use the Summart above to respond to it.

MESSAGE:
Date: 2024-02-19T12:02:00Z. Sender: User. Message: "Can you recall my name?"

Positives:

Negatives:

Adding metadata to the conversation

In SQL, the dataset would look something like this:

| conversation_id | sender | message                                      | created_at           | topic           | sentiment       | type            |
|-----------------|--------|----------------------------------------------|----------------------|-----------------|-----------------|-----------------|
| 12345           | user   | Hello, how are you?                          | 2024-02-19T12:00:00Z | introduction    | positive        | casual          |
| 12345           | ai     | I'm good, thanks. How can I help you today?  | 2024-02-19T12:01:00Z | introduction    | pleasant        | casual          |

And in JSON format:

{
  "conversation_id": "12345",
  "messages": [
    {
      "conversation": {
        "id": "12345",
        "conversation_id": "12345",
        "sender": "user",
        "message": "Hello, how are you?",
        "created_at": "2024-02-19T12:00:00Z",
        "metadata": {
          "topic": "introduction",
          "sentiment": "positive",
          "type": "casual"
        }
      }
    },
    {
      "conversation": {
        "id": "12346",
        "conversation_id": "12345",
        "sender": "ai",
        "message": "I'm good, thanks. How can I help you today?",
        "created_at": "2024-02-19T12:01:00Z",
        "metadata": {
          "topic": "introduction",
          "sentiment": "pleasant",
          "type": "casual"
        }
      }
    }
  ]
}

Our search would change a little. For example, we would search our NoSQL database with keywords extracted from the message and retrieve the last N messages with a specific topic, sentiment, or type, with the highest relevance to the user’s latest message.

With this optimization, the AI can use the metadata to generate more relevant responses, since it will have more information about the context of the conversation. For example, if the conversation is about a casual topic, the AI can generate a more casual response. If the conversation is about a formal topic, the AI can generate a more formal response.

Positives:

Negatives:

Closing Thoughts

6

Persistent memory in Generative AI is a crucial component that allows the AI to remember and recall information from previous interactions. In this article, we explored the Naive NoSQL conversational history retrieval strategy, which involves using a NoSQL database to store and retrieve conversational history, which the AI can then use to generate responses to new user inputs. We also explored its benefits and limitations, and a few ways on how it can be optimized to increase its memory capacity, relevance, efficiency and cost.

There are better ways to search for metadata in a conversation, such as using a vector database, which uses similarity search to find similar conversations and can be more efficient and effective. We can also use a combination of NoSQL and Vector databases to store and retrieve the conversation history, which can be more efficient and effective than using a NoSQL database alone. Stay tuned for more on that in future articles.

Interested in collaborating? Got any cool ideas? feel free to reach out to me on GitHub, LinkedIn, or via email.

Until next time, take care and keep learning!


Edit page
Share this post on:

Previous Post
How much would it cost to store a 1 hour, 60fps 4k Video in a RAG model?
Next Post
So, I've been doing stuff...