MemoryBank: Enhancing Large Language Models with Long-Term Memory

Zhong W, Guo L, Gao Q, Ye H, Wang Y. MemoryBank: Enhancing Large Language Models with Long-Term Memory. arXiv preprint arXiv:2305.10250, 2023.
https://github.com/zhongwanjun/MemoryBank-SiliconFriend

Memory430+940+AAAI 2024CCF-A中山大学

MemoryBank：用长期记忆增强大语言模型

Abstract

Revolutionary advancements in Large Language Models (LLMs) have drastically reshaped our interactions with artificial intelligence (AI) systems, showcasing impressive performance across an extensive array of tasks. Despite this, a notable hindrance remains—the deficiency of a long-term memory mechanism within these models. This shortfall becomes increasingly evident in situations demanding sustained interaction, such as personal companion systems, psychological counseling, and secretarial assistance.

大语言模型（LLM）的革命性进展已经极大地重塑了我们与人工智能（AI）系统的交互方式，并在广泛任务上展现出令人印象深刻的表现。尽管如此，一个显著障碍仍然存在：这些模型缺乏长期记忆机制。在需要持续交互的场景中，例如个人陪伴系统、心理咨询和秘书助理，这一短板会变得越来越明显。

Recognizing the necessity for long-term memory, we propose MemoryBank, a novel memory mechanism tailored for LLMs. MemoryBank enables the models to summon relevant memories, continually evolve through continuous memory updates, comprehend, and adapt to a user's personality over time by synthesizing information from previous interactions. To mimic anthropomorphic behaviors and selectively preserve memory, MemoryBank incorporates a memory updating mechanism, inspired by the Ebbinghaus Forgetting Curve theory. This mechanism permits the AI to forget and reinforce memory based on time elapsed and the relative significance of the memory, thereby offering a more human-like memory mechanism and enriched user experience.

认识到长期记忆的必要性，作者提出 MemoryBank，这是一种为 LLM 量身设计的新型记忆机制。 MemoryBank 使模型能够调用相关记忆，并通过持续的记忆更新不断演化；它还可以综合先前交互中的信息，逐渐理解并适应用户的人格特征。为了模拟拟人化行为并有选择地保留记忆，MemoryBank 引入了一种受艾宾浩斯遗忘曲线启发的记忆更新机制。该机制允许 AI 根据经过的时间和记忆的相对重要性来遗忘或强化记忆，从而提供更接近人类的记忆机制和更丰富的用户体验。

MemoryBank is versatile in accommodating both closed-source models like ChatGPT and open-source models such as ChatGLM. To validate MemoryBank's effectiveness, we exemplify its application through the creation of an LLM-based chatbot named SiliconFriend in a long-term AI Companion scenario. Further tuned with psychological dialog data, SiliconFriend displays heightened empathy and discernment in its interactions. Experiment involves both qualitative analysis with real-world user dialogs and quantitative analysis with simulated dialogs.

MemoryBank 具有较强的通用性，既可以适配 ChatGPT 这样的闭源模型，也可以适配 ChatGLM 这样的开源模型。为了验证 MemoryBank 的有效性，作者在长期 AI 伴侣场景中创建了一个基于 LLM 的聊天机器人 SiliconFriend，作为应用示例。在进一步使用心理对话数据调优后，SiliconFriend 在交互中表现出更强的共情能力和辨别能力。实验同时包含基于真实用户对话的定性分析，以及基于模拟对话的定量分析。

In the latter, ChatGPT acts as multiple users with diverse characteristics and generates long-term dialog contexts covering a wide array of topics. The results of our analysis reveal that SiliconFriend, equipped with MemoryBank, exhibits a strong capability for long-term companionship as it can provide emphatic response, recall relevant memories and understand user personality. This underscores the effectiveness of MemoryBank.

在后者中，ChatGPT 扮演多个具有不同特征的用户，并生成覆盖广泛主题的长期对话上下文。分析结果表明，配备 MemoryBank 的 SiliconFriend 展现出很强的长期陪伴能力，因为它能够提供共情式回应、回忆相关记忆，并理解用户人格。这凸显了 MemoryBank 的有效性。

1. Introduction

The advent of Large Language Models (LLMs) such as ChatGPT and GPT-4 has led to increasing influence across various sectors, from education and healthcare to customer service and entertainment. These powerful AI systems have demonstrated a remarkable ability to understand and generate human-like responses. Despite the remarkable capabilities of LLMs, a key limitation is their lack of long-term memory, an essential aspect of human-like communication, particularly noticeable in scenarios requiring sustained interactions like personal companionship, psychological counseling, and secretarial tasks.

ChatGPT 和 GPT-4 等大型语言模型（LLM）的出现，使其在教育、医疗、客服、娱乐等多个领域的影响力不断增强。这些强大的 AI 系统展现出了理解并生成类人回应的显著能力。尽管 LLM 能力突出，但一个关键限制是它们缺乏长期记忆；长期记忆是类人交流的重要方面，在个人陪伴、心理咨询和秘书任务等需要持续交互的场景中尤其明显。

Long-term memory in AI is vital to maintain contextual understanding, ensure meaningful interactions and understand user behaviors over time. For instance, personal AI companions need to recall past conversations for rapport building. In psychological counseling, an AI can provide more effective support with knowledge of the user's history and past emotional states. Similarly, secretarial AI requires memory for task management and preference recognition. The absence of long-term memory in LLMs hinders their performance and user experience. Therefore, it is essential to develop AI systems with improved memory capabilities for a more seamless and personalized interaction.

AI 中的长期记忆对于维持上下文理解、保证有意义的交互，以及随时间理解用户行为都至关重要。例如，个人 AI 伴侣需要回忆过去对话来建立亲近感。在心理咨询中，如果 AI 了解用户历史和过去的情绪状态，就能提供更有效的支持。类似地，秘书型 AI 也需要记忆来进行任务管理和偏好识别。 LLM 缺乏长期记忆，会限制其性能和用户体验。因此，开发具有更强记忆能力的 AI 系统，对于实现更流畅、更个性化的交互至关重要。

Therefore, we introduce MemoryBank, a novel mechanism designed to provide LLMs with the ability to retain long-term memory and draw user portraits. MemoryBank enables LLMs to recall historical interactions, continually evolve their understanding of context, and adapt to a user's personality based on past interactions, thereby enhancing their performance in long-term interaction scenarios. Inspired by the Ebbinghaus Forgetting Curve theory, a well-established psychological principle that describes how the strength of memory decreases over time, MemoryBank further incorporates a dynamic memory mechanism closely mirroring human cognitive process. This mechanism empowers the AI to remember, selectively forget, and strengthen memories based on time elapsed, offering more natural and engaging user experience. Specifically, MemoryBank is built on a memory storage with memory retrieval and updating mechanism, and ability to summarize past events and users' personality.

因此，作者提出 MemoryBank，这是一种新机制，旨在赋予 LLM 保留长期记忆并描绘用户画像的能力。 MemoryBank 使 LLM 能够回忆历史交互，不断演化其对上下文的理解，并基于过去交互适应用户人格，从而提升其在长期交互场景中的表现。 MemoryBank 还受到艾宾浩斯遗忘曲线启发；该理论是一个成熟的心理学原则，用来描述记忆强度如何随时间下降。这一机制使 AI 能够根据经过的时间来记住、有选择地遗忘并强化记忆，从而带来更自然、更有吸引力的用户体验。具体而言，MemoryBank 建立在记忆存储、记忆检索与更新机制之上，并具备总结过去事件和用户人格的能力。

MemoryBank is versatile as it can accommodate both closed-source LLMs like ChatGPT and open-source LLMs like ChatGLM or BELLE. To exemplify the practical implications of MemoryBank, we develop SiliconFriend, an LLM-based AI Companion chatbot integrated with this innovative memory mechanism. SiliconFriend is designed to retain and reference past interactions, reinforcing the transformative influence of MemoryBank in crafting a more personable AI companion. A distinctive features of SiliconFriend is its tuning with 38k psychological conversations, collected from various online sources, which enables it to exhibit empathy, carefulness, and provide useful guidance, making it adept at handling emotionally charged dialogues.

MemoryBank 具有通用性，因为它既能适配 ChatGPT 这样的闭源 LLM，也能适配 ChatGLM 或 BELLE 这样的开源 LLM。为了展示 MemoryBank 的实际意义，作者开发了 SiliconFriend，这是一个集成了该创新记忆机制、基于 LLM 的 AI 伴侣聊天机器人。 SiliconFriend 被设计为能够保留并引用过去交互，从而体现 MemoryBank 在打造更具人格感的 AI 伴侣方面的变革性影响。 SiliconFriend 的一个显著特点是使用来自多个线上来源的 3.8 万条心理对话进行调优，这使其能够表现出共情、谨慎并提供有用指导，因此更擅长处理带有强烈情绪的对话。

Moreover, one of the standout capabilities of SiliconFriend is to understand a user's personality by summarizing from past interactions, which empowers it to tailor responses to the user's individual traits, thereby enhancing user experience. Additionally, SiliconFriend supports bilingual functionality, catering to users who communicate in English and Chinese. This multi-language support broadens its accessibility and usability across different user groups. SiliconFriend is implemented with two open-source models, ChatGLM and BELLE, along with one closed-source model, ChatGPT, showcasing the versatility of MemoryBank in accommodating different LLMs.

此外，SiliconFriend 的一个突出能力是通过总结过去交互来理解用户人格，这使它能够根据用户个体特征定制回应，从而提升用户体验。另外，SiliconFriend 支持双语功能，能够服务使用英语和中文交流的用户。这种多语言支持拓宽了它在不同用户群体中的可访问性和可用性。 SiliconFriend 使用两个开源模型 ChatGLM 和 BELLE，以及一个闭源模型 ChatGPT 实现，展示了 MemoryBank 对不同 LLM 的适配能力。

To evaluate the effectiveness of MemoryBank, we conduct evaluations covering both qualitative and quantitative analyses, where the former involves real-world user dialogs and the latter employs simulated dialogs. For the quantitative analysis, we create a memory storage consisting of 10 days of conversations encompassing a diverse range of topics. These conversations involve 15 virtual users with diverse personalities, for which ChatGPT plays the role of users and generates dialog contexts according to their personalities. Based on this memory storage, we design 194 probing questions to assess whether the model could successfully recall pertinent memories and provide appropriate responses. Experiment results showcase the capabilities of SiliconFriend in memory recall, provision of empathetic companionship, and understanding of user portraits. These findings corroborate the potential of MemoryBank to significantly improve the performance of LLMs in long-term interaction scenarios.

为了评估 MemoryBank 的有效性，作者进行了同时覆盖定性和定量分析的评估：前者涉及真实用户对话，后者使用模拟对话。在定量分析中，作者构建了一个记忆存储，其中包含 10 天的对话，覆盖多种主题。这些对话涉及 15 个具有不同人格的虚拟用户；ChatGPT 扮演这些用户，并根据其人格生成对话上下文。基于这一记忆存储，作者设计了 194 个探查问题，用来评估模型能否成功回忆相关记忆并提供合适回应。实验结果展示了 SiliconFriend 在记忆回忆、共情式陪伴以及理解用户画像方面的能力。这些发现证实了 MemoryBank 在长期交互场景中显著提升 LLM 表现的潜力。

In this paper, we summarize the key contributions as follows:

We introduce MemoryBank, a novel human-like long-term memory mechanism, which enables LLMs to store, recall, update memory, and draw user portrait.
We demonstrate the practical applicability of MemoryBank through SiliconFriend, an LLM-based AI companion equipped with MemoryBank and tuned with psychological dialogs.
It can recall past memories, provide empathetic companionship, and understand user behaviors.
We show the generalizability of MemoryBank in three key aspects: (1) Accommodation of both open-source and closed-source LLMs; (2) Bilingual ability in both Chinese and English; (3) Applicability with and without memory forgetting mechanism.

本文的主要贡献总结如下：

作者提出 MemoryBank，这是一种新型类人长期记忆机制，使 LLM 能够存储、回忆和更新记忆，并描绘用户画像。
作者通过 SiliconFriend 展示了 MemoryBank 的实际适用性；SiliconFriend 是一个配备 MemoryBank、并用心理对话调优的基于 LLM 的 AI 伴侣。
它能够回忆过去记忆、提供共情式陪伴，并理解用户行为。
作者从三个关键方面展示 MemoryBank 的泛化性：（1）同时适配开源和闭源 LLM；（2）同时具备中文和英文双语能力；（3）可在启用或不启用记忆遗忘机制的情况下使用。

2. MemoryBank: A Novel Memory Mechanism Tailored for LLMs

In this section, we provide a detailed description of MemoryBank, our novel memory mechanism designed for LLMs. As shown in Figure 1, MemoryBank is a unified mechanism structured around three central pillars: (1) a memory storage (Section 2.1) serving as the primary data repository, (2) a memory retriever (Section 2.2) for context-specific memory recollection, and (3) a memory updater (Section 2.3) drawing inspiration from the Ebbinghaus Forgetting Curve theory, a time-tested psychological principle pertaining to memory retention and forgetting.

本节详细介绍 MemoryBank，这是作者为 LLM 设计的新型记忆机制。如图1 所示，MemoryBank 是一个统一机制，围绕三个核心支柱构建：（1）作为主要数据仓库的记忆存储（第 2.1 节），（2）用于特定上下文记忆回忆的记忆检索器（第 2.2 节），以及（3）受艾宾浩斯遗忘曲线启发的记忆更新器（第 2.3 节）；艾宾浩斯遗忘曲线是关于记忆保持与遗忘的经典心理学原则。

Overview of MemoryBank — 图1：MemoryBank 概览。记忆存储保存过去对话、总结事件和用户画像；记忆更新机制负责更新记忆存储；记忆检索负责回忆相关记忆。SiliconFriend 是一个基于 LLM、并由 MemoryBank 增强的 AI 伴侣。

2.1 Memory Storage: The Warehouse of MemoryBank

Memory storage, the warehouse of MemoryBank, is a robust data repository holding a meticulous array of information. As shown in Figure 1, it stores daily conversations records, summaries of past events, and evolving assessments of user personalities, thereby constructing a dynamic and multi-layered memory landscape.

记忆存储是 MemoryBank 的仓库，是一个强健的数据仓库，保存着细致组织的信息。如图1 所示，它存储每日对话记录、过去事件摘要，以及对用户人格不断演化的评估，从而构建出动态、多层次的记忆景观。

In-Depth Memory Storage

MemoryBank's storage system captures the richness of AI-user interactions by recording multi-turn conversations in a detailed, chronological fashion. Each piece of dialogue is stored with timestamps, creating an ordered narrative of past interactions. This detailed record not only aids in precise memory retrieval but also facilitates the memory updating process afterwards, offering a detailed index of conversational history.

MemoryBank 的存储系统通过以详细、按时间顺序的方式记录多轮对话，捕捉 AI 与用户交互的丰富性。每一段对话都会连同时间戳一起存储，从而形成过去交互的有序叙事。这种详细记录不仅有助于精确记忆检索，也方便后续记忆更新过程，为对话历史提供细粒度索引。

Hierarchical Event Summary

Reflecting the intricacies of human memory, MemoryBank goes beyond mere detailed storage. It processes and distills conversations into a high-level summary of daily events, much like how humans remember key aspects of their experiences. We condense verbose dialogues into a concise daily event summary, which is further synthesized into a global summary. This process results in a hierarchical memory structure, providing a bird's eye view of past interactions and significant events. Specifically, taken previous daily conversations or daily events as input, we ask the LLMs to summarize daily events or global events with the prompt “Summarize the events and key information in the content [dialog/events]”.

为了反映人类记忆的复杂性，MemoryBank 不只是进行详细存储。它会处理并提炼对话，形成每日事件的高层摘要，就像人类会记住经历中的关键方面一样。作者将冗长对话压缩为简洁的每日事件摘要，并进一步综合为全局摘要。这一过程形成了层级化记忆结构，为过去交互和重要事件提供鸟瞰式视图。具体来说，作者以前一天的每日对话或每日事件为输入，要求 LLM 使用提示“Summarize the events and key information in the content [dialog/events]”来总结每日事件或全局事件。

Dynamic Personality Understanding

MemoryBank focuses on user personality understanding. It continuously assesses and updates these understandings with the long-term interactions and creates daily personality insights. These insights are further aggregated to form a global understanding of the user's personality. This multi-tiered approach results in an AI companion that learns, adapts, and tailors its responses to the unique traits of each user, enhancing user experience. Specially, taken the daily conversations or personality analysis, we ask the LLM to deduce with prompts: “Based on the following dialogue, please summarize the user's personality traits and emotions. [dialog]” or “The following are the user's exhibited personality traits and emotions throughout multiple days. Please provide a highly concise and general summary of the user's personality [daily Personalities]”.

MemoryBank 关注对用户人格的理解。它会在长期交互中持续评估并更新这些理解，并生成每日人格洞察。这些洞察会进一步聚合，形成对用户人格的全局理解。这种多层方法使 AI 伴侣能够学习、适应，并根据每个用户的独特特征定制回应，从而提升用户体验。具体来说，作者以每日对话或人格分析为输入，要求 LLM 使用如下提示进行推断：“Based on the following dialogue, please summarize the user's personality traits and emotions. [dialog]”或“The following are the user's exhibited personality traits and emotions throughout multiple days. Please provide a highly concise and general summary of the user's personality [daily Personalities]”。

2.2 Memory Retrieval

Built on the robust infrastructure of memory storage, our memory retrieval mechanism operates akin to a knowledge retrieval task. In this context, we adopt a dual-tower dense retrieval model similar to Dense Passage Retrieval. In this paradigm, every turn of conversations and event summaries is considered as a memory piece $m$ , which is pre-encoded into a contextual representation $h_{m}$ using the encoder model $E (\cdot)$ . Consequently, the entire memory storage $M$ is pre-encoded into $M = {h_{m}^{0}, h_{m}^{1}, . . . h_{m}^{| M |}}$ , where each $h_{m}$ is a vector representation of a memory piece. These vector representations are then indexed using FAISS for efficient retrieval.

基于稳健的记忆存储基础设施，作者的记忆检索机制类似于一个知识检索任务。在这一背景下，作者采用了类似 Dense Passage Retrieval 的双塔稠密检索模型。在该范式中，每一轮对话和事件摘要都被视为一个记忆片段 $m$ ，并使用编码器模型 $E (\cdot)$ 预先编码成上下文表示 $h_{m}$ 。因此，整个记忆存储 $M$ 会被预编码为 $M = {h_{m}^{0}, h_{m}^{1}, . . . h_{m}^{| M |}}$ ，其中每个 $h_{m}$ 都是一个记忆片段的向量表示。随后，这些向量表示会使用 FAISS 建索引，以支持高效检索。

Parallel to this, the current context of conversation $c$ is encoded by $E (\cdot)$ into $h_{c}$ , which serves as the query to search $M$ for the most relevant memory. In practice, the encoder $E (\cdot)$ can be interchanged to any suitable model.

与此同时，当前对话上下文 $c$ 会由 $E (\cdot)$ 编码为 $h_{c}$ ，并作为查询去搜索 $M$ 中最相关的记忆。在实践中，编码器 $E (\cdot)$ 可以替换为任何合适的模型。

2.3 Memory Updating Mechanism

With the persistent memory storage and the memory retrieval mechanism discussed in Section 2.1 and Section 2.2, the memorization capability of LLMs can be greatly enhanced. However, for scenarios that expect more anthropopathic memory behavior, memory updating is needed. These scenarios include AI companion, virtual IP, etc. Forgetting less important memory pieces that are long time ago and have not been recalled much can make the AI companion more natural.

借助第 2.1 节和第 2.2 节讨论的持久记忆存储与记忆检索机制，LLM 的记忆能力可以得到显著增强。然而，在期望更拟人化记忆行为的场景中，还需要记忆更新。这些场景包括 AI 伴侣、虚拟 IP 等。遗忘那些发生在很久以前、且很少被回忆的不重要记忆片段，可以让 AI 伴侣显得更自然。

Our memory forgetting mechanism is inspired from Ebbinghaus Forgetting Curve theory and follow the following principle rules:

Rate of Forgetting. Ebbinghaus found that memory retention decreases over time. He quantified this in his forgetting curve, showing that information is lost rapidly after learning unless it is consciously reviewed.
Time and Memory Decay. The curve is steep at the beginning, indicating that a significant amount of learned information is forgotten within the first few hours or days after learning. After this initial period, the rate of memory loss slows down.
Spacing Effect. Ebbinghaus discovered that relearning information is easier than learning it for the first time. Regularly revisiting and repeating the learned material can reset the forgetting curve, making it less steep and thereby improving memory retention.

作者的记忆遗忘机制受到艾宾浩斯遗忘曲线理论启发，并遵循以下原则：

遗忘速率。 艾宾浩斯发现，记忆保持会随时间下降。他在遗忘曲线中对此进行了量化，表明如果不主动复习，信息会在学习后迅速流失。
时间与记忆衰减。 曲线在开始阶段很陡，表示大量已学信息会在学习后的最初几小时或几天内被遗忘。在这一初始阶段之后，记忆流失速度会变慢。
间隔效应。 艾宾浩斯发现，重新学习信息比第一次学习更容易。定期回顾和重复已学材料可以重置遗忘曲线，使其不那么陡峭，从而提升记忆保持。

The Ebbinghaus forgetting curve is expressed using an exponential decay model: $R = e^{- \frac{t}{S}}$ , where $R$ is the memory retention, or what fraction of the information can be retained. $t$ is the time elapsed since learning the information. $e$ is approximately equal to 2.71828. $S$ is the memory strength, which changes based on factors such as the depth of learning and the amount of repetition. To simply memory updating process, we model $S$ as a discrete value and initialize it with $1$ upon its first mention in a conversation. When a memory item is recalled during conversations, it will persist longer in memory. We increase $S$ by 1 and reset $t$ to 0, hence forget it with a lower probability.

艾宾浩斯遗忘曲线可以用指数衰减模型表示： $R = e^{- \frac{t}{S}}$ ，其中 $R$ 是记忆保持率，也就是能够被保留下来的信息比例。 $t$ 表示自学习该信息以来经过的时间。 $e$ 约等于 2.71828。 $S$ 表示记忆强度，它会根据学习深度和重复次数等因素变化。为了简化记忆更新过程，作者将 $S$ 建模为离散值，并在某条记忆首次出现在对话中时将其初始化为 $1$ 。当某个记忆项在对话中被回忆时，它会在记忆中保持更久。作者将 $S$ 增加 1，并把 $t$ 重置为 0，因此该记忆会以更低概率被遗忘。

It is important to note that this is an exploratory and highly simplified memory updating model. Real-life memory processes are more complex and can be influenced by a variety of factors. The forgetting curve will look different for different people and different types of information. In summary, MemoryBank weaves together these critical components to form a more comprehensive memory management system for LLMs. It enhances their ability to provide meaningful and personalized interactions over extended periods, opening up new possibilities for AI applications.

需要注意的是，这是一个探索性的、高度简化的记忆更新模型。现实生活中的记忆过程更加复杂，并会受到多种因素影响。对于不同人和不同类型的信息，遗忘曲线都会有所不同。总体而言，MemoryBank 将这些关键组件编织在一起，形成了一个更全面的 LLM 记忆管理系统。它增强了 LLM 在长时间跨度内提供有意义、个性化交互的能力，为 AI 应用打开了新的可能性。

3. SiliconFriend: An AI Chatbot Companion Powered by MemoryBank

To demonstrate the practicality of MemoryBank in the field of long-term personal AI companionship, we create an AI chatbot named SiliconFriend. It is designed to serve as an emotional companion for users, recalling pertinent user memories, and understanding users' personalities and emotional states. Our implementation demonstrates adaptability by integrating three powerful LLMs that originally lack long-term memory and specific adaptation to the psychology domain.

为了展示 MemoryBank 在长期个人 AI 陪伴领域中的实用性，作者创建了一个名为 SiliconFriend 的 AI 聊天机器人。它被设计为用户的情感陪伴者，能够回忆相关用户记忆，并理解用户的人格和情绪状态。作者的实现通过集成三个强大的 LLM 展示了适配性；这些 LLM 原本都缺乏长期记忆，也没有专门适配心理学领域。

1) ChatGPT, a closed-source conversation model built by OpenAI, is a proprietary conversational AI model known for its ability to facilitate dynamic and interactive conversations. This model is trained on vast amount of data and further fine-tuned with reinforcement learning from human feedback. This approach enables ChatGPT to generate responses that are not only contextually appropriate but also closely align with human conversational expectations.

1）ChatGPT 是 OpenAI 构建的闭源对话模型，是一个专有对话式 AI 模型，以促进动态交互式对话的能力而闻名。该模型在海量数据上训练，并进一步通过基于人类反馈的强化学习进行微调。这种方法使 ChatGPT 能够生成不仅符合上下文，而且与人类对话期望高度一致的回应。

2) ChatGLM: ChatGLM is an open-source bilingual language model founded on the General Language Model (GLM) framework. This model is characterized by its 6.2 billion parameters and its specific optimization for Chinese dialogue data. The model's training involves processing approximately one trillion tokens of Chinese and English text, supplemented by supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback.

2）ChatGLM：ChatGLM 是一个基于 General Language Model（GLM）框架的开源双语语言模型。该模型具有 62 亿参数，并针对中文对话数据进行了专门优化。模型训练涉及约一万亿个中文和英文 token，并辅以监督微调、反馈自举，以及基于人类反馈的强化学习。

3) BELLE: BELLE is an open-source bilingual language model that is continuously fine-tuned from 7B LLaMA. BELLE's feature is its automated instruction data synthesis using ChatGPT, which enhances its Chinese conversation ability.

3）BELLE：BELLE 是一个从 7B LLaMA 持续微调而来的开源双语语言模型。 BELLE 的特点是使用 ChatGPT 自动合成指令数据，从而增强其中文对话能力。

The development of SiliconFriend is divided into two stages. The first stage (only for open-source LLMs) involves parameter-efficient tuning of the LLM with psychological dialogue data. This step is crucial as it allows SiliconFriend to offer useful and empathetic emotional support to users, mirroring the understanding and compassionate responses one would expect from a human companion. The second stage is to integrate MemoryBank into SiliconFriend, thereby instilling it with a robust memory system. MemoryBank allows the chatbot to retain, recall, and leverage past interactions and user portrait, providing a richer, more personalized user experience.

SiliconFriend 的开发分为两个阶段。第一阶段（仅针对开源 LLM）使用心理对话数据对 LLM 进行参数高效调优。这一步非常关键，因为它使 SiliconFriend 能够为用户提供有用且共情的情感支持，模拟人类陪伴者应有的理解与关怀回应。第二阶段是将 MemoryBank 集成到 SiliconFriend 中，从而为其注入一个强健的记忆系统。 MemoryBank 允许聊天机器人保留、回忆并利用过去交互和用户画像，从而提供更丰富、更个性化的用户体验。

Parameter-efficient Tuning with Psychological Dialogue Data

The initial stage of SiliconFriend's development involves tuning the LLMs using a dataset of 38k psychological dialogues. This data, parsed from online sources, comprises a range of conversations that cover an array of emotional states and responses. This tuning process enables SiliconFriend to understand and respond to emotional cues effectively, mimicking the empathy, understanding, and support of a human companion. It equips the AI with the ability to handle emotionally guided conversations with psychological knowledge, provide meaningful emotional support to users based on their emotional state.

SiliconFriend 开发的初始阶段，是使用包含 3.8 万条心理对话的数据集调优 LLM。这些数据来自线上来源的解析，包含覆盖多种情绪状态和回应方式的一系列对话。这一调优过程使 SiliconFriend 能够有效理解并回应情绪线索，模拟人类陪伴者的共情、理解和支持。它赋予 AI 使用心理学知识处理情绪导向对话的能力，并根据用户情绪状态提供有意义的情感支持。

To adapt LLMs to scenarios with limited computational resources, we utilize a computation-efficient tuning approach, known as the Low-Rank Adaptation (LoRA) method. LoRA significantly reduces the quantity of trainable parameters by learning pairs of rank-decomposition matrices, while keeping the original weights frozen. Formally, consider a linear layer defined as $y = W x$ with weight $W$ . LoRA modifies this into $y = W x + B A x$ , where $W \in R^{d \times k}$ , $B \in R^{d \times r}$ , $A \in R^{r \times k}$ , and $r ≪ min (d, k)$ . This method greatly reduce amount of parameters need to be learned, which is crucial for efficiency in resource-limited scenarios. We set LoRA rank $r$ as 16 and train the model for 3 epochs with an A100 GPU.

为了让 LLM 适应计算资源有限的场景，作者使用了一种计算高效的调优方法，即低秩适配（LoRA）。 LoRA 在冻结原始权重的同时，通过学习成对的秩分解矩阵，大幅减少可训练参数数量。形式化地，考虑一个由权重 $W$ 定义的线性层 $y = W x$ 。 LoRA 将其修改为 $y = W x + B A x$ ，其中 $W \in R^{d \times k}$ ， $B \in R^{d \times r}$ ， $A \in R^{r \times k}$ ，且 $r ≪ min (d, k)$ 。这种方法大幅减少需要学习的参数量，这对于资源受限场景中的效率至关重要。作者将 LoRA rank $r$ 设为 16，并使用一块 A100 GPU 训练模型 3 个 epoch。

Noting that this stage is only conducted for open-source LLMs like ChatGLM and BELLE. In essence, this stage lays the foundation for SiliconFriend's role as an empathetic AI companion, ensuring it can respond appropriately and helpfully to users' emotional needs.

需要注意的是，该阶段只针对 ChatGLM 和 BELLE 这样的开源 LLM 进行。本质上，这一阶段为 SiliconFriend 作为共情型 AI 伴侣奠定基础，确保它能够对用户的情绪需求作出合适且有帮助的回应。

Integration with MemoryBank

The second stage in SiliconFriend's development involves the integration of MemoryBank. This stage is vital as it equips SiliconFriend with the ability to store, retrieve past interactions and understand user portraits, thereby offering a more personalized and engaging user experience. When it comes to memory storage, the dialogues between SiliconFriend and users are logged and updated in the memory storage, a process that is adaptable across various model backbones. The memory updating mechanism operates using principles inspired by the Ebbinghaus Forgetting Curve theory, allowing for a realistic and human-like memory recall process.

SiliconFriend 开发的第二阶段涉及集成 MemoryBank。这一阶段非常重要，因为它赋予 SiliconFriend 存储和检索过去交互、理解用户画像的能力，从而提供更个性化、更有吸引力的用户体验。在记忆存储方面，SiliconFriend 与用户之间的对话会被记录并更新到记忆存储中；这一过程可适配不同模型骨干。记忆更新机制采用受艾宾浩斯遗忘曲线理论启发的原则运行，从而支持更真实、更类人的记忆回忆过程。

During real-time conversation, the user's conversation serves as the query for memory retrieval. In practice, we use LangChain for memory retrieval. LangChain supports open-source embedding models and FAISS indexing, making it a versatile choice. In language-specific implementations of the open-source version of SiliconFriend, we use MiniLM as the embedding model for English and Text2vec for Chinese. It is worth noting that the embedding models can be flexibly interchanged to suit varying needs, even accommodating multi-lingual models.

在实时对话中，用户对话会作为记忆检索的查询。在实践中，作者使用 LangChain 进行记忆检索。 LangChain 支持开源嵌入模型和 FAISS 索引，因此是一个通用选择。在 SiliconFriend 开源版本的特定语言实现中，作者使用 MiniLM 作为英文嵌入模型，使用 Text2vec 作为中文嵌入模型。值得注意的是，嵌入模型可以根据不同需求灵活替换，甚至可以适配多语言模型。

Upon memory retrieval, a series of information is organized into the conversation prompt, including relevant memory, global user portrait, and global event summary. Consequently, SiliconFriend can generate responses that refer past memories and deliver interactions tailored to the user's portrait. In conclusion, these stages transform SiliconFriend from a standard AI chatbot into a long-term AI companion, capable of remembering and learning from past interactions to provide personalized and empathetic user experience.

完成记忆检索后，一系列信息会被组织进对话提示中，包括相关记忆、全局用户画像和全局事件摘要。因此，SiliconFriend 可以生成引用过去记忆的回应，并提供契合用户画像的交互。总而言之，这些阶段将 SiliconFriend 从标准 AI 聊天机器人转变为长期 AI 伴侣，使其能够记住并学习过去交互，从而提供个性化且共情的用户体验。

4. Experiments

Example of consulting SiliconFriend_ChatGLM for psychological companionship — 图2：向 SiliconFriend_ChatGLM 寻求心理陪伴的示例。总体而言，SiliconFriend 能够提供更具共情的回应，向用户提供建设性的情感支持，并帮助用户以积极态度面对悲伤。

The primary objective of our experiments is to evaluate the efficacy of MemoryBank within the framework of an LLM, specifically in its ability as an AI companion. We are particularly interested in determining whether embedding a long-term memory module could augment the AI's proficiency in recalling historical interactions and deepening its understanding of user personalities. Additionally, we aim to testify whether the tuning based on psychological data can bolster the AI's capability to provide more effective emotional support.

作者实验的主要目标，是评估 MemoryBank 在 LLM 框架中的有效性，尤其是它作为 AI 伴侣的能力。作者特别关注：嵌入长期记忆模块是否能够增强 AI 回忆历史交互的能力，并加深其对用户人格的理解。此外，作者还希望验证基于心理数据的调优是否能够增强 AI 提供更有效情感支持的能力。

The qualitative analysis focuses on three aspects: (1) a comparative study between SiliconFriend and baseline LLMs to evaluate their capabilities in providing empathetic and beneficial psychological companionship; (2) an investigation into SiliconFriend's memory recall ability; (3) an analysis of how the model's understanding of user profiles influences the responses. Moreover, to demonstrate the model's proficiency in memory recall on a broader scale, we design a qualitative analysis that uses simulated long-term dialog history and 194 memory probing questions. This simulated dialog history, spanning a period of 10 days and encompassing a wide array of topics, is produced by ChatGPT through the role-play of 15 distinct virtual users, each embodying the users' personality.

定性分析关注三个方面：（1）比较 SiliconFriend 与基线 LLM，以评估它们提供共情且有益的心理陪伴的能力；（2）考察 SiliconFriend 的记忆回忆能力；（3）分析模型对用户画像的理解如何影响回应。此外，为了在更大范围内展示模型的记忆回忆能力，作者设计了使用模拟长期对话历史和 194 个记忆探查问题的定性分析。这些模拟对话历史横跨 10 天并覆盖广泛主题，由 ChatGPT 通过扮演 15 个不同虚拟用户生成，每个虚拟用户都体现了对应用户人格。

4.1 Qualitative Analysis

The qualitative analysis is conducted by showcasing practical examples of SiliconFriend's capabilities. To gather these examples, we have developed an online platform for SiliconFriend and collected real-time conversations from actual users.

定性分析通过展示 SiliconFriend 能力的实际示例来进行。为了收集这些示例，作者为 SiliconFriend 开发了一个在线平台，并从真实用户那里收集实时对话。

Psychological Companionship

The ability to exhibit empathy in a conversation is a key attribute of an effective AI companion. To evaluate models' ability to provide psychological comfort to users, we compared the responses shown by SiliconFriend with that of the baseline LLMs in real-world conversations. As demonstrated in Figure 2, when a user expresses emotional difficulties and seeks assistance from SiliconFriend, the model is capable of delivering empathetic responses along with constructive suggestions. SiliconFriend's responses stand out due to their emotional support, showcasing a stark contrast to its baseline ChatGLM.

在对话中表现出共情，是有效 AI 伴侣的关键属性。为了评估模型为用户提供心理安慰的能力，作者在真实对话中比较了 SiliconFriend 与基线 LLM 的回应。如图2 所示，当用户表达情绪困难并向 SiliconFriend 寻求帮助时，模型能够给出共情式回应和建设性建议。 SiliconFriend 的回应因其情感支持而突出，与其基线 ChatGLM 形成鲜明对比。

Memory Recall Analysis

Example responses from SiliconFriend_BELLE in memory recall — 图3：SiliconFriend_BELLE 在记忆回忆中的回应示例。

To evaluate SiliconFriend's ability in memory recall, we integrate memory probing questions into the dialogues. These questions are designed to prompt SiliconFriend to retrieve specific details from the chat history. As shown in Figure 3, the user and SiliconFriend engaged in a discussion about programming learning suggestions. Several days later, the user posed several memory probing questions. SiliconFriend successfully recalled previously suggested book and algorithm. Furthermore, it correctly identified an event (i.e., the heap sort algorithm) that had not been discussed before. These instances underscore SiliconFriend's successful memory recall and recognition capabilities.

为了评估 SiliconFriend 的记忆回忆能力，作者将记忆探查问题整合进对话中。这些问题旨在促使 SiliconFriend 从聊天历史中检索特定细节。如图3 所示，用户和 SiliconFriend 曾围绕编程学习建议展开讨论。几天后，用户提出了几个记忆探查问题。 SiliconFriend 成功回忆起之前建议过的书籍和算法。此外，它还正确识别出一个此前未讨论过的事件，即堆排序算法。这些例子凸显了 SiliconFriend 成功的记忆回忆和识别能力。

Personality Interaction Analysis

Example responses from SiliconFriend_ChatGPT to users with different personalities — 图4：SiliconFriend_ChatGPT 面向不同人格用户的回应示例。

As shown in Figure 4, we examine the capability of SiliconFriend with users of diverse personalities. We observe that it effectively recommend activities tailored to users' interests based on their character traits. This analysis demonstrates SiliconFriend's ability to draw interact effectively with various user personalities.

如图4 所示，作者考察了 SiliconFriend 面对不同人格用户时的能力。作者观察到，它能够基于用户的性格特征，有效推荐符合用户兴趣的活动。这一分析表明，SiliconFriend 具备与多种用户人格进行有效互动的能力。

4.2 Quantitative Analysis

Quantitative analysis is conducted to exemplify the memory recall ability of SiliconFriend in a larger scale. We ask the human annotators to score the retrieved memories and responses from the models: (1) SiliconFriend_ChatGPT; (2) SiliconFriend_ChatGLM; (3) SiliconFriend_BELLE.

定量分析旨在更大规模地展示 SiliconFriend 的记忆回忆能力。作者要求人工标注者为模型检索到的记忆和回应打分，模型包括：（1）SiliconFriend_ChatGPT；（2）SiliconFriend_ChatGLM；（3）SiliconFriend_BELLE。

Memory Storage Construction

We initially establish an evaluation foundation with a memory storage of 10 days of conversations involving 15 virtual users. These users have diverse personalities and dialogue on each day covers at least two topics. User meta-information, including names, personalities, and interested topics is generated using ChatGPT. Conversations are synthesized by users acted by ChatGPT based on predefined topics and user personalities. We create memory storages in both English and Chinese. After memory storage construction, we manually write 194 probing questions (97 in English and 97 in Chinese) to assess whether the model could accurately recall pertinent memory and appropriately formulate answers. Table 1 presents an example of user meta-information, generated conversations, and probing questions.

作者首先建立了一个评估基础：一个包含 15 个虚拟用户、跨度 10 天对话的记忆存储。这些用户具有不同人格，并且每天的对话至少覆盖两个主题。用户元信息，包括姓名、人格和感兴趣主题，由 ChatGPT 生成。对话则由 ChatGPT 扮演用户，并根据预定义主题和用户人格合成。作者分别创建了英文和中文的记忆存储。构建好记忆存储后，作者人工编写了 194 个探查问题（英文 97 个、中文 97 个），用于评估模型能否准确回忆相关记忆并恰当地组织答案。表1 展示了用户元信息、生成对话和探查问题的示例。

表1：记忆库中一个记忆片段（5 月 3 日）、对应探查问题（5 月 10 日）以及三个 SiliconFriend 变体输出的示例。三个模型都能正确且连贯地回答。

Meta-information of User	Name: Gary Hobbies: racing, business, chess, painting, wine Personality: decisive and straightforward, outspoken, helpful
Past Conversation (May 3th)	User: I've been feeling a bit stressed out lately and my sleep hasn't been great. Do you have any good ways to relieve stress? AI: It's true that prolonged work or life stress can affect our physical and mental health. There are many ways to relieve stress, such as doing moderate exercise, listening to music, reading, talking to friends, etc. You can try to add more entertainment and leisure activities into your daily life, and reduce work pressure and negative effects. ... (The rest of conversation cover topics about movies and TV shows recommendations.)
Probing Question (May 10th)	What good ways did you recommend me to relieve stress?
Model Output SiliconFriend (ChatGLM)	I recommended that you try doing some moderate exercise, listening to music, reading, talking to friends, or adding more entertainment and leisure activities into your daily life to reduce work pressure and negative effects. You can also try to add more of these things into your life in moderation, as it is important to balance things out and not overdo it. If you are still feeling stressed, I would recommend seeking the help of a professional counselor or therapist who can provide further guidance and support.
Model Output SiliconFriend (BELLE)	I recommended some ways to relieve stress such as doing moderate exercise, listening to music, reading, talking to friends, etc. You can try to add more entertainment and leisure activities into your daily life, and reduce work pressure and negative effects.
Model Output SiliconFriend (ChatGPT)	Based on our past conversations, I suggested doing moderate exercise, listening to music, reading, talking to friends, and adding more leisure activities to your daily life as good ways to relieve stress. Exercise can help release endorphins and boost your mood, while music and reading can provide a mental escape from stress. Talking to friends can also help alleviate stress by providing a social support system. Lastly, adding leisure activities to your daily routine can be a fun and effective way of reducing stress levels.

Evaluation Metrics

The performance of models is assessed based on the following metrics. (1) Memory Retrieval Accuracy: Determines if related memory can be successfully retrieved. The labels are 0: no and 1: yes. (2) Response Correctness: Evaluates if the response contains the correct answer to the probing question. The labels are 0: wrong, 0.5: partial, and 1: correct. (3) Contextual Coherence: Assesses whether the response is naturally and coherently structured, connecting the dialogue context and retrieved memory. The labels are 0: not coherent, 0.5: partially coherent, and 1: coherent. (4) Model Ranking Score: Ranks outputs from the three SiliconFriend variants for the same question and context. The variants are SiliconFriend_ChatGLM, SiliconFriend_ChatGPT, and SiliconFriend_BELLE. Models' scores are calculated using $s = 1 / r$ , where $r = {1, 2, 3}$ indicates its relative ranking.

模型性能基于以下指标评估。（1）记忆检索准确率：判断相关记忆能否被成功检索。标签为 0：no，1：yes。（2）回应正确性：评估回应是否包含探查问题的正确答案。标签为 0：wrong，0.5：partial，1：correct。（3）上下文连贯性：评估回应是否自然且连贯地组织，并连接对话上下文与检索到的记忆。标签为 0：not coherent，0.5：partially coherent，1：coherent。（4）模型排序分数：对同一问题和上下文下三个 SiliconFriend 变体的输出进行排序。这三个变体分别是 SiliconFriend_ChatGLM、SiliconFriend_ChatGPT 和 SiliconFriend_BELLE。模型分数用 $s = 1 / r$ 计算，其中 $r = {1, 2, 3}$ 表示其相对排名。

Result Analysis

We evaluate 3 SiliconFriend variants using both English and Chinese test set. Table 2 yields the following insights: (1) Our overall best variant SiliconFriend_ChatGPT has high performance across all metrics, showing the effectiveness of our overall framework. (2) SiliconFriend_BELLE and SiliconFriend_ChatGLM also have high performance in retrieval accuracy, showing the generality and effectiveness of our MemoryBank mechanism for both open-source and closed-source LLMs. Nonetheless, their performance on other metrics is not as good as SiliconFriend_ChatGPT. This might be attributed to the inferior overall abilities of the base models (BELLE and ChatGLM) compared to ChatGPT. (3) Models' performance varies on different languages. SiliconFriend_ChatGLM and SiliconFriend_ChatGPT deliver better results in English, while SiliconFriend_BELLE excells in Chinese.

作者使用英文和中文测试集评估了 3 个 SiliconFriend 变体。表2 给出了以下观察： （1） 总体最佳变体 SiliconFriend_ChatGPT 在所有指标上都表现较高，说明整体框架有效。 （2） SiliconFriend_BELLE 和 SiliconFriend_ChatGLM 在检索准确率上也表现较好，说明 MemoryBank 机制对开源和闭源 LLM 都具有通用性和有效性。不过，它们在其他指标上的表现不如 SiliconFriend_ChatGPT。这可能归因于基座模型 BELLE 和 ChatGLM 的整体能力弱于 ChatGPT。 （3） 模型在不同语言上的表现不同。 SiliconFriend_ChatGLM 和 SiliconFriend_ChatGPT 在英文上效果更好，而 SiliconFriend_BELLE 在中文上表现更好。

表2：定量分析结果。所有指标均越高越好。

Language	Model	Retrieval Acc.	Correctness	Coherence	Ranking
English	SiliconFriend_ChatGLM	0.809	0.438	0.680	0.498
	SiliconFriend_BELLE	0.814	0.479	0.582	0.517
	SiliconFriend_ChatGPT	0.763	0.716	0.912	0.818
Chinese	SiliconFriend_ChatGLM	0.840	0.418	0.428	0.510
	SiliconFriend_BELLE	0.856	0.603	0.562	0.565
	SiliconFriend_ChatGPT	0.711	0.655	0.675	0.758

Large Language Models

LLMs such as GPT-3, OPT, and FLAN-T5 have made remarkable strides in a broad spectrum of natural language processing tasks in recent years. Recently, cutting-edge closed-source language models, like PaLM, GPT-4 and ChatGPT, continue to display substantial flexibility, adapting to a wide variety of domains. They have increasingly become daily decision-making aids for many people. However, the close-source nature of these models prohibit the researchers and companies to study the inner mechanism of LLMs and built domain-adapted applications. Therefore, many open-source LLMs emerged in the community, like LLaMa, ChatGLM and Alpaca. For more details, we refer readers to this comprehensive review: Zhao et al.

近年来，GPT-3、OPT 和 FLAN-T5 等 LLM 在广泛的自然语言处理任务中取得了显著进展。近期，PaLM、GPT-4 和 ChatGPT 等前沿闭源语言模型持续展现出很强的灵活性，能够适应各种领域。它们越来越多地成为许多人日常决策的辅助工具。然而，这些模型的闭源性质阻碍了研究者和公司研究 LLM 内部机制并构建领域适配应用。因此，社区中出现了许多开源 LLM，例如 LLaMa、ChatGLM 和 Alpaca。更多细节可参见 Zhao 等人的综述。

Nevertheless, these models still have shortcomings. A noticeable gap lies in their deficiency in a robust long-term memory function. This limitation hinders their ability to maintain context over a long period and retrieve pertinent information from past interactions. Our research steps in here, with the primary objective of developing long-term memory mechanism for LLMs.

不过，这些模型仍然存在不足。一个明显缺口在于它们缺少强健的长期记忆功能。这一限制阻碍了它们长时间保持上下文，并从过去交互中检索相关信息的能力。本文的研究正是切入这一问题，主要目标是为 LLM 开发长期记忆机制。

Long-term Memory Mechanisms

Numerous attempts have been made to enhance the memory capabilities of neural models. Memory-augmented networks (MANNs) like Neural Turing Machines (NTMs) is an example of this, designed to increase the memory capacity of neural networks. These models are structured to interact with an external memory matrix, enabling them to handle tasks that necessitate the maintenance and manipulation of stored information over extended periods. Despite showing potential, these methods have not fully addressed the need for a reliable and adaptable long-term memory function in LLMs.

已有许多尝试旨在增强神经模型的记忆能力。记忆增强网络（MANN）中的神经图灵机（NTM）就是一个例子，其设计目标是提升神经网络的记忆容量。这些模型被设计为与外部记忆矩阵交互，使其能够处理需要在较长时间内维护和操作已存储信息的任务。尽管这些方法展现出潜力，但它们尚未完全满足 LLM 对可靠且可适配的长期记忆功能的需求。

There have also been studies focusing on long-range conversations. For instance, Xu et al. introduced a new English dataset comprised of multi-session human-human crowdworker chats for long-term conversations. However, these conversations are generally restricted to a few rounds of conversation, which can not align with the application scenarios of long-term AI companions. Moreover, these models often fail to create a detailed user portrait and lack a human-like memory updating mechanism, both crucial for facilitating more natural interactions.

也有一些研究关注长程对话。例如，Xu 等人提出了一个新的英文数据集，由多会话的人类众包对话组成，用于长期对话研究。然而，这些对话通常局限于少数几轮，无法与长期 AI 伴侣的应用场景对齐。此外，这些模型通常无法创建详细的用户画像，也缺少类人的记忆更新机制；二者对于促进更自然的交互都很关键。

The concept of memory updating has been extensively researched in psychology. The Forgetting Curve theory by Ebbinghaus offers valuable insights into the pattern of memory retention and forgetting over time. Taking inspiration from this theory, we integrate a memory updating mechanism into MemoryBank to bolster its long-term memory function. In summary, while significant progress has been made in the field of LLMs, there is still a need for long-term memory mechanism to empower LLMs in the scenarios requiring personalized and persistent interactions. Our work presents MemoryBank as a novel approach to address this challenge.

记忆更新这一概念已经在心理学中得到广泛研究。艾宾浩斯的遗忘曲线理论为记忆保持和随时间遗忘的模式提供了有价值的洞察。受该理论启发，作者将记忆更新机制集成进 MemoryBank，以增强其长期记忆功能。总的来说，尽管 LLM 领域已经取得显著进展，但在需要个性化和持续交互的场景中，仍然需要长期记忆机制来增强 LLM。本文提出 MemoryBank，作为应对这一挑战的新方法。

6. Conclusion

We present MemoryBank, a novel long-term memory mechanism designed to address the memory limitation of LLMs. MemoryBank enhances the ability to maintain context over time, recall relevant information, and understand user personality. Besides, the memory updating mechanism of MemoryBank draws inspiration from the Ebbinghaus Forgetting Curve theory, a psychological principle that describes the nature of memory retention and forgetting over time. This design improves the anthropomorphism of AI in long-term interactions scenarios. The versatility of MemoryBank is demonstrated through its accommodation of both open-source models such as ChatGLM and BELLE, and close-source models like ChatGPT.

作者提出 MemoryBank，这是一种新型长期记忆机制，旨在解决 LLM 的记忆局限。 MemoryBank 增强了随时间保持上下文、回忆相关信息以及理解用户人格的能力。此外，MemoryBank 的记忆更新机制受到艾宾浩斯遗忘曲线理论启发；该理论描述了记忆保持和随时间遗忘的性质。这一设计提升了 AI 在长期交互场景中的拟人化程度。 MemoryBank 对 ChatGLM 和 BELLE 等开源模型以及 ChatGPT 等闭源模型的适配，展示了其通用性。

We further illustrate the practical application of MemoryBank through the development of SiliconFriend, an LLM-based chatbot designed to serve as a long-term AI companion. Equipped with MemoryBank, SiliconFriend can establish a deeper understanding of users, offering more personalized and meaningful interactions, emphasizing the potential for MemoryBank to humanize AI interactions. The tuning of SiliconFriend with psychological dialogue data enables it to provide empathetic emotional support. Extensive experiments including both qualitative and quantitative methods validate the effectiveness of MemoryBank. The findings demonstrate that MemoryBank empowers SiliconFriend with memory recall capabilities and deepens the understanding of user behaviors. Besides, SiliconFriend can provide empathetic companionship of higher quality.

作者进一步通过开发 SiliconFriend 展示了 MemoryBank 的实际应用；SiliconFriend 是一个基于 LLM、旨在作为长期 AI 伴侣的聊天机器人。配备 MemoryBank 后，SiliconFriend 能够对用户建立更深入的理解，提供更个性化、更有意义的交互，凸显 MemoryBank 使 AI 交互更人性化的潜力。使用心理对话数据对 SiliconFriend 进行调优，使其能够提供共情式情感支持。包含定性和定量方法的大量实验验证了 MemoryBank 的有效性。实验发现表明，MemoryBank 赋予 SiliconFriend 记忆回忆能力，并加深其对用户行为的理解。此外，SiliconFriend 能够提供更高质量的共情陪伴。

🤖 Rasa

MemoryBank: Enhancing Large Language Models with Long-Term Memory ​

MemoryBank：用长期记忆增强大语言模型 ​

Abstract ​

1. Introduction ​

2. MemoryBank: A Novel Memory Mechanism Tailored for LLMs ​

2.1 Memory Storage: The Warehouse of MemoryBank ​

In-Depth Memory Storage ​

Hierarchical Event Summary ​

Dynamic Personality Understanding ​

2.2 Memory Retrieval ​

2.3 Memory Updating Mechanism ​

3. SiliconFriend: An AI Chatbot Companion Powered by MemoryBank ​

Parameter-efficient Tuning with Psychological Dialogue Data ​

Integration with MemoryBank ​

4. Experiments ​

4.1 Qualitative Analysis ​

Psychological Companionship ​

Memory Recall Analysis ​

Personality Interaction Analysis ​

4.2 Quantitative Analysis ​

Memory Storage Construction ​

Evaluation Metrics ​

Result Analysis ​

5. Related Works ​

Large Language Models ​

Long-term Memory Mechanisms ​

6. Conclusion ​