ELF: Embedded Language Flows
Hu K, Qiu L, Lu Y, et al. ELF: Embedded Language Flows[J]. arXiv preprint arXiv:2605.10938, 2026.
嵌入式语言流
Abstract
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulation makes it straightforward to adapt established techniques from image-domain diffusion models, e.g., classifier-free guidance (CFG). Experiments show that ELF substantially outperforms leading discrete and continuous DLMs, achieving better generation quality with fewer sampling steps. These results suggest that ELF offers a promising path toward effective continuous DLMs.
扩散模型和基于流的模型已经成为生成连续数据的事实标准方法,例如在图像和视频等领域。它们的成功引发了人们将其应用于语言建模的广泛兴趣。与图像领域的对应模型不同,当前领先的扩散语言模型主要在离散词元上运行。本文表明,只需对离散领域进行最小程度的适配,连续扩散语言模型也可以变得有效。我们提出 Embedded Language Flows (ELF),这是一类基于连续时间 Flow Matching、在连续嵌入空间中建模的扩散模型。与现有扩散语言模型不同,ELF 在绝大部分过程中都保持在连续嵌入空间内,直到最后一个时间步才使用共享权重网络映射到离散词元。这种形式使得从图像领域扩散模型中迁移已有技术变得直接,例如无分类器引导。实验表明,ELF 显著优于领先的离散和连续扩散语言模型,并且能用更少采样步数获得更好的生成质量。这些结果表明,ELF 为构建有效的连续扩散语言模型提供了一条有前景的路径。