IBM's New Embedding Models Digest 32K Words in 200 Languages

Published: May 16, 2026 at 12:13 AM

Updated: May 16, 2026 at 12:13 AM

100-word summary

IBM just released two open-source embedding models that can process documents 32,000 tokens long across 200+ languages, up from standard 512-token limits. The Granite R2 models were rebuilt from scratch to handle multilingual contracts and technical manuals without chopping them into fragments. Both run under Apache 2.0 licensing, meaning no legal hurdles for commercial use. The smaller 97-million-parameter version processes roughly 2,500 documents per second on an H100 GPU. They plug directly into existing tools like LangChain and LlamaIndex. Open multilingual models at this context length used to require massive parameter counts or proprietary licenses.

What happened

Why it matters

Open multilingual models at this context length used to require massive parameter counts or proprietary licenses.

Sources

Hugging Face Blog