发布日期: 2025-03-17
版本号: v1.14.0-rc.0

Meilisearch v1.14 引入了更细粒度的可过滤属性设置功能,允许用户为每个可过滤属性选择特定的过滤功能,从而优化索引时间。此外,该版本还提升了语义搜索性能、嵌入索引速度,并新增了通过ID获取多个文档的路由。新功能包括复合嵌入器,允许在搜索和索引时使用不同的嵌入器,以优化性能。其他改进包括支持批量获取文档、合并更新和替换操作、显示内部索引步骤时间等。修复了从模型配置中获取池化方法的问题,并确保删除无用的前缀。依赖项更新包括将CI中的Ubuntu版本从20.04升级到22.04,并将heed库升级到v0.22。测试和CI方面也有所改进,如提高测试性能、添加Ollama集成测试等。感谢外部贡献者的支持。

更新内容 (中文)

详见原始内容

更新内容 (原始)

[!WARNING] Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports[#new] and feedback about new features.

Meilisearch v1.14 exposes a more granular way to express your filterable attributes! 🎉 This release also improves semantic search performance, embedding indexing speed, and a new route to get multiple documents by IDs.

New features and updates 🔥

Granular Filterable Attribute Settings

This feature allows for the choice of the filter features of each filterable attribute with high granularity. Activating or deactivating a feature will have an impact on the indexing time. To use the feature, use the new filterableAttributes format in the settings route (PATCH /indexes/INDEX_UID/settings):

{
	"filterableAttributes": [
	  {
	    "attributePatterns": ["genre", "artist"], 
	    "features": { "facetSearch": true, "filter": { "equality": true, "comparison": false } }
	  },
	  {
	    "attributePatterns": ["rank"],
	    "features": { "facetSearch": false, "filter": { "equality": true, "comparison": true } }
	  },
	  {
	    "attributePatterns": ["albumId"],
	    "features": { "facetSearch": false, "filter": { "equality": true, "comparison": false } }
	  },
	]
}

🗒️ In this example, we activate/deactivate each feature depending on the usage of each filterable attribute, the genre and artist are expected to be strings so the facet search and the equality operators (=/!=) are activated, but, the comparison operators are deactivated (>=, <=, …) because it’s not useful and will save time during the indexing process. However, the rank is expected to be a number, so the facet search is deactivated in favor of the the comparison operators. Last, the albumId is expected to be unique, so only the equality operators (=/!=) are activated.

For more details about this feature, please refer to its public usage page.

Done by @ManyTheFish in #5254.

Composite Embedders

This feature allows using different embedders at search and indexing time, which is useful to optimize the embedder to each situation:

  • Using a remote embedder for indexing in bulk, as remote embedders provide the highest bandwidth (embeddings/s)
  • Using a local embedder for answering search queries, as local embedders provide the lowest latency (time to first embedding)

To use the feature, follow these steps:

  1. Enable the Composite embedders feature from the Cloud dashboard or with the following:
curl -H 'Content-Type: application/json' MEILISEARCH_URL/experimental-features -d '{ "compositeEmbedders": true }'
  1. Send a settings task (PATCH /indexes/INDEX_UID/settings) containing an embedder with source: "composite", and with the parameters searchEmbedder and indexingEmbedder describing the embedder at search time (resp. indexing time). For example, using a Hugging Face inference endpoint.
{
    "embedders": {
        "text": {
            "source": "composite",
            "searchEmbedder": {
                "source": "huggingFace", // locally computed embeddings using a model from the Hugging Face Hub
                "model": "baai/bge-base-en-v1.5",
                "revision": "a5beb1e3e68b9ab74eb54cfd186867f64f240e1a"
            },
            "indexingEmbedder": {
                "source": "rest", // remotely computed embeddings using Hugging Face inference endpoints
                "url": "https://URL.endpoints.huggingface.cloud",
                "apiKey": "hf_XXXXXXX",
                "documentTemplate": "Your {{doc.template}}",
                "request": {
                    "inputs": [
                        "{{text}}",
                        "{{..}}"
                    ]
                },
                "response": [
                    "{{embedding}}",
                    "{{..}}"
                ]
            }
        }
    }
}
  1. Send documents to the index. They will be indexed remotely using the Hugging Face inference endpoint.
  2. Perform semantic search queries, they will be embedded using the local model fetched from the Hugging Face Hub.

For more details about this feature, please refer to its public usage page.

Done by @dureuill in #5371.

Other improvements

  • Get multiple documents by ids by @dureuill in #5384
  • Support merging update and replacement operations by @Kerollmops in #5293
  • Display the internal indexing steps with timings on the /batches route by @Kerollmops in #5356
  • exhaustive facet search by @ManyTheFish in #5369
  • Reduce RAM consumption of arroy by @irevoire in https://github.com/meilisearch/arroy/pull/105
  • Cache Embeddings in Search with an experimental feature by @dureuill in #5418
  • Extend the batch progress view to the indexing of vectors by @irevoire in 5420

Fixes 🐞

  • Support fetching the pooling method from the model configuration by @dureuill in #5355
  • Make sure to delete useless prefixes by @Kerollmops in #5413

Misc

  • Dependencies updates
    • Bump Ubuntu in the CI from 20.04 to 22.04 by @Kerollmops in #5338
    • Bump heed to v0.22 by @irevoire and @Kerollmops in #5406
  • CIs and tests
    • Improve test performance of get_index.rs by @DerTimonius in #5210
    • Ollama Integration Tests by @Kerollmops in #5308
    • Ensure the settings routes are now configurated when a new field is added to the Settings struct by @MichaScant in #5149
    • Skip a snapshot test on Windows by @Kerollmops in #5383
  • Misc
    • Rename callTrace into progressTrace by @Kerollmops in #5364
    • Make composite embedders an experimental feature by @dureuill in #5401

❤️ Thanks again to our external contributors:

下载链接