发布日期: 2024-10-23
版本号: v21.0.0-rc2

Vitess v21.0.0版本发布,主要更新包括以下内容:弃用并计划移除VTTablet的queryserver-enable-settings-pool标志,删除多个已弃用的VTOrc监控指标,同时替换了部分vttablet指标名称。新增流量镜像功能,允许VTGate将指定比例的流量复制到目标keyspace以降低切换风险。VTGate引入新关闭行为,支持通过--mysql-server-drain-onterm标志在关闭时拒绝新连接但允许现有连接完成。Tablet Throttler支持多指标监控(如复制延迟、线程数、负载等),并可根据应用动态分配指标组合。优化了跨单元晋升主库的逻辑,允许通过--allow-cross-cell-promotion标志实现。新增实验性递归CTE支持。引入VTGate负载均衡器,通过动态分配查询负载提升性能。查询超时机制支持通过注释、会话变量或命令行参数覆盖,优先级依次递减。新增基于MySQL Shell的逻辑备份引擎(实验性)。动态VReplication配置支持无需重启修改参数。参考表物化功能可将非分片表同步至分片集群的所有分片。VEXPLAIN新增TRACE和KEYS模式,分别提供执行跟踪与关键列分析。增强Errant GTID检测机制,阻止异常副本加入复制。自动将MySQL的auto_increment替换为Vitess序列,并新增相关配置参数。实验性支持MySQL 8.4版本。新增VTOrc的CurrentErrantGTIDCount指标监控异常GTID数量。新增vtctldclient ChangeTabletTags命令动态修改标签。主库切换操作支持指定预期主库以确保条件执行。该版本共合并354项代码提交。

更新内容 (中文)

Vitess v21.0.0 版本发布

概述

目录

主要变更

弃用和删除项

弃用的 VTTablet 标志

  • queryserver-enable-settings-pool 标志自 v15 添加,自 v17 起默认启用。现已弃用,将在未来版本中移除。

删除已弃用的指标

以下 VTOrc 指标在 v20 中已弃用,现已被删除:

指标名称
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

弃用的指标

以下指标现已弃用,将在未来版本中删除,请使用替代指标:

组件 指标名称 替代指标
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

流量镜像

流量镜像功能旨在减少 MoveTables SwitchTraffic 操作中的不确定性。启用流量镜像后,VTGate 会将指定比例的流量从一个 keyspace 镜像到另一个 keyspace。

可通过 vtctldclientMoveTables MirrorTraffic 命令启用镜像规则。例如:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

可使用 GetMirrorRules 检查镜像规则。

新的 VTGate 关闭行为

新增了 VTGate 选项,允许在关闭期间拒绝新连接,同时允许现有连接继续处理直到手动断开或达到 --onterm_timeout,而不会返回 Server shutdown in progress 错误。

可通过为 VTGate 指定新标志 --mysql-server-drain-onterm 启用此行为。

更多信息请参阅 RFC

Tablet 限流器:多指标支持

v20 之前,Tablet 限流器仅监控和使用单个指标(默认复制延迟或自定义查询结果)。本版本引入重大改进,限流器现可同时监控和使用多个指标。

默认行为是监控所有指标,但仅使用 lag(未定义自定义查询时)或 custom 指标(定义自定义查询时)。此行为与 v20 向后兼容。现可为应用分配一个或多个指标组合,限流器将根据所有分配指标的运行状况决策请求。

预定义指标包括:

  • lag:基于心跳注入的复制延迟
  • threads_running:MySQL 服务器并发线程数
  • loadavg:Tablet 实例/pod 的每核负载
  • custom:自定义查询结果

每个指标均有默认阈值,可通过 UpdateThrottlerConfig 命令覆盖。限流器支持通配符 "all" 应用名,可为所有应用分配指标。指标默认作用域为 self(Tablet 独立)或 shard(分片内最差值),可为每个指标指定不同作用域。

允许在 PRS 中跨 Cell 提升

此前使用 PlannedReparentShard 跨 Cell 提升副本时需显式指定 --new-primary。新增 --allow-cross-cell-promotion 标志允许在不显式指定新主库时跨 Cell 选择主库。

递归 CTE 实验性支持

新增递归 CTE 的实验性支持。由于尚未充分测试,可能存在限制,欢迎社区反馈以改进此功能。

VTGate Tablet 负载均衡器

当 VTGate 路由查询时,若存在多个可用 Tablet(如 REPLICA 类型),默认行为为本地 Cell 亲和性+轮询策略。Tablet 负载均衡器提供新机制,在优先选择本地 Cell Tablet 的同时均衡查询负载。

通过 --enable-balancer 标志启用,并通过 --balancer-vtgate-cells--balancer-keyspaces 配置。详见 RFC

查询超时覆盖

当设置 QUERY_TIMEOUT_MS 注释指令、query_timeout 会话变量或 query-timeout 标志时,VTGate 会向 VTTablet 发送权威查询超时。优先级顺序为:注释指令 > 会话变量 > VTGate 标志。事务内查询的实际超时为事务超时和查询超时的较小值。

使用 QUERY_TIMEOUT_MS=0 可取消超时限制。示例: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

新备份引擎(实验性)

引入基于 MySQL Shell 的逻辑备份新引擎,需配置 --backup_engine_implementation=mysqlshell。详见文档

动态 VReplication 配置

此前许多 VReplication 工作流配置需通过 VTTablet 标志设置,变更需重启 VTTablet。现支持在创建工作流时或运行时动态覆盖配置。

参考表物化

Materialize 工作流新增选项,可将未分片 keyspace 中的参考表/查找表同步到分片 keyspace 的所有分片。

新 VEXPLAIN 模式:TRACE 和 KEYS

VEXPLAIN TRACE

TRACE 模式提供查询执行的详细跟踪信息,展示操作流程和 Tablet 交互。适用于:

  • 识别性能瓶颈
  • 理解执行模式
  • 优化复杂查询
  • 调试异常行为

该模式执行查询并记录交互,返回包含调用次数、平均处理行数等统计信息的 JSON 执行计划。

VEXPLAIN KEYS

KEYS 模式提供查询结构摘要,突出显示连接、过滤和分组使用的列。适用于:

  • 识别分片键候选
  • 优化查询性能
  • 分析查询模式以指导数据库设计

该模式不执行查询,返回包含分组列、连接列、过滤列和语句类型的 JSON 输出。

VTTablet 上的异常 GTID 检测

VTTablet 在加入复制流前会执行异常 GTID 检测。若副本存在异常 GTID,将不开始复制。Kubernetes 上运行的副本若存在异常 GTID 将显示为未就绪状态,需手动替换。

自动用 Vitess 序列替换 MySQL auto_increment 子句

MoveTables 工作流中自动将 MySQL auto_increment 替换为 Vitess 序列。原 --remove-sharded-auto-increment 标志已弃用,改用 --sharded-auto-increment-handling。详见文档

实验性 MySQL 8.4 支持

新增 MySQL 8.4 实验性支持,已通过测试套件但需社区反馈以改进。

当前异常 GTID 计数指标

VTOrc 新增 CurrentErrantGTIDCount 指标,显示 Tablet 中异常 GTID 的当前数量。

vtctldclient ChangeTabletTags 命令

新增 vtctldclient ChangeTabletTags 命令,支持动态修改 Tablet 标签。

在重定位中指定预期主库的支持

EmergencyReparentShardPlannedReparentShard 命令及 RPC 现支持指定预期主库,使重定位操作可基于特定状态条件执行。


完整更新日志请参见此处

本版本包含 354 个合并的 Pull Request。

感谢所有贡献者:@GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

更新内容 (原始)

Release of Vitess v21.0.0

Summary

Table of Contents

Major Changes

Deprecations and Deletions

Deprecated VTTablet Flags

  • queryserver-enable-settings-pool flag, added in v15, has been on by default since v17. It is now deprecated and will be removed in a future release.

Deletion of deprecated metrics

The following VTOrc metrics were deprecated in v20. They have now been deleted.

Metric Name
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

Deprecated Metrics

The following metrics are now deprecated and will be deleted in a future release, please use their replacements.

Component Metric Name Replaced By
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

Traffic Mirroring

Traffic mirroring is intended to help reduce some of the uncertainty inherent to MoveTables SwitchTraffic. When traffic mirroring is enabled, VTGate will mirror a percentage of traffic from one keyspace to another.

Mirror rules may be enabled through vtctldclient with MoveTables MirrorTraffic. For example:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

Mirror rules can be inspected with GetMirrorRules.

New VTGate Shutdown Behavior

We added a new option to VTGate to disallow new connections while VTGate is shutting down, while allowing existing connections to finish their work until they manually disconnect or until the --onterm_timeout is reached, without getting a Server shutdown in progress error.

This new behavior can be enabled by specifying the new --mysql-server-drain-onterm flag to VTGate.

You can find more information about this option in the RFC.

Tablet Throttler: Multi-Metric support

Up until v20, the tablet throttler would only monitor and use a single metric. That would be replication lag, by default, or could be the result of a custom query. In this release, we introduce a major redesign so that the throttler monitors and uses multiple metrics at the same time, including the above two.

The default behavior now is to monitor all metrics, but only use lag (if the custom query is undefined) or the custom metric (if the custom query is defined). This is backwards-compatible with v20. A v20 PRIMARY is compatible with a v21 REPLICA, and a v21 PRIMARY is compatible with a v20 REPLICA.

However, it is now possible to assign any combination of one or more metrics for a given app. The throttler would then accept or reject the app’s requests based on the health of all assigned metrics. We have provided a pre-defined list of metrics:

  • lag: replication lag based on heartbeat injection.
  • threads_running: concurrent active threads on the MySQL server.
  • loadavg: per core load average measured on the tablet instance/pod.
  • custom: the result of a custom query executed on the MySQL server.

Each metric has a default threshold which can be overridden by the UpdateThrottlerConfig command.

The throttler also supports the catch-all "all" app name, and it is thus possible to assign metrics to all apps. Explicit app to metric assignments will override the catch-all configuration.

Metrics are assigned a default scope, which could be self (isolated to the tablet) or shard (max, aka worst value among shard tablets). It is further possible to require a different scope for each metric.

Allow Cross Cell Promotion in PRS

Up until now if the users wanted to promote a replica in a different cell from the current primary using PlannedReparentShard, they had to specify the new primary with the --new-primary flag.

We have now added a new flag --allow-cross-cell-promotion that lets PlannedReparentShard choose a primary in a different cell even if no new primary is provided explicitly.

Experimental support for recursive CTEs

We have added experimental support for recursive CTEs in Vitess. We are marking it as experimental because it is not yet fully tested and may have some limitations. We are looking for feedback from the community to improve this feature.

VTGate Tablet Balancer

When a VTGate routes a query and has multiple available tablets for a given shard / tablet type (e.g. REPLICA), the current default behavior routes the query with local cell affinity and round robin policy. The VTGate Tablet Balancer provides an alternate mechanism that routes queries to maintain an even distribution of query load to each tablet, while preferentially routing to tablets in the same cell as the VTGate.

The tablet balancer is enabled by a new flag --enable-balancer and configured by --balancer-vtgate-cells and --balancer-keyspaces.

See the RFC for more details on the design and configuration of this feature.

Query Timeout Override

VTGate sends an authoritative query timeout to VTTablet when the QUERY_TIMEOUT_MS comment directive, query_timeout session system variable, or query-timeout flag is set. The order of precedence is: comment directive > session variable > VTGate flag. VTTablet overrides its default query timeout with the value received from VTGate. All timeouts are specified in milliseconds.

When a query is executed inside a transaction, there is an additional nuance. The actual timeout used will be the smaller of the transaction timeout and the query timeout.

A query can also be set to have no timeout by using the QUERY_TIMEOUT_MS comment directive with a value of 0.

Example usage: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

New Backup Engine (EXPERIMENTAL)

We are introducing a new backup engine for logical backups in order to support use cases that require something other than physical backups. This feature is experimental and is based on MySQL Shell.

The new engine is enabled by using --backup_engine_implementation=mysqlshell. There are other options that are required, so please read the documentation to learn which options are required and how to configure them.

Dynamic VReplication Configuration

Previously, many of the configuration options for VReplication Workflows had to be provided using VTTablet flags. This meant that any change to VReplication configuration required restarting VTTablets. We now allow these to be overridden while creating a workflow or dynamically after the workflow is already in progress.

Reference Table Materialization

There is a new option in Materialize workflows to keep a synced copy of reference or lookup tables (countries, states, zip codes, etc) from an unsharded keyspace, which holds the source of truth for the reference table, to all shards in a sharded keyspace.

New VEXPLAIN Modes: TRACE and KEYS

VEXPLAIN TRACE

The new TRACE mode for VEXPLAIN provides a detailed execution trace of queries, showing how they’re processed through various operators and interactions with tablets. This mode is particularly useful for:

  • Identifying performance bottlenecks
  • Understanding query execution patterns
  • Optimizing complex queries
  • Debugging unexpected query behavior

TRACE mode runs the query and logs all interactions, returning a JSON representation of the query execution plan with additional statistics like number of calls, average rows processed, and number of shards queried.

VEXPLAIN KEYS

The KEYS mode for VEXPLAIN offers a concise summary of query structure, highlighting columns used in joins, filters, and grouping operations. This information is crucial for:

  • Identifying potential sharding key candidates
  • Optimizing query performance
  • Analyzing query patterns to inform database design decisions

KEYS mode analyzes the query structure without executing it, providing JSON output that includes grouping columns, join columns, filter columns (potential candidates for indexes, primary keys, or sharding keys), and the statement type.

These new VEXPLAIN modes enhance Vitess’s query analysis capabilities, allowing for more informed decisions about sharding strategies and query optimization.

Errant GTID Detection on VTTablets

VTTablets now run an errant GTID detection logic before they join the replication stream. So, if a replica has an errant GTID, it will not start replicating from the primary. This protects us from running into situations which are very difficult to recover from.

For users running with the vitess-operator on Kubernetes, this change means that replica tablets with errant GTIDs will have broken replication and will report as unready. Users will need to manually replace and clean up these errant replica tablets.

Automatically Replace MySQL auto_increment Clauses with Vitess Sequences

In https://github.com/vitessio/vitess/pull/16860 we added support for replacing MySQL auto_increment clauses with Vitess Sequences, performing all of the setup and initialization work automatically during the MoveTables workflow. As part of that work we have deprecated the --remove-sharded-auto-increment boolean flag and you should begin using the new --sharded-auto-increment-handling flag instead. Please see the new MoveTables Auto Increment Handling documentation for additional details.

Experimental MySQL 8.4 support

We have added experimental support for MySQL 8.4. It passes the Vitess test suite, but it is otherwise not yet tested. We are looking for feedback from the community to improve this to move support out of the experimental phase in a future release.

Current Errant GTIDs Count Metric

A new metric called CurrentErrantGTIDCount has been added to the VTOrc component. This metric shows the current count of the errant GTIDs in the tablets.

vtctldclient ChangeTabletTags command

The vtctldclient command ChangeTabletTags was added to allow the tags of a tablet to be changed dynamically.

Support specifying expected primary in reparents

The EmergencyReparentShard and PlannedReparentShard commands and RPCs now support specifying a primary we expect to still be the current primary in order for a reparent operation to be processed. This allows reparents to be conditional on a specific state being true.


The entire changelog for this release can be found here.

The release includes 354 merged Pull Requests.

Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

下载链接