发布日期: 2024-10-29
版本号: v21.0.0

Vitess v21.0.0版本发布,包含以下主要更新和变更:已知问题方面,内置备份引擎可能存在部分文件失败仍显示备份成功的情况。主要变化包括弃用并计划移除VTTablet的queryserver-enable-settings-pool标志,删除多个VTOrc旧指标并替换部分vttablet指标。新增流量镜像功能,可通过命令配置将指定比例流量复制到目标keyspace。实验性支持原子分布式事务,提供两种事务模式选择。VTGate新增优雅关闭连接选项,允许现有请求完成。Tablet Throttler扩展支持多指标监控(如复制延迟、线程数等),可组合应用健康检测策略。新增跨Cell晋升主库的配置选项及实验性递归CTE支持。VTGate引入负载均衡机制优化查询分发,支持基于百分比设置查询超时。新增基于MySQL Shell的逻辑备份引擎(实验性),并支持动态调整VReplication配置。物化工作流新增参考表同步功能,支持将未分片表同步到分片环境。新增VEXPLAIN的TRACE和KEYS模式,分别用于查询执行跟踪和结构分析。MoveTables自动替换MySQL自增列为Vitess序列,并废弃旧标志。实验性支持MySQL 8.4版本,新增VTOrc的errant GTID计数指标。vtctldclient新增修改Tablet标签命令,重构操作支持指定预期主库条件。该版本共合并364项PR,完整变更日志可通过GitHub查看。

更新内容 (中文)

Vitess v21.0.0 版本发布

概览

目录

已知问题

备份报告自身成功但实际存在失败

在此版本中,我们发现当文件备份失败时备份可能仍显示成功的问题。即使发生部分错误,备份仍会报告成功。此问题仅在使用内置备份引擎且所有文件已启动备份流程时出现。详情请参考 GitHub Issue https://github.com/vitessio/vitess/issues/17063

重大变更

弃用与删除项

弃用的 VTTablet 标志

  • queryserver-enable-settings-pool 标志(于 v15 添加)自 v17 起默认启用,现已弃用并将在未来版本中移除。

删除已弃用指标

以下 VTOrc 指标已在 v20 中弃用,现已被删除:

指标名称
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

弃用指标

以下指标现已弃用并将在未来版本中删除,请使用替代指标:

组件 指标名称 替代指标
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

流量镜像

流量镜像功能旨在减少 MoveTables SwitchTraffic 操作的不确定性。启用后,VTGate 可将指定比例的流量从一个 keyspace 镜像到另一个 keyspace。

可通过 vtctldclientMoveTables MirrorTraffic 命令启用镜像规则,例如:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

使用 GetMirrorRules 可查看镜像规则配置。

原子分布式事务支持

我们已引入原子分布式事务作为实验性功能。用户现可运行具备更强一致性的多分片事务。Vitess 现为多分片事务提供两种保证模式:Best Effort(尽力而为)和 Atomic(原子性),用户可根据需求选择模式。

请参考文档启用原子分布式事务。实现细节与权衡分析请查阅 RFC

VTGate 新关机行为

我们为 VTGate 新增了在关机期间拒绝新连接的功能选项,同时允许现有连接继续处理请求直至手动断开或达到 --onterm_timeout 超时,期间不会返回 Server shutdown in progress 错误。

通过为 VTGate 指定新标志 --mysql-server-drain-onterm 可启用此行为。更多信息请参阅 RFC

Tablet 限流器:多指标支持

v20 及之前版本中,tablet 限流器仅监控单个指标(默认复制延迟或自定义查询结果)。本版本进行了重大重构,现支持同时监控并使用多个指标。

默认行为改为监控所有指标,但仅使用 lag(未定义自定义查询时)或 custom 指标(定义自定义查询时),保持与 v20 的兼容性。现可为指定应用配置多个指标组合,限流器将基于所有指定指标的健康状态决定是否接受请求。预定义指标包括:

  • lag:基于心跳注入的复制延迟
  • threads_running:MySQL 服务器的并发活跃线程数
  • loadavg:tablet 实例/pod 的每核负载平均值
  • custom:MySQL 服务器执行的自定义查询结果

每个指标均有默认阈值,可通过 UpdateThrottlerConfig 命令覆盖。限流器支持通配符 "all" 应用名,可为所有应用分配指标。明确的应用-指标分配将覆盖通配符配置。

允许 PRS 跨 Cell 晋升

此前使用 PlannedReparentShard 跨 cell 晋升副本时需显式指定 --new-primary 参数。现新增 --allow-cross-cell-promotion 标志,允许在不指定新主库时自动选择跨 cell 副本晋升。

实验性递归 CTE 支持

我们为 Vitess 添加了实验性递归 CTE 支持。由于尚未完整测试且可能存在限制,现标记为实验性功能,欢迎社区反馈以改进此特性。

VTGate Tablet 负载均衡器

当 VTGate 路由查询时遇到多个可用 tablet(如 REPLICA 类型),默认行为采用本地 cell 亲和性+轮询策略。新增的负载均衡器提供替代机制:在保持查询负载均衡分布的同时优先选择与 VTGate 同 cell 的 tablet。

通过新标志 --enable-balancer 启用,并通过 --balancer-vtgate-cells--balancer-keyspaces 配置。设计细节与配置说明请参考 RFC

查询超时覆盖

当设置 QUERY_TIMEOUT_MS 注释指令、query_timeout 会话变量或 query-timeout 标志时,VTGate 会向 VTTablet 发送权威查询超时值。优先级顺序为:注释指令 > 会话变量 > VTGate 标志。VTTablet 将使用该值覆盖默认查询超时(单位:毫秒)。

事务中的查询实际超时将取事务超时与查询超时的较小值。通过设置 QUERY_TIMEOUT_MS=0 可使查询无超时限制。

使用示例: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

新备份引擎(实验性)

我们引入基于 MySQL Shell 的逻辑备份引擎以支持非物理备份场景。该实验性功能通过 --backup_engine_implementation=mysqlshell 启用,需配合其他必要参数,完整配置说明请参阅文档

动态 VReplication 配置

此前许多 VReplication 工作流配置选项需通过 VTTablet 标志设置,变更需重启 VTTablet。现支持在创建工作流时或运行期间动态覆盖这些配置。

引用表物化

Materialize 工作流新增选项,可将引用表或查找表(如国家、州、邮编等)从非分片 keyspace(源数据)同步到分片 keyspace 的所有分片。

新 VEXPLAIN 模式:TRACE 与 KEYS

VEXPLAIN TRACE

TRACE 模式提供查询执行的详细跟踪信息,展示经过各操作符与 tablet 交互的过程,适用于:

  • 识别性能瓶颈
  • 理解查询执行模式
  • 优化复杂查询
  • 调试异常行为

该模式会执行查询并记录所有交互,返回包含调用次数、平均处理行数、查询分片数等统计信息的 JSON 执行计划。

VEXPLAIN KEYS

KEYS 模式提供查询结构的简明摘要,突出显示用于连接、过滤和分组的列,适用于:

  • 识别潜在分片键候选
  • 优化查询性能
  • 指导数据库设计决策

该模式无需执行查询即可分析结构,输出包含分组列、连接列、过滤列(潜在索引/主键/分片键候选)及语句类型的 JSON 数据。

自动将 MySQL auto_increment 替换为 Vitess 序列

PR#16860 中,我们增加了在 MoveTables 工作流中自动将 MySQL auto_increment 替换为 Vitess 序列的支持。原 --remove-sharded-auto-increment 布尔标志已弃用,请改用新 --sharded-auto-increment-handling 标志。详情参见新文档

实验性 MySQL 8.4 支持

我们添加了对 MySQL 8.4 的实验性支持,该版本已通过 Vitess 测试套件,但尚未进行完整测试。欢迎社区反馈以在未来版本中移除实验性标记。

当前异常 GTID 计数指标

VTOrc 组件新增 CurrentErrantGTIDCount 指标,用于显示 tablet 中当前异常 GTID 的数量。

vtctldclient ChangeTabletTags 命令

新增 vtctldclient ChangeTabletTags 命令用于动态修改 tablet 的标签。

支持在重指定中指定预期主库

EmergencyReparentShardPlannedReparentShard 命令及 RPC 现支持指定预期当前主库作为重指定条件,使重指定操作仅在满足特定状态时执行。


完整更新日志请查看此处

本版本包含 364 个合并的 Pull Request。

感谢所有贡献者:@GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

更新内容 (原始)

Release of Vitess v21.0.0

Summary

Table of Contents

Known Issue

Backup reports itself as successful despite failures

In this release, we identified an issue where a backup may succeed even if a file fails to be backed up. Leading to a successful backup, even if some errors occurred. This only happen with the Builtin Backup Engine, and when all files have already been initiated in the backup process. For more details, please refer to the related GitHub Issue https://github.com/vitessio/vitess/issues/17063.

Major Changes

Deprecations and Deletions

Deprecated VTTablet Flags

  • queryserver-enable-settings-pool flag, added in v15, has been on by default since v17. It is now deprecated and will be removed in a future release.

Deletion of deprecated metrics

The following VTOrc metrics were deprecated in v20. They have now been deleted.

Metric Name
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

Deprecated Metrics

The following metrics are now deprecated and will be deleted in a future release, please use their replacements.

Component Metric Name Replaced By
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

Traffic Mirroring

Traffic mirroring is intended to help reduce some of the uncertainty inherent to MoveTables SwitchTraffic. When traffic mirroring is enabled, VTGate will mirror a percentage of traffic from one keyspace to another.

Mirror rules may be enabled through vtctldclient with MoveTables MirrorTraffic. For example:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

Mirror rules can be inspected with GetMirrorRules.

Atomic Distributed Transaction Support

We have introduced atomic distributed transactions as an experimental feature. Users can now run multi-shard transactions with stronger guarantees. Vitess now provides two modes of transactional guarantees for multi-shard transactions: Best Effort and Atomic. These can be selected based on the user’s requirements and the trade-offs they are willing to make.

Follow the documentation to enable Atomic Distributed Transaction

For more details on the implementation and trade-offs, please refer to the RFC

New VTGate Shutdown Behavior

We added a new option to VTGate to disallow new connections while VTGate is shutting down, while allowing existing connections to finish their work until they manually disconnect or until the --onterm_timeout is reached, without getting a Server shutdown in progress error.

This new behavior can be enabled by specifying the new --mysql-server-drain-onterm flag to VTGate.

You can find more information about this option in the RFC.

Tablet Throttler: Multi-Metric support

Up until v20, the tablet throttler would only monitor and use a single metric. That would be replication lag, by default, or could be the result of a custom query. In this release, we introduce a major redesign so that the throttler monitors and uses multiple metrics at the same time, including the above two.

The default behavior now is to monitor all metrics, but only use lag (if the custom query is undefined) or the custom metric (if the custom query is defined). This is backwards-compatible with v20. A v20 PRIMARY is compatible with a v21 REPLICA, and a v21 PRIMARY is compatible with a v20 REPLICA.

However, it is now possible to assign any combination of one or more metrics for a given app. The throttler would then accept or reject the app’s requests based on the health of all assigned metrics. We have provided a pre-defined list of metrics:

  • lag: replication lag based on heartbeat injection.
  • threads_running: concurrent active threads on the MySQL server.
  • loadavg: per core load average measured on the tablet instance/pod.
  • custom: the result of a custom query executed on the MySQL server.

Each metric has a default threshold which can be overridden by the UpdateThrottlerConfig command.

The throttler also supports the catch-all "all" app name, and it is thus possible to assign metrics to all apps. Explicit app to metric assignments will override the catch-all configuration.

Metrics are assigned a default scope, which could be self (isolated to the tablet) or shard (max, aka worst value among shard tablets). It is further possible to require a different scope for each metric.

Allow Cross Cell Promotion in PRS

Up until now if the users wanted to promote a replica in a different cell from the current primary using PlannedReparentShard, they had to specify the new primary with the --new-primary flag.

We have now added a new flag --allow-cross-cell-promotion that lets PlannedReparentShard choose a primary in a different cell even if no new primary is provided explicitly.

Experimental support for recursive CTEs

We have added experimental support for recursive CTEs in Vitess. We are marking it as experimental because it is not yet fully tested and may have some limitations. We are looking for feedback from the community to improve this feature.

VTGate Tablet Balancer

When a VTGate routes a query and has multiple available tablets for a given shard / tablet type (e.g. REPLICA), the current default behavior routes the query with local cell affinity and round robin policy. The VTGate Tablet Balancer provides an alternate mechanism that routes queries to maintain an even distribution of query load to each tablet, while preferentially routing to tablets in the same cell as the VTGate.

The tablet balancer is enabled by a new flag --enable-balancer and configured by --balancer-vtgate-cells and --balancer-keyspaces.

See the RFC for more details on the design and configuration of this feature.

Query Timeout Override

VTGate sends an authoritative query timeout to VTTablet when the QUERY_TIMEOUT_MS comment directive, query_timeout session system variable, or query-timeout flag is set. The order of precedence is: comment directive > session variable > VTGate flag. VTTablet overrides its default query timeout with the value received from VTGate. All timeouts are specified in milliseconds.

When a query is executed inside a transaction, there is an additional nuance. The actual timeout used will be the smaller of the transaction timeout and the query timeout.

A query can also be set to have no timeout by using the QUERY_TIMEOUT_MS comment directive with a value of 0.

Example usage: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

New Backup Engine (EXPERIMENTAL)

We are introducing a new backup engine for logical backups in order to support use cases that require something other than physical backups. This feature is experimental and is based on MySQL Shell.

The new engine is enabled by using --backup_engine_implementation=mysqlshell. There are other options that are required, so please read the documentation to learn which options are required and how to configure them.

Dynamic VReplication Configuration

Previously, many of the configuration options for VReplication Workflows had to be provided using VTTablet flags. This meant that any change to VReplication configuration required restarting VTTablets. We now allow these to be overridden while creating a workflow or dynamically after the workflow is already in progress.

Reference Table Materialization

There is a new option in Materialize workflows to keep a synced copy of reference or lookup tables (countries, states, zip codes, etc) from an unsharded keyspace, which holds the source of truth for the reference table, to all shards in a sharded keyspace.

New VEXPLAIN Modes: TRACE and KEYS

VEXPLAIN TRACE

The new TRACE mode for VEXPLAIN provides a detailed execution trace of queries, showing how they’re processed through various operators and interactions with tablets. This mode is particularly useful for:

  • Identifying performance bottlenecks
  • Understanding query execution patterns
  • Optimizing complex queries
  • Debugging unexpected query behavior

TRACE mode runs the query and logs all interactions, returning a JSON representation of the query execution plan with additional statistics like number of calls, average rows processed, and number of shards queried.

VEXPLAIN KEYS

The KEYS mode for VEXPLAIN offers a concise summary of query structure, highlighting columns used in joins, filters, and grouping operations. This information is crucial for:

  • Identifying potential sharding key candidates
  • Optimizing query performance
  • Analyzing query patterns to inform database design decisions

KEYS mode analyzes the query structure without executing it, providing JSON output that includes grouping columns, join columns, filter columns (potential candidates for indexes, primary keys, or sharding keys), and the statement type.

These new VEXPLAIN modes enhance Vitess’s query analysis capabilities, allowing for more informed decisions about sharding strategies and query optimization.

Automatically Replace MySQL auto_increment Clauses with Vitess Sequences

In https://github.com/vitessio/vitess/pull/16860 we added support for replacing MySQL auto_increment clauses with Vitess Sequences, performing all of the setup and initialization work automatically during the MoveTables workflow. As part of that work we have deprecated the --remove-sharded-auto-increment boolean flag and you should begin using the new --sharded-auto-increment-handling flag instead. Please see the new MoveTables Auto Increment Handling documentation for additional details.

Experimental MySQL 8.4 support

We have added experimental support for MySQL 8.4. It passes the Vitess test suite, but it is otherwise not yet tested. We are looking for feedback from the community to improve this to move support out of the experimental phase in a future release.

Current Errant GTIDs Count Metric

A new metric called CurrentErrantGTIDCount has been added to the VTOrc component. This metric shows the current count of the errant GTIDs in the tablets.

vtctldclient ChangeTabletTags command

The vtctldclient command ChangeTabletTags was added to allow the tags of a tablet to be changed dynamically.

Support specifying expected primary in reparents

The EmergencyReparentShard and PlannedReparentShard commands and RPCs now support specifying a primary we expect to still be the current primary in order for a reparent operation to be processed. This allows reparents to be conditional on a specific state being true.


The entire changelog for this release can be found here.

The release includes 364 merged Pull Requests.

Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

下载链接