发布日期: 2024-10-15
版本号: v21.0.0-rc1

Vitess v21.0.0版本主要包含以下更新:新增流量镜像功能,允许VTGate将指定比例的流量复制到目标Keyspace以降低迁移风险;引入VTGate新关闭行为,通过--mysql-server-drain-onterm标志实现优雅停机,允许现有连接完成请求。Tablet Throttler支持多指标监控(如复制延迟、线程数、负载等),并支持按应用配置不同指标组合。优化PlannedReparentShard,允许跨Cell提升主库(需启用--allow-cross-cell-promotion)。实验性支持递归CTE查询,并新增VTGate负载均衡器(通过--enable-balancer启用)。查询超时机制增强,支持通过注释、会话变量或标志动态设置超时时间。引入基于MySQL Shell的实验性逻辑备份引擎(需配置--backup_engine_implementation=mysqlshell)。动态VReplication配置允许运行时调整参数,无需重启VTTablet。新增参考表物化功能,支持将非分片Keyspace的参考表同步至分片Keyspace。VEXPLAIN新增TRACE和KEYS模式,分别用于查询执行跟踪及结构分析。VTTablet加入GTID异常检测机制,避免副本加入异常复制流。自动替换MySQL自增列为Vitess序列,废弃旧标志并改用--sharded-auto-increment-handling。实验性支持MySQL 8.4,新增VTOrc指标CurrentErrantGTIDCount用于监控异常GTID数量,同时新增vtctldclient ChangeTabletTags命令动态修改Tablet标签。此外,废弃并删除了部分VTTablet标志及VTOrc指标,相关替代指标已列出。其他改进包括跨Cell主库提升、动态配置增强及多项稳定性优化。

更新内容 (中文)

Vitess v21.0.0 版本发布

概述

目录

<a id="major-changes"/>重大变更</a>

<a id="deprecations-and-deletions"/>弃用和删除项</a>

<a id="vttablet-flags"/>弃用 VTTablet 参数</a>

  • v15 添加的 queryserver-enable-settings-pool 参数自 v17 起默认启用。 现已被弃用,将在未来版本中移除。

<a id="metric-deletion"/>删除已弃用指标</a>

以下 VTOrc 指标已在 v20 弃用,现已被删除:

指标名称
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

<a id="deprecations-metrics"/>弃用指标</a>

以下指标现已弃用并将于未来版本删除,请使用替代指标:

组件 指标名称 替代指标
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

<a id="traffic-mirroring"/>流量镜像</a>

流量镜像功能旨在减少 MoveTables SwitchTraffic 操作的不确定性。启用后,VTGate 可将指定比例的流量从一个 keyspace 镜像到另一个 keyspace。

可通过 vtctldclientMoveTables MirrorTraffic 启用镜像规则,例如:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

镜像规则可通过 GetMirrorRules 查看。

<a id="new-vtgate-shutdown-behavior"/>新 VTGate 关闭行为</a>

新增 VTGate 选项:在关闭期间禁止新连接,同时允许现有连接完成操作直至手动断开或达到 --onterm_timeout 超时,期间不会返回 Server shutdown in progress 错误。

通过为 VTGate 指定新参数 --mysql-server-drain-onterm 可启用此行为。

更多信息请参阅 RFC

<a id="tablet-throttler"/>Tablet 限流器:多指标支持</a>

v20 及之前版本中,tablet 限流器仅监控单个指标(默认复制延迟或自定义查询结果)。本次更新实现了重大改进:限流器现可同时监控并使用多个指标。

默认行为改为监控所有指标,但仅使用 lag(未定义自定义查询时)或 custom 指标(定义自定义查询时)。这与 v20 保持向后兼容。v20PRIMARYv21REPLICA 兼容,反之亦然。

现在可为指定应用分配任意组合的指标,限流器将根据所有分配指标的健康状态决定是否接受请求。预定义指标包括:

  • lag:基于心跳注入的复制延迟
  • threads_running:MySQL 服务器并发活跃线程数
  • loadavg:tablet 实例/pod 的单核平均负载
  • custom:MySQL 服务器自定义查询结果

每个指标均有默认阈值,可通过 UpdateThrottlerConfig 覆盖。

限流器支持通配符 "all" 应用名,可为所有应用分配指标。显式应用指标分配将覆盖通配配置。

指标默认作用域为 self(仅限当前 tablet)或 shard(取分片内所有 tablet 的最大值)。可为每个指标指定不同作用域。

<a id="allow-cross-cell"/>允许 PRS 跨单元晋升</a>

此前使用 PlannedReparentShard 跨单元晋升副本时需通过 --new-primary 显式指定新主节点。新增 --allow-cross-cell-promotion 参数允许在不显式指定新主节点时跨单元晋升。

<a id="recursive-cte"/>递归 CTE 实验性支持</a>

新增递归公共表表达式(CTE)的实验性支持。由于尚未完成全面测试,可能存在限制。欢迎社区反馈以改进此功能。

<a id="tablet-balancer"/>VTGate Tablet 负载均衡器</a>

当 VTGate 路由查询时,若某分片/tablet 类型(如 REPLICA)有多个可用 tablet,默认行为采用本地单元亲和性+轮询策略。新负载均衡器提供替代方案:在保持查询负载均衡分布的同时,优先选择与 VTGate 同单元的 tablet。

通过新参数 --enable-balancer 启用,并通过 --balancer-vtgate-cells--balancer-keyspaces 配置。

详见设计文档 RFC

<a id="query-timeout"/>查询超时覆盖</a>

当设置 QUERY_TIMEOUT_MS 注释指令、query_timeout 会话变量或 query-timeout 参数时,VTGate 会向 VTTablet 发送权威查询超时值。优先级顺序为:注释指令 > 会话变量 > VTGate 参数。VTTablet 将使用该值覆盖默认查询超时(单位:毫秒)。

事务内查询的实际超时为事务超时与查询超时的较小值。

通过 QUERY_TIMEOUT_MS=0 注释指令可设置无超时查询。

示例: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

<a id="new-backup-engine"/>新备份引擎(实验性)</a>

为支持逻辑备份场景,新增基于 MySQL Shell 的实验性备份引擎。

通过 --backup_engine_implementation=mysqlshell 启用。需配置其他必要参数,详见文档

<a id="dynamic-vreplication-configuration"/>动态 VReplication 配置</a>

此前 VReplication 工作流的多数配置需通过 VTTablet 参数设置,变更需重启 VTTablet。现支持在工作流创建时或运行时动态覆盖配置。

<a id="reference-table-materialization"/>参考表物化</a>

物化工作流 新增选项:将未分片 keyspace(参考表源)中的参考表/查找表同步到分片 keyspace 的所有分片。

<a id="new-vexplain-modes"/>新 VEXPLAIN 模式:TRACE 和 KEYS</a>

VEXPLAIN TRACE

TRACE 模式提供查询执行的详细跟踪信息,展示通过各运算符的处理过程及与 tablet 的交互。适用于:

  • 识别性能瓶颈
  • 理解查询执行模式
  • 优化复杂查询
  • 调试异常行为

该模式会实际执行查询并记录所有交互,返回包含调用次数、平均处理行数、查询分片数等统计信息的 JSON 执行计划。

VEXPLAIN KEYS

KEYS 模式提供查询结构的简明摘要,突出显示连接、过滤和分组操作使用的列。适用于:

  • 识别潜在分片键候选
  • 优化查询性能
  • 通过查询模式分析指导数据库设计

该模式不执行查询,返回包含分组列、连接列、过滤列(潜在索引/主键/分片键候选)及语句类型的 JSON 结果。

<a id="errant-gtid-vttablet"/>VTTablet 异常 GTID 检测</a>

VTTablet 现会在加入复制流前执行异常 GTID 检测。若副本存在异常 GTID 将不启动复制,避免难以恢复的情况。

Kubernetes 上使用 vitess-operator 的用户需注意:存在异常 GTID 的副本 tablet 将处于复制中断状态并报告为未就绪,需手动替换清理。

<a id="auto-replace-mysql-autoinc-with-seq"/>自动用 Vitess 序列替换 MySQL auto_increment 子句</a>

PR 16860 中新增支持:在 MoveTables 工作流期间自动将 MySQL auto_increment 替换为 Vitess 序列。原布尔参数 --remove-sharded-auto-increment 已弃用,请改用新参数 --sharded-auto-increment-handling。详见文档

<a id="experimental-mysql-84"/>实验性 MySQL 8.4 支持</a>

新增 MySQL 8.4 实验性支持。该版本已通过 Vitess 测试套件,但尚未经过全面测试。欢迎社区反馈以改进支持。

<a id="errant-gtid-metric"/>当前异常 GTID 计数指标</a>

VTOrc 组件新增 CurrentErrantGTIDCount 指标,用于统计 tablet 中当前异常 GTID 数量。

<a id="vtctldclient-changetablettags"/>vtctldclient ChangeTabletTags 命令</a>

新增 vtctldclient ChangeTabletTags 命令,支持动态修改 tablet 标签。


完整变更日志请查看此处

本次发布包含 338 个合并 PR。

感谢所有贡献者:@GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

更新内容 (原始)

Release of Vitess v21.0.0

Summary

Table of Contents

Major Changes

Deprecations and Deletions

Deprecated VTTablet Flags

  • queryserver-enable-settings-pool flag, added in v15, has been on by default since v17. It is now deprecated and will be removed in a future release.

Deletion of deprecated metrics

The following VTOrc metrics were deprecated in v20. They have now been deleted.

Metric Name
analysis.change.write
audit.write
discoveries.attempt
discoveries.fail
discoveries.instance_poll_seconds_exceeded
discoveries.queue_length
discoveries.recent_count
instance.read
instance.read_topology
emergency_reparent_counts
planned_reparent_counts
reparent_shard_operation_timings

Deprecated Metrics

The following metrics are now deprecated and will be deleted in a future release, please use their replacements.

Component Metric Name Replaced By
vttablet QueryCacheLength QueryEnginePlanCacheLength
vttablet QueryCacheSize QueryEnginePlanCacheSize
vttablet QueryCacheCapacity QueryEnginePlanCacheCapacity
vttablet QueryCacheEvictions QueryEnginePlanCacheEvictions
vttablet QueryCacheHits QueryEnginePlanCacheHits
vttablet QueryCacheMisses QueryEnginePlanCacheMisses

Traffic Mirroring

Traffic mirroring is intended to help reduce some of the uncertainty inherent to MoveTables SwitchTraffic. When traffic mirroring is enabled, VTGate will mirror a percentage of traffic from one keyspace to another.

Mirror rules may be enabled through vtctldclient with MoveTables MirrorTraffic. For example:

$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0

Mirror rules can be inspected with GetMirrorRules.

New VTGate Shutdown Behavior

We added a new option to VTGate to disallow new connections while VTGate is shutting down, while allowing existing connections to finish their work until they manually disconnect or until the --onterm_timeout is reached, without getting a Server shutdown in progress error.

This new behavior can be enabled by specifying the new --mysql-server-drain-onterm flag to VTGate.

You can find more information about this option in the RFC.

Tablet Throttler: Multi-Metric support

Up until v20, the tablet throttler would only monitor and use a single metric. That would be replication lag, by default, or could be the result of a custom query. In this release, we introduce a major redesign so that the throttler monitors and uses multiple metrics at the same time, including the above two.

The default behavior now is to monitor all metrics, but only use lag (if the custom query is undefined) or the custom metric (if the custom query is defined). This is backwards-compatible with v20. A v20 PRIMARY is compatible with a v21 REPLICA, and a v21 PRIMARY is compatible with a v20 REPLICA.

However, it is now possible to assign any combination of one or more metrics for a given app. The throttler would then accept or reject the app’s requests based on the health of all assigned metrics. We have provided a pre-defined list of metrics:

  • lag: replication lag based on heartbeat injection.
  • threads_running: concurrent active threads on the MySQL server.
  • loadavg: per core load average measured on the tablet instance/pod.
  • custom: the result of a custom query executed on the MySQL server.

Each metric has a default threshold which can be overridden by the UpdateThrottlerConfig command.

The throttler also supports the catch-all "all" app name, and it is thus possible to assign metrics to all apps. Explicit app to metric assignments will override the catch-all configuration.

Metrics are assigned a default scope, which could be self (isolated to the tablet) or shard (max, aka worst value among shard tablets). It is further possible to require a different scope for each metric.

Allow Cross Cell Promotion in PRS

Up until now if the users wanted to promote a replica in a different cell from the current primary using PlannedReparentShard, they had to specify the new primary with the --new-primary flag.

We have now added a new flag --allow-cross-cell-promotion that lets PlannedReparentShard choose a primary in a different cell even if no new primary is provided explicitly.

Experimental support for recursive CTEs

We have added experimental support for recursive CTEs in Vitess. We are marking it as experimental because it is not yet fully tested and may have some limitations. We are looking for feedback from the community to improve this feature.

VTGate Tablet Balancer

When a VTGate routes a query and has multiple available tablets for a given shard / tablet type (e.g. REPLICA), the current default behavior routes the query with local cell affinity and round robin policy. The VTGate Tablet Balancer provides an alternate mechanism that routes queries to maintain an even distribution of query load to each tablet, while preferentially routing to tablets in the same cell as the VTGate.

The tablet balancer is enabled by a new flag --enable-balancer and configured by --balancer-vtgate-cells and --balancer-keyspaces.

See the RFC for more details on the design and configuration of this feature.

Query Timeout Override

VTGate sends an authoritative query timeout to VTTablet when the QUERY_TIMEOUT_MS comment directive, query_timeout session system variable, or query-timeout flag is set. The order of precedence is: comment directive > session variable > VTGate flag. VTTablet overrides its default query timeout with the value received from VTGate. All timeouts are specified in milliseconds.

When a query is executed inside a transaction, there is an additional nuance. The actual timeout used will be the smaller of the transaction timeout and the query timeout.

A query can also be set to have no timeout by using the QUERY_TIMEOUT_MS comment directive with a value of 0.

Example usage: select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl

New Backup Engine (EXPERIMENTAL)

We are introducing a new backup engine for logical backups in order to support use cases that require something other than physical backups. This feature is experimental and is based on MySQL Shell.

The new engine is enabled by using --backup_engine_implementation=mysqlshell. There are other options that are required, so please read the documentation to learn which options are required and how to configure them.

Dynamic VReplication Configuration

Previously, many of the configuration options for VReplication Workflows had to be provided using VTTablet flags. This meant that any change to VReplication configuration required restarting VTTablets. We now allow these to be overridden while creating a workflow or dynamically after the workflow is already in progress.

Reference Table Materialization

There is a new option in Materialize workflows to keep a synced copy of reference or lookup tables (countries, states, zip codes, etc) from an unsharded keyspace, which holds the source of truth for the reference table, to all shards in a sharded keyspace.

New VEXPLAIN Modes: TRACE and KEYS

VEXPLAIN TRACE

The new TRACE mode for VEXPLAIN provides a detailed execution trace of queries, showing how they’re processed through various operators and interactions with tablets. This mode is particularly useful for:

  • Identifying performance bottlenecks
  • Understanding query execution patterns
  • Optimizing complex queries
  • Debugging unexpected query behavior

TRACE mode runs the query and logs all interactions, returning a JSON representation of the query execution plan with additional statistics like number of calls, average rows processed, and number of shards queried.

VEXPLAIN KEYS

The KEYS mode for VEXPLAIN offers a concise summary of query structure, highlighting columns used in joins, filters, and grouping operations. This information is crucial for:

  • Identifying potential sharding key candidates
  • Optimizing query performance
  • Analyzing query patterns to inform database design decisions

KEYS mode analyzes the query structure without executing it, providing JSON output that includes grouping columns, join columns, filter columns (potential candidates for indexes, primary keys, or sharding keys), and the statement type.

These new VEXPLAIN modes enhance Vitess’s query analysis capabilities, allowing for more informed decisions about sharding strategies and query optimization.

Errant GTID Detection on VTTablets

VTTablets now run an errant GTID detection logic before they join the replication stream. So, if a replica has an errant GTID, it will not start replicating from the primary. This protects us from running into situations which are very difficult to recover from.

For users running with the vitess-operator on Kubernetes, this change means that replica tablets with errant GTIDs will have broken replication and will report as unready. Users will need to manually replace and clean up these errant replica tablets.

Automatically Replace MySQL auto_increment Clauses with Vitess Sequences

In https://github.com/vitessio/vitess/pull/16860 we added support for replacing MySQL auto_increment clauses with Vitess Sequences, performing all of the setup and initialization work automatically during the MoveTables workflow. As part of that work we have deprecated the --remove-sharded-auto-increment boolean flag and you should begin using the new --sharded-auto-increment-handling flag instead. Please see the new MoveTables Auto Increment Handling documentation for additional details.

Experimental MySQL 8.4 support

We have added experimental support for MySQL 8.4. It passes the Vitess test suite, but it is otherwise not yet tested. We are looking for feedback from the community to improve this to move support out of the experimental phase in a future release.

Current Errant GTIDs Count Metric

A new metric called CurrentErrantGTIDCount has been added to the VTOrc component. This metric shows the current count of the errant GTIDs in the tablets.

vtctldclient ChangeTabletTags command

The vtctldclient command ChangeTabletTags was added to allow the tags of a tablet to be changed dynamically.


The entire changelog for this release can be found here.

The release includes 338 merged Pull Requests.

Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot

下载链接