vitess v21.0.0-rc2 版本更新介绍
发布日期: 2024-10-23
版本号: v21.0.0-rc2
Vitess v21.0.0版本发布,主要更新包括以下内容:弃用并计划移除VTTablet的
queryserver-enable-settings-pool
标志,删除多个已弃用的VTOrc监控指标,同时替换了部分vttablet指标名称。新增流量镜像功能,允许VTGate将指定比例的流量复制到目标keyspace以降低切换风险。VTGate引入新关闭行为,支持通过--mysql-server-drain-onterm
标志在关闭时拒绝新连接但允许现有连接完成。Tablet Throttler支持多指标监控(如复制延迟、线程数、负载等),并可根据应用动态分配指标组合。优化了跨单元晋升主库的逻辑,允许通过--allow-cross-cell-promotion
标志实现。新增实验性递归CTE支持。引入VTGate负载均衡器,通过动态分配查询负载提升性能。查询超时机制支持通过注释、会话变量或命令行参数覆盖,优先级依次递减。新增基于MySQL Shell的逻辑备份引擎(实验性)。动态VReplication配置支持无需重启修改参数。参考表物化功能可将非分片表同步至分片集群的所有分片。VEXPLAIN新增TRACE和KEYS模式,分别提供执行跟踪与关键列分析。增强Errant GTID检测机制,阻止异常副本加入复制。自动将MySQL的auto_increment
替换为Vitess序列,并新增相关配置参数。实验性支持MySQL 8.4版本。新增VTOrc的CurrentErrantGTIDCount
指标监控异常GTID数量。新增vtctldclient ChangeTabletTags
命令动态修改标签。主库切换操作支持指定预期主库以确保条件执行。该版本共合并354项代码提交。
更新内容 (中文)
Vitess v21.0.0 版本发布
概述
目录
- 主要变更
- 弃用和删除项
- 流量镜像
- 新的 VTGate 关闭行为
- Tablet 限流器:多指标支持
- 允许在 PRS 中跨 Cell 提升
- 递归 CTE 支持
- VTGate Tablet 负载均衡器
- 查询超时覆盖
- 新备份引擎
- 动态 VReplication 配置
- 参考表物化
- 新的 VEXPLAIN 模式:TRACE 和 KEYS
- VTTablet 上的异常 GTID 检测
- 自动用 Vitess 序列替换 MySQL auto_increment 子句
- 实验性 MySQL 8.4 支持
- 当前异常 GTID 计数指标
- vtctldclient ChangeTabletTags 命令
- 在重定位中指定预期主库的支持
主要变更
弃用和删除项
弃用的 VTTablet 标志
queryserver-enable-settings-pool
标志自v15
添加,自v17
起默认启用。现已弃用,将在未来版本中移除。
删除已弃用的指标
以下 VTOrc 指标在 v20
中已弃用,现已被删除:
指标名称 |
---|
analysis.change.write |
audit.write |
discoveries.attempt |
discoveries.fail |
discoveries.instance_poll_seconds_exceeded |
discoveries.queue_length |
discoveries.recent_count |
instance.read |
instance.read_topology |
emergency_reparent_counts |
planned_reparent_counts |
reparent_shard_operation_timings |
弃用的指标
以下指标现已弃用,将在未来版本中删除,请使用替代指标:
组件 | 指标名称 | 替代指标 |
---|---|---|
vttablet |
QueryCacheLength |
QueryEnginePlanCacheLength |
vttablet |
QueryCacheSize |
QueryEnginePlanCacheSize |
vttablet |
QueryCacheCapacity |
QueryEnginePlanCacheCapacity |
vttablet |
QueryCacheEvictions |
QueryEnginePlanCacheEvictions |
vttablet |
QueryCacheHits |
QueryEnginePlanCacheHits |
vttablet |
QueryCacheMisses |
QueryEnginePlanCacheMisses |
流量镜像
流量镜像功能旨在减少 MoveTables SwitchTraffic
操作中的不确定性。启用流量镜像后,VTGate 会将指定比例的流量从一个 keyspace 镜像到另一个 keyspace。
可通过 vtctldclient
的 MoveTables MirrorTraffic
命令启用镜像规则。例如:
$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0
可使用 GetMirrorRules
检查镜像规则。
新的 VTGate 关闭行为
新增了 VTGate 选项,允许在关闭期间拒绝新连接,同时允许现有连接继续处理直到手动断开或达到 --onterm_timeout
,而不会返回 Server shutdown in progress
错误。
可通过为 VTGate 指定新标志 --mysql-server-drain-onterm
启用此行为。
更多信息请参阅 RFC。
Tablet 限流器:多指标支持
在 v20
之前,Tablet 限流器仅监控和使用单个指标(默认复制延迟或自定义查询结果)。本版本引入重大改进,限流器现可同时监控和使用多个指标。
默认行为是监控所有指标,但仅使用 lag
(未定义自定义查询时)或 custom
指标(定义自定义查询时)。此行为与 v20
向后兼容。现可为应用分配一个或多个指标组合,限流器将根据所有分配指标的运行状况决策请求。
预定义指标包括:
lag
:基于心跳注入的复制延迟threads_running
:MySQL 服务器并发线程数loadavg
:Tablet 实例/pod 的每核负载custom
:自定义查询结果
每个指标均有默认阈值,可通过 UpdateThrottlerConfig
命令覆盖。限流器支持通配符 "all"
应用名,可为所有应用分配指标。指标默认作用域为 self
(Tablet 独立)或 shard
(分片内最差值),可为每个指标指定不同作用域。
允许在 PRS 中跨 Cell 提升
此前使用 PlannedReparentShard
跨 Cell 提升副本时需显式指定 --new-primary
。新增 --allow-cross-cell-promotion
标志允许在不显式指定新主库时跨 Cell 选择主库。
递归 CTE 实验性支持
新增递归 CTE 的实验性支持。由于尚未充分测试,可能存在限制,欢迎社区反馈以改进此功能。
VTGate Tablet 负载均衡器
当 VTGate 路由查询时,若存在多个可用 Tablet(如 REPLICA 类型),默认行为为本地 Cell 亲和性+轮询策略。Tablet 负载均衡器提供新机制,在优先选择本地 Cell Tablet 的同时均衡查询负载。
通过 --enable-balancer
标志启用,并通过 --balancer-vtgate-cells
和 --balancer-keyspaces
配置。详见 RFC。
查询超时覆盖
当设置 QUERY_TIMEOUT_MS
注释指令、query_timeout
会话变量或 query-timeout
标志时,VTGate 会向 VTTablet 发送权威查询超时。优先级顺序为:注释指令 > 会话变量 > VTGate 标志。事务内查询的实际超时为事务超时和查询超时的较小值。
使用 QUERY_TIMEOUT_MS=0
可取消超时限制。示例:
select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl
新备份引擎(实验性)
引入基于 MySQL Shell 的逻辑备份新引擎,需配置 --backup_engine_implementation=mysqlshell
。详见文档。
动态 VReplication 配置
此前许多 VReplication 工作流配置需通过 VTTablet 标志设置,变更需重启 VTTablet。现支持在创建工作流时或运行时动态覆盖配置。
参考表物化
Materialize
工作流新增选项,可将未分片 keyspace 中的参考表/查找表同步到分片 keyspace 的所有分片。
新 VEXPLAIN 模式:TRACE 和 KEYS
VEXPLAIN TRACE
TRACE
模式提供查询执行的详细跟踪信息,展示操作流程和 Tablet 交互。适用于:
- 识别性能瓶颈
- 理解执行模式
- 优化复杂查询
- 调试异常行为
该模式执行查询并记录交互,返回包含调用次数、平均处理行数等统计信息的 JSON 执行计划。
VEXPLAIN KEYS
KEYS
模式提供查询结构摘要,突出显示连接、过滤和分组使用的列。适用于:
- 识别分片键候选
- 优化查询性能
- 分析查询模式以指导数据库设计
该模式不执行查询,返回包含分组列、连接列、过滤列和语句类型的 JSON 输出。
VTTablet 上的异常 GTID 检测
VTTablet 在加入复制流前会执行异常 GTID 检测。若副本存在异常 GTID,将不开始复制。Kubernetes 上运行的副本若存在异常 GTID 将显示为未就绪状态,需手动替换。
自动用 Vitess 序列替换 MySQL auto_increment 子句
在 MoveTables
工作流中自动将 MySQL auto_increment
替换为 Vitess 序列。原 --remove-sharded-auto-increment
标志已弃用,改用 --sharded-auto-increment-handling
。详见文档。
实验性 MySQL 8.4 支持
新增 MySQL 8.4 实验性支持,已通过测试套件但需社区反馈以改进。
当前异常 GTID 计数指标
VTOrc 新增 CurrentErrantGTIDCount
指标,显示 Tablet 中异常 GTID 的当前数量。
vtctldclient ChangeTabletTags
命令
新增 vtctldclient ChangeTabletTags
命令,支持动态修改 Tablet 标签。
在重定位中指定预期主库的支持
EmergencyReparentShard
和 PlannedReparentShard
命令及 RPC 现支持指定预期主库,使重定位操作可基于特定状态条件执行。
完整更新日志请参见此处。
本版本包含 354 个合并的 Pull Request。
感谢所有贡献者:@GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot
更新内容 (原始)
Release of Vitess v21.0.0
Summary
Table of Contents
- Major Changes
- Deprecations and Deletions
- Traffic Mirroring
- New VTGate Shutdown Behavior
- Tablet Throttler: Multi-Metric support
- Allow Cross Cell Promotion in PRS
- Support for recursive CTEs
- VTGate Tablet Balancer
- Query Timeout Override
- New Backup Engine
- Dynamic VReplication Configuration
- Reference Table Materialization
- New VEXPLAIN Modes: TRACE and KEYS
- Errant GTID Detection on VTTablets
- Automatically Replace MySQL auto_increment Clauses with Vitess Sequences
- Experimental MySQL 8.4 support
- Current Errant GTIDs Count Metric
- vtctldclient ChangeTabletTags
- Support for specifying expected primary in reparents
Major Changes
Deprecations and Deletions
Deprecated VTTablet Flags
queryserver-enable-settings-pool
flag, added inv15
, has been on by default sincev17
. It is now deprecated and will be removed in a future release.
Deletion of deprecated metrics
The following VTOrc metrics were deprecated in v20
. They have now been deleted.
Metric Name |
---|
analysis.change.write |
audit.write |
discoveries.attempt |
discoveries.fail |
discoveries.instance_poll_seconds_exceeded |
discoveries.queue_length |
discoveries.recent_count |
instance.read |
instance.read_topology |
emergency_reparent_counts |
planned_reparent_counts |
reparent_shard_operation_timings |
Deprecated Metrics
The following metrics are now deprecated and will be deleted in a future release, please use their replacements.
Component | Metric Name | Replaced By |
---|---|---|
vttablet |
QueryCacheLength |
QueryEnginePlanCacheLength |
vttablet |
QueryCacheSize |
QueryEnginePlanCacheSize |
vttablet |
QueryCacheCapacity |
QueryEnginePlanCacheCapacity |
vttablet |
QueryCacheEvictions |
QueryEnginePlanCacheEvictions |
vttablet |
QueryCacheHits |
QueryEnginePlanCacheHits |
vttablet |
QueryCacheMisses |
QueryEnginePlanCacheMisses |
Traffic Mirroring
Traffic mirroring is intended to help reduce some of the uncertainty inherent to MoveTables SwitchTraffic
. When
traffic mirroring is enabled, VTGate will mirror a percentage of traffic from one keyspace to another.
Mirror rules may be enabled through vtctldclient
with MoveTables MirrorTraffic
. For example:
$ vtctldclient --server :15999 MoveTables --target-keyspace customer --workflow commerce2customer MirrorTraffic --percent 5.0
Mirror rules can be inspected with GetMirrorRules
.
New VTGate Shutdown Behavior
We added a new option to VTGate to disallow new connections while VTGate is shutting down,
while allowing existing connections to finish their work until they manually disconnect or until
the --onterm_timeout
is reached, without getting a Server shutdown in progress
error.
This new behavior can be enabled by specifying the new --mysql-server-drain-onterm
flag to VTGate.
You can find more information about this option in the RFC.
Tablet Throttler: Multi-Metric support
Up until v20
, the tablet throttler would only monitor and use a single metric. That would be replication lag, by
default, or could be the result of a custom query. In this release, we introduce a major redesign so that the throttler
monitors and uses multiple metrics at the same time, including the above two.
The default behavior now is to monitor all metrics, but only use lag
(if the custom query is undefined) or the custom
metric (if the custom query is defined). This is backwards-compatible with v20
. A v20
PRIMARY
is compatible with
a v21
REPLICA
, and a v21
PRIMARY
is compatible with a v20
REPLICA
.
However, it is now possible to assign any combination of one or more metrics for a given app. The throttler would then accept or reject the app’s requests based on the health of all assigned metrics. We have provided a pre-defined list of metrics:
lag
: replication lag based on heartbeat injection.threads_running
: concurrent active threads on the MySQL server.loadavg
: per core load average measured on the tablet instance/pod.custom
: the result of a custom query executed on the MySQL server.
Each metric has a default threshold which can be overridden by the UpdateThrottlerConfig
command.
The throttler also supports the catch-all "all"
app name, and it is thus possible to assign metrics to all apps.
Explicit app to metric assignments will override the catch-all configuration.
Metrics are assigned a default scope, which could be self
(isolated to the tablet) or shard
(max, aka worst
value among shard tablets). It is further possible to require a different scope for each metric.
Allow Cross Cell Promotion in PRS
Up until now if the users wanted to promote a replica in a different cell from the current primary
using PlannedReparentShard
, they had to specify the new primary with the --new-primary
flag.
We have now added a new flag --allow-cross-cell-promotion
that lets PlannedReparentShard
choose a primary in a
different cell even if no new primary is provided explicitly.
Experimental support for recursive CTEs
We have added experimental support for recursive CTEs in Vitess. We are marking it as experimental because it is not yet fully tested and may have some limitations. We are looking for feedback from the community to improve this feature.
VTGate Tablet Balancer
When a VTGate routes a query and has multiple available tablets for a given shard / tablet type (e.g. REPLICA), the current default behavior routes the query with local cell affinity and round robin policy. The VTGate Tablet Balancer provides an alternate mechanism that routes queries to maintain an even distribution of query load to each tablet, while preferentially routing to tablets in the same cell as the VTGate.
The tablet balancer is enabled by a new flag --enable-balancer
and configured by --balancer-vtgate-cells
and --balancer-keyspaces
.
See the RFC for more details on the design and configuration of this feature.
Query Timeout Override
VTGate sends an authoritative query timeout to VTTablet when the QUERY_TIMEOUT_MS
comment directive,
query_timeout
session system variable, or query-timeout
flag is set.
The order of precedence is: comment directive > session variable > VTGate flag.
VTTablet overrides its default query timeout with the value received from VTGate.
All timeouts are specified in milliseconds.
When a query is executed inside a transaction, there is an additional nuance. The actual timeout used will be the smaller of the transaction timeout and the query timeout.
A query can also be set to have no timeout by using the QUERY_TIMEOUT_MS
comment directive with a value of 0
.
Example usage:
select /*vt+ QUERY_TIMEOUT_MS=30 */ col from tbl
New Backup Engine (EXPERIMENTAL)
We are introducing a new backup engine for logical backups in order to support use cases that require something other than physical backups. This feature is experimental and is based on MySQL Shell.
The new engine is enabled by using --backup_engine_implementation=mysqlshell
. There are other options that are required,
so please read the documentation to learn which options are required and how to configure them.
Dynamic VReplication Configuration
Previously, many of the configuration options for VReplication Workflows had to be provided using VTTablet flags. This meant that any change to VReplication configuration required restarting VTTablets. We now allow these to be overridden while creating a workflow or dynamically after the workflow is already in progress.
Reference Table Materialization
There is a new option in Materialize
workflows to keep a synced copy of reference or lookup tables
(countries, states, zip codes, etc) from an unsharded keyspace, which holds the source of truth for the reference
table, to all shards in a sharded keyspace.
New VEXPLAIN Modes: TRACE and KEYS
VEXPLAIN TRACE
The new TRACE
mode for VEXPLAIN
provides a detailed execution trace of queries, showing how they’re processed through various
operators and interactions with tablets. This mode is particularly useful for:
- Identifying performance bottlenecks
- Understanding query execution patterns
- Optimizing complex queries
- Debugging unexpected query behavior
TRACE
mode runs the query and logs all interactions, returning a JSON representation of the query execution plan with additional
statistics like number of calls, average rows processed, and number of shards queried.
VEXPLAIN KEYS
The KEYS
mode for VEXPLAIN
offers a concise summary of query structure, highlighting columns used in joins, filters, and
grouping operations. This information is crucial for:
- Identifying potential sharding key candidates
- Optimizing query performance
- Analyzing query patterns to inform database design decisions
KEYS
mode analyzes the query structure without executing it, providing JSON output that includes grouping columns, join columns,
filter columns (potential candidates for indexes, primary keys, or sharding keys), and the statement type.
These new VEXPLAIN
modes enhance Vitess’s query analysis capabilities, allowing for more informed decisions about sharding
strategies and query optimization.
Errant GTID Detection on VTTablets
VTTablets now run an errant GTID detection logic before they join the replication stream. So, if a replica has an errant GTID, it will not start replicating from the primary. This protects us from running into situations which are very difficult to recover from.
For users running with the vitess-operator on Kubernetes, this change means that replica tablets with errant GTIDs will have broken replication and will report as unready. Users will need to manually replace and clean up these errant replica tablets.
Automatically Replace MySQL auto_increment Clauses with Vitess Sequences
In https://github.com/vitessio/vitess/pull/16860 we added support for replacing MySQL auto_increment
clauses with Vitess Sequences, performing all of the setup and initialization
work automatically during the MoveTables
workflow. As part of that work we have deprecated the
--remove-sharded-auto-increment
boolean flag and you should begin using the new
--sharded-auto-increment-handling
flag instead. Please see the new
MoveTables
Auto Increment Handling documentation for additional details.
Experimental MySQL 8.4 support
We have added experimental support for MySQL 8.4. It passes the Vitess test suite, but it is otherwise not yet tested. We are looking for feedback from the community to improve this to move support out of the experimental phase in a future release.
Current Errant GTIDs Count Metric
A new metric called CurrentErrantGTIDCount
has been added to the VTOrc
component.
This metric shows the current count of the errant GTIDs in the tablets.
vtctldclient ChangeTabletTags
command
The vtctldclient
command ChangeTabletTags
was added to allow the tags of a tablet to be changed dynamically.
Support specifying expected primary in reparents
The EmergencyReparentShard
and PlannedReparentShard
commands and RPCs now support specifying a primary we expect to still be the current primary in order for a reparent operation to be processed. This allows reparents to be conditional on a specific state being true.
The entire changelog for this release can be found here.
The release includes 354 merged Pull Requests.
Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @Utkar5hM, @anshikavashistha, @app/dependabot, @app/vitess-bot, @arthurschreiber, @beingnoble03, @brendar, @cameronmccord2, @chrism1001, @cuishuang, @dbussink, @deepthi, @demmer, @frouioui, @harshit-gangal, @harshitasao, @icyflame, @kirtanchandak, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @notfelineit, @perminov, @rafer, @rohit-nayak-ps, @runewake2, @rvrangel, @shanth96, @shlomi-noach, @systay, @timvaillancourt, @vitess-bot