vitess v22.0.0-rc1 版本更新介绍
发布日期: 2025-04-08
版本号: v22.0.0-rc1
Vitess v22.0.0 版本发布,包含多项重大变更和优化。主要更新包括:
-
废弃和删除内容:
- 废弃了部分 VTGate 指标和 CLI 标志
- 移除了 gh-ost 和 pt-osc 在线 DDL 策略支持
- 删除了过时的 vttablet 指标和 CLI 标志
-
新增指标:
- VTGate 新增查询执行、路由等监控指标
- VTTablet 新增表行数、索引大小等数据库指标
-
配置变更:
- VTOrc 支持动态配置更新
- VTGate 配置键名调整以匹配标志名称
-
VTOrc 改进:
- 新增磁盘停滞恢复功能
- 支持在集群监控中指定键范围
-
新功能支持:
- 更高效的 JSON 复制
- 支持 LAST_INSERT_ID(x) 语法
- 连接池新增最大空闲连接数配置
- 新增仅记录错误查询的日志模式
-
优化改进:
- 预处理语句支持延迟优化
- 新增 RPC 接口获取事务状态
- 改进半同步监控机制
- 事务错误处理优化
-
默认版本更新:
- MySQL 默认版本升级至 8.0.40
- Docker 镜像基于 Debian Bookworm
-
其他改进:
- 拓扑读取并发控制优化
- VTTablet ACL 重载机制改进
- VTAdmin 升级至 Node.js v22.13.1 LTS
该版本共合并了 457 个 Pull Requests,完整变更日志可在 GitHub 查看。
更新内容 (中文)
详见原始内容
更新内容 (原始)
Release of Vitess v22.0.0
Summary
Table of Contents
- Major Changes
- Minor Changes
Major Changes
Deprecations
Metrics
Component | Metric Name | Deprecation PR |
---|---|---|
vtgate |
QueriesProcessed |
#17727 |
vtgate |
QueriesRouted |
#17727 |
vtgate |
QueriesProcessedByTable |
#17727 |
vtgate |
QueriesRoutedByTable |
#17727 |
CLI Flags
Component | Flag Name | Notes | Deprecation PR |
---|---|---|---|
vttablet |
twopc_enable |
Usage of TwoPC commit will be determined by the transaction_mode set on VTGate via flag or session variable. |
#17279 |
vtgate |
grpc-send-session-in-streaming |
Session will be sent as part of response on StreamExecute API call. | #17907 |
Deletions
Metrics
Component | Metric Name | Was Deprecated In | Deprecation PR |
---|---|---|---|
vttablet |
QueryCacheLength |
v21.0.0 |
#16289 |
vttablet |
QueryCacheSize |
v21.0.0 |
#16289 |
vttablet |
QueryCacheCapacity |
v21.0.0 |
#16289 |
vttablet |
QueryCacheEvictions |
v21.0.0 |
#16289 |
vttablet |
QueryCacheHits |
v21.0.0 |
#16289 |
vttablet |
QueryCacheMisses |
v21.0.0 |
#16289 |
CLI Flags
Component | Flag Name | Was Deprecated In | Deprecation PR |
---|---|---|---|
vttablet |
queryserver-enable-settings-pool |
v21.0.0 |
#16280 |
vttablet |
remove-sharded-auto-increment |
v21.0.0 |
#16860 |
vttablet |
disable_active_reparents |
v20.0.0 |
#14871 |
vtgate , vtcombo , vtctld |
healthcheck-dial-concurrency |
v21.0.0 |
#16378 |
gh-ost and pt-osc Online DDL strategies
Vitess no longer recognizes the gh-ost
and pt-osc
(pt-online-schema-change
) Online DDL strategies. The vitess
strategy is the recommended way to make schema changes at scale. mysql
and direct
strategies continue to be supported.
These vttablet
flags have been removed:
--gh-ost-path
--pt-osc-path
The use of gh-ost
and pt-osc
as strategies as follows, yields an error:
$ vtctldclient ApplySchema --ddl-strategy="gh-ost" ...
$ vtctldclient ApplySchema --ddl-strategy="pt-osc" ...
New Metrics
VTGate
Name | Dimensions | Description | PR |
---|---|---|---|
QueryExecutions |
Query , Plan , Tablet |
Number of queries executed. | #17727 |
QueryRoutes |
Query , Plan , Tablet |
Number of vttablets the query was executed on. | #17727 |
QueryExecutionsByTable |
Query , Table |
Queries executed at vtgate, with counts recorded per table. | #17727 |
VStreamsCount |
Keyspace , ShardName , TabletType |
Number of active vstream. | #17858 |
VStreamsEventsStreamed |
Keyspace , ShardName , TabletType |
Number of events sent across all vstreams. | #17858 |
VStreamsEndedWithErrors |
Keyspace , ShardName , TabletType |
Number of vstreams that ended with errors. | #17858 |
CommitModeTimings |
Mode |
Timing metrics for commit (Single, Multi, TwoPC). | #16939 |
CommitUnresolved |
N/A | Counter for failure after Prepare. | #16939 |
The work done in #17727 introduces new metrics for queries. Via this work we have deprecated several vtgate metrics, please see the Deprecated Metrics section. Here is an example on how to use them:
Query: select t1.a, t2.b from t1 join t2 on t1.id = t2.id
Shards: 2
Sharding Key: id for both tables
Metrics Published:
1. QueryExecutions – {select, scatter, primary}, 1
2. QueryRoutes – {select, scatter, primary}, 2
3. QueryExecutionsByTable – {select, t1}, 1 and {select, t2}, 1
VTTablet
Name | Dimensions | Description | PR |
---|---|---|---|
TableRows |
Table |
Estimated number of rows in the table. | #17570 |
TableClusteredIndexSize |
Table |
Byte size of the clustered index (i.e. row data). | #17570 |
IndexCardinality |
Table , Index |
Estimated number of unique values in the index | #17570 |
IndexBytes |
Table , Index |
Byte size of the index. | #17570 |
UnresolvedTransaction |
ManagerType |
Number of events sent across all vstreams. | #16939 |
CommitPreparedFail |
FailureType |
Number of vstreams that ended with errors. | #16939 |
RedoPreparedFail |
FailureType |
Timing metrics for commit (Single, Multi, TwoPC) | #16939 |
Config File Changes
VTOrc
The configuration file for VTOrc has been updated to now support dynamic fields. The old --config
parameter has been removed. The alternative is to use the --config-file
parameter. The configuration can now be provided in json, yaml or any other format that viper supports.
The following fields can be dynamically changed -
instance-poll-time
prevent-cross-cell-failover
snapshot-topology-interval
reasonable-replication-lag
audit-to-backend
audit-to-syslog
audit-purge-duration
wait-replicas-timeout
tolerable-replication-lag
topo-information-refresh-duration
recovery-poll-duration
allow-emergency-reparent
change-tablets-with-errant-gtid-to-drained
To upgrade to the newer version of the configuration file, first switch to using the flags in your current deployment before upgrading. Then you can switch to using the configuration file in the newer release.
VTGate
The Viper configuration keys for the following flags has been changed to match their flag names. Previously they had a discovery prefix instead of it being part of the name.
Flag Name | Old Configuration Key | New Configuration Key |
---|---|---|
discovery_low_replication_lag |
discovery.low_replication_lag |
discovery_low_replication_lag |
discovery_high_replication_lag_minimum_serving |
discovery.high_replication_lag_minimum_serving |
discovery_high_replication_lag_minimum_serving |
discovery_min_number_serving_vttablets |
discovery.min_number_serving_vttablets |
discovery_min_number_serving_vttablets |
discovery_legacy_replication_lag_algorithm |
discovery.legacy_replication_lag_algorithm |
discovery_legacy_replication_lag_algorithm |
To upgrade to the newer version of the configuration keys, first switch to using the flags in your current deployment before upgrading. Then you can switch to using the new configuration keys in the newer release.
VTOrc
Stalled Disk Recovery
VTOrc can now identify and recover from stalled disk errors. VTTablets test whether the disk is writable and they send this information in the full status output to VTOrc. If the disk is not writable on the primary tablet, VTOrc will attempt to recover the cluster by promoting a new primary. This is useful in scenarios where the disk is stalled and the primary vttablet is unable to accept writes because of it.
To opt into this feature, --enable-primary-disk-stalled-recovery
flag has to be specified on VTOrc, and --disk-write-dir
flag has to be specified on the vttablets.
--disk-write-interval
and --disk-write-timeout
flags can be used to configure the polling interval and timeout respectively.
KeyRanges in --clusters_to_watch
VTOrc now supports specifying keyranges in the --clusters_to_watch
flag. This means that there is no need to restart a VTOrc instance with a different flag value when you reshard a keyspace.
For example, if a VTOrc is configured to watch ks/-80
, then it would watch all the shards that fall under the keyrange -80
.
If a reshard is performed and -80
is split into new shards -40
and 40-80
, the VTOrc instance will automatically start watching the new shards without needing a restart.
In the previous logic, specifying ks/-80
for the flag would mean that VTOrc would watch only 1 (or no) shard.
In the new system, since we interpret -80
as a key range, it can watch multiple shards as described in the example.
Users can continue to specify exact keyranges. The new feature is backward compatible.
New Default Versions
MySQL 8.0.40
The default major MySQL version used by our vitess/lite:latest
image is going from 8.0.30
to 8.0.40
.
This change was brought by #17552.
VTGate also advertises MySQL version 8.0.40
by default instead of 8.0.30
if no explicit version is set. The users can set the mysql_server_version
flag to advertise the correct version.
⚠️ Upgrading to this release with vitess-operator:
If you are using the
vitess-operator
, considering that we are bumping the patch version of MySQL 80 from8.0.30
to8.0.40
, you will have to manually upgrade:
- Add
innodb_fast_shutdown=0
to your extra cnf in your YAML file.- Apply this file.
- Wait for all the pods to be healthy.
- Then change your YAML file to use the new Docker Images (
vitess/lite:v22.0.0
).- Remove
innodb_fast_shutdown=0
from your extra cnf in your YAML file.- Apply this file.
This is the last time this will be needed in the
8.0.x
series, as starting with MySQL8.0.35
it is possible to upgrade and downgrade between8.0.x
versions without needing to runinnodb_fast_shutdown=0
.
Docker vitess/lite
images with Debian Bookworm
The base system now uses Debian Bookworm instead of Debian Bullseye for the vitess/lite
images. This change was brought by #17552.
New Support
More Efficient JSON Replication
In #7345 we added support for --binlog-row-value-options=PARTIAL_JSON
. You can read more about this feature added to MySQL 8.0 here.
If you are using MySQL 8.0 or later and using JSON columns, you can now enable this MySQL feature across your Vitess cluster(s) to lower the disk space needed for binary logs and improve the CPU and memory usage in both mysqld
(standard intrashard MySQL replication) and vttablet
(VReplication) without losing any capabilities or features.
LAST_INSERT_ID(x)
In #17408 and #17409, we added the ability to use LAST_INSERT_ID(x)
in Vitess directly at vtgate. This improvement allows certain queries—like SELECT last_insert_id(123);
or SELECT last_insert_id(count(*)) ...
—to be handled without relying on MySQL for the final value.
Limitations:
- When using
LAST_INSERT_ID(x)
in ordered queries (e.g.,SELECT last_insert_id(col) FROM table ORDER BY foo
), MySQL sets the session’s last-insert-id value according to the last row returned. Vitess does not guarantee the same behavior.
Maximum Idle Connections in the Pool
In #17443 we introduced a new configurable max-idle-count parameter for connection pools. This allows you to specify the maximum number of idle connections retained in each connection pool to optimize performance and resource efficiency.
You can control idle connection retention for the query server’s query pool, stream pool, and transaction pool with the following flags: • –queryserver-config-query-pool-max-idle-count: Defines the maximum number of idle connections retained in the query pool. • –queryserver-config-stream-pool-max-idle-count: Defines the maximum number of idle connections retained in the stream pool. • –queryserver-config-txpool-max-idle-count: Defines the maximum number of idle connections retained in the transaction pool.
This feature ensures that, during traffic spikes, idle connections are available for faster responses, while minimizing overhead in low-traffic periods by limiting the number of idle connections retained. It helps strike a balance between performance, efficiency, and cost.
Filtering Query logs on Error
The querylog-mode
setting can be configured to error
to log only queries that result in errors. This option is supported in both VTGate and VTTablet.
Optimization
Prepared Statement
Prepared statements now benefit from Deferred Optimization, enabling parameter-aware query plans. Initially, a baseline plan is created at prepare-time, and on first execution, a more efficient parameter-optimized plan is generated. Subsequent executions dynamically switch between these plans based on input values, improving query performance while ensuring correctness.
RPC Changes
These are the RPC changes made in this release -
GetTransactionInfo
RPC has been added to bothVtctldServer
, andTabletManagerClient
interface. These RPCs are used to facilitate the users in reading the state of an unresolved distributed transaction. This can be useful in debugging what went wrong and how to fix the problem.
Prefer not promoting a replica that is currently taking a backup
Emergency reparents now prefer not promoting replicas that are currently taking backups with a backup engine other than
builtin
. Note that if there’s only one suitable replica to promote, and it is taking a backup, it will still be
promoted.
For planned reparents, hosts taking backups with a backup engine other than builtin
are filtered out of the list of
valid candidates. This means they will never get promoted - not even if there’s no other candidates.
Note that behavior for builtin
backups remains unchanged: a replica that is currently taking a builtin
backup will
never be promoted, neither by planned nor by emergency reparents.
Semi-sync monitor in vttablet
A new component has been added to the vttablet binary to monitor the semi-sync status of primary vttablets. We’ve observed cases where a brief network disruption can cause the primary to get stuck indefinitely waiting for semi-sync ACKs. In rare scenarios, this can block reparent operations and render the primary unresponsive. More information can be found in the issues #17709 and #17749.
To address this, the new component continuously monitors the semi-sync status. If the primary becomes stuck on semi-sync ACKs, it generates writes to unblock it. If this fails, VTOrc is notified of the issue and initiates an emergency reparent operation.
The monitoring interval can be adjusted using the --semi-sync-monitor-interval
flag, which defaults to 10 seconds.
Wrapped fatal transaction errors
When a query fails while being in a transaction, due to the transaction no longer being valid (e.g. PRS, rollout, primary down, etc), the original error is now wrapped around a VT15001
error.
For non-transactional queries that produce a VT15001
, VTGate will try to rollback and clear the transaction.
Any new queries on the same connection will fail with a VT09032
error, until a ROLLBACK
is received
to acknowledge that the transaction was automatically rolled back and cleared by VTGate.
VT09032
is returned to clients to avoid applications blindly sending queries to VTGate thinking they are still in a transaction.
This change was introduced by #17669.
Minor Changes
--topo_read_concurrency
behaviour changes
The --topo_read_concurrency
flag was added to all components that access the topology and the provided limit is now applied separately for each global or local cell (default 32
).
All topology read calls (Get
, GetVersion
, List
and ListDir
) now respect this per-cell limit. Previous to this version a single limit was applied to all cell calls and it was not respected by many topology calls.
VTTablet
CLI Flags
-
twopc_abandon_age
flag now supports values in the time.Duration format (e.g., 1s, 2m, 1h). While the flag will continue to accept float values (interpreted as seconds) for backward compatibility, float inputs are deprecated and will be removed in a future release. -
--consolidator-query-waiter-cap
flag to set the maximum number of clients allowed to wait on the consolidator. The default value is set to 0 for unlimited wait. Users can adjust this value based on the performance of VTTablet to avoid excessive memory usage and the risk of being OOMKilled, particularly in Kubernetes deployments.
ACL enforcement and reloading
When a tablet is started with --enforce-tableacl-config
it will exit with an error if the contents of the file are not valid. After the changes made in #17485 the tablet will no longer exit when reloading the contents of the file after receiving a SIGHUP. When the file contents are invalid on reload the tablet will now log an error and the active in-memory ACLs remain in effect.
VTAdmin
vtadmin-web updated to node v22.13.1 (LTS)
Building vtadmin-web
now requires node >= v22.13.0 (LTS). Breaking changes from v20 to v22 can be found at https://nodejs.org/en/blog/release/v22.13.0 – with no known issues that apply to VTAdmin.
Full details on the node v20.12.2 release can be found at https://nodejs.org/en/blog/release/v22.13.1.
The entire changelog for this release can be found here.
The release includes 457 merged Pull Requests.
Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @L3o-pold, @akagami-harsh, @anirbanmu, @app/dependabot, @app/vitess-bot, @arthmis, @arthurschreiber, @beingnoble03, @c-r-dev, @corbantek, @dbussink, @deepthi, @derekperkins, @ejortegau, @frouioui, @garfthoffman, @gmpify, @gopoto, @harshit-gangal, @huochexizhan, @jeefy, @jwangace, @kbutz, @lmorduch, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @mounicasruthi, @niladrix719, @notfelineit, @rafer, @rohit-nayak-ps, @rvrangel, @shailpujan88, @shanth96, @shlomi-noach, @siadat, @systay, @timvaillancourt, @twthorn, @vitess-bot, @vmg, @wiebeytec, @wukuai