vitess v22.0.0-rc2 版本更新介绍
发布日期: 2025-04-11
版本号: v22.0.0-rc2
Vitess v22.0.0 版本主要更新内容包括:
-
重要变更:
- 弃用部分指标和CLI标志,删除不再支持的gh-ost和pt-osc在线DDL策略
- 新增VTGate和VTTablet的多项监控指标
- VTOrc配置改为支持动态字段,VTGate配置键名调整
-
VTOrc改进:
- 新增磁盘停滞恢复功能,支持在
--clusters_to_watch
中使用KeyRange
- 新增磁盘停滞恢复功能,支持在
-
默认版本更新:
- MySQL默认版本升级至8.0.40
- Docker镜像基础系统改用Debian Bookworm
-
新功能支持:
- 更高效的JSON复制
- 支持LAST_INSERT_ID(x)语法
- 连接池新增最大空闲连接数配置
- 支持按错误过滤查询日志
- 新增MultiQuery RPC接口
- 非分片键空间支持CREATE PROCEDURE
-
优化改进:
- 预处理语句支持延迟优化
- 主库提升时避免选择正在备份的副本
- 新增半同步监控组件
-
错误处理:
- 事务错误现在会包装为VT15001错误
- 自动回滚后连接会返回VT09032错误
-
其他改进:
- 拓扑读取并发控制行为变更
- VTTablet支持时长格式参数
- VTAdmin升级至Node.js v22.13.1
该版本共合并了466个Pull Request,包含多项性能优化和功能增强。
更新内容 (中文)
详见原始内容
更新内容 (原始)
Release of Vitess v22.0.0
Summary
Table of Contents
- Major Changes
- Minor Changes
Major Changes
Deprecations
Metrics
Component | Metric Name | Deprecation PR |
---|---|---|
vtgate |
QueriesProcessed |
#17727 |
vtgate |
QueriesRouted |
#17727 |
vtgate |
QueriesProcessedByTable |
#17727 |
vtgate |
QueriesRoutedByTable |
#17727 |
CLI Flags
Component | Flag Name | Notes | Deprecation PR |
---|---|---|---|
vttablet |
twopc_enable |
Usage of TwoPC commit will be determined by the transaction_mode set on VTGate via flag or session variable. |
#17279 |
vtgate |
grpc-send-session-in-streaming |
Session will be sent as part of response on StreamExecute API call. | #17907 |
Deletions
Metrics
Component | Metric Name | Was Deprecated In | Deprecation PR |
---|---|---|---|
vttablet |
QueryCacheLength |
v21.0.0 |
#16289 |
vttablet |
QueryCacheSize |
v21.0.0 |
#16289 |
vttablet |
QueryCacheCapacity |
v21.0.0 |
#16289 |
vttablet |
QueryCacheEvictions |
v21.0.0 |
#16289 |
vttablet |
QueryCacheHits |
v21.0.0 |
#16289 |
vttablet |
QueryCacheMisses |
v21.0.0 |
#16289 |
CLI Flags
Component | Flag Name | Was Deprecated In | Deprecation PR |
---|---|---|---|
vttablet |
queryserver-enable-settings-pool |
v21.0.0 |
#16280 |
vttablet |
remove-sharded-auto-increment |
v21.0.0 |
#16860 |
vttablet |
disable_active_reparents |
v20.0.0 |
#14871 |
vtgate , vtcombo , vtctld |
healthcheck-dial-concurrency |
v21.0.0 |
#16378 |
gh-ost and pt-osc Online DDL strategies
Vitess no longer recognizes the gh-ost
and pt-osc
(pt-online-schema-change
) Online DDL strategies. The vitess
strategy is the recommended way to make schema changes at scale. mysql
and direct
strategies continue to be supported.
These vttablet
flags have been removed:
--gh-ost-path
--pt-osc-path
The use of gh-ost
and pt-osc
as strategies as follows, yields an error:
$ vtctldclient ApplySchema --ddl-strategy="gh-ost" ...
$ vtctldclient ApplySchema --ddl-strategy="pt-osc" ...
New Metrics
VTGate
Name | Dimensions | Description | PR |
---|---|---|---|
QueryExecutions |
Query , Plan , Tablet |
Number of queries executed. | #17727 |
QueryRoutes |
Query , Plan , Tablet |
Number of vttablets the query was executed on. | #17727 |
QueryExecutionsByTable |
Query , Table |
Queries executed at vtgate, with counts recorded per table. | #17727 |
VStreamsCount |
Keyspace , ShardName , TabletType |
Number of active vstream. | #17858 |
VStreamsEventsStreamed |
Keyspace , ShardName , TabletType |
Number of events sent across all vstreams. | #17858 |
VStreamsEndedWithErrors |
Keyspace , ShardName , TabletType |
Number of vstreams that ended with errors. | #17858 |
CommitModeTimings |
Mode |
Timing metrics for commit (Single, Multi, TwoPC). | #16939 |
CommitUnresolved |
N/A | Counter for failure after Prepare. | #16939 |
The work done in #17727 introduces new metrics for queries. Via this work we have deprecated several vtgate metrics, please see the Deprecated Metrics section. Here is an example on how to use them:
Query: select t1.a, t2.b from t1 join t2 on t1.id = t2.id
Shards: 2
Sharding Key: id for both tables
Metrics Published:
1. QueryExecutions – {select, scatter, primary}, 1
2. QueryRoutes – {select, scatter, primary}, 2
3. QueryExecutionsByTable – {select, t1}, 1 and {select, t2}, 1
VTTablet
Name | Dimensions | Description | PR |
---|---|---|---|
TableRows |
Table |
Estimated number of rows in the table. | #17570 |
TableClusteredIndexSize |
Table |
Byte size of the clustered index (i.e. row data). | #17570 |
IndexCardinality |
Table , Index |
Estimated number of unique values in the index | #17570 |
IndexBytes |
Table , Index |
Byte size of the index. | #17570 |
UnresolvedTransaction |
ManagerType |
Number of events sent across all vstreams. | #16939 |
CommitPreparedFail |
FailureType |
Number of vstreams that ended with errors. | #16939 |
RedoPreparedFail |
FailureType |
Timing metrics for commit (Single, Multi, TwoPC) | #16939 |
Config File Changes
VTOrc
The configuration file for VTOrc has been updated to now support dynamic fields. The old --config
parameter has been removed. The alternative is to use the --config-file
parameter. The configuration can now be provided in json, yaml or any other format that viper supports.
The following fields can be dynamically changed -
instance-poll-time
prevent-cross-cell-failover
snapshot-topology-interval
reasonable-replication-lag
audit-to-backend
audit-to-syslog
audit-purge-duration
wait-replicas-timeout
tolerable-replication-lag
topo-information-refresh-duration
recovery-poll-duration
allow-emergency-reparent
change-tablets-with-errant-gtid-to-drained
To upgrade to the newer version of the configuration file, first switch to using the flags in your current deployment before upgrading. Then you can switch to using the configuration file in the newer release.
VTGate
The Viper configuration keys for the following flags has been changed to match their flag names. Previously they had a discovery prefix instead of it being part of the name.
Flag Name | Old Configuration Key | New Configuration Key |
---|---|---|
discovery_low_replication_lag |
discovery.low_replication_lag |
discovery_low_replication_lag |
discovery_high_replication_lag_minimum_serving |
discovery.high_replication_lag_minimum_serving |
discovery_high_replication_lag_minimum_serving |
discovery_min_number_serving_vttablets |
discovery.min_number_serving_vttablets |
discovery_min_number_serving_vttablets |
discovery_legacy_replication_lag_algorithm |
discovery.legacy_replication_lag_algorithm |
discovery_legacy_replication_lag_algorithm |
To upgrade to the newer version of the configuration keys, first switch to using the flags in your current deployment before upgrading. Then you can switch to using the new configuration keys in the newer release.
VTOrc
Stalled Disk Recovery
VTOrc can now identify and recover from stalled disk errors. VTTablets test whether the disk is writable and they send this information in the full status output to VTOrc. If the disk is not writable on the primary tablet, VTOrc will attempt to recover the cluster by promoting a new primary. This is useful in scenarios where the disk is stalled and the primary vttablet is unable to accept writes because of it.
To opt into this feature, --enable-primary-disk-stalled-recovery
flag has to be specified on VTOrc, and --disk-write-dir
flag has to be specified on the vttablets.
--disk-write-interval
and --disk-write-timeout
flags can be used to configure the polling interval and timeout respectively.
KeyRanges in --clusters_to_watch
VTOrc now supports specifying keyranges in the --clusters_to_watch
flag. This means that there is no need to restart a VTOrc instance with a different flag value when you reshard a keyspace.
For example, if a VTOrc is configured to watch ks/-80
, then it would watch all the shards that fall under the keyrange -80
.
If a reshard is performed and -80
is split into new shards -40
and 40-80
, the VTOrc instance will automatically start watching the new shards without needing a restart.
In the previous logic, specifying ks/-80
for the flag would mean that VTOrc would watch only 1 (or no) shard.
In the new system, since we interpret -80
as a key range, it can watch multiple shards as described in the example.
Users can continue to specify exact keyranges. The new feature is backward compatible.
New Default Versions
MySQL 8.0.40
The default major MySQL version used by our vitess/lite:latest
image is going from 8.0.30
to 8.0.40
.
This change was brought by #17552.
VTGate also advertises MySQL version 8.0.40
by default instead of 8.0.30
if no explicit version is set. The users can set the mysql_server_version
flag to advertise the correct version.
⚠️ Upgrading to this release with vitess-operator:
If you are using the
vitess-operator
, considering that we are bumping the patch version of MySQL 80 from8.0.30
to8.0.40
, you will have to manually upgrade:
- Add
innodb_fast_shutdown=0
to your extra cnf in your YAML file.- Apply this file.
- Wait for all the pods to be healthy.
- Then change your YAML file to use the new Docker Images (
vitess/lite:v22.0.0
).- Remove
innodb_fast_shutdown=0
from your extra cnf in your YAML file.- Apply this file.
This is the last time this will be needed in the
8.0.x
series, as starting with MySQL8.0.35
it is possible to upgrade and downgrade between8.0.x
versions without needing to runinnodb_fast_shutdown=0
.
Docker vitess/lite
images with Debian Bookworm
The base system now uses Debian Bookworm instead of Debian Bullseye for the vitess/lite
images. This change was brought by #17552.
New Support
More Efficient JSON Replication
In #7345 we added support for --binlog-row-value-options=PARTIAL_JSON
. You can read more about this feature added to MySQL 8.0 here.
If you are using MySQL 8.0 or later and using JSON columns, you can now enable this MySQL feature across your Vitess cluster(s) to lower the disk space needed for binary logs and improve the CPU and memory usage in both mysqld
(standard intrashard MySQL replication) and vttablet
(VReplication) without losing any capabilities or features.
LAST_INSERT_ID(x)
In #17408 and #17409, we added the ability to use LAST_INSERT_ID(x)
in Vitess directly at vtgate. This improvement allows certain queries—like SELECT last_insert_id(123);
or SELECT last_insert_id(count(*)) ...
—to be handled without relying on MySQL for the final value.
Limitations:
- When using
LAST_INSERT_ID(x)
in ordered queries (e.g.,SELECT last_insert_id(col) FROM table ORDER BY foo
), MySQL sets the session’s last-insert-id value according to the last row returned. Vitess does not guarantee the same behavior.
Maximum Idle Connections in the Pool
In #17443 we introduced a new configurable max-idle-count parameter for connection pools. This allows you to specify the maximum number of idle connections retained in each connection pool to optimize performance and resource efficiency.
You can control idle connection retention for the query server’s query pool, stream pool, and transaction pool with the following flags: • –queryserver-config-query-pool-max-idle-count: Defines the maximum number of idle connections retained in the query pool. • –queryserver-config-stream-pool-max-idle-count: Defines the maximum number of idle connections retained in the stream pool. • –queryserver-config-txpool-max-idle-count: Defines the maximum number of idle connections retained in the transaction pool.
This feature ensures that, during traffic spikes, idle connections are available for faster responses, while minimizing overhead in low-traffic periods by limiting the number of idle connections retained. It helps strike a balance between performance, efficiency, and cost.
Filtering Query logs on Error
The querylog-mode
setting can be configured to error
to log only queries that result in errors. This option is supported in both VTGate and VTTablet.
MultiQuery RPC in vtgate
New RPCs in vtgate have been added that allow users to pass multiple queries in a single sql string. It behaves the same way MySQL does where-in multiple result sets for the queries are returned in the same order as the queries were passed until an error is encountered. The new RPCs are ExecuteMulti
and StreamExecuteMulti
.
A new flag --mysql-server-multi-query-protocol
has also been added that makes the server use this new implementation. This flag is set to false
by default, so the old implementation is used by default. The new implementation is more efficient and allows for better performance when executing multiple queries in a single RPC call.
Unsharded CREATE PROCEDURE
support
Until now Vitess didn’t allow users to create procedures through the vtgate, and they had to be created by running a DDL directly against the vttablets. In this release, we have started adding support for running CREATE PROCEDURE
statements through the vtgate for unsharded keyspaces. Not all constructs of procedures are currently supported in the parser, so there are still some limitations which will be addressed in future releases.
Optimization
Prepared Statement
Prepared statements now benefit from Deferred Optimization, enabling parameter-aware query plans. Initially, a baseline plan is created at prepare-time, and on first execution, a more efficient parameter-optimized plan is generated. Subsequent executions dynamically switch between these plans based on input values, improving query performance while ensuring correctness.
RPC Changes
These are the RPC changes made in this release -
GetTransactionInfo
RPC has been added to bothVtctldServer
, andTabletManagerClient
interface. These RPCs are used to facilitate the users in reading the state of an unresolved distributed transaction. This can be useful in debugging what went wrong and how to fix the problem.
Prefer not promoting a replica that is currently taking a backup
Emergency reparents now prefer not promoting replicas that are currently taking backups with a backup engine other than
builtin
. Note that if there’s only one suitable replica to promote, and it is taking a backup, it will still be
promoted.
For planned reparents, hosts taking backups with a backup engine other than builtin
are filtered out of the list of
valid candidates. This means they will never get promoted - not even if there’s no other candidates.
Note that behavior for builtin
backups remains unchanged: a replica that is currently taking a builtin
backup will
never be promoted, neither by planned nor by emergency reparents.
Semi-sync monitor in vttablet
A new component has been added to the vttablet binary to monitor the semi-sync status of primary vttablets. We’ve observed cases where a brief network disruption can cause the primary to get stuck indefinitely waiting for semi-sync ACKs. In rare scenarios, this can block reparent operations and render the primary unresponsive. More information can be found in the issues #17709 and #17749.
To address this, the new component continuously monitors the semi-sync status. If the primary becomes stuck on semi-sync ACKs, it generates writes to unblock it. If this fails, VTOrc is notified of the issue and initiates an emergency reparent operation.
The monitoring interval can be adjusted using the --semi-sync-monitor-interval
flag, which defaults to 10 seconds.
Wrapped fatal transaction errors
When a query fails while being in a transaction, due to the transaction no longer being valid (e.g. PRS, rollout, primary down, etc), the original error is now wrapped around a VT15001
error.
For non-transactional queries that produce a VT15001
, VTGate will try to rollback and clear the transaction.
Any new queries on the same connection will fail with a VT09032
error, until a ROLLBACK
is received
to acknowledge that the transaction was automatically rolled back and cleared by VTGate.
VT09032
is returned to clients to avoid applications blindly sending queries to VTGate thinking they are still in a transaction.
This change was introduced by #17669.
Minor Changes
--topo_read_concurrency
behaviour changes
The --topo_read_concurrency
flag was added to all components that access the topology and the provided limit is now applied separately for each global or local cell (default 32
).
All topology read calls (Get
, GetVersion
, List
and ListDir
) now respect this per-cell limit. Previous to this version a single limit was applied to all cell calls and it was not respected by many topology calls.
VTTablet
CLI Flags
-
twopc_abandon_age
flag now supports values in the time.Duration format (e.g., 1s, 2m, 1h). While the flag will continue to accept float values (interpreted as seconds) for backward compatibility, float inputs are deprecated and will be removed in a future release. -
--consolidator-query-waiter-cap
flag to set the maximum number of clients allowed to wait on the consolidator. The default value is set to 0 for unlimited wait. Users can adjust this value based on the performance of VTTablet to avoid excessive memory usage and the risk of being OOMKilled, particularly in Kubernetes deployments.
ACL enforcement and reloading
When a tablet is started with --enforce-tableacl-config
it will exit with an error if the contents of the file are not valid. After the changes made in #17485 the tablet will no longer exit when reloading the contents of the file after receiving a SIGHUP. When the file contents are invalid on reload the tablet will now log an error and the active in-memory ACLs remain in effect.
VTAdmin
vtadmin-web updated to node v22.13.1 (LTS)
Building vtadmin-web
now requires node >= v22.13.0 (LTS). Breaking changes from v20 to v22 can be found at https://nodejs.org/en/blog/release/v22.13.0 – with no known issues that apply to VTAdmin.
Full details on the node v20.12.2 release can be found at https://nodejs.org/en/blog/release/v22.13.1.
The entire changelog for this release can be found here.
The release includes 466 merged Pull Requests.
Thanks to all our contributors: @GrahamCampbell, @GuptaManan100, @L3o-pold, @akagami-harsh, @anirbanmu, @app/dependabot, @app/vitess-bot, @arthmis, @arthurschreiber, @beingnoble03, @c-r-dev, @corbantek, @dbussink, @deepthi, @derekperkins, @ejortegau, @frouioui, @garfthoffman, @gmpify, @gopoto, @harshit-gangal, @huochexizhan, @jeefy, @jwangace, @kbutz, @lmorduch, @mattlord, @mattrobenolt, @maxenglander, @mcrauwel, @mounicasruthi, @niladrix719, @notfelineit, @rafer, @rohit-nayak-ps, @rvrangel, @shailpujan88, @shanth96, @shlomi-noach, @siadat, @systay, @timvaillancourt, @twthorn, @vitess-bot, @vmg, @wiebeytec, @wukuai