longhorn v1.6.3-rc2 版本更新介绍
发布日期: 2024-09-18
版本号: v1.6.3-rc2
此版本为Longhorn v1.6.3,主要修复多项问题并优化功能。注意:不支持升级至或从任何RC/Preview/Sprint版本,操作可能导致异常。
新功能:为ServiceMonitor资源增加额外监控设置。
改进项:修复节点状态事件矛盾问题;更新组件镜像时自动升级内置系统包;减少Engine和Volume资源中容量信息的更新频率;优化Longhorn Manager日志中的引擎代理错误提示;改进备份镜像名称显示逻辑;简化Helm chart的values.yaml配置;增强备份镜像管理界面交互;优化设置页面保存机制;调整CSI组件的客户端请求速率限制;增加支持捆绑包超时配置;解决XFS卷克隆挂载问题;修复UI创建卷后无法扩容的问题;环境检查脚本新增iscsi_tcp模块检测;优化快照数量超限提示的显示效果。
Bug修复:解决1.6.x版本性能下降问题;修复备份保留策略失效、卷扩容取消后报错、节点排空操作被残留引擎进程阻塞、实例管理器标签缺失、加密卷重启异常、存储类配置被重置、单副本备份卡顿、节点调度状态误判、快照状态字段丢失、系统备份失败导致卷异常、镜像安全漏洞、副本重建失败、实例管理器探针异常、共享管理器频繁协调、快照数量显示错误、XFS小卷创建失败、定时修剪任务超时、快照回滚兼容性问题、副本自动平衡选项大小写不统一等60余项问题。
其他更新:在最佳实践文档中补充问题内核版本说明。
贡献者包括@ChanYiLin、@PhanLe1010、@c3y1huang、@ejweber等14位开发者。
更新内容 (中文)
请勿从任何 RC/Preview/Sprint 版本升级或降级,因为该操作不受支持。
本版本已解决的问题
新功能
- [BACKPORT][v1.6.3][功能] 为 ServiceMonitor 资源添加额外监控设置 8984 - @ejweber @chriscchien
改进
- [BACKPORT][v1.6.3][改进] 修复节点状态事件冲突 9327 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][改进] 构建组件镜像时始终更新内置系统软件包 8722 - @yangchiu @c3y1huang
- [BACKPORT][v1.6.3][改进] 减少 Engine 和 Volume 资源中容量字段的更新频率 8684 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][改进] 长角牛管理器频繁出现“无法获取引擎代理…无法获取引擎客户端”消息 8729 - @derekbit @roger-ryao
- [BACKPORT][v1.6.3][改进] 恢复最新备份时应应用 BackingImage 名称值 8671 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][改进] 优化并简化 chart values.yaml 文件 8636 - @ChanYiLin @chriscchien
- [BACKPORT][v1.6.3][改进] 后备镜像界面改进 8655 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][改进] 保存设置页面变更 8602 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][改进] CSI 组件 sidecar 内的 client-go rest 客户端速率限制可能过小 (csi-provisioner, csi-attacher, csi-snapshotter) 8726 - @PhanLe1010
- [BACKPORT][v1.6.3][改进] 添加配置支持节点日志包收集超时时间的设置 8624 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.3][改进] 挂载 XFS 卷克隆/恢复快照的问题 8797 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.3][改进] 无法扩展通过 Longhorn UI 创建的卷 8828 - @mantissahz
- [BACKPORT][v1.6.3][改进] environment_check.sh 应检查 iscsi_tcp 内核模块 8720 - @tserong @roger-ryao
- [BACKPORT][v1.6.3][改进]
toomanysnapshots
界面元素不够突出,无法防止快照数量失控 8672 - @a110605 @roger-ryao
缺陷修复
- [缺陷] 1.6.x-head 版本出现执行时间显著增加的回归问题 9439 - @ChanYiLin @roger-ryao
- [缺陷] 测试用例
test_recurring_job
在v1.6.x-head
的amd64
架构上备份定时任务保留策略失效 9454 - @mantissahz @chriscchien - [BACKPORT][v1.6.3][缺陷] 取消扩容操作导致卷扩容错误 9469 - @derekbit
- [缺陷] 测试用例
test_support_bundle_should_not_timeout
在v1.6.x-head
的amd64
架构上超时 9452 - @yangchiu @derekbit - [BACKPORT][v1.6.3][缺陷] 通用设置和卷设置中的副本自动平衡选项大小写不一致 8786 - @yangchiu @a110605
- [BACKPORT][v1.6.03][缺陷] kubectl drain node 被异常的孤儿引擎进程阻塞 9446 - @ejweber @chriscchien @roger-ryao
- [缺陷][UI][v1.6.x] 更新卷属性模态框中的下拉菜单空白 9465 - @a110605
- [BACKPORT][v1.6.03][缺陷] 管理器崩溃后实例管理器缺失必要的选择器标签 9472 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.03][缺陷] 使用 strict-local 数据本地性的卷升级到 1.7.1 失败 9447 - @james-munson @chriscchien
- [缺陷][v1.6.x] 异常快照缺失状态字段 9438 - @yangchiu @derekbit
- [BACKPORT][v1.6.03][缺陷] 节点重启后单副本卷备份卡在 17% 进度 9399 - @yangchiu @ChanYiLin
- [缺陷] Longhorn 1.6.2 版本镜像存在安全问题 9132 - @c3y1huang
- [BACKPORT][v1.6.03][缺陷] k3s-agent 服务重启后 Longhorn 未能正确关闭和打开加密卷 9386 - @mantissahz @roger-ryao
- [BACKPORT][v1.6.03][缺陷] 不存在的块设备导致 longhorn-manager 进入 Crashloopbackoff 状态 9074 - @yangchiu @derekbit
- [BACKPORT][v1.6.03][缺陷] Longhorn 持续重置存储类配置 9395 - @mantissahz @roger-ryao
- [BACKPORT][v1.6.03][缺陷] 数据本地性和副本数变更后卷无法创建健康副本并永久处于降级状态 8561 - @ejweber @chriscchien @roger-ryao
- [BACKPORT][v1.6.03][缺陷] [备份存储] Azure 备份存储驱动需要关闭文件下载后的读取器 9283 - @yangchiu @mantissahz
- [BACKPORT][v1.6.03][缺陷] 节点存在已挂载卷时 Longhorn 静默失败 9211 - @yangchiu @ejweber
- [BACKPORT][v1.6.03][缺陷] 修复 longhorn-manager
TestCleanupRedundantInstanceManagers
测试 8670 - @derekbit @roger-ryao - [BACKPORT][v1.6.03][缺陷] Longhorn 误判节点不可调度 9052 - @c3y1huang @roger-ryao
- [BACKPORT][v1.6.03][缺陷] 卷卡在降级状态 9295 - @PhanLe1010 @roger-ryao
- [BACKPORT][v1.6.03][缺陷] SLE Micro ARM64 上测试用例
test_system_backup_and_restore_volume_with_backingimage
失败 9227 - @ChanYiLin @roger-ryao - [BACKPORT][v1.6.03][缺陷] 系统备份失败且容灾卷进入反复挂载/卸载循环当卷备份策略设为
Always
时 9339 - @c3y1huang @roger-ryao - [BACKPORT][v1.6.03][缺陷] 从 v1.7.0 升级到 v1.7.1-rc1 后 v1 卷副本重建失败 9336 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.03][缺陷] instance-manager 卡在启动状态 8678 - @derekbit
- [BACKPORT][v1.6.03][缺陷] v2 卷的 instance-manager pod 因存活探针失败被终止 8808 - @derekbit @chriscchien
- [BACKPORT][v1.6.03][缺陷] Pod 自动删除可能产生海量日志 9020 - @ejweber @roger-ryao
- [BACKPORT][v1.6.03][缺陷] 共享管理器控制器触发数万次协调 9088 - @ejweber @roger-ryao
- [BACKPORT][v1.6.03][缺陷] 副本快照规模告警 8851 - @ejweber
- [BACKPORT][v1.6.03][缺陷] Longhorn 无法创建小于 300 MiB 的 XFS 卷 8560 - @ejweber @chriscchien
- [BACKPORT][v1.6.03][缺陷] 文件系统修剪定时任务在频繁创建删除文件的卷上超时 9048 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.03][缺陷] 从 v1.6.2 升级到 v1.7.0-dev 后无法回滚 v2 卷快照 9066 - @chriscchien @DamiaSan
- [BACKPORT][v1.6.03][缺陷] 大容量卷的副本重建失败 8949 -
- [BACKPORT][v1.6.03][缺陷] 残留的 longhorn-engine-manager 和 longhorn-replica-manager 服务 8858 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.03][缺陷] 禁用修订计数器时,引擎可能选择头文件较小的副本作为自动抢救的基准 8661 - @PhanLe1010
- [BACKPORT][v1.6.03][缺陷]
toomanysnapshots
界面消息显示快照计数错误 8700 - @ejweber - [BACKPORT][v1.6.03][缺陷] 设置无效备份目标时卸载失败 8793 - @mantissahz @chriscchien
其他
- [BACKPORT][v1.6.03][任务] 更新最佳实践文档提及有问题的内核版本 8882 - @PhanLe1010
贡献者
- @ChanYiLin
- @DamiaSan
- @PhanLe1010
- @a110605
- @c3y1huang
- @chriscchien
- @derekbit
- @ejweber
- @innobead
- @james-munson
- @mantissahz
- @roger-ryao
- @tserong
- @yangchiu
更新内容 (原始)
DON’T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Feature
- [BACKPORT][v1.6.3][FEATURE] Add additional monitoring settings to ServiceMonitor resource. 8984 - @ejweber @chriscchien
Improvement
- [BACKPORT][v1.6.3][IMPROVEMENT] Fix contradicting node status events 9327 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] Always update the built-in installed system packages when building component images 8722 - @yangchiu @c3y1huang
- [BACKPORT][v1.6.3][IMPROVEMENT] Update sizes in Engine and Volume resources less frequently 8684 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] Longhor Manager Flood with “Failed to get engine proxy of … cannot get client for engine” Message 8729 - @derekbit @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] Restore Latest Backup should be applied with BackingImage name value 8671 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] Improve and simplify chart values.yaml 8636 - @ChanYiLin @chriscchien
- [BACKPORT][v1.6.3][IMPROVEMENT] BackingImage UI improvement 8655 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] Saving Settings page changes 8602 - @a110605 @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT] The client-go rest client rate limit inside the csi sidecar component might be too small (csi-provisioner, csi-attacjer. csi-snappshotter, csi-attacher) 8726 - @PhanLe1010
- [BACKPORT][v1.6.3][IMPROVEMENT] Add setting to configure support bundle timeout for node bundle collection 8624 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.3][IMPROVEMENT] Problems mounting XFS volume clones / restored snapshots 8797 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.3][IMPROVEMENT] Cannot expand a volume created by Longhorn UI 8828 - @mantissahz
- [BACKPORT][v1.6.3][IMPROVEMENT] environment_check.sh should check for the iscsi_tcp kernel module 8720 - @tserong @roger-ryao
- [BACKPORT][v1.6.3][IMPROVEMENT]
toomanysnapshots
UI element not prominent enough to prevent runaway snapshots 8672 - @a110605 @roger-ryao
Bug
- [BUG] Regression in 1.6.x-head, significant increase in execution time 9439 - @ChanYiLin @roger-ryao
- [BUG] test case
test_recurring_job
the backup recurring job’s retain is not working onv1.6.x-head
foramd64
9454 - @mantissahz @chriscchien - [BACKPORT][v1.6.3][BUG] Canceling expansion results in a volume expansion error 9469 - @derekbit
- [BUG] test case
test_support_bundle_should_not_timeout
timeout onv1.6.x-head
foramd64
9452 - @yangchiu @derekbit - [BACKPORT][v1.6.3][BUG] Replica Auto Balance options under General Setting and under Volume section should have similar case 8786 - @yangchiu @a110605
- [BACKPORT][v1.6.3][BUG] kubectl drain node is blocked by unexpected orphan engine processes 9446 - @ejweber @chriscchien @roger-ryao
- [BUG][UI][v1.6.x] blank dropdown menu in update volume property modals 9465 - @a110605
- [BACKPORT][v1.6.3][BUG] Instance manager missing required selector labels after manager crash 9472 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.3][BUG] Engine Upgrade to 1.7.1 fails on volumes with strict-local data locality 9447 - @james-munson @chriscchien
- [BUG][v1.6.x] Abnormal snapshot missing status field 9438 - @yangchiu @derekbit
- [BACKPORT][v1.6.3][BUG] Accidentally encountered a single replica volume backup stuck at progress 17% indefinitely after a node rebooted 9399 - @yangchiu @ChanYiLin
- [BUG] Security issues in longhorn 1.6.2 version images 9132 - @c3y1huang
- [BACKPORT][v1.6.3][BUG] Longhorn did not close and open encrypted volumes correctly when the service k3s-agent restarted for a while 9386 - @mantissahz @roger-ryao
- [BACKPORT][v1.6.3][BUG] Non-existing block device results in longhorn-manager to be in Crashloopbackoff state 9074 - @yangchiu @derekbit
- [BACKPORT][v1.6.3][BUG] Longhorn keeps resetting my storageClass 9395 - @mantissahz @roger-ryao
- [BACKPORT][v1.6.3][BUG] Volume failed to create healthy replica after data locality and replica count changed and got stuck in degraded state forever 8561 - @ejweber @chriscchien @roger-ryao
- [BACKPORT][v1.6.3][BUG] [Backupstore] Need to close the reader after downloading files for the Azure backup store driver. 9283 - @yangchiu @mantissahz
- [BACKPORT][v1.6.3][BUG] LH fails silently when node has attached volumes 9211 - @yangchiu @ejweber
- [BACKPORT][v1.6.3][BUG] Fix longhorn-manager
TestCleanupRedundantInstanceManagers
8670 - @derekbit @roger-ryao - [BACKPORT][v1.6.3][BUG] Longhorn thinks node is unschedulable 9052 - @c3y1huang @roger-ryao
- [BACKPORT][v1.6.3][BUG] Volume stuck in degraded 9295 - @PhanLe1010 @roger-ryao
- [BACKPORT][v1.6.3][BUG] test case
test_system_backup_and_restore_volume_with_backingimage
failed on sle-micro ARM64 9227 - @ChanYiLin @roger-ryao - [BACKPORT][v1.6.3][BUG] System Backup Fails and DR Volume Enters Attach-Detach Loop When Volume Backup Policy is Set to
Always
9339 - @c3y1huang @roger-ryao - [BACKPORT][v1.6.3][BUG] v1 volume replica rebuld fail after upgade from v1.7.0 to v1.7.1-rc1 9336 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.3][BUG] instance-manager is stuck at starting state 8678 - @derekbit
- [BACKPORT][v1.6.3][BUG] instance-manager pod for v2 volume is killed due to a failed liveness probe. 8808 - @derekbit @chriscchien
- [BACKPORT][v1.6.3][BUG] Pod auto-deletion may cause thousands of logs 9020 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][BUG] Share manager controller reconciles tens of thousands of times 9088 - @ejweber @roger-ryao
- [BACKPORT][v1.6.3][BUG] Scale replica snapsots warning 8851 - @ejweber
- [BACKPORT][v1.6.3][BUG] Longhorn can no longer create XFS volumes smaller than 300 MiB 8560 - @ejweber @chriscchien
- [BACKPORT][v1.6.3][BUG]filesystem trim RecurringJob times out (volumes where files are frequently created and deleted) 9048 - @c3y1huang @chriscchien
- [BACKPORT][v1.6.3][BUG] Can not revert V2 volume snapshot after upgrade from v1.6.2 to v1.7.0-dev 9066 - @chriscchien @DamiaSan
- [BACKPORT][v1.6.3][BUG] Rebuilding Replica fails on larger volumes 8949 -
- [BACKPORT][v1.6.3][BUG] Orphan longhorn-engine-manager and longhorn-replica-manager services 8858 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.6.3][BUG] When revision counter is disabled, the engine might choose a replica with a smaller head size to be the source of truth for auto-salvage 8661 - @PhanLe1010
- [BACKPORT][v1.6.3][BUG]
toomanysnapshots
UI message displays incorrect snapshot count 8700 - @ejweber - [BACKPORT][v1.6.3][BUG] Uninstallation will fail if invalid backuptarget is set. 8793 - @mantissahz @chriscchien
Misc
- [BACKPORT][v1.6.3][TASK] Update the best practice page to mention these broken kernels 8882 - @PhanLe1010
Contributors
- @ChanYiLin
- @DamiaSan
- @PhanLe1010
- @a110605
- @c3y1huang
- @chriscchien
- @derekbit
- @ejweber
- @innobead
- @james-munson
- @mantissahz
- @roger-ryao
- @tserong
- @yangchiu