发布日期: 2025-03-05
版本号: v1.15.3

Dapr 1.15.3 版本主要修复了以下问题:首先,修复了定时器调用失败后导致定时器停用的问题,确保即使应用返回非2xx状态码,周期性定时器仍能继续触发。其次,解决了使用工作流时Daprd内存持续增长的问题,通过在工作流活动完成时释放内部Actor锁对象,避免内存耗尽导致崩溃。最后,修复了调度器在高负载下内存持续增长的问题,通过每10分钟检查并整理Etcd数据库,释放未使用的内存,防止内存耗尽。

更新内容 (中文)

Dapr 1.15.3

本次更新包含以下错误修复:

修复定时器调用失败后停用的问题

问题

修复此 issue
当应用从定时器调用返回非 2xx 状态码时,会导致周期性定时器不再触发。

影响

如果 Actor 应用重启/崩溃或处于繁忙状态,会导致定时器不再触发。
这破坏了向后兼容性,即即使 Actor 繁忙或出现错误,周期性 Actor 定时器也应继续按照定义的时间间隔触发。

根本原因

如果 任何 定时器调用失败,Actor 定时器处理逻辑会停用该定时器。
无论定时器在其周期计划中是否还有后续触发点。

解决方案

与 v1.15.0 之前一样,将任何成功或失败的定时器调用视为相同,并推进 Actor 定时器以允许未来的调用。

修复 Daprd 内存持续增长的问题

问题

修复了 Daprd 在使用工作流时内存持续增长的问题。

影响

Daprd 最终会耗尽节点或 cgroup 上的所有可用内存,导致 OOM 崩溃。

根本原因

工作流活动中的内部 Actor 锁对象未被释放。

解决方案

在工作流活动完成时释放锁内存。

修复 Scheduler 内存持续增长的问题

问题

Scheduler 在重负载(例如使用工作流)时内存会持续增长。

影响

Scheduler 最终会耗尽节点或 cgroup 上的所有可用内存,导致 OOM 崩溃。

根本原因

Etcd 在压缩后不会自动进行碎片整理,导致未使用的内存未被释放。

解决方案

每 10 分钟,每个 Scheduler 主机将检查总内存是否为已用内存的两倍,如果是,则对该主机的 Etcd 数据库进行碎片整理。

更新内容 (原始)

Dapr 1.15.3

This update includes bug fixes:

Fix Timers Deactivating after timer invocation fails

Problem

Fixes this issue. An app returning a non-2xx status code from a timer invocation would cause a periodic timer to no longer trigger.

Impact

An Actor app which restarted/crashed, or was otherwise busy, would cause a timer to no longer trigger. This breaks backwards compatibility where a periodic Actor timer would continue to trigger at the defined period, even if the actor was busy or had an error.

Root cause

The Actor timer handle logic deactivates the timer if any timer invocation failed. Regardless of whether the timer had further ticks defined in it’s period schedule.

Solution

As did before v1.15.0, treat any successful or failed timer invocation as the same, and tick the Actor timer forward allowing for future invocations.

Fix Daprd continuously growing in memory

Problem

Fixes an issue where Daprd would continually grow in memory when using Workflows.

Impact

Daprd would eventually use all available memory on the node or cgroup, causing an OOM crash.

Root cause

An internal Actor lock object was not being released from Workflow Activities.

Solution

Release lock memory during Workflow Activity completion.

Fix Scheduler continuously growing in memory

Problem

Scheduler would continuously grow in memory when under heavy usage, for example using Workflows.

Impact

Scheduler would eventually use all available memory on the node or cgroup, causing an OOM crash.

Root cause

Etcd does not automatically Defragment after compaction, causing unused memory to not be released.

Solution

Every 10 minutes, each Scheduler host will check whether the total memory is twice the size of used memory and if so, will defragment that host’s Etcd database.

下载链接