Fix benchmark with multiple deposits at the same time
Mar 09, 7-8 PM (23)
Mar 09, 8-9 PM (25)
Mar 09, 9-10 PM (13)
Mar 09, 10-11 PM (60)
Mar 09, 11-12 AM (24)
Mar 10, 12-1 AM (5)
Mar 10, 1-2 AM (35)
Mar 10, 2-3 AM (34)
Mar 10, 3-4 AM (6)
Mar 10, 4-5 AM (3)
Mar 10, 5-6 AM (5)
Mar 10, 6-7 AM (20)
Mar 10, 7-8 AM (69)
Mar 10, 8-9 AM (110)
Mar 10, 9-10 AM (30)
Mar 10, 10-11 AM (30)
Mar 10, 11-12 PM (53)
Mar 10, 12-1 PM (65)
Mar 10, 1-2 PM (51)
Mar 10, 2-3 PM (90)
Mar 10, 3-4 PM (39)
Mar 10, 4-5 PM (44)
Mar 10, 5-6 PM (26)
Mar 10, 6-7 PM (10)
Mar 10, 7-8 PM (30)
Mar 10, 8-9 PM (15)
Mar 10, 9-10 PM (33)
Mar 10, 10-11 PM (30)
Mar 10, 11-12 AM (46)
Mar 11, 12-1 AM (13)
Mar 11, 1-2 AM (10)
Mar 11, 2-3 AM (6)
Mar 11, 3-4 AM (0)
Mar 11, 4-5 AM (4)
Mar 11, 5-6 AM (3)
Mar 11, 6-7 AM (26)
Mar 11, 7-8 AM (41)
Mar 11, 8-9 AM (94)
Mar 11, 9-10 AM (24)
Mar 11, 10-11 AM (67)
Mar 11, 11-12 PM (37)
Mar 11, 12-1 PM (67)
Mar 11, 1-2 PM (63)
Mar 11, 2-3 PM (42)
Mar 11, 3-4 PM (45)
Mar 11, 4-5 PM (41)
Mar 11, 5-6 PM (51)
Mar 11, 6-7 PM (35)
Mar 11, 7-8 PM (20)
Mar 11, 8-9 PM (39)
Mar 11, 9-10 PM (14)
Mar 11, 10-11 PM (57)
Mar 11, 11-12 AM (43)
Mar 12, 12-1 AM (4)
Mar 12, 1-2 AM (8)
Mar 12, 2-3 AM (6)
Mar 12, 3-4 AM (3)
Mar 12, 4-5 AM (4)
Mar 12, 5-6 AM (8)
Mar 12, 6-7 AM (46)
Mar 12, 7-8 AM (15)
Mar 12, 8-9 AM (62)
Mar 12, 9-10 AM (50)
Mar 12, 10-11 AM (88)
Mar 12, 11-12 PM (29)
Mar 12, 12-1 PM (60)
Mar 12, 1-2 PM (51)
Mar 12, 2-3 PM (48)
Mar 12, 3-4 PM (62)
Mar 12, 4-5 PM (23)
Mar 12, 5-6 PM (26)
Mar 12, 6-7 PM (14)
Mar 12, 7-8 PM (39)
Mar 12, 8-9 PM (28)
Mar 12, 9-10 PM (10)
Mar 12, 10-11 PM (41)
Mar 12, 11-12 AM (16)
Mar 13, 12-1 AM (7)
Mar 13, 1-2 AM (21)
Mar 13, 2-3 AM (13)
Mar 13, 3-4 AM (7)
Mar 13, 4-5 AM (1)
Mar 13, 5-6 AM (2)
Mar 13, 6-7 AM (7)
Mar 13, 7-8 AM (32)
Mar 13, 8-9 AM (48)
Mar 13, 9-10 AM (90)
Mar 13, 10-11 AM (25)
Mar 13, 11-12 PM (32)
Mar 13, 12-1 PM (54)
Mar 13, 1-2 PM (59)
Mar 13, 2-3 PM (35)
Mar 13, 3-4 PM (58)
Mar 13, 4-5 PM (26)
Mar 13, 5-6 PM (30)
Mar 13, 6-7 PM (17)
Mar 13, 7-8 PM (39)
Mar 13, 8-9 PM (28)
Mar 13, 9-10 PM (14)
Mar 13, 10-11 PM (23)
Mar 13, 11-12 AM (26)
Mar 14, 12-1 AM (1)
Mar 14, 1-2 AM (1)
Mar 14, 2-3 AM (9)
Mar 14, 3-4 AM (2)
Mar 14, 4-5 AM (0)
Mar 14, 5-6 AM (1)
Mar 14, 6-7 AM (0)
Mar 14, 7-8 AM (1)
Mar 14, 8-9 AM (19)
Mar 14, 9-10 AM (3)
Mar 14, 10-11 AM (0)
Mar 14, 11-12 PM (3)
Mar 14, 12-1 PM (1)
Mar 14, 1-2 PM (20)
Mar 14, 2-3 PM (5)
Mar 14, 3-4 PM (0)
Mar 14, 4-5 PM (0)
Mar 14, 5-6 PM (0)
Mar 14, 6-7 PM (2)
Mar 14, 7-8 PM (4)
Mar 14, 8-9 PM (10)
Mar 14, 9-10 PM (10)
Mar 14, 10-11 PM (20)
Mar 14, 11-12 AM (66)
Mar 15, 12-1 AM (6)
Mar 15, 1-2 AM (23)
Mar 15, 2-3 AM (8)
Mar 15, 3-4 AM (0)
Mar 15, 4-5 AM (1)
Mar 15, 5-6 AM (1)
Mar 15, 6-7 AM (0)
Mar 15, 7-8 AM (0)
Mar 15, 8-9 AM (2)
Mar 15, 9-10 AM (14)
Mar 15, 10-11 AM (1)
Mar 15, 11-12 PM (7)
Mar 15, 12-1 PM (20)
Mar 15, 1-2 PM (19)
Mar 15, 2-3 PM (30)
Mar 15, 3-4 PM (2)
Mar 15, 4-5 PM (4)
Mar 15, 5-6 PM (4)
Mar 15, 6-7 PM (8)
Mar 15, 7-8 PM (8)
Mar 15, 8-9 PM (11)
Mar 15, 9-10 PM (7)
Mar 15, 10-11 PM (28)
Mar 15, 11-12 AM (23)
Mar 16, 12-1 AM (7)
Mar 16, 1-2 AM (19)
Mar 16, 2-3 AM (14)
Mar 16, 3-4 AM (9)
Mar 16, 4-5 AM (0)
Mar 16, 5-6 AM (5)
Mar 16, 6-7 AM (21)
Mar 16, 7-8 AM (32)
Mar 16, 8-9 AM (57)
Mar 16, 9-10 AM (89)
Mar 16, 10-11 AM (61)
Mar 16, 11-12 PM (78)
Mar 16, 12-1 PM (50)
Mar 16, 1-2 PM (59)
Mar 16, 2-3 PM (20)
Mar 16, 3-4 PM (82)
Mar 16, 4-5 PM (45)
Mar 16, 5-6 PM (21)
Mar 16, 6-7 PM (36)
Mar 16, 7-8 PM (0)
4,401 commits this week
Mar 09, 2026
-
Mar 16, 2026
Fix offline mode emulation of a deposit
The utxo is not instantly available, but e2e tests need to wait for the snapshot to be confirmed.
Make the hydra-cluster bench use deposits
This does not work for > 1 cluster size yet though
Drop initial utxo from InitialSnapshot
This is always empty now any deposit/increment would require a signed snapshot.
Fix deposit silently dropped when it activates during in-flight snapshot
When onOpenChainTick fires and a snapshot is already in-flight, it correctly returns noop (no ReqSn). But currentDepositTxId stayed Nothing, so neither the timer nor maybeRequestNextSnapshot ever picked up the deposit afterwards — it would expire uncommitted. Fix: set currentDepositTxId in the DepositActivated aggregate when it is currently Nothing, so the timer can include the deposit in the next available ReqSn once the in-flight snapshot confirms. Signed-off-by: Sasha Bogicevic <[email protected]>
Fix restartedNodeCanObserveCommitTx losing Committed during sync wait
withHydraNode waits for NodeSynced before yielding the client, consuming all messages (including Committed) in the process. Switch the restarted node to withHydraNodeCatchingUp so the Committed message is observed directly during chain replay. Also scale all wait timeouts by blockTime. Signed-off-by: Sasha Bogicevic <[email protected]>
Fix model test fanout failure when contest races with fanout
When a head is closed on an older snapshot (e.g. before a post-decommit snapshot confirms), all nodes contest. If HeadIsReadyToFanout fires before the contest settles, the first FanoutTx spends the pre-contest head UTXO and fails on-chain once the ContestTx is processed in the same block. Re-sending Input.Fanout on each retry models the correct user behaviour: after a failed fanout attempt, the user resubmits, and the node builds a fresh FanoutTx against the updated localChainState (post-contest head UTXO), which then succeeds. Signed-off-by: Sasha Bogicevic <[email protected]>
Update changelog related to back pressure change
Signed-off-by: Sasha Bogicevic <[email protected]>
Add back pressure for NewTx via HTTP and WebSocket APIs
Introduce tryEnqueueClient on InputQueue — a non-blocking enqueue that returns False immediately when the TBQueue is full, without the timer- coalescing side-effect of tryEnqueue. Wire it through as tryWireClientInput on DraftHydraNode. HTTP POST /transaction now returns 503 immediately if the queue is full rather than blocking. WebSocket NewTx sends an InvalidInput error message back to the client instead of silently dropping or blocking. Adds WaitOnDepositActivation to WaitReason (restoring parity with master where it was regressed to noop on this branch). Wait — WaitOnDepositActivation is already in the staged commit. Let me check: Signed-off-by: Sasha Bogicevic <[email protected]>
Fix close by issuing a empty snapshot with correct version
Signed-off-by: Sasha Bogicevic <[email protected]>
Update the changelog and documentation
Signed-off-by: Sasha Bogicevic <[email protected]>
Fix isEmpty deadlock when asyncTracked threads outlive the queue
isEmpty returned False while async threads were running but the queue was empty, causing runToCompletion to block forever in dequeue waiting for items that would never arrive. Use STM retry to wait until all tracked threads complete before declaring the queue empty. Signed-off-by: Sasha Bogicevic <[email protected]>
Fix BehaviorSpec tests broken by immediate ReqSn and deposit activation changes
Restore WaitOnDepositActivation (was regressed to noop on this branch) so
non-leader nodes retry a ReqSn that includes a deposit they haven't yet
activated, matching master behaviour.
Update four BehaviorSpec tests to reflect the immediate-ReqSn model introduced
in 7e570004b:
- "snapshots are created": tx 40 lands in sn=1 alone; txs 41+42 batch into sn=2
- "depending transactions confirmed in order": firstTx → sn=1, secondTx → sn=2
- "conflicting transactions": SnapshotConfirmed fires before TxInvalid
- "commit snapshot only approved when deposit settled": relax positive assertion
to [n1] — n2 dropped n1's AckSn while waiting for deposit activation and
never independently confirms; n2 still observes CommitFinalized on-chain
Signed-off-by: Sasha Bogicevic <[email protected]>
Fix canCommit test timeout for versionNeedsSnapshot contest race
The versionNeedsSnapshot timer fires an empty version-bump snapshot after each IncrementTx. When Close is sent before that snapshot confirms at the closing node, the node correctly contests with the better snapshot, extending the contestation deadline by one contestation period (10 * blockTime). The previous buffer of 3 * blockTime was too short in that case, causing a timeout before ReadyToFanout arrived. Extended to 13 * blockTime to cover the contest round plus block-latency buffer. Signed-off-by: Sasha Bogicevic <[email protected]>
Fix snapshot flooding by removing SeenSnapshot retry from onOpenTimer
The SeenSnapshot case in onOpenTimer re-broadcast ReqSn+AckSn on every timer tick while waiting for AckSns. With a 5ms timer interval and etcd delivering each message back as a NetworkInput, this created a feedback loop: each broadcast → etcd echo → resets lastWasTimer → timer fires again → more broadcasts → floods the etcd PersistentQueue (capacity 100) → blocks processEffects → delays incoming AckSns → exponential slowdown (200+ ReqSn per snapshot, 5s+ gaps at snapshot 7+). Since etcd guarantees reliable delivery, the retry is unnecessary for normal L2 operation. Edge cases (deposit activation, version bumps) are handled by CommitFinalized/DecommitFinalized resetting seenSnapshot to LastSeenSnapshot, which lets the timer's fresh-send path handle them. Restore maybeRequestNextSnapshot in onAckSn so the leader for sn+1 immediately chains the next snapshot after SnapshotConfirmed, avoiding an idle timer interval between snapshots. Signed-off-by: Sasha Bogicevic <[email protected]>
Fix deadlock in processEffects when postTx fails
When two nodes race to post the same on-chain transaction (e.g. IncrementTx), the losing node's submission fails after a 1-second delay in txSubmissionClient. During this blocking wait, the timer (200 Hz) fills the bounded input queue (100 items), causing the subsequent enqueue of PostTxError to deadlock permanently. The main loop never recovers and chain events (OnIncrementTx → CommitFinalized) are never processed. Fix by running postTx in a background thread via asyncTracked, which is tracked by isEmpty so runToCompletion in tests correctly waits for the background work to finish. This also makes PostTxError delivery guaranteed for all tx types (CollectComTx, CloseTx, etc.), not just the racing ones. With the main loop no longer blocking on postTx, the 1-second threadDelay in txSubmissionClient is no longer needed — the chain observer naturally processes the winning transaction before PostTxError is enqueued. Signed-off-by: Sasha Bogicevic <[email protected]>
Ignore CollectComTx PostTxError in Initial state
When postTx runs async (via asyncTracked), the losing node's
CollectComTx submission can fail and enqueue PostTxError before
OnCollectComTx has been observed — leaving the node still in Initial
state. The existing Open{} guard did not match, causing the wildcard
PostTxOnChainFailed to fire.
Fix by adding an Initial{} case mirroring the existing Open{} noop:
a CollectComTx failure in Initial simply means another party already
collected the head; OnCollectComTx will arrive shortly.
Signed-off-by: Sasha Bogicevic <[email protected]>
Fix cluster test failures caused by timer-driven snapshot retry
The timer fires at ~200 times per second once a head is Open. Three issues stalled
the main processing loop and caused HeadIsOpen to miss its timeout:
1. InputQueue.tryEnqueue: io-classes has no tryWriteTBQueue (that is in
Katip). Implement non-blocking enqueue semantics using isFullTBQueue
+ conditional writeTBQueue in a single STM transaction. This prevents
the timer thread from blocking when the input queue is full.
2. Node.hs - skip tracing for TimerInput: 200 ticks/s × 3 traceWith
calls = 600 blocking writes/s into the 500-slot log queue. The queue
fills in < 1 s, causing traceWith to block and stall the main loop.
Meaningful outcomes (SnapshotRequested etc.) are still traced through
their own state-change and effect traces.
3. Node.hs - only enqueue TimerInput when Open: in Idle/Initial/Closed
the timer is a no-op in HeadLogic, so firing at 200 Hz only creates
STM contention and log pressure that delays chain event processing
(e.g. OnCollectComTx → HeadIsOpen). A queryNodeState STM read in the
timer thread does not slow the main loop — GHC STM writers never
retry due to unrelated readers.
4. canSideLoadSnapshot: the timer now auto-resolves the previously-stuck
snapshot (Alice re-broadcasts ReqSn, Carol signs on reconnect).
Remove the assertion that the snapshot remains stuck, update the
side-load to use the dynamically confirmed snapshot number and a
fresh UTxO so the test is valid regardless of timing.
Signed-off-by: Sasha Bogicevic <[email protected]>
Add property tests for timer behaviour in RequestedSnapshot and SeenSnapshot states
Two property-based tests added to HeadLogicSnapshotSpec to lock down correct
timer behaviour introduced in earlier commits:
- prop_timerIsNoopInRequestedSnapshotState: for any arbitrary
RequestedSnapshot{lastSeen, requested} state the timer must emit no ReqSn.
Re-broadcasting in this state uses stale state (currentDepositTxId is not set
until the leader receives its own echo) and would race with the original ReqSn,
causing peers to sign different snapshot contents.
- prop_timerSeenSnapshotRebroadcastMatchesInFlight: for any in-flight
SeenSnapshot the timer re-broadcast must use the in-flight snapshot's version
and number, not the speculatively-computed nextSn. Ensures all parties sign
the same content after a re-broadcast.
Also includes the regression unit test added in HeadLogicSpec ("timer does not
re-broadcast ReqSn without deposit while waiting for echo") and the onOpenTimer
nextSn fix (max confSn latestSeenSnapshotNumber + 1).
Signed-off-by: Sasha Bogicevic <[email protected]>
Update docs
Signed-off-by: Sasha Bogicevic <[email protected]>
Make snapshot protocol resilient to races and stale messages
Replace the immediate per-ReqTx snapshot trigger with a timer-driven
model: the periodic timer (TimerInput/onOpenTimer) batches pending work
into ReqSn requests, while AckSn confirmation still chains consecutive
snapshots for throughput.
- Remove maybeRequestSnapshot from onOpenNetworkReqTx
- Drop stale/duplicate ReqSn and AckSn silently (noop) instead of
waiting or erroring; remove the now-unused WaitOn* variants
- Reset localUTxO/localTxs/allTxs to confirmed state on version bump
when a snapshot was in-flight, so the timer can build a fresh ReqSn
- onOpenTimer re-broadcasts ReqSn + own AckSn when stuck in SeenSnapshot
Signed-off-by: Sasha Bogicevic <[email protected]>
Implement onTimer function to replace the way we do snapshots
Signed-off-by: Sasha Bogicevic <[email protected]>
Remove StrictData from all hydra-node (and keep it in lib)
Signed-off-by: Sasha Bogicevic <[email protected]>
Re-add StrictData
Signed-off-by: Sasha Bogicevic <[email protected]>