Check for fully removed pieces if not found in deletion scheduling queue#947
Check for fully removed pieces if not found in deletion scheduling queue#947ZenGround0 wants to merge 2 commits intopdpv0from
Conversation
rvagg
left a comment
There was a problem hiding this comment.
lgtm, except for that log needing some nuance
|
@rvagg With Zen OOO for the day(2026-02-06)/monday, any concerns with me committing the suggestion and merging this, so that we can cut a Curio tag? |
|
@rjan90 make the changes here and ping me on slack and I'll come and re-review |
|
I am still seeing the errors in my logs after updating to this branch. Note that these |
|
@TippyFlitsUK would you mind grabbing the latest from this branch and giving it a go to see if you get those errors again, or anything like it? I believe this is good to go but would like to see confirmation. |
|
Still seeing these warning messages, I'm afraid:
|
| return xerrors.Errorf("failed to check if piece is live: %w", err) | ||
| } | ||
| if live { | ||
| return xerrors.Errorf("piece %d is not scheduled for removal", piece.PieceID) |
There was a problem hiding this comment.
| return xerrors.Errorf("piece %d is not scheduled for removal", piece.PieceID) | |
| log.Warnw("piece is live but not in scheduled removals despite successful delete tx; (possible chain reorg) clearing stale delete tracking", | |
| "dataSetId", piece.DataSetID, "pieceID", piece.PieceID, "txHash", piece.TxHash) | |
| _, err := db.Exec(ctx, `UPDATE pdp_data_set_pieces SET rm_message_hash = NULL | |
| WHERE data_set = $1 AND piece_id = $2 AND rm_message_hash = $3`, | |
| piece.DataSetID, piece.PieceID, piece.TxHash) | |
| if err != nil { | |
| return xerrors.Errorf("failed to clear stale rm_message_hash: %w", err) | |
| } | |
| continue |
I believe we might be seeing this for only a single piece, so I'm going to go out on a limb and say this instance was due to a chain reorg. We got a successful delete tx but now the chain says it's not a delete.
@ZenGround0 plausible?
There was a problem hiding this comment.
oooh yeah that's a nice idea. Curio is executing these fast and reorging a block or two is pretty normal.
I'll file an issue to review curio for these more generally. In the past we were handing these pretty well but I suspect we haven't been as rigorous as we need to be.
Addressing #924
I suspect that this will fix the problem but I could use your help seeing if this works @TippyFlitsUK