Add ERA file producer on a running beacon node#60
Open
Add ERA file producer on a running beacon node#60
Conversation
Fix a bug in `verify_header_signature` which tripped up some Lighthouse nodes at the Fusaka fork. The bug was a latent bug in a function that has been present for a long time, but only used by slashers. With Fulu it entered the critical path of blob/column verification -- call stack: - `FetchBlobsBeaconAdapter::process_engine_blobs` - `BeaconChain::process_engine_blobs` - `BeaconChain::check_engine_blobs_availability_and_import` - `BeaconChain::check_blob_header_signature_and_slashability` - `verify_header_signature` Thanks @eserilev for quickly diagnosing the root cause. Change `verify_header_signature` to use `ChainSpec::fork_at_epoch` to compute the `Fork`, rather than using the head state's fork. At a fork boundary the head state's fork is stale and lacks the data for the new fork. Using `fork_at_epoch` ensures that we use the correct fork data and validate transition block's signature correctly. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
None I noticed that `observed_column_sidecars` is missing its prune call in the finalization handler, which results in a memory leak on long-running nodes (very slow (**7MB/day**)) : https://github.com/sigp/lighthouse/blob/13dfa9200f822c41ccd81b95a3f052df54c888e9/beacon_node/beacon_chain/src/canonical_head.rs#L940-L959 Both caches use the same generic type `ObservedDataSidecars<T>:` https://github.com/sigp/lighthouse/blob/22ec4b327186c4a4a87d2c8c745caf3b36cb6dd6/beacon_node/beacon_chain/src/beacon_chain.rs#L413-L416 The type's documentation explicitly requires manual pruning: > "*The cache supports pruning based upon the finalized epoch. It does not automatically prune, you must call Self::prune manually.*" https://github.com/sigp/lighthouse/blob/b4704eab4ac8edf0ea0282ed9a5758b784038dd2/beacon_node/beacon_chain/src/observed_data_sidecars.rs#L66-L74 Currently: - `observed_blob_sidecars` => pruned - `observed_column_sidecars` => **NOT** pruned Without pruning, the underlying HashMap accumulates entries indefinitely, causing continuous memory growth until the node restarts. Co-Authored-By: Antoine James <antoine@ethereum.org>
I was resolving CI issues for my gloas block production [PR ](sigp#8313), and noticed the `make audit-CI` [check](https://github.com/sigp/lighthouse/actions/runs/20588442102/job/59129268003) was failing due to: ``` Crate: ruint Version: 1.17.0 Title: Unsoundness of safe `reciprocal_mg10` Date: 2025-12-22 ID: RUSTSEC-2025-0137 URL: https://rustsec.org/advisories/RUSTSEC-2025-0137 Solution: Upgrade to >=1.17.1 ``` Using the latest stable rust, `1.92.0`, I ran `cargo update ruint` -> `cargo check` -> `make audit-CI`, which passed Co-Authored-By: shane-moore <skm1790@gmail.com>
Which issue # does this PR address? sigp#8586 Please list or describe the changes introduced by this PR. Remove `service_name` from `TaskExecutor` Co-Authored-By: Abhivansh <31abhivanshj@gmail.com>
…igp#8614) This PR does two small things: - Removes the allocations that were happening on each loop - Makes it more explicit that the bit in the index is only being used to specify the order of the inputs for the hash function Co-Authored-By: Kevaundray Wedderburn <kevtheappdev@gmail.com>
Closes sigp#8569 Updates the HTTP API error when the node cannot reconstruct blobs due to "Insufficient data columns". Changes the response from 500 Internal Server Error to 400 Bad Request and adds a hint to run with --supernode or --semi-supernode. Co-Authored-By: Andrurachi <andruvrch@gmail.com>
Fixes attester cache write lock contention. Alternative to sigp#8463. Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Co-Authored-By: shane-moore <skm1790@gmail.com>
sigp#8547 This unifies the following `crypto` dependencies to a single version each: - `sha2` - `hmac` - `pbkdf2` - `aes` - `cipher` - `ctr` - `scrypt` - `digest` Co-Authored-By: Mac L <mjladson@pm.me>
```bash $ lcli mock-el .... ... ... Dec 15 11:52:06.002 INFO Metrics HTTP server started listen_address: "127.0.0.1:8551" ... ``` The log message "Metrics HTTP server" was misleading, as the server is actually a Mock Execution Client that provides a JSON-RPC API for testing purposes, not a metrics server. Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>
…gp#8498) Which issue # does this PR address? None Discussed in private with @jimmygchen, Lighthouse's `earliest_available_slot` is guaranteed to always align with epoch boundaries, but as a safety implementation, we should use `start_slot` just in case other clients differ in their implementations. At least we agreed it would be safer for `synced_peers_for_epoch`, I also made the change in `has_good_custody_range_sync_peer`, but this is to be reviewed please. Co-Authored-By: Antoine James <antoine@ethereum.org> Co-Authored-By: Jimmy Chen <jimmy@sigmaprime.io>
Just visual clean-up, making logging statements look uniform. There's no reason to use `tracing::debug` instead of `debug`. If we ever need to migrate our logging lib in the future it would make things easier too. Co-Authored-By: dapplion <35266934+dapplion@users.noreply.github.com> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
…#8141) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu>
N/A The `beacon_data_column_sidecar_computation_seconds` used to record the full kzg proof generation times before we changed getBlobsV2 to just return the full proofs + cells. This metric should be taking way less time than 100ms which was the minimum bucket previously. Update the metric to use the default buckets for better granularity. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
N/A Add standardized metrics for getBlobsV2 from ethereum/beacon-metrics#14. Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
…gp#8653) sigp#8652 - This removes instances of `BeaconStateError` from `eth_spec.rs`, and replaces them directly with `ArithError` which can be trivially converted back to `BeaconStateError` at the call site. - Also moves the state related methods on `ChainSpec` to be methods on `BeaconState` instead. I think this might be a more natural place for them to exist anyway. Co-Authored-By: Mac L <mjladson@pm.me>
Removes some of the temporary re-exports in `consensus/types`. I am doing this in multiple parts to keep each diff small. Co-Authored-By: Mac L <mjladson@pm.me>
[Missing values in /eth/v1/config/spec sigp#8571 ](sigp#8571) - there will be follow up PR for the re org props 1. As per above issue from EF dev ops, I added ``` "EPOCHS_PER_SUBNET_SUBSCRIPTION": "256", "ATTESTATION_SUBNET_COUNT": "64", "ATTESTATION_SUBNET_EXTRA_BITS": "0", "UPDATE_TIMEOUT": "8192", "DOMAIN_BLS_TO_EXECUTION_CHANGE": "0x0a000000" ``` to `/eth/v1/config/spec` 2. Had to change the minimal config for UPDATE_TIMEOUT to get currents tests to pass. This is ok given UPDATE_TIMEOUT is not used in lighthouse as this config for light client spec from altair 3. ATTESTATION_SUBNET_PREFIX_BITS is now dynamically calculated and shimmed into the /eth/v1/config/spec output as advised by @michaelsproul Co-Authored-By: Joseph Patchen <josephmipatchen@gmail.com>
Remove more of the temporary re-exports from `consensus/types` Co-Authored-By: Mac L <mjladson@pm.me>
…sigp#8666) sigp#8652 This moves the `ExecutionBlockHash` from the `execution` module to the `core` module. This allows `core` to not depend on the `execution` module, and the `ExecutionBlockHash` is a pretty core part of our types so I think it makes sense. Co-Authored-By: Mac L <mjladson@pm.me>
…p#8672) Removes the remaining facade re-exports from `consensus/types`. I have left `graffiti` as I think it has some utility so am leaning towards keeping it in the final API design. Co-Authored-By: Mac L <mjladson@pm.me>
Co-Authored-By: Tan Chee Keong <tanck@sigmaprime.io>
…#7944) ethereum/consensus-specs#4476 Co-Authored-By: Barnabas Busa <barnabas.busa@ethereum.org> Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Closes: - sigp#8667 Use the `early_attester_cache` to serve the head block root (if present). This should be faster than waiting for the head to finish importing. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Adds support for payload envelopes in the db. This is the minimum we'll need to store and fetch payloads. Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
Co-Authored-By: hopinheimer <knmanas6@gmail.com> Co-Authored-By: hopinheimer <48147533+hopinheimer@users.noreply.github.com> Co-Authored-By: Eitan Seri-Levi <eserilev@ucsc.edu> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
…pha.2 (sigp#8725) Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
Co-Authored-By: Eitan Seri- Levi <eserilev@gmail.com> Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
N/A In sigp#4801 , we added a state lru cache to avoid having too many states in memory which was a concern with 200mb+ states pre tree-states. With sigp#5891 , we made the overflow cache a simpler in memory lru cache that can only hold 32 pending states at the most and doesn't flush anything to disk. As noted in sigp#5891, we can always fetch older blocks which never became available over rpc if they become available later. Since we merged tree states, I don't think the state lru cache is relevant anymore. Instead of having the `DietAvailabilityPendingExecutedBlock` that stores only the state root, we can just store the full state in the `AvailabilityPendingExecutedBlock`. Given entries in the cache can span max 1 epoch (cache size is 32), the underlying `BeaconState` objects in the cache share most of their memory. The state_lru_cache is one level of indirection that doesn't give us any benefit. Please check me on this cc @dapplion Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com>
- Ensure all ssz_static tests are running and passing for Gloas 🎉 - Refine file ignores for Gloas EF tests Co-Authored-By: Michael Sproul <michael@sigmaprime.io> Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Closes sigp#8681 Co-Authored-By: Jimmy Chen <jchen.tc@gmail.com>
Update `time` to fix [ RUSTSEC-2026-0009 ](https://rustsec.org/advisories/RUSTSEC-2026-0009.html) Co-Authored-By: Mac L <mjladson@pm.me>
Swaps out the `local_ip_address` dependency for `if-addrs`. The reason for this is that is that `local_ip_address` is a relatively heavy dependency (depends on `neli`) compared to `if-addrs` and we only use it to check the presence of an IPv6 interface. This is an experiment to see if we can use the more lightweight `if-addrs` instead. Co-Authored-By: Mac L <mjladson@pm.me>
sigp#8547 This updates a few of our crates to remove the older `syn 1` crate. This updates: - `criterion` -> `0.8` - `itertools` -> `0.14` And also certain `sigp` crates: - `xdelta3` -> [`fe39066`](sigp/xdelta3-rs@fe39066) - `superstruct` -> `0.10.1` - `ethereum_ssz` -> `0.10.1` - `tree_hash` -> `0.12.1` - `metastruct` -> `0.1.4` - `context_deserialize` -> `0.2.1` - `compare_fields` -> `0.1.1` Co-Authored-By: Mac L <mjladson@pm.me>
I accidentally broke `unstable` while merging some missed commits from `release-v8.0`. The merge was clean but semantically broken, and I didn't notice because I pushed without running CI 😬 - Fix the regression test added for sigp#8528, for compatibility with the recent `RpcBlock` changes. I'm passing `is_available = false` which seems correct for this test. Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
N/A Fixes the issue where we were setting block observed timings for blocks that were potentially gossip invalid. Thanks @gitToki for the find Co-Authored-By: Pawan Dhananjay <pawandhananjay@gmail.com> Co-Authored-By: Michael Sproul <michaelsproul@users.noreply.github.com>
Co-Authored-By: Michael Sproul <michael@sigmaprime.io>
Co-Authored-By: João Oliveira <hello@jxs.pt> Co-Authored-By: ackintosh <sora.akatsuki@gmail.com>
8332fbc to
4b0a64a
Compare
sigp#8756 Only the Web3Signer actually needs OpenSSL in order to parse PKCS12 certificates. This updates the function to instead manually parse the cert (using the `p12-keystore` crate) and converts it to a `PEM` certificate (using the `pem` crate) which can be directly converted to a `reqwest::tls::Identity` as this can be done directly in `rustls`. Co-Authored-By: Mac L <mjladson@pm.me>
2f6554a to
d02192e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements an ERA file producer in lighthouse on a running beacon node. Does not require to stop the node and will produce ERA files for all available states regardless of the mode the node is run:
Keeps producing ERA files as the chain advances.
Testing
TODO