Should NEAR Validators Have Kernel-Level Network Protection? An Infrastructure Discussion


Author: Abhiraj
Type: Discussion / Community Input


Who We Are (Brief Context)

We’re MontaQ Labs. We come from a low-level systems engineering background, our previous work includes building cryptographic libraries in no_std Rust for novel VM targets. We’re writing this not as a product pitch but as validator operators who’ve been staring at our node’s network traffic and thinking about a gap that nobody seems to be discussing.


The Observation

Every packet that arrives at a NEAR validator, whether it’s a legitimate chunk from a peer validator or junk from a misbehaving node , travels the same expensive path through the system:

Packet arrives at network interface
  → Linux kernel allocates memory for it (sk_buff)
  → Kernel parses IP headers
  → TCP state machine processes the connection
  → Data handed up to nearcore
  → nearcore reads the message
  → nearcore decides whether the peer is legitimate
  → If not: message dropped, peer potentially banned

The key detail: by the time nearcore makes the “drop this” decision, the system has already spent 15-150 microseconds of CPU time on that single junk packet. The memory was allocated. The headers were parsed. The TCP state was tracked. All of that work was wasted.

At low volumes, who cares. At high volumes, and “high” might just mean a few misbehaving peers during a congested period, this becomes CPU time that was supposed to be spent on chunk production.


Why This Might Matter More for NEAR Than Other Chains

We’ve been thinking about why NEAR’s architecture makes this potentially more significant than it would be on, say, Ethereum.

Chunk production timing is tight. NEAR chunk producers work in roughly 1-second windows per shard. This isn’t like Ethereum’s 12-second block time where a validator has breathing room. CPU contention during a NEAR chunk production window has a more direct path to a missed chunk than it does on most other networks.

Sharding concentrates the target. Each shard has a relatively small set of chunk producers. If you wanted to disrupt a specific shard, you don’t need to overwhelm the entire network, just the handful of chunk producers for that shard during their production windows. As NEAR adds more shards, each shard’s producer set gets smaller.

Economic incentives for disruption grow with TVL. This isn’t a problem today in the way it’s a problem on Ethereum, where validators get DDoS’d around MEV opportunities. But NEAR’s DeFi ecosystem is growing. MEV dynamics are emerging. At some point, there may be economic value in knocking a specific chunk producer offline for a few seconds. Every other major PoS network has eventually faced this.

nearcore handles peer management at the application layer. This is well-implemented — peer banning, connection limits, rate limiting on certain message types. The protocol team has clearly thought about peer abuse. But there’s a structural ceiling: application-level filtering can only reject packets after the kernel has already processed them. The cost is paid before the decision is made.

We want to be clear: this is not a criticism of nearcore. Handling peer management in the application is the normal and sensible default. We’re raising the question of whether there’s value in an additional layer below it.


The Technology That Made Us Think About This: eBPF/XDP

For those unfamiliar with this space, a brief technical sketch.

eBPF is a Linux kernel technology that allows running small, verified programs inside the kernel without modifying the kernel itself. The programs go through a formal verifier before execution, it proves they terminate, don’t access invalid memory, and have bounded execution. They literally cannot crash the kernel.

XDP (eXpress Data Path) is an eBPF hook at the network driver level. This is the earliest possible interception point for a packet in Linux, before memory allocation, before the TCP/IP stack, before iptables, before everything. A packet dropped at XDP costs roughly 50-100 nanoseconds. Compare that to the 15-150 microseconds for a packet that travels the full path to nearcore.

That’s a 150x to 1500x difference in CPU cost per dropped packet.

This isn’t exotic technology. Cloudflare uses XDP to handle DDoS mitigation at millions of packets per second. Meta uses it for load balancing. The Linux kernel community has been maturing eBPF/XDP for over a decade. What’s less explored is applying it specifically to blockchain validator infrastructure.


What a Validator Protection Layer Could Look Like

We’re not proposing a specific design, we’re sketching what the architecture might look like to see if it resonates.

The interesting pattern isn’t just “block bad IPs at the NIC.” That’s trivially achievable with existing tools. The interesting pattern is what we’d call application intelligence with kernel-speed enforcement — a feedback loop where nearcore’s understanding of peer behavior drives packet-level filtering.

Conceptually:

nearcore knows:                     The kernel could enforce:
────────────────                    ─────────────────────────

"Peer X has sent 50 invalid         Drop all future packets from
 messages this minute"               Peer X's IP at the NIC.
                                     Cost: ~100ns per packet.
                                     nearcore never sees them again.

"Peer Y is a validator I'm          NEVER drop Peer Y's traffic,
 co-producing chunks with            even during a flood. Whitelist
 this epoch"                         at the kernel level.

"I'm about to enter my chunk        Tighten filtering thresholds
 production window in 200ms"         during production windows.
                                     Relax them otherwise.

"Network is calm, no issues"        Pass everything through.
                                     XDP programs still attached
                                     but effectively no-ops.
                                     Zero overhead when not needed.

The mechanism for this feedback loop would be eBPF maps , shared data structures that userspace processes (like a nearcore sidecar) and kernel-space programs (like XDP filters) can both read and write with negligible overhead. nearcore flags a bad peer, the eBPF map updates, and within microseconds the kernel is dropping that peer’s packets before they consume any meaningful resources.

The layered approach we’ve been thinking about:

Layer 1 — XDP (NIC-level, nanosecond decisions)
Pattern-based filtering on packet metadata: source IP, port, packet rate, connection rate. The blunt instrument that handles volumetric attacks.

Layer 2 — TC-eBPF (traffic control layer, microsecond analysis)
More nuanced analysis on established connections: bandwidth abuse detection, connection behavior scoring, anomalous traffic patterns.

Layer 3 — Userspace sidecar (intelligence and coordination)
Reads nearcore’s peer data, writes to eBPF maps, exposes monitoring metrics, handles alert escalation. This is the “brain” that makes the kernel-level filtering smart rather than static.


Honest Limitations We See

We want to be upfront about what this approach cannot do

NEAR’s P2P traffic is encrypted. NEAR uses the Noise Protocol Framework for peer communication. After the initial handshake, all messages are encrypted. This means:

  • XDP cannot inspect message content. We cannot look at an encrypted packet and determine whether it’s a valid chunk, an invalid attestation, or garbage. Deep packet inspection of NEAR gossip at the NIC level is not possible without breaking the encryption, and we’re not proposing to break the encryption.
  • Filtering is pattern-based, not content-based. XDP can filter on: source/destination IP, port, packet sizes, packet rates, connection patterns, timing. This catches volumetric attacks (floods), misbehaving peers (excessive connections), and known-bad actors (IP blocklist). It does not catch a peer that sends correctly-formatted but semantically invalid messages at normal rates.
  • Content-aware detection stays in nearcore. For detecting things like “this peer is sending invalid chunks,” the detection logic must remain at the application layer where messages are decrypted. What the eBPF layer adds is faster enforcement of nearcore’s decisions, not replacement of nearcore’s intelligence.

Not all hardware supports XDP. XDP requires driver-level support. Bare metal servers (Hetzner, OVH, etc. where most serious validators run) fully support it. Cloud instances vary , AWS ENA driver supports XDP, GCP varies by instance type, some providers don’t support it at all. A TC-eBPF fallback would be needed for environments without XDP driver support. TC-eBPF is slower than XDP (post-TCP rather than pre-TCP) but still faster than application-level filtering.

This doesn’t replace good operational security. Sentry node architecture, firewall configuration, peer selection, monitoring all still matter. An XDP layer would be one component of defense in depth, not a silver bullet.


Rough Numbers to Ground the Discussion

We’ve been doing back-of-envelope calculations to understand whether this is meaningful or marginal.

Scenario: Moderate spam attack — 100,000 junk packets/second

WITHOUT kernel-level filtering:
├── Each packet costs ~15-150μs of CPU time
├── Total CPU consumed by junk: 1.5 to 15 CPU-seconds per second
├── That's 1.5 to 15 entire cores consumed by garbage
├── On a typical 8-16 core validator: 10-100% of capacity wasted
├── During chunk production: high probability of missed chunks
└── Missed chunks = lost rewards + reduced network throughput

WITH XDP filtering:
├── Each dropped packet costs ~100ns
├── Total CPU consumed: 0.01 CPU-seconds per second
├── Effectively invisible to the rest of the system
├── Chunk production completely unaffected
└── Legitimate peer traffic passes through unchanged

Improvement factor: 150x to 1500x reduction in CPU waste

These aren’t speculative numbers. XDP filtering at 100K+ packets/second with negligible CPU overhead is well-documented in production systems (Cloudflare’s published benchmarks show millions of pps at XDP level). The question isn’t whether XDP can handle this it’s whether NEAR validators face enough network-level pressure to make it worth deploying.


What We Don’t Know (And Want to Learn)

There are several things we don’t know and would like community input on.

1. What does network traffic actually look like across NEAR validators?

We’d love to hear from other validators:

  • Do you see unexpected network traffic patterns?
  • Have you experienced CPU spikes that correlated with peer activity rather than normal block processing?
  • Have you ever had to manually ban peer IPs?
  • During high-activity periods (large mints, DeFi events, etc.), does your network layer show stress?

Even “I’ve never noticed any issues” is useful data. If the majority of validators see clean traffic, the urgency is lower and this becomes more of a proactive/insurance conversation.

2. How does the nearcore team think about the kernel boundary?

nearcore handles peer management at the application layer. This might be intentional and well-reasoned perhaps the team considered kernel-level filtering and decided the complexity wasn’t justified. Or perhaps it hasn’t been explored because the team’s expertise is (reasonably) focused on consensus and runtime rather than Linux kernel networking.

We’d genuinely like to understand the design thinking here. If there’s a reason application-level filtering is preferred, we’d rather learn that than duplicate effort.

3. Would validators actually deploy kernel-level tooling?

eBPF programs run in the Linux kernel. Even though they’re formally verified before execution (they cannot crash the kernel), some operators might be uncomfortable running non-standard kernel programs on their validator machines. We understand that concern.

Is the validator community’s posture “we’ll run well-tested open-source kernel tools if they protect our infrastructure” or “we don’t want anything non-standard touching our kernel”? This significantly affects whether building such a tool is useful.

4. Is there already work happening in this area that we’re not aware of?

It’s possible that the nearcore team, Pagoda, or another infrastructure provider is already thinking about or building kernel-level network protection. If so, we’d rather contribute to that effort than start a parallel one. We haven’t found any public discussion of this, but that doesn’t mean it’s not happening.

5. What’s the priority relative to other validator infrastructure needs?

Every engineering effort has an opportunity cost. Maybe validators need better monitoring tooling more than they need network protection. Maybe the biggest pain point is state sync performance, or storage costs, or something else entirely. We’re interested in the community’s honest ranking of infrastructure priorities.


How We Think About the Priority

Our honest assessment:

This is not an emergency. NEAR hasn’t had a publicized validator DDoS incident. The network is healthy. nearcore’s peer management works for current conditions.

This is infrastructure insurance. Every major PoS network has eventually faced validator-targeted network attacks as economic stakes grew. Ethereum validators get DDoS’d around MEV opportunities. Solana has experienced network degradation from message flooding. Cosmos developed sentry node architecture specifically in response to validator DDoS.

NEAR will likely face this eventually as DeFi TVL grows and MEV dynamics mature. The question is whether the ecosystem builds the defense before or after the first incident.

The cost of building it proactively is much lower than building it reactively. After an incident, there’s pressure to ship something fast, which usually means shipping something poorly designed. Building it during peacetime means getting the architecture right.

That said we might be wrong about the priority. Maybe the threat model is further away than we think. Maybe nearcore’s application-level defenses will scale further than we expect. That’s why we’re posting for discussion rather than building in isolation.


We’re also posting this to get feedback from the NEAR validator community directly.