Skip to content

refactor(iroh-dns): Replace hickory with a simpledns based DNS resolver#4036

Draft
dignifiedquire wants to merge 68 commits into
mainfrom
refactor-hickory
Draft

refactor(iroh-dns): Replace hickory with a simpledns based DNS resolver#4036
dignifiedquire wants to merge 68 commits into
mainfrom
refactor-hickory

Conversation

@dignifiedquire

Copy link
Copy Markdown
Contributor

No description provided.

@rklaehn

rklaehn commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

I don't remember the exact detalis, but there was something that hickory could do that simple-dns can't. @Frando ?

@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/4036/docs/iroh/

Last updated: 2026-06-16T20:35:48Z

@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 59682fe

@dignifiedquire dignifiedquire moved this from 🚑 Needs Triage to 👍 Ready in iroh Apr 7, 2026
Frando added 16 commits April 14, 2026 15:05
Recursive resolvers commonly return CNAME records when a queried name
is an alias. The previous code only looked for the exact queried record
type (A/AAAA/TXT) in the answer section, silently returning empty
results for any name behind a CNAME (common with CDNs, cloud LBs).

This adds two levels of CNAME following:

(a) In-response: resolve_cname_chain() walks CNAME records within a
    single response packet to find the canonical name, then collects
    records matching either the original or canonical name.

(b) Recursive: send_query_following_cnames() detects when a response
    contains only a CNAME with no target records, and issues a new
    query for the CNAME target. Limited to 8 hops to prevent loops.
Without EDNS(0), well-behaved DNS servers limit UDP responses to 512
bytes (RFC 1035). Iroh endpoint TXT records with multiple addresses
can easily exceed this, forcing a TCP fallback round-trip on every
endpoint discovery query.

Add an OPT pseudo-record to all outgoing queries advertising 1232-byte
UDP payload support (the recommended safe value per RFC 6891 and DNS
flag day 2020). This avoids unnecessary TCP fallbacks while staying
under common path MTU limits.
Previously, send_query tried nameservers sequentially with a 5-second
per-nameserver timeout. Since the outer DNS_TIMEOUT is 3 seconds, only
the first nameserver was ever reached. A single UDP packet loss meant
immediate failure.

Now:
- All nameservers are queried in parallel via FuturesUnorderedBounded,
  with staggered starts (100ms between each) so the preferred
  nameserver gets a head start.
- UDP queries retry once per nameserver (2 attempts total) before
  giving up, matching hickory-resolver's default attempts:2 behavior.
- Per-nameserver timeout reduced to 2s so individual attempts complete
  quickly and don't block the overall query.
- First successful response from any nameserver wins.

Also fixes CNAME name-matching to gracefully handle responses without
a question section (accept all records of the target type).
Use recv_from instead of recv to verify that the DNS response came from
the expected nameserver. Without this check, a local network attacker
could race a spoofed response before the real one arrives. The random
16-bit query ID provides some defense, but source address validation
is standard defense-in-depth against cache poisoning.
DNS-over-TLS and DNS-over-HTTPS currently derive the TLS server name
from the IP address. This works for providers with IP SANs in their
certificates (Google, Cloudflare) but will fail for servers with
hostname-only certificates. Document this as a known limitation.
TxtRecordData was changed from Box<[Box<[u8]>]> to Box<[String]>,
which is lossy for non-UTF-8 TXT record content and breaks the public
API. Restore the original bytes representation to preserve binary TXT
record fidelity. Display still uses from_utf8_lossy for rendering.

Keep From<Vec<String>> for convenience at construction sites.
The previous extract_txt_record_data used TXT::attributes() which
returns a HashMap, losing ordering and deduplicating keys. This is
destructive for iroh's endpoint records which publish multiple addr=
entries as separate TXT records.

Replace with String::try_from(txt) which preserves the raw concatenated
content of each TXT record faithfully.
The resolv.conf parser previously only extracted nameserver lines,
ignoring search and domain directives. This means search domain
completion for short hostnames was silently broken.

Parse both directives per resolv.conf(5) semantics: search and domain
are mutually exclusive, last one wins. Introduce SystemDnsConfig struct
to carry both nameservers and search domains.

The resolver does not yet apply search domains to queries -- this
commit just ensures the configuration is read and available.
Some setups (Docker, VPNs, custom resolvers) use non-standard DNS
ports. Previously, entries like "nameserver 8.8.8.8:5353" would silently
fail to parse as IpAddr and be skipped.

Now try parsing as SocketAddr first (which supports port), falling back
to IpAddr with the default port 53.
NXDOMAIN (domain doesn't exist) and SERVFAIL (server error) were lumped
into the same ServerError variant. Add a dedicated NxDomain variant so
callers can distinguish "this domain doesn't exist" from "DNS is broken"
and skip retries for definitive NXDOMAIN responses.
dedup_by_key only removes consecutive duplicates, so if the same DNS
server appears on non-adjacent network adapters, it would survive
deduplication. Use a HashSet to properly deduplicate regardless of
ordering.
Document two known limitations:

1. No negative caching: NXDOMAIN/NODATA responses are never cached,
   which can cause thundering herd under high concurrency for
   non-existent domains. This matches the old hickory-resolver config.

2. No TCP/TLS connection reuse: each query opens a fresh connection,
   adding a full TLS handshake per DoT query. Only affects non-default
   DoT/DoH configurations.
The field is intentionally parsed from resolv.conf but not yet consumed
by the resolver. Add allow(dead_code) with a comment explaining why.
Implement resolv.conf search domain semantics per resolv.conf(5):

- Short hostnames (fewer dots than ndots, default 1) try each search
  domain suffix first, then the bare name.
- Names with enough dots try the bare name first, then search domains.
- FQDNs (trailing dot) bypass search domains entirely.
- NXDOMAIN responses advance to the next candidate name rather than
  failing immediately.

This makes the custom resolver behave like system resolvers for short
hostnames, which matters for Docker, Kubernetes, and corporate network
setups where search domains are commonly configured.
- Collapse nested if/if-let into if-let chains (resolve_cname_chain)
- Use .then() and .ok()? for cname_target
- Use is_ok_and for is_truncated
- Use matches! for record type checking
- Use bytes() instead of chars() for dot counting
- Extract with_timeout helper to deduplicate timeout wrapping
- Simplify stagger logging in send_query
- Use elapsed() instead of manual Instant arithmetic in cache
- Remove dead system_nameservers() wrapper
- Remove stale #[allow(dead_code)] on search_domains
- with_timeout: use `?` on Elapsed to get DnsError::Timeout directly
  instead of wrapping in a fake io::Error -> DnsError::Transport
- InvalidPacket: use `?` on SimpleDnsError directly (from_sources
  generates the From impl)
- UDP source validation: use `?` on io::Error instead of e!() wrapper
- TLS config missing: use io::Error::other for brevity
- Remove unused n0_error::e import from transport.rs
@rklaehn

rklaehn commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

I don't remember the exact detalis, but there was something that hickory could do that simple-dns can't. @Frando ?

I think it can do recursive resolution, but that is disabled by default. So I don't think we have to worry about it.

Frando added 16 commits June 11, 2026 14:24
Move the per-platform system DNS readers into unix, windows and android submodules under dns::system_config, each exposing read_system_dns().

Android no longer forwards to hickory_resolver: its JNI reader, which reads the active network's DNS servers from LinkProperties.getDnsServers(), is inlined and adapted to return SystemDnsConfig. install_android_jni_context moves alongside it and is re-exported at the crate root unchanged.

Add jni (android) and ipconfig (windows) as platform dependencies.
Replace the fixed-stagger fan-out in send_query with a bounded happy-eyeballs loop: try the historically fastest nameserver first, start the next after a short delay or as soon as the in-flight one fails, and cap concurrency so a long nameserver list no longer blasts every server.

Track a per-nameserver smoothed RTT (EWMA on success, penalty on failure, read-time decay) to order servers and re-probe demoted ones, so the list is self-healing.

Expand the fallback to Cloudflare, Google and Quad9 over UDP (v4 and v6, primary and secondary) plus DNS-over-HTTPS when a crypto provider is available, so resolution still works when one provider is down or plain DNS is blocked.

Refactor the system config type to DnsConfig with system, fallback, from_nameservers and system_with_fallback constructors.
Return a DnsConfig instead of a 3-tuple and fold the public-resolver fallback into a single trailing check applied in one place.
Order nameservers by smoothed RTT relative to a neutral baseline so a measured-fast server stays ahead of an idle or recovering one, instead of unprobed or decayed servers sorting as fastest.

Verify the host and query type on a DNS cache hit, so a 64-bit key collision returns a miss rather than another name's records.

Fall back to public resolvers only when neither the system nor the builder provides a nameserver, so explicit servers are no longer mixed behind the fallback set; drop the now-redundant system_with_fallback. Correct the DoH-by-IP certificate comment.
Confirmed via the live certificates that Cloudflare, Google and Quad9 all carry their anycast IPs as iPAddress SANs, so IP-addressed DNS-over-HTTPS validates for every fallback entry.
Carry an optional TLS server name per nameserver internally, so DNS-over-TLS and DNS-over-HTTPS can be configured against providers whose certificates cover a hostname rather than the IP.

Add Builder::with_tls_nameserver and Builder::with_https_nameserver; the existing tuple-based with_nameserver/with_nameservers and the DnsProtocol enum are unchanged, so the change is additive. DoT uses the name for SNI; DoH addresses the URL by hostname with the connection pinned to the configured IP via reqwest, avoiding a bootstrap resolution loop.
Narrow items that are only reachable within their own module to private: the internal Nameserver type (and its fields and constructor), DnsConfig::from_nameservers, attrs::endpoint_id_from_txt_name, and TxtAttrs::from_strings.
On a fresh resolver, fallback nameservers are queried in list order, and the DoH entries sat behind all twelve UDP servers. On a network that silently drops UDP/53 the lookup timed out before DoH was ever tried -- the exact case it exists for.

Move the DoH entries just after the two fastest UDP primaries so they land within the happy-eyeballs first wave (MAX_CONCURRENT_QUERIES). On a working network UDP still answers before the staggered DoH attempts start, so no TLS handshake is wasted; when port 53 is blocked DoH is raced immediately. A test pins a DoH entry within the first wave.
@Frando Frando force-pushed the refactor-hickory branch from a384c38 to 96cd8eb Compare June 16, 2026 13:30
@Frando Frando changed the title [wip] Replace hickory with simpledns based DNS resolver refactor(iroh-dns): Replace hickory with a simpledns based DNS resolver Jun 16, 2026
A major network change rebuilds the resolver via `reset()` to pick up new
nameservers. It used to start with an empty cache, so lookups went cold
exactly during the transition (e.g. WiFi to 5G) and reconnects stranded
while DNS was still in flux. Make `DnsCache` Arc-backed and carry it into
the rebuilt resolver, so cached records keep serving until the new
nameservers settle.

Closes #4037
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 👍 Ready

Development

Successfully merging this pull request may close these issues.

3 participants