Skip to content

Conversation

@jdslab
Copy link
Contributor

@jdslab jdslab commented Jan 5, 2026

Background

SCION packet headers contain a SRC address to which packets should be returned. This address needs to be visible/reachable by the first-hop border router, assuming that the path gets simply reversed by the peer. This address may not be easy to discover if the sender is separated from the receiver by a NAT.

A solution has been designed that uses STUN for NAT traversal, allowing clients behind NATs to query the first-hop border router to obtain their external mapped address. This external address can then be used as the SRC address in the SCION header. The relevant design document can be found here. The implementation of a STUN server at the border router has been done here. This PR and its implementation details were originally discussed here

Summary of Changes

The present PR concerns the client side implementation of the NAT travelsal mechanism in the snet library. Modifications involve adding a configurable option to SCIONNetwork that enables transparent NAT traversal. If enabled, SCION connections obtained by SCIONNetwork.Dial will automatically query the first-hop border router for the external address using a STUN request. The external address is then used as the clients SRC address instead of its local address when sending packets.

The STUN mechanism is implemented compliant with RFC8489, which includes retransmissions on timeouts. Changes to the behavior of the SCION connection's methods are kept to a minimum. Specifically, the ReadFrom and WriteTo methods closely mimick the behavior of net.UDPconn. The STUN mechanism happens in the background and is transparent towards client applications.

In summary, the changes in this PR allow client applications to connect to the SCION internet even in the presence of a NAT device.

Testing

A new integration test has been added in /acceptance/stun. The test simulates a SCION network topology with a NAT device present. A test client behind the NAT sends a test packet using snet's SCIONNetwork.Dial to a test server outside of the NAT, in a different AS. The test server, in turn, responds with a packet to the client. The test succeeds if the returning packet arrives at the client without issue. As usual, this test is part of the CD/CI pipeline.

The integration test can be run, manually, using the following instructions:

  • Set up development environment following these instructions, replacing the SCION repository with code from this PR.
  • Run make, then make docker-images
  • Run bazel test --config=integration --test-output=streamed //acceptance/stun:test

if timeout <= 0 {
return nil, os.ErrDeadlineExceeded
}
deadlineExceeded = time.After(timeout)
Copy link
Contributor

@romshark romshark Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.After is quite expensive. Maybe we can reuse a time.Timer, potentially even allocated once as member of *stunConn (probably can't be reused for all callers though, but at least across retransmissions)? This would reduce the number of runtime allocations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package main

import (
	"testing"
	"time"
)

func BenchmarkTimeAfter(b *testing.B) {
	for b.Loop() {
		for range 7 {
			<-time.After(1 * time.Microsecond)
		}
	}
}

func BenchmarkReuseTimer(b *testing.B) {
	t := time.NewTimer(0)
	<-t.C // wait for initial fire

	for b.Loop() {
		for range 7 {
			t.Reset(1 * time.Microsecond)
			<-t.C
		}
	}
}
goos: darwin
goarch: arm64
pkg: benchfmt
cpu: Apple M4 Pro
BenchmarkTimeAfter-14              39968             29366 ns/op            1736 B/op         21 allocs/op
BenchmarkReuseTimer-14             42736             28290 ns/op               0 B/op          0 allocs/op
PASS
ok      benchfmt        2.624s

Reusing a time.Timer results in no garbage being generated for the GC to clean up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the change, reusing time.Timer on retransmissions and deadline changes to reduce the number of timer allocations. Allocating as member on stunConn does not work though, since we need separate timers for concurrent calls.

Comment on lines +327 to +331
close(c.readDeadlineChanged)
close(c.writeDeadlineChanged)
c.readDeadlineChanged = make(chan struct{})
c.writeDeadlineChanged = make(chan struct{})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks rather wasteful as it probably can be done with sync.Cond without runtime allocations. But I can't tell straight away how complicated that would be to do here and whether it's worth it. This is just an idea, so feel free to resolve this comment without action.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot wait on sync.Cond directly inside a select statement. If we use a Cond, we would still need to bridge it to a channel in the waiting call, e.g. like this:

deadlineChanged := make(chan struct{})
go func() {
	c.mutex.Lock()
	c.readDeadlineCond.Wait()
	c.mutex.Unlock()
	deadlineChanged <- struct{}{}
}()

select {
...

I don't know which is better idiomatically. Any thoughts?

Copy link
Contributor

@romshark romshark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, stated my suggestions in the comments.

// getSTUNMappedSource returns the NAT mapped address for the source if the connection is
// using STUN and the destination is in a different IA. Otherwise, it returns the original
// address unchanged.
func (c *scionConnWriter) getSTUNMappedSource(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stunMappedSource instead of getSTUNMappedSource

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants