Opened 10 years ago

Closed 9 years ago

Last modified 9 years ago

#236 closed enhancement (fixed)

pluggable connection hints, plan for Tor

Reported by: Brian Warner Owned by:
Priority: major Milestone: 0.9.0
Component: network Version: 0.7.0
Keywords: Cc:

Description

dawuud and I worked out a plan for making it easier to run Foolscap over Tor and other network layers.

The first API change will be to make Tub.listenOn() accept a new (optional) advertise= argument. The idea is that each Listener knows how to figure out what connection-hints will reach it, and these hints are used when advertise= is set to its default of "AUTO". Any other value overrides that figure-it-out-yourself logic. The built-in TCP Listener uses IP-address autodetection to compute this. While doing this, we should also backport the improved ipaddr-autodetection changes from Tahoe.

Then we introduce "foolscap connection plugins". There are two categories: one for Listeners, and a second for outbound connection hints.

The Listener plugins are used to translate the string passed into Tub.listenOn() into a Twisted server endpoint (IStreamServerEndpoint). The basic TCP listener uses tcp:0 or tcp:PORT or tcp:PORT:interface=IFADDR as usual, and the plugin is a no-op. These plugins are also responsible for figuring out the right connection-hints (for advertise=AUTO). They'll implement an interface vaguely like this:

class IFoolscapServerConnectionHelper(Interface):
    typename = "onion"
    def how_to_listen(listenspec):
        return IStreamServerEndpoint # or str
    def what_to_advertise(IListeningPort):
        return connection_hint

The Connection plugins are used to translate a FURL's connection-hints into actual Twisted client endpoints. Each connection-hint is parsed enough to figure out the hint type (e.g. "tcp"), then passed into the matching plugin. The interface looks something like:

class IFoolscapClientConnectionHelper(Interface):
    typename = "tor"
    def make_client_endpoint(connection_hint):
        # add client-private stuff, like SOCKS port
        return IStreamClientEndpoint # or str. Endpoint must be able to .startTLS

The txtorcon package makes it possible to create a client Endpoint that uses a local Tor daemon to connect to an arbitrary DNS name, IPv4 address, or .onion hidden service. By adding a Connection plugin that handles tor:XYZ.onion:80, Foolscap becomes capable of using tor:-prefixed connection hints in FURLs.

Note that Twisted already has functions to convert from string to endpoint: twisted.internet.endpoints.serverFromString() and clientFromString(). These use a set of *Twisted* plugins (using Twisted's slightly-weird non-pip/setuptools-based plugin mechanism, the one with a .plugins file in $PYTHONPATH). The txtorcon package installs a Twisted plugin so that endpoints.clientfromString("tor:HOST:PORT:otheroptions") will use Tor to connect to HOST/PORT. The endpoints can (or will, some day) be configured to use a pre-existing Tor relay (with a SOCKS and/or control port), or to launch a new instance (by providing a path to the Tor executable, and maybe a persistent directory for it to store state). txtorcon also makes it pretty easy to launch hidden services.

We used to think that it was a good idea to use interpret the FURL's connection hints directly as Twisted client endpoint specification strings, but then came to our senses. The problem is that the "otheroptions" fields are powerful: for txtorcon, these fields are used to tell the plugin where to find the local Tor proxy, and control which directory is used for persistent state, etc. These fields must not be controlled by an external party. So the endpoint should either be constructed as a normal python object (combining host/port arguments from the FURL, with locally-defined proxy settings), or the endpoint specification string must be carefully assembled from the same pieces (guarding against attacks like host="XYZ.onion:socksProxy=attacker.com:1234, which would reveal the client's address to the attacker).

The server specification string, on the other hand, is entirely controlled by the local admin. So it would nominally be ok to use e.g. Tub.listenOn("onion:80:controlPort=9052:hiddenServiceDir=/path"). But if the socks-port/control-port must be given to the Foolscap plugin for client purposes, then it probably makes sense to give them to the Foolscap plugin for Listener purposes too.

So the next step is to write Tor-for-Foolscap plugins. Application code could then do

tub.registerPlugin(TorListenerPlugin(control_port=XYZ))
tub.registerPlugin(TorConnectionPlugin(control_port=XYZ))
# or TorConnectionPlugin(tor_exe=TORPATH, state_dir=DIR)
tub.listenOn("onion:80")
tub.getReference(furl_with_tor_hints)

All information about the Tor configuration is stored in the plugins, and used when constructing the client/server Endpoints. The listener's what_to_advertise() method would figure out the hidden-server onion address, and return a connection hint of tor:FOO.onion:80.

This Tor-for-Foolscap package would include a connection-plugin which used Tor client endpoints to talk to all hosts, not just .onion services. We'll add another API to foolscap to clear the plugin table. Then, an application which wants to *only* use Tor for everything (to hide its own IP address) would do:

tub.removeAllPlugins()
tub.registerPlugin(TorListenerPlugin(tor_stuff))
tub.registerPlugin(TorConnectionPlugin(tor_stuff))
tub.registerPlugin(TorForTCPPlugin(tor_stuff))

and all subsequent .listenOn() and .getReference() calls would use Tor exclusively.

Finally, to help Tahoe use this, we'd change Tahoe to install these plugins if it sees a [tor] section in tahoe.cfg, and to replace its Tub.setLocation() call with a corresponding Tub.listenOn(.., advertise=) argument. Tahoe needs to delegate its what-is-my-ip-address code to Foolscap.

Change History (19)

comment:1 Changed 9 years ago by Brian Warner

Having thought about it some more, I think the client-side plugins (the ones tentatively named IFoolscapClientConnectionHelper) will work, but the server/listener side ones are more troublesome. The problem lies in the relative timing (and synchronization) between six events:

  • 1: the application tells Foolscap how to listen, with Tub.listenOn()
  • 2: Foolscap's Listeners get their sockets in order
  • 3: the Listeners report locations
  • 4: the Tub assembles the composite connection-hint
  • 5: (somehow) the application learns that the Tub is ready
  • 6: the application calls Tub.registerReference()

At the moment, we have maybe three rules:

  • the application must call Tub.setLocation() before it can call registerReference()
  • if you use listenOn("tcp:0"), you must use startService before you can call listener.getPortnum()
  • you can use Tub.setLocationAutomatically(), which handles "tcp:0", and when its Deferred fires, it's safe to use registerReference()

Tahoe uses Tub.setLocationAutomatically() for the auxilliary "key generator" and "stats gatherer" daemons, but not for the main Tahoe client/server node (it was added for Tahoe's benefit, but we didn't finish making the transition). I'm not sure Tahoe's current logic can use this anymore (where tub.location = ...:AUTO:... gets modified to include the computed location hints).

The listener-side plugin design (above) moves us to a world where apps aren't calling Tub.setLocation(), but instead the location hints are coming from the listener plugins and then getting concatenated together. We'd need to add an observer to tell the application that the plugins are finished working and it's safe to start registering references. The application-side API for this might look like:

t = Tub()
t.listenOn("specification-1")
d = t.when_ready()
d.addCallback(lambda _: t.registerReference(ref))

We currently simulate when_ready() in Tahoe, and it'd be nice to move that into Foolscap proper (or remove the need for it altogether). And we could have what_to_advertise() return a Deferred to handle the synchronization between events 2 and 3/4. But that leaves two problems.

The first is that we might have multiple Listeners, and they could be added one at a time. Without a clear signal that we've added the last one, it's hard for the Tub to know when it should fire when_ready. We might fix this by rejecting multiple listenOn calls, and maybe add a new call-once listenOnMany() or addListeners() that takes a list of specifications.

t = Tub()
t.listenOnMany(["specification-1", "specification-2"])
d = t.when_ready()
d.addCallback(lambda _: t.registerReference(ref))

The second problem is that I'm not sure simple concatenation is quite what we want. Do we always want one hint per listener? We might want multiple hints if the listener is attached to multiple interfaces (say, a multi-homed box). And there might be reasons to have one hint for multiple listeners (maybe round-robin DNS or something?). Plus, I don't think users would want to configure this one-Listener-at-a-time. It seems more natural to me to assign one set of connection hints for the Tub as a whole, even if you build it out of multiple listeners. As nice as it'd be to have the plugin take care of Tor Hidden-Service construction/allocation, the multiple-listener thing makes it messy.

So here's a thought:

  • build the client-side plugins
  • omit the server/listener-side ones
  • deprecate Tub.setLocationAutomatically()
  • replace it with a pair of utility functions that 1: allocate a port, and 2: collect the local IP addresses
  • deprecate Foolscap's native handling of tcp:0 while we're at it

Foolscap would then be fully-explicit, and any port-allocation or address-autodetection would need to be done by the application code before setting up the Tub. In particular, Tahoe would handle tcp:0 (by allocating a port before calling listenOn()), rather than Foolscap.

Tahoe would need to do something special to listen on a Tor hidden service, but we know Tor users who want to configure that externally anyways (e.g. tell Tahoe to listen on 127.0.0.1/port=X, tell Tor to forward Y.onion at X, then advertise onion:Y.onion). We could make a simpler option (tahoe create-node --tor) that uses Tahoe-side code to allocate the HS.

*If* we could get rid of address autodetection entirely, and *if* we could either get rid of port allocation or push it up into tahoe create-node, then Tahoe's startup would look like this (which would be awesome, and would help some errors get delivered synchronously during tahoe start, which would be doubly awesome):

t = Tub()
t.listenOn(cfg.get("tub.listen"))
t.setLocation(cfg.get("tub.location"))
do_rest_of_setup()

If we could get rid of address autodetection, but had to allow tub.listen = tcp:0 (noting that, in general, this only ever happens the first time the node is launched, because after that the port number is read from NODEDIR/client.port instead), then it'd look like:

t = Tub()
listen_spec = merge_tcp_0(cfg.get("tub.listen"), read_client_dot_port)
d = defer.succeed(listen_spec)
if listen_spec.split(":")[:2] == ["tcp", "0"]:
    d.addCallback(lambda _: foolscap.allocate_tcp_port())
    d.addCallback(lambda port: "tcp:%d" % port)
d.addCallback(lambda new_listen_spec: t.listenOn(new_listen_spec))
d.addCallback(lambda _: do_rest_of_setup())

which still has the messy split between setup that's done before we can get a port, and setup that can be done afterwards. I really want to get rid of that split.

If we have to tolerate both, it grows to something like:

t = Tub()
location_spec = cfg.get("tub.location")
d1 = defer.succeed(location_spec)
if "AUTO" in location_spec.split(","):
    d1.addCallback(lambda _: autodetect_address_and_interpolate(location_spec))
listen_spec = merge_tcp_0(cfg.get("tub.listen"), read_client_dot_port)
d2 = defer.succeed(listen_spec)
if listen_spec.split(":")[:2] == ["tcp", "0"]:
    d2.addCallback(lambda _: foolscap.allocate_tcp_port())
    d2.addCallback(lambda port: "tcp:%d" % port)
d2.addCallback(lambda new_listen_spec: t.listenOn(new_listen_spec))
DeferredList([d1, d2]).addCallback(lambda _: do_rest_of_setup())

So, we have (at least) two basic directions to choose from:

  • 1: remove detection/allocation from Foolscap, use tub.setLocation(), no Listener plugins, application provides connection hints
  • 2: add more detection/allocation into Foolscap, use Listener plugins, no tub.setLocation(), plugin provides connection hints

I'll keep thinking about this.

comment:2 Changed 9 years ago by Brian Warner

In today's meeting, we basically settled on the "static" approach described above. The Foolscap-specific parts of this are:

  • specify the API for the client-side plugins (how the app registers plugins, and how the plugins get called)
  • write the "default" (TCP) plugin
  • deprecate and maybe remove Tub.setLocationAutomatically()
  • add an allocate_port() utility function

I'm thinking the plugins should be per-Tub, rather than global to the whole process. This fits more with our "no ambient globals" style, but one downside is that a process that uses multiple Tubs (e.g. Tahoe, using one Tub for the storage server, and a second for the logport/controlport) might forget to switch to Tor-only plugins on all the Tubs, and might have configured a log-gatherer FURL pointing to a regular IP address, and the non-plugined Tub would then leak its address to the gatherer.

Per-Tub plugins might also be a tool for implementing the "connect directly to certain servers" override that Leif mentioned in the mailing list thread (https://tahoe-lafs.org/pipermail/tahoe-dev/2015-June/009448.html). The StorageFarmBroker would maintain two Tubs (one for Tor, one for direct TCP), and would switch between them according to the local override rules. I think I'm slightly in favor of a different approach, where a "MaybeTorForTCP" plugin knows about the override rules and can emit SOCKS/Tor-ified endpoints or regular ones accordingly.

comment:3 Changed 9 years ago by dawuud

OK so I copied Meejah's/txtorcon's available_tcp_port; added to util.py: https://github.com/david415/foolscap/tree/236.allocate-port.0

This helper function returns available ports for the loopback interface... but this might not be what we want?

comment:4 Changed 9 years ago by dawuud

It seems we should use the Twisted plugin api to write our Foolscap client plugin system: https://twistedmatrix.com/documents/current/core/howto/plugin.html

But what should our client plugin Zope interface look like? It could just have a connect method... and then use the plugin to dynamically create an endpoint object... and then call this endpoint's connect method.

comment:5 Changed 9 years ago by dawuud

tracking progress of synchronous ip auto-detection in #238

comment:6 Changed 9 years ago by Brian Warner

I'm looking at converting negotiation.py from ClientConnectionFactory to (Client)Endpoints. It's not trivial, but not impossible. Once that's done, we can start on the plugins.

For the immediate goal (which is just client-side Tor support), we only need the connection plugins. We can defer the allocate_port() work for later.

comment:7 Changed 9 years ago by dawuud

Maybe some of my dev branches can be used as a reference when writing this... I obviously didn't design the API changes properly but the twisted endpoint stuff did work and I learned some thing when getting rid of all the dirty reactor errors... mostly having to do with connection deferred cancellation or lack thereof.

comment:8 Changed 9 years ago by Brian Warner

WIP: https://github.com/warner/foolscap/tree/endpoints , seems to be working. dawuud, thanks for the reference, that helped a bunch!

comment:9 Changed 9 years ago by Brian Warner

ok, the endpoint stuff got landed in trunk in [d61360ef]. There's a Twisted bug (twisted#8014) that means we can't use HostnameEndpoint right now, which is a shame because that'd probably give us IPv6 client-side support (as well as handling round-robin DNS responses), but we can do without it for now.

Next step is to change the way connection hints are managed: do less parsing ahead of time, leave more of the work up to the plugins.

comment:10 Changed 9 years ago by str4d

Reviewing [d61360ef].

foolscap/connection.py:

  • Is there a reason for separately adding self._connectionSuccess and self._connectionFailed on lines 182-183? Logically they are a pair, and I don't see any functional reason why self._connectionSuccess should fail and error to self._connectionFailed. So wouldn't d.addCallbacks() make more sense here? Or is it simply that the method signature for d.addCallbacks() isn't as readable when both callback and errback need additional arguments?

Otherwise, LGTM 👍

comment:11 Changed 9 years ago by Brian Warner

All your observations are correct. I was being lazy and using the log.err in _connectionFailed to capture any errors that happened during _connectionSuccess. If/when we manage to clean up that control flow (replacing the redirectReceived/negotiationFailed/negotiationComplete calls with a normal Deferred callback), I think I'll make the two connection failed/success calls into siblings like you suggest, and put a single log-weird-stuff errback at the end of the whole chain.

comment:12 Changed 9 years ago by Brian Warner

I've pushed some more changes in [20f867a4], now we manage connection hints as strings internally instead of tuples. This delivers the string connection hint to a standalone function named hint_to_endpoint(). The plugin will take the place of that function, or be called by it, or something.

comment:13 Changed 9 years ago by Brian Warner

The remaining pieces:

  • define an Interface for the plugins
  • maybe call them Handlers instead of Plugins, since we aren't proposing to use twisted.plugin or setuptools plugins for these (an application might, but Foolscap itself won't, and apps must explicitly run Tub.addConnectionPlugin() if they want non-default behavior)
  • tests, docs

We also need to think about the future, where we might pass additional information to the plugin (which tubid we're connecting to, maybe some opaque pointer or options that accompanied the tub.getReference() call with e.g. a Tahoe serverid). We could add a new method in the future (hint_to_endpoint2) with additional arguments, and fall back to the old type if we get a NameError while invoking that one. We could define the arguments now, and tell plugin authors that they'll get None until some future release that starts using them. Or we could tell plugin authors to accept/discard **kwargs in their arguments now, but to not expect additional arguments until some later release.

comment:14 Changed 9 years ago by Brian Warner

Component: unknownnetwork
Milestone: undecided0.9.0

comment:15 Changed 9 years ago by Brian Warner

Resolution: fixed
Status: newclosed

Added the Interface and some tests in [cb5a7c048]. Calling this one done.

comment:16 Changed 9 years ago by Brian Warner

Resolution: fixed
Status: closedreopened

After some feedback from str4d in https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517#comment:48 , I'm going to change the interface before the 0.9.0 release:

  • Tub.addConnectionHintHandler(hint_type, handler)
  • the Tub maintains a dict that maps hint_type to handler
  • handlers are only called with hints of the registered types
  • hint_to_endpoint(hint,reactor) raises InvalidHintError rather than returning None

The old-style host:port hints will be transformed into tcp:host:port before looking through the handlers, instead of being processed in DefaultTCP.

comment:17 Changed 9 years ago by Brian Warner

One slight downside of using a dictionary: legacy hints like localhost:12345 are ambiguous.. is this type=localhost and stuff=12345, or should it be translated into the non-ambiguous tcp:localhost:12345? Allowing plugins to claim-or-decline each hint meant a type=localhost handler could choose to grab it, or allow it to pass through to the legacy TCP handler. Using a dictionary, and the necessary pre-conversion routine, means this hypothetical type=localhost handler would never see it.

The regexp that matches legacy hints has three rules:

  • exactly one colon
  • the first part must be a dotted-quad IPv4 address, or legal DNS name
  • the second part must be 1-5 digits

Since we generally expect hint types to be [a-z]+, the only points of overlap will be short (non-fully-qualified) hostnames. But I don't think that helps. New hint types will need to do one of the following to avoid having their hints get mis-classified as legacy:

  • use two or more colons, e.g. tor:abc.onion:80. Anything with a "port number" is safe.
  • put something non-alphanumeric/./- in the hint type. However comma and slash are claimed by the FURL the hints are embedded in. Underscore, asterisk, equals, and plus might be options.
  • put something non-digit in the second half. e.g. fd:fd=1 instead of just fd:1.

Kind of a drag, but maybe not too onerous. Also, I'm not entirely convinced the claim-or-decline approach would have worked anyways: we'd need to let plugins get inserted *before* !DefaultTCP, and that complicates the registration functions.

comment:18 Changed 9 years ago by Brian Warner

Resolution: fixed
Status: reopenedclosed

Updated in [6cb27f14].

comment:19 Changed 9 years ago by str4d

Here's an alternative: Look up the handlers first, and if none are found, then do the pre-conversion and look up again (which will then find DefaultTCP if it is there).

  • If the first lookup succeeds, it's a hint for a registered handler.
  • If the pre-conversion fails, it's an invalid hint.
  • If the second lookup succeeds, it's a hint for DefaultTCP (or another handler with type=tcp).
  • If the second lookup fails, it's an invalid hint.

This means that if a type=localhost plugin is installed, it will get the hint. It also removes the need for other plugins to be careful about how their own hints are prepared, because they are asked before the pre-conversion.

Note: See TracTickets for help on using tickets.