Opened 12 years ago

Closed 9 years ago

#203 closed task (fixed)

change Tub.listenOn() to use Endpoints

Reported by: Brian Warner Owned by: Brian Warner
Priority: major Milestone: 0.12.0
Component: network Version: 0.6.4
Keywords: Cc: reg.trac.foolscap.4v9Xu@…

Description (last modified by Brian Warner)

Starting in Twisted-10.1, "Endpoints" are the preferred way to set up servers and clients. They are intended to encapsulate a variety of ways of making streaming network connections (of which TCP is one possibility, but unix-sockets, SOCKS proxies, TOR .onion addresses, i2p connections, and cjdns connections are others).

The "type" extension field in Foolscap's "connection hints" are intended to reflect this variety, starting with accomodating both ipv4 and ipv6, but meant to cover other things in the future too (like .i2p addresses which aren't associated with port numbers).

It would probably be a good idea to change Foolscap's internal socket-handling code to use Endpoints rather than internet.TCPServer and reactor.connectTCP directly. It might also be good to allow our connection hints to include any string that endpoints.clientFromString is capable of parsing.

The current plan is to make Tub.listenOn() accept a "stream server endpoint description string", which is then parsed by Twisted to build a "stream server endpoint". This will get us listen-on-IPv6 support for free.

We no longer intend to allow FURLs to specify client endpoint strings directly: instead we added connection-hint handling plugins (in #236) to safely map externally-supplied connection hints to locally-generated client endpoints.

This process started in #167 (where early Endpoints support broke foolscap and needed to be worked around), and continues in #155 (where Endpoints might make ipv6 support easier).

Attachments (2)

Change History (60)

comment:1 Changed 12 years ago by Brian Warner

Incidentally, one thing we still need out of this is the ability to learn which kernel-allocated TCP port number we got, so we can use it again next time. In an internet.TCPServer we got this by peeking at the internal ._port attribute. We'll need something similar from our endpoint objects.

comment:2 Changed 11 years ago by Zooko

I just posted a comment on #150 wondering whether this feature -- Endpoints -- would satisfy the goals of #150 but better.

comment:3 Changed 11 years ago by Zooko

I'm not sure how to write the first unit test for this. Would the test do something like passing a new "self address" to foolscap which says "i2p:somethingsomething" instead of saying "192.168.0.1:somethingsomething" ?

comment:4 Changed 11 years ago by str4d

Replying to warner:

Incidentally, one thing we still need out of this is the ability to learn which kernel-allocated TCP port number we got, so we can use it again next time. In an internet.TCPServer we got this by peeking at the internal ._port attribute. We'll need something similar from our endpoint objects.

AFAICT the way to do this is to get your IStreamServerEndpoint (with serverFromString()) and call listen() on it, which returns a Deferred that results in a IListeningPort on success. You can then call IListeningPort.getHost() which returns an IAddress provider. I haven't looked into whether the port set here is the kernel-allocated one when you ask to bind to port 0, but API logic dictates that it should be.

comment:5 Changed 11 years ago by str4d

From IRC:

<str4d> Looks like a rewrite of foolscap.pb.Listener to use Endpoints instead of TCPServer will work.
<str4d> zooko: pb.Listener leverages twisted.application.internet.TCPServer, which is a wrapper around reactor.listenTCP
<str4d> Instead of storing the TCPServer instance in Listener, you want to store a twisted.internet.interfaces.IStreamServerEndpoint instance
<str4d> (provided by serverFromString)
<str4d> Calling listen() on that will directly give you the IListeningPort, which is what pb.Listener is currently "stealing" via s._port
<str4d> Except... hmm.
<str4d> IStreamServerEndpoint.listen() takes an IProtocolFactory as a parameter, and returns the IListeningPort.
<str4d> So I'm not certain whether managing the IStreamServerEndpoint inside the IProtocolFactory (pb.Listener) is the right "flow".
<str4d> TCPServer is a proper Service, whereas IStreamServerEndpoints are not.
<str4d> FWICT addTub() / removeTub() is where the listening logic occurs, hidden behind s.setServiceParent()
<str4d> With Endpoints, I think addTub() would call IStreamServerEndpoint.listen(self) and store the returned IListeningPort, and removeTub() would call IListeningPort.stopListening() once all Tubs were gone.
* str4d is not yet clear on the exact relationship between Tubs and Listeners, because each seems to have a list or map of the other.
<zooko> I know that a single Tub can have multiple Listeners.
<zooko> Because it can accept connections on multiple interfaces+portnumbers.
<zooko> I assume that a single Listener handles a single interface+portnumber.
<zooko> I doubt a single Listener can have multiple tubs ?
<str4d> zooko: it looks like it can.
<str4d> The first Tub becomes the "parent", and when the parent tub gets removed another tub is made the parent.
<str4d> But that only makes sense to me as being necessary because the current system uses TCPServer which requires a parent Service.
<str4d> Single Listener handles single endpoint, yes.
<str4d> Tub.listenOn():
<str4d> @return: The Listener object that was created. This can be used to
<str4d> stop listening later on, to have another Tub listen on the same port,
<str4d> and to figure out which port was allocated when you used a strports
<str4d> specification of 'tcp:0'.

comment:6 Changed 11 years ago by str4d

Attached is my first pass at migrating foolscap.pb.Listener to use Twisted endpoints. Calling Tub.listenOn() should work with any valid endpoint description. Non-TCP endpoint descriptions will listen but not be used because the client side still only knows TCP.

Modifying the client side will take more time, because I don't yet understand how connections are managed by foolscap. reactor.connectTCP() (used in TubConnector.connectToAll()) returns an IConnector which I think can be restarted, but IStreamClientEndpoint.connect() returns an IProtocol which cannot.

Last edited 11 years ago by str4d (previous) (diff)

comment:7 Changed 11 years ago by str4d

This patch breaks six tests. Five of these are tests that use port-only descriptions (e.g. Tub.listenOn(0)) that previously were accepted and defaulted to TCP, but that the endpoint API considers invalid. It needs to be decided whether foolscap should continue to accept "old-style" descriptions (and detect them and prepend "tcp:"), or whether users should be informed that they should use proper endpoint descriptions now.

The other broken test fails because Listener no longer has a parentTub attribute. I stripped the parentTub logic out because AFAICT it is unnecessary with the endpoint API - there is no Service to add as a child of the Tub. If I am mistaken, and the parentTub attribute is used for more than running the TCPServer, then that can be replaced (but grep -r parentTub turned up nothing outside of the original pb.py and the now-broken test).

comment:8 Changed 11 years ago by nejucomo

I read over the patch diff, but I'm not intimately familiar with foolscap code. I have a concern about how hosts and ports are used by foolscap:

The Listener can be queried for the host and/or port. Why?

It's hard to tell how applications will use this information, though it seems a common pattern for "decentralized apps" will be to transmit this information elsewhere so that other systems can connect to the Listener.

For a use case of I2P/Tor this seems dangerous. Instead, those applications should almost certainly not be transmitting any local details like localhost:8080. So the first thought is to ensure we replace getHost() / getPort() -style APIs with getConnectionAddress() which returns an endpoint client spec string.

Notice I said client spec. For Tor, the server endpoint must include the private key (or a way to discover it), whereas the connection details passed around to other peers must not include that, and instead must be a public .onion.

So now the API must be (approximately, I'm not certain of the current API): Tub.listen(serverspec) versus Listener.getClientSpec(). For simple TCP cases, these happen to be the same host/port pair (although for things like NAT they may already differ).

This bears some thought about endpoints, API design, and application management of endpoints.

I can't tell if this patch already makes this distinction clearly and correctly. One way it could help is to stop referring to hosts and ports, and start referring to connector specs and listener specs.

Last edited 9 years ago by Brian Warner (previous) (diff)

comment:9 in reply to:  8 Changed 11 years ago by str4d

Replying to nejucomo:

I read over the patch diff, but I'm not intimately familiar with foolscap code. I have a concern about how hosts and ports are used by foolscap:

The Listener can be queried for the host and/or port. Why?

From the docstring for getPortnum():

When this Listener was created with a port string of '0' or
'tcp:0' (meaning 'please allocate me something'), and if the Listener
is active (it is attached to a Tub which is in the 'running' state),
this method will return the port number that was allocated. This is
useful for the following pattern::

    t = Tub()
    l = t.listenOn('tcp:0')
    t.setLocation('localhost:%d' % l.getPortnum())

This is only useful for IP-style endpoints; the IAddress returned by IListeningPort will not always have a port attribute. I added getHost() so that foolscap users could get the current listening-on information (as provided by the particular endpoint - TCP/SSL/I2P/Tor/etc). Ideally, I would remove getPortnum entirely, and the foolscap user would determine how to use the provided IAddress - if using endpoints, foolscap should be endpoint-agnostic.

It's hard to tell how applications will use this information, though it seems a common pattern for "decentralized apps" will be to transmit this information elsewhere so that other systems can connect to the Listener.

For a use case of I2P/Tor this seems dangerous. Instead, those applications should almost certainly not be transmitting any local details like localhost:8080. So the first thought is to ensure we replace getHost() / getPort() -style APIs with getConnectionAddress() which returns an endpoint client spec string.

As above, getHost() returns the IAddress provided by the endpoint, which should contain this info. I am not certain what is returned for TCP apps (and if there is NAT etc. then the foolscap user will still need to manually set the externally-accessible location) but the I2P/Tor endpoint plugins should be implemented such that the IAddress they return contains the B32/onion being listened on.

Incidentally, it will be easier to not leak local details once endpoints are implemented. Using Tahoe over I2P right now needs careful configuration to not leak details because the underlying foolscap uses TCP. With endpoints, foolscap will have no need to "know" specific local details.

Notice I said client spec. For Tor, the server endpoint must include the private key (or a way to discover it), whereas the connection details passed around to other peers must not include that, and instead must be a public .onion.

The server endpoint description passed into the Listener would necessarily contain the I2P/Tor private key location, passed in however the I2P/Tor endpoint expects it (B64 repr, file location etc. - however the I2P/Tor endpoint plugin is implemented).

This distinction is equivalent to the difference between client endpoint descriptions and server endpoint descriptions for the core endpoint types (described here).

So now the API must be (approximately, I'm not certain of the current API): Tub.listen(serverspec) versus Listener.getClientSpec(). For simple TCP cases, these happen to be the same host/port pair (although for things like NAT they may already differ).

This bears some thought about endpoints, API design, and application management of endpoints.

I can't tell if this patch already makes this distinction clearly and correctly. One way it could help is to stop referring to hosts and ports, and start referring to connector specs and listener specs.

I didn't want to move too far away from that until I understood the foolscap code better, but I agree.

comment:10 Changed 11 years ago by Zooko

I got this message from tom prince on IRC:

Perhaps you are looking for twisted.application.internet.StreamServerEndpointService (in that foolscap ticket)

comment:11 Changed 11 years ago by Zooko

Owner: set to Brian Warner

I'm still stuck on "What are the tests?". Maybe if we're switching to Endpoints internally while still offering the same functionality through the foolscap API to the foolscap user, then we don't need any new tests! Refactoring internally and not breaking any of the current tests is sufficient. Brian: does that sound right to you?

comment:12 Changed 11 years ago by Zooko

… and then, yes, if we're adding functionality like I2P endpoints, then it is easier for me to see what the tests should be: constructing a foolscap tub while configuring it to listen on an I2P server-endpoint, for example.

comment:13 in reply to:  11 ; Changed 11 years ago by nejucomo

Replying to zooko:

I'm still stuck on "What are the tests?". Maybe if we're switching to Endpoints internally while still offering the same functionality through the foolscap API to the foolscap user, then we don't need any new tests! Refactoring internally and not breaking any of the current tests is sufficient. Brian: does that sound right to you?

That sounds acceptable to me, FWIW.

However, perhaps the string spec parsing can be tested. Are there valid endpoint string specs which the foolscap strports format does not accept? In that case a test could be added that fails against trunk foolscap but passes with endpoint-enabled foolscap.

Is there a "null" endpoint or a "process-internal endpoint" that can be used in a deterministic unittest?

Last edited 11 years ago by nejucomo (previous) (diff)

comment:14 in reply to:  13 Changed 11 years ago by str4d

Replying to nejucomo:

Replying to zooko:

I'm still stuck on "What are the tests?". Maybe if we're switching to Endpoints internally while still offering the same functionality through the foolscap API to the foolscap user, then we don't need any new tests! Refactoring internally and not breaking any of the current tests is sufficient. Brian: does that sound right to you?

That sounds acceptable to me, FWIW.

However, perhaps the string spec parsing can be tested. Are there valid endpoint string specs which the foolscap strports format does not accept? In that case a test could be added that fails against trunk foolscap but passes with endpoint-enabled foolscap.

I don't think there are endpoint description formats that strports doesn't accept, but "0" is a strport format that endpoints (specifically, clientFromString and serverFromString) don't accept (as above).

Is there a "null" endpoint or a "process-internal endpoint" that can be used in a deterministic unittest?

My understanding of Twisted is that you would write a Fake endpoint plugin to use to test that endpoint descriptions were being used correctly in foolscap and being passed into the endpoint API correctly. Anything past the endpoint API should be tested by Twisted or the endpoint plugin developers.

comment:15 in reply to:  12 Changed 11 years ago by str4d

Replying to zooko:

… and then, yes, if we're adding functionality like I2P endpoints, then it is easier for me to see what the tests should be: constructing a foolscap tub while configuring it to listen on an I2P server-endpoint, for example.

One change I am not sure how to handle is to setLocation(). AFAICT the user must call setLocation() with the publicly-reachable locations before they can use the Tub, and then the Tub URL is created from these hints. For I2P/Tor this is not necessary, and the publicly-reachable location should be obtained from the endpoint. I think this is what the getHost() methods of IListeningPort (returned by IStreamServerEndpoint.listen()) is for, and maybe it should be used for all endpoint types? How reliable is it for TCP/UDP endpoints?

comment:16 in reply to:  10 ; Changed 11 years ago by str4d

Replying to zooko:

I got this message from tom prince on IRC:

Perhaps you are looking for twisted.application.internet.StreamServerEndpointService (in that foolscap ticket)

This looks like a good replacement for TCPServer. But is it necessary?

comment:17 in reply to:  16 Changed 11 years ago by Zooko

Replying to str4d:

This looks like a good replacement for TCPServer. But is it necessary?

No idea! I was just repeating what tom prince had told me.

comment:18 Changed 11 years ago by Mike Kazantsev

Cc: reg.trac.foolscap.4v9Xu@… added

Minor nitpick - "and cjdns connections are others" in the description (referring to cjdns needing special endpoint type) seem to be incorrect - cjdns maps keys to ipv6 addresses ("cjdns ipv6 = 0xfc + sha512(sha512(pubkey))[:15]", essentially), so you have:

11: hyperboria: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> ...
    link/none 
    inet6 fc7b:d79a:4eb5:28f6:5686:637d:9f75:5db6/8 scope global 
       valid_lft forever preferred_lft forever

"tcp6:" endpoint should work there, no need for anything special for that particular case.

comment:19 Changed 11 years ago by str4d

<str4d> tl;dr - foolscap cannot IMHO be completely ignorant of what wire it is on.
<zooko> str4d: why not?
<str4d> Because (I think) the Twisted APIs for TCP/UDP to get the public url a server is listening on are unreliable, but for I2P/Tor they are reliable.
<str4d> But for a user, they can easily work out their public url for TCP/UDP, but not for I2P/Tor
<zooko> What do you mean unreliable?
<str4d> IListeningPort.getHost()
<str4d> Has the standard problems with NAT etc.
<str4d> foolscap does not use it to get the host, only to get the TCP port.
<str4d> foolscap requires the user (i.e. Tahoe) to explicitly specify the public address that the server will be reachable on.
<str4d> But it cannot require that for I2P/Tor because for the general use case, the I2P endpoint will create and store the Destination keys itself.
<zooko> Hm.
<str4d> This is the key simplification over the existing process that requires users to manually set up tunnels. If the user has to still manually set tunnels up, they would just use a TCP endpoint.
<str4d> So, IListeningPort.getHost() returns an IAddress. For TCP/UDP the IAddress is unreliable. For I2P/UDP the IAddress is reliable. (IMHO)

For clients: ITransport.getPeer() gets the public address of the remote peer but cautions its use, and ITransport.getHost() states that it is the same but for the local side of the connection.

For servers: IListeningPort.getHost() gets the host the IListeningPort is listening for.

For TCP/UDP, the IAddress returned by the getHost() methods is an IP address:port that may not be publicly reachable (e.g. if behind NAT or port forwarding). For I2P, the IAddress is an I2P Destination(:port) that is guaranteed to be the publicly reachable address of the client/server (I assume similar for Tor HSs).

Result: Tub.listenOn() should accept any endpoint description. Tub.setLocation() is still required before a reference can be registered, but only if a TCP/UDP endpoint description was provided.

I think something needs to be special-cased (when an endpoint description is passed in, it is checked against a list of endpoint types to determine the required behavior). Do we special-case IP-style endpoints (TCP/UDP/SSL/SSH/...) or non-IP-style endpoints (I2P/Tor/whatever-else-might-be-created)?

How should the foolscap API be changed to accommodate this? Do foolscap clients (e.g. Tahoe) need to perform checks on their config to ensure that they have the required information? Can they call Tub.listenOn() and then only call Tub.setLocation() if they were provided that information (putting the requirement for checks onto the user configuring the foolscap client)?

comment:20 Changed 11 years ago by Randall Mason

Starting at line 793, there is a comment about how to support future protocols. It requires that every new type start with TYPE: where type is some alphanumeric string.

Twisted endpoints already have a location hint like syntax that would allow us to piggy back off of any protocol additions that they do. This would be a good reason to break from the TYPE: syntax; we can make using twisted endpoints easier. The most concise way to include a twisted endpoint in our current refereanceable.py would be to make all twisted endpoint definitions start with ":". This would still work with new possible protocols that twisted didn't support, but it would make most protocols be supported because twisted endpoints support them.

Legacy IPv4 addresses would continue to work because they are detected by this RE:

IPV4_HINT_RE = re.compile(r"^([^:]+):(\d+)$")

which requires that the first character not be a colon for IPv4.

Anything that we wanted to add that wasn't supported by twisted endpoints would still be able to use the TYPE: format according to the same format mentioned in the above comment.

comment:21 Changed 11 years ago by Zooko

Clash's idea in comment:20 was something that we came up with at Tahoe-LAFS Weekly Dev Chat today, and I like it. This is because people manipulate furls manually, e.g. with cut-and-paste in terminals and web browsers and so on, and so conciseness is important. That's why I said during Dev Chat today that I didn't want to have something like "endpoint:" at the beginning of every connection-hint that was in the endpoints scheme. Clash and I came up with the idea of just having a leading ":" be the indicator that this hint is an endpoint hint.

comment:22 in reply to:  20 Changed 11 years ago by str4d

Replying to ClashTheBunny:

Starting at line 793, there is a comment about how to support future protocols. It requires that every new type start with TYPE: where type is some alphanumeric string.

Twisted endpoints already have a location hint like syntax that would allow us to piggy back off of any protocol additions that they do. This would be a good reason to break from the TYPE: syntax; we can make using twisted endpoints easier. The most concise way to include a twisted endpoint in our current refereanceable.py would be to make all twisted endpoint definitions start with ":". This would still work with new possible protocols that twisted didn't support, but it would make most protocols be supported because twisted endpoints support them.

I don't understand why a break from the TYPE: syntax is necessary - that is exactly the syntax that Twisted endpoint descriptions use. These are all valid Twisted client endpoint descriptions:

tcp:host=twistedmatrix.com:port=80:timeout=15
tcp:twistedmatrix.com:80
ssl:host=twistedmatrix.com:port=443:caCertsDir=/etc/ssl/certs
i2pbob:stats.i2p

(The timeout and caCertsDir parameters are examples of optional parameters. Real FURLs would not contain them, and would use the concise formats e.g. ssl:twistedmatrix.com:443)

A FURL contains a series of comma-separated location hints for a client to use. Any of the above endpoint descriptions could be provided in the FURL, extracted by decode_location_hints and passed directly to clientFromString(). I don't see anything preventing commas from being used in endpoint descriptions, but Twisted supports escaped colons in parameters, so encode_location_hint could escape commas and decode_location_hints could unescape them.

Legacy IPv4 addresses would continue to work because they are detected by this RE:

IPV4_HINT_RE = re.compile(r"^([^:]+):(\d+)$")

which requires that the first character not be a colon for IPv4.

If a Twisted endpoint cannot be found to match an endpoint description, clientFromString() raises a ValueError. So a much simpler/dumber way to support legacy IPv4 addresses is to pass all location hints through clientFromString(), and if any return a ValueError instead of an endpoint, add tcp: to the start and try again. If a ValueError is still raised, then either the location hint is corrupted/invalid or the Twisted endpoint (plugin) it requires is not installed.

Anything that we wanted to add that wasn't supported by twisted endpoints would still be able to use the TYPE: format according to the same format mentioned in the above comment.

The reason we are considering using Twisted endpoints in foolscap is because it removes the need to have code for custom wires. If Twisted doesn't inherently support something, a plugin can be written that does - the API is generic enough to support anything (or it will be once this Twisted ticket lands).

Last edited 11 years ago by str4d (previous) (diff)

comment:23 Changed 11 years ago by str4d

Another pass at migrating Listener to use Twisted endpoints. This patch uses StreamServerEndpointService as a replacement for TCPServer, instead of using the endpoint directly. I do not know which method is better, but I do not understand the Twisted child Service concept yet.

This patch badly fails tests, because the reactor is left unclean. I have no idea why, the reactor is only used the same way as the previous patch (and TCPServer also uses the global reactor internally).

comment:24 Changed 11 years ago by Zooko

Brian and I just worked up this patch at the GSoC Mentor Summit party.

https://github.com/warner/foolscap/pull/15

comment:25 Changed 11 years ago by Randall Mason

I do definitely think that str4d has a very good idea of the biggest problem with this. Many services will have different types of advertising. With a server on all protocols, you want maximum advertising. TOR (and I think i2p) servers need to be able to specify that they only want to make connections over one type of protocol and that they will not advertise other protocols. As for a client, they NEED the ability to say that they will NOT advertise non-anonymized addresses. There are half way cases such as IPv6 addresses that advertise a MAC address, which is constant and traceable for the life of the computer; most people will want to not advertise that address and go with the constantly rotating anonymous IPv6 addresses.

What does this mean for i2p/TOR services:

The new parsing rules will have a legacy-ipv4 rule that matches hints with a single colon, so anything else will be in the form TYPE:ADDRESS:PORT. (future schemes that don't have two-part address+port identifiers, like Tor hidden services or #151 I2P endpoints will need an extra colon, like i2p::I2PHASH.b32.i2p, to avoid being confused with legacy IPV4ADDR:PORT pairs).

Was this erroneous because the second part is not numeric? Is it impossible to have a named port like clashthebunny.com:smtp in the connection strings that could require this double colon?

comment:26 Changed 11 years ago by str4d

After much more thought while implementing I2P endpoints, I think that my comment:22 is flawed. But I don't have a clear picture of the best way. This comment is a dump of my notes.

Foolscap can't be ignorant of the wire for several reasons:

  • Authenticated Tubs use TLS, but this is not needed for all wires (I2P, Tor).
    • Is the TubID only for ensuring authenticity of the location hints? pbu:// doesn't use TLS?
    • I2P, Tor don't need TLS, the B32 hash/onion is inherently authenticated.
  • Location hints can't always be passed directly to clientFromString().
    • I2P endpoints will have different types depending on the backend API (i2pbob:, i2psam), but any I2P endpoint type should connect to the provided location hint.
      • This was recommended by Twisted devs over having one i2p: type and making the API choice a parameter.
    • User may want to only use some location hints, or modify connection parameters (see below).
    • I may go back to having a single i2p: type if it makes converting location hints to client endpoint descriptions easier.
  • Foolscap needs wire-specific knowledge to generate correct I2P/Tor location hints for FURLs.

Multiple location hints:

  • Are there use cases for a service accessible
    • by multiple I2P Destinations?
      • IMHO no.
    • across multiple wires?
      • IMHO yes generally, no for Tahoe storage networks (some storage nodes would be inaccessible from some wires).
      • I2P and Tor - access a service from both networks.
      • I2P and clearnet?
        • Non-anon users get fast access, anon users get anon access.
        • But TLS needed for clearnet, not needed for I2P.
          • Have TubID to check TLS, ignore it for I2P?

Ideas for possible FURLs containing I2P location hints:

  1. pbu://i2p:[b32hash.b32.i2p|example.i2p][:port]/
    • Treat the location hint as unauthenticated, even though it is
  2. pb://b32hash@i2p:[b32hash.b32.i2p|example.i2p][:port]/
    • Use the B32 hash as the TubID, only works for a single location hint
    • Look at location hint type to determine if TLS is needed
  3. pb://b32hash@i2p[:port]/
    • Reduced form, use B32s only, FURLs could not contain domains
      • But domains in I2P are all "local" (now)
      • This only matters if the Tub operator wants their FURL to be identified (e.g. using same I2P Destination as their eepsite for their Tub node, or to make their FURL shorter)
  4. pbi://i2p:[b32hash.b32.i2p|example.i2p][:port]/
    • Use a different URL protocol to indicate that the Tub is authenticated, but not by TLS
    • Use this for any supported wire with its own keys (Tor)
  5. pbi://b32hash@i2p:[b32hash.b32.i2p|example.i2p][:port]/
    • Explicitly specify the B32 hash as TubID, only works for a single location hint
      • If the wire handles authenticity itself, is this necessary?

How should the foolscap API change?

  • Tub.listenOn(what)
    • what should be a Twisted server endpoint description string
  • Tub.setLocation(hints)
    • Only necessary for regular wires, foolscap can look up valid reachable location for I2P/Tor.
    • Optional for I2P to pass in a domain? (see FURL ideas above)
  • Converting from location hints to client endpoint descriptions
    • User must specify what location hints they support using
      • If not specified, should foolscap use all (fail-open) or raise an error (fail-closed)?
      • If multiple endpoints can be used for a single location hint, user must specify which to use, e.g. i2pbob
        • Foolscap would need to keep an accurate mapping of location hints to Twisted endpoints.
        • Talk to Twisted devs again, get their opinion on correlating wires to endpoints/APIs.
    • User can pass in keyword parameters to append, e.g. timeout=15
      • Twisted ignores invalid parameters, so leave validity up to the configuring user?
Last edited 11 years ago by str4d (previous) (diff)

comment:27 in reply to:  24 Changed 11 years ago by str4d

Replying to zooko:

Brian and I just worked up this patch at the GSoC Mentor Summit party.

https://github.com/warner/foolscap/pull/15

Sorry, I didn't get onto reading this earlier.

This patch is incomplete. It assumes that the Twisted Endpoints syntax only allows for keyword arguments. But required parameters don't need to be keyword arguments. "tcp:127.0.0.1:9999" is valid syntax for a TCP endpoint description string.

That decode_location_hints() returns the host and port as a tuple is symptomatic of the larger rewrite that is necessary to move to endpoints. Many wires do have a host/port concept (TCP, UDP, I2P, Tor), but others may not (Unix sockets).

A FURL contains the required hints for a client to find a server. I think that should directly translate to the required arguments for the client endpoints. Instead of storing (type, host, port) for each hint, I propose that either locationHint, (type, locationHint) or (type, reqParams) is stored (where reqParams is locationHint with type: stripped from the front).

This proposal simplifies your patch. You don't need to parse the location hints looking for type-specific parameters, just pass the entire hint through. The only place I see location hints are used is in foolscap.negotiate.TubConnector.connectToAll(), which pops a (host, port) pair and calls reactor.connectTCP() on that pair. This is where clientFromString() would instead be used, based on a generated client endpoint description string:

type, reqParams = ('tcp', 'example.com:8080') # From FURL

userOpts = ['tcp:timeout=15', 'i2p:bobEndpoint=127.0.0.1\:2827'] # From config file somewhere
userParams = ''
for opt in userOpts:
    if opt.startswith(type):
        userParams = opt.split(type)[1]
        break

description = type + ':' + reqParams + userParams
endpoint = clientFromString(reactor, description)
Last edited 11 years ago by str4d (previous) (diff)

comment:28 Changed 11 years ago by Zooko

Good catch, str4d. Thanks! I wonder if it would help to spawn off a sub-ticket for just the forward-compatibility patch? The goal is to make a foolscap-0.7.0 which doesn't add functionality, but which reliably ignores weirdo connection-hint-types-from-the-future, and which correctly interprets tcp connection-hints of both current and future formats. It sounds like the current patch doesn't achieve that latter aim.

Anyway, I think there should probably be a sub-ticket, to go into 0.7.0, with this forward-compatibility sub-goal, and then we should come back to #203 after that is fixed.

comment:29 in reply to:  28 Changed 11 years ago by str4d

Replying to zooko:

Good catch, str4d. Thanks! I wonder if it would help to spawn off a sub-ticket for just the forward-compatibility patch? The goal is to make a foolscap-0.7.0 which doesn't add functionality, but which reliably ignores weirdo connection-hint-types-from-the-future, and which correctly interprets tcp connection-hints of both current and future formats. It sounds like the current patch doesn't achieve that latter aim.

Anyway, I think there should probably be a sub-ticket, to go into 0.7.0, with this forward-compatibility sub-goal, and then we should come back to #203 after that is fixed.

Sub-ticket #217 created.

comment:30 Changed 11 years ago by dawuud

I recently asked #twisted on irc:

15:29 < dawuud> my goal is to sockify foolscap (a twisted crypto network api); can i get an IConnector given only a TCP4Endpoint?
15:34 -!- mgw1 [~matthew@67.237.212.106] has joined #twisted
15:36 -!- mgw [~matthew@67.237.212.106] has quit [Ping timeout: 252 seconds]
15:40 -!- LionsMane [~don@elections.clearballot.com] has quit [Quit: Leaving.]
15:42 -!- freedeb [~deb@c-65-96-169-149.hsd1.ma.comcast.net] has quit [Quit: waves goodbye!]
15:45 -!- alerante [~irrealis@unaffiliated/alerante] has joined #twisted
16:01 -!- zz_mwhudson is now known as mwhudson
16:11 < glyph> dawuud: Why would you want one?  IConnector is kind of a goofy interface.
16:15 < dawuud> glyph: foolscap uses the IConnector https://github.com/warner/foolscap/blob/master/foolscap/negotiate.py
16:15 < dawuud> i guess i could port it to use endpoints
16:15 < glyph> dawuud: uhh
16:16 < glyph> dawuud: the thing called 'connector' in that file does not appear to be a Twisted connector.
16:16 < glyph> in fact it specifically says TubConnector?
16:17 < dawuud>             c = reactor.connectTCP(host, port, f)
16:17 < dawuud> on line 1385
16:19 < glyph> dawuud: just change 1370 to 'c.cancel()' and you're pretty much good
16:19 < glyph> (Except you might also need to change TubConnectorClientFactory to make sure it's not relying on clientConnectionMade/clientConnectionLost)
16:20 -!- wiretapped [~wiretappe@gateway/tor-sasl/wiretapped] has joined #twisted
16:20 < dawuud> glyph: oh perfect! thanks
16:20 -!- chiasmj [~chiasmj@74-95-199-185-SFBA.hfc.comcastbusiness.net] has quit [Remote host closed the connection]
16:21 -!- mwhudson is now known as zz_mwhudson
16:21 < glyph> dawuud: ClientFactory and IConnector are both somewhat silly as interfaces go
16:21 < glyph> dawuud: endpoints are much more consistent

Last edited 11 years ago by dawuud (previous) (diff)

comment:31 Changed 11 years ago by dawuud

I was compelled to ask about endpoints because I wanted to use txsocksx to make foolscap use a socks proxy to tor : https://github.com/hellais/txsocksx

I have operated a tahoe-lafs client over tor connecting to the onion grid without usewithtor (torsocks). However this was an ugly hack that makes foolscap only use tor: https://github.com/david415/foolscap/commit/cfb8d9734d368245516fcd30b349da2b45a933c3

Here is my unfinished attempt to port to twisted endpoints for the foolscap client: https://github.com/david415/foolscap/commit/3f194b0d40977de9b91b54c947b157ce030ea720

This change converts old style connection hints to new style and then passes around endpoint descriptors instead of "connection hints 3-tuple (type,host,port); we don't want to be parsing the endpoint descriptor and we don't want to strip data from the endpoint descriptor. Instead the intact endpoint descriptors are passed to clientFromString and receive the endpoint object which foolscap uses to connect; see leif's comment in foolscap trac #517: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517#comment:15

Along the way I also experimented with a sort of intermediate step; foolscap over tcp only... except that the foolscap client uses the TCP4Endpoint instead of ConnectTCP : https://github.com/david415/foolscap/commits/tor

These two last patches fail many tests... and in particular first fail foolscap.test.test_appserver.RunCommand?...

I suspected that TubConnector?'s clientConnectionFailed doesn't get called and then I found this twisted doc: https://twistedmatrix.com/documents/current/core/howto/endpoints.html

"""Note: If you've used ClientFactory? before, keep in mind that the connect method takes a Factory, not a ClientFactory?. Even if you pass a ClientFactory? to endpoint.connect, its clientConnectionFailed and clientConnectionLost methods will not be called."""

I'm not sure how to solve this problem... so if anyone knows then please chime in. I think that the next step might be for me to look into how to properly use endpoints in client and the difference between clients that use Factory and ClientFactory? so I can understand the advice that Glyph (http://foolscap.lothar.com/trac/ticket/203#comment:30) was giving me.

comment:32 Changed 11 years ago by str4d

zooko, warner and I (and others who I unfortunately don't know handles for) had a lengthy discussion about this at RWC. Proper notes to follow, either here or on https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517

comment:33 Changed 11 years ago by dawuud

i recently posted this to the twisted mailing list:

Greetings,

My goal is to get Brian Warner's foolscap library ported to Twisted endpoints instead of the older Twisted api interfaces it currently uses (ClientFactory? and IConnector).

This effort has been documented in foolscap trac ticket #203: http://foolscap.lothar.com/trac/ticket/203

and also part of Tahoe-LAFS trac ticket #517 - make tahoe Tor- and I2P-friendly: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517

My question is this:

Since Foolscap's TubConnectorClientFactory? does rely on clientConnectionFailed... What is the equivalent to this for the new interfaces?

I have tried to braindump all my foolscap twisted endpoint thoughts to foolscap trac ticket #203 : http://foolscap.lothar.com/trac/ticket/203#comment:30 http://foolscap.lothar.com/trac/ticket/203#comment:31

If we look at line 1223 of https://github.com/warner/foolscap/blob/master/foolscap/negotiate.py (at 0395476c7cb154f925d67abf6858a8200126352b)

we see that there's this TubConnectorClientFactory?: class TubConnectorClientFactory?(protocol.ClientFactory?, object):

and later at line 1384 it is used with connectTCP like this:

f = TubConnectorClientFactory?(self, host, lp) c = reactor.connectTCP(host, port, f)

In my endpoints2 branch (https://github.com/david415/foolscap/tree/endpoints2) I changed it to: class TubConnectorFactory?(protocol.Factory, object):

I suspected that TubConnector??'s clientConnectionFailed doesn't get called and then I found a relevent quaote this twisted doc: https://twistedmatrix.com/documents/current/core/howto/endpoints.html

here's the quote: """Note: If you've used ClientFactory? before, keep in mind that the connect method takes a Factory, not a ClientFactory?. Even if you pass a ClientFactory? to endpoint.connect, its clientConnectionFailed and clientConnectionLost methods will not be called."""

Currently foolscap's extensive collection of unittests are exercising bugs in my code when I try to port foolscap to twisted endpoints.

foolscap$ trial foolscap.test.test_appserver.RunCommand?.test_run foolscap.test.test_appserver

RunCommand?

test_run ... [ERROR]

=============================================================================== [ERROR] Traceback (most recent call last): Failure: foolscap.tokens.NegotiationError?: no connection established within client timeout

foolscap.test.test_appserver.RunCommand?.test_run


Ran 1 tests in 60.576s

FAILED (errors=1) foolscap$

Cheers!

comment:34 Changed 11 years ago by dawuud

and jp kindly responded:

It looks like TubConnector? is the primary intended user of TubConnectorClientFactory?. In this case, you can simply move the logic from TubConnector?.clientConnectionFailed to an errback on the Deferred returned by IStreamClientEndpoint.connect. Or you can leave it where it is an just make clientConnectionFailed be that errback.

For example:

f = TubConnectorClientFactory?(self, host, lp) d = endpoint.connect(f) d.addErrback(self.clientConnectionFailed)

(removing the connector argument from clientConnectionFailed, of course).

You may want to apply a similar transformation to the *success* case here as well. That is, move the code that initiates protocol actions on the connection to a callback on this Deferred instead of automatically starting those actions in buildProtocol (this gives greater flexibility in how the protocol is used - if you want to change the details of initialization you can do so by using a different callback on the endpoint's Deferred instead of having to mess around with the factory or the protocol to disable this stuff).

Jean-Paul

comment:35 Changed 11 years ago by dawuud

the result of all this talk was that i wrote some code that get's foolscap to use endpoints (client side) and passes the majority of the unittests; here check it out: https://github.com/david415/foolscap/blob/endpoints_trial1/foolscap/negotiate.py

I have tried keep this code change as small as possible to make it easier to debug... Here is a small sample that is similar to JP's code suggestion:

            f = TubConnectorFactory(self, host, lp)

            endpointDesc = "tcp:host=%s:port=%s" % (host, port)
            endpoint = clientFromString(reactor, endpointDesc)

            connectDeferred = endpoint.connect(f)
            self.pendingConnections[f] = connectDeferred
            connectDeferred.addErrback(lambda r: self.clientConnectionFailed(f, r))

I also tried to fix some of the unittest results by making this minor change where we try to decided weather to call clientConnectionFailed or nagotiationFailed...

            f = TubConnectorFactory(self, host, lp)

            endpointDesc = "tcp:host=%s:port=%s" % (host, port)
            endpoint = clientFromString(reactor, endpointDesc)

            connectDeferred = endpoint.connect(f)
            self.pendingConnections[f] = connectDeferred

            def handleFailures(factory, reason):
                if isinstance(reason, NegotiationError):
                    self.negotiationFailed(factory, reason)
                else:
                    self.clientConnectionFailed(factory, reason)


            connectDeferred.addErrback(lambda r: self.handleFailures(f, r))

comment:36 Changed 11 years ago by dawuud

OK... I finally fixed my client side port; this makes foolscap use twisted endpoints for the client side: https://github.com/david415/foolscap/tree/endpoint_descriptors3

It passes all the unit tests... I had to fix several unit tests.

And here is another branch where I combine my client side patch and str4d's server side patch: https://github.com/david415/foolscap/tree/endpoint_descriptors_server2

In theory I should be able to use this branch of foolscap with Tahoe-LAFS... and make a small code change to Tahoe so that it registers the txsocksx endpoint parser; that way when foolscap does calls clientFromString or serverFromString they will return the txsocksx endpoint object.

I will try this soon and report back here with my findings...

comment:37 Changed 11 years ago by dawuud

Yes we should have an "always_use_tor" option... however instead of rewriting furls we could perhaps more simply just filter out non-tor connection hints (twisted endpoint descriptors) before passing them to foolscap... AND there seems to be at least one fairly slick and twisted way to do this; we can let clientFromString/serverFromString make our endpoint objects and then check it's type.

I wrote a simple tor client endpoint parser... but it's buggy and I need to fix it: https://github.com/david415/txsocksx/blob/endpoint_parsers/examples/tor_client_endpoint_parser.py

After I get my client tor endpoint parser working I'll write the Tor Hidden Service endpoint parser for meejah's txtorcon - https://github.com/meejah/txtorcon

After that I'll try to figure out what all the coding and deployment issues are for Tahoe-LAFS to support these Tor endpoint parsers. I've already done a bunch of research regarding endpoints however I need to figure out what the "best" way to deploy an endpoint parser is...

comment:38 in reply to:  26 Changed 11 years ago by davidsarah

Replying to str4d:

Foolscap can't be ignorant of the wire for several reasons:

  • Authenticated Tubs use TLS, but this is not needed for all wires (I2P, Tor).
    • Is the TubID only for ensuring authenticity of the location hints? pbu:// doesn't use TLS?
    • I2P, Tor don't need TLS, the B32 hash/onion is inherently authenticated.

Foolscap requires a confidential channel (not just authenticated, although confidentiality against MITM attacks requires authentication). It was my impression that Tor doesn't by itself provide a confidential channel. I don't know about I2P.

comment:39 Changed 11 years ago by dawuud

replying to davidsarah:

In the specific case of tor client to tor hidden service, confidentiality is provided. However there are other use cases such as tor client to non-tor service which of course does not provide confidentiality.

In general I disagree with str4d's suggestion that authenticated Tubs are not needed for Tor.

On a slight tangent: It seems likely that the heartbleed bug allowed Tor to be broken in various ways... perhaps there exist powerful spy-entities with more Tor 0-days... and in that case it would be better to have redundant confidentiality layers (not that it would make a difference in the face of heartbleed bug... since it seems likely that both tahoe-lafs and tor could have possibly been broken by it). In the future tor and foolscap may stop using TLS and use something else instead.

comment:40 Changed 11 years ago by dawuud

TL;DR With your continued permission and approval I am going to make sure Tahoe-LAFS has the most beautiful Twisted native Tor integration the world has ever seen... or die trying.

I promise to make the Foolscap changes 100% backwards compatible.

I promise to provide at least 99.99% unit test coverage (not hard... right?) for all code involved in the Tor integration.

I promise to rigorously stick with the design and interface patterns that I mention at the bottom of this comment.

---

replying to warner, his ticket descriptor: " The "type" extension field in Foolscap's "connection hints" are intended to reflect this variety, starting with accomodating both ipv4 and ipv6, but meant to cover other things in the future too (like .i2p addresses which aren't associated with port numbers). "

If we look at how twisted endpoints are used... the idea is to pass an endpoint descriptor string to either clientFromString or serverFromString; we therefore need to keep the endpoint string intact and not have this connection hint tuple situation. My backwards compatibility changes address this issue of Foolscap api interface while maintaining the internal storage of the endpoint strings.

About your other comment: "It might also be good to allow our connection hints to include any string that endpoints.clientFromString is capable of parsing."

Yes, that is precisely the whole point of using Twisted endpoints; to gloriously decouple the application from the wire protocol below it. My foolscap client side endpoints branch does just this.

replying to warner's comment 1: Str4d's comment #4 precisely solves this problem... Let me comment:

comment 1 - http://foolscap.lothar.com/trac/ticket/203#comment:1 comment 4 - http://foolscap.lothar.com/trac/ticket/203#comment:4

Yes for the tcp server endpoint... the object implementing IListeningPort returned by the listen method call does indeed tell you the kernel-chosen tcp port it decided to listen on.

Replying to nejucomo's comment 8 http://foolscap.lothar.com/trac/ticket/203#comment:8 Replying to str4d's comment 9 http://foolscap.lothar.com/trac/ticket/203#comment:9

--->

An IStreamServerEndpoint's [1] listen method return's a deferred which fires a IListeningPort [2] whose getHost method returns an IAddress [3].

  1. https://twistedmatrix.com/documents/13.2.0/api/twisted.internet.interfaces.IStreamServerEndpoint.html
  2. https://twistedmatrix.com/documents/13.2.0/api/twisted.internet.interfaces.IListeningPort.html
  3. https://twistedmatrix.com/documents/13.2.0/api/twisted.internet.interfaces.IAddress.html

nejucomo... I fully agree that the Tor Hidden Service endpoint should not expose the internal interface (usually 127.0.0.1) and the local TCP port that was chosen to listen on.

However each component in this beautiful Twisted forest of classes and zope interfaces has a very simple job to do so let's not confuse the behavior of different endpoints:

  • TCP server endpoint: *should* provide an IAddress object that contains the interface IP address and TCP port that the listen is occuring on.

  • Tor Hidden service endpoint: *should* provide an IAddress object that contains the Tor Hidden Service onion address and port. In this case the hidden service port can be different from the TCP port that the application is listening on locally.

We need to provide a way for Tahoe-LAFS/Foolscap to learn it's own client-side endpoint descriptor string; a string which any Tahoe/Foolscap? client can use with clientFromString and receive a valid endpoint object which allows them to talk to the server listening endpoint.

I am working on writing the new txtorcon Tor Hidden Service endpoint... nejucomo... I am taking your comments *very* seriously. I wrote code because of it:

In this branch https://github.com/david415/txtorcon/tree/endpoint_parser_plugin-rewrite3_tor_onion_address these two commits address the issue of not leaking ip and port information when using Tor Hidden Service endpoints: https://github.com/david415/txtorcon/commit/d21d263032a0013129a730ae6c4d2ef11f5d6eb9 https://github.com/david415/txtorcon/commit/cb8fc25ea5c82bfa884b0f25b33f0f94504cec2a

I wrote the TorOnionAddress class which imlements IAddress interface and the TorOnionListeningPort class which implements IListeningPort interface. All the unit tests pass and I have personally tested the code to verify that it works correctly. The total design of the endpoint is not spec`ed out... however this small part is complete.

The other design issues with the Tor hidden service endpoint I will not discuss in this ticket because it will not affect Foolscap or Tahoe-LAFS at all.

In the TCP NAT situation we have three choices... in order of correctness :

  1. Create a TCP-Server-NAT endpoint class to encapsulate the address

logic. The parser for this endpoint receives the publicly accessible IP address and port fields from the endpoint descriptor string.

  1. Provide a Foolscap api call that manually set's the tub's location

to a provided client endpoint descriptor string. In this case Foolscap is still ignorant of the wire protocol... but provides an optional api call to correct the endpoint string that may have been generated automatically. This requires a code change to Tahoe-LAFS: a tahoe.cfg config option that specifies an endpoint descriptor string!

  1. Fail. Fail horribly... the Tahoe-LAFS announced endpoint string will be a

valid endpoint string however the attempted connects will fail because it is an rfc1918 IP address... due to NAT routing.

I will write this TCP NAT endpoint class and parser if I must in order to keep us from choosing a design that breaks the Twisted endpoints philosophy. If I am wrong then please give me an example that helps me understand.

Now replying to str4d's comment 19: http://foolscap.lothar.com/trac/ticket/203#comment:19

"foolscap cannot IMHO be completely ignorant of what wire"

I strongly disagree. We have many moving parts... but each part is only responsible for performing a very simple task:

  1. Tahoe-LAFS - responsible to gathering endpoint descriptor strings

from the user and passing them to Foolscap to talk on the network. Tahoe-LAFS will filter or transform endpoint strings based on user set policy... For instance "Tor only" mode will transform all non-Tor endpoint strings to Tor endpoint strings. In this way "bad acting" introducer nodes cannot trick the client into using a non-Tor endpoint.

  1. Foolscap - responsible for providing endpoint connections given the

user supplied endpoint descriptor strings.

  1. Endpoint objects - responsible for ALL details of the wire protocol.
Last edited 11 years ago by dawuud (previous) (diff)

comment:41 Changed 10 years ago by dawuud

Revisited a branch that I previously verified as having passed all the unit tests and to my dismay found that it failed several unit tests with a dirty reactor. Last night I fixed all the dirty reactor bugs that the foolscap unit tests told me about.

Here's my latest working code; broken up into client and server: https://github.com/david415/foolscap/tree/endpoint_client1

https://github.com/david415/foolscap/tree/endpoint_server1

comment:42 Changed 10 years ago by Brian Warner

This is awesome stuff. I'm sorry I haven't been particpating in the discussion. I just read through this ticket (understanding probably 70% of it, I'll try to improve that with another pass later). A few thoughts:

  • I think we should run TLS over all connections, even if the lower-level provides crypto. It's redundant, but it gives FURLs a stable TubID and gets us consistent security properties, independent of the underlying channel.
  • yes, Tub.listenOn() should take a server endpoint descriptor, or maybe even an actual pre-instantiated Endpoint object.
  • Tub.setLocation() should be called with a list of client endpoint descriptors (pre-colon-concatenated, I guess), regardless of what kinds of listeners / server-endpoints you set up. The location hints provided here are just used to construct FURLs in Tub.registerReference() (they're interpolated into the generated FURL string). In narrow circumstances (publically-routable ipv4/ipv6 TCP sockets), we could automatically deduce these (and that's what Tub.setLocationAutomatically() is for), but in general it's better to have the application configure this.
    • in particular, cryptographic endpoints are going to have some sort of private key in the server endpoint descriptor, and some sort of public key in the client endpoint descriptor. The function that converts the former to the latter is tightly coupled to the endpoint technology in use. For some exotic types, it might even be the case that you cannot convert the private bits to the public bits (that you must generate both at the same time, from some common seed that you then throw away).
    • even for plain TCP, to convert from tcp:0 (aka "hello kernel, please pick a free port for me") to a client endpoint descriptor requires that we spin up the server and ask it what got allocated

I like the idea of a safety-filter function (maybe a Tub option) that says "here is a set of endpoint types that I like. Refuse to connect to anything that's not on the list". I don't know how client endpoints are built: can you construct one (to learn its type) without letting it start to connect? If so, great. Maybe the option should just take a predicate function, to pass judgement on each potential endpoint, and then we provide a helper that builds a predicate out of a list of endpoint types. That'd make it easy to limit hostnames and stuff too.

And... maybe make the function take a connection hint and return a client endpoint descriptor (like E's coercing Guards instead of a simple precondition), throwing an exception or returning None to filter things out. That would make it pretty easy to convert i2p:blah into i2pbob:blah to deal with the alternative-APIs that you mentioned earlier (although I'm still not entirely sure where the which-API-to-use decision should be made or implemented).

comment:43 in reply to:  42 ; Changed 10 years ago by str4d

Replying to warner:

  • I think we should run TLS over all connections, even if the lower-level provides crypto. It's redundant, but it gives FURLs a stable TubID and gets us consistent security properties, independent of the underlying channel.

Having thought much more on this since I last commented here, I am inclined to agree. A stable TubID is important, particularly for nodes available over multiple channels. How this is done is a more complicated issue, because Foolscap can't rely on every endpoint supporting STARTTLS.

  • yes, Tub.listenOn() should take a server endpoint descriptor, or maybe even an actual pre-instantiated Endpoint object.

To support multiple channels, it should take a comma-separated list of server endpoint descriptors. The different endpoints should be considered independent of each other, so each will require its own listener.

  • Tub.setLocation() should be called with a list of client endpoint descriptors (pre-colon-concatenated, I guess), regardless of what kinds of listeners / server-endpoints you set up. The location hints provided here are just used to construct FURLs in Tub.registerReference() (they're interpolated into the generated FURL string). In narrow circumstances (publically-routable ipv4/ipv6 TCP sockets), we could automatically deduce these (and that's what Tub.setLocationAutomatically() is for), but in general it's better to have the application configure this.
    • in particular, cryptographic endpoints are going to have some sort of private key in the server endpoint descriptor, and some sort of public key in the client endpoint descriptor. The function that converts the former to the latter is tightly coupled to the endpoint technology in use. For some exotic types, it might even be the case that you cannot convert the private bits to the public bits (that you must generate both at the same time, from some common seed that you then throw away).
    • even for plain TCP, to convert from tcp:0 (aka "hello kernel, please pick a free port for me") to a client endpoint descriptor requires that we spin up the server and ask it what got allocated

Generating client endpoint descriptors from server Endpoint ListeningPorts is the subject of dawuud's Twisted ticket #7603.

I like the idea of a safety-filter function (maybe a Tub option) that says "here is a set of endpoint types that I like. Refuse to connect to anything that's not on the list". I don't know how client endpoints are built: can you construct one (to learn its type) without letting it start to connect?

Type is encoded in the endpoint string. All that clientFromString() does is search for a Twisted Client Endpoint plugin that provides that type.

And... maybe make the function take a connection hint and return a client endpoint descriptor (like E's coercing Guards instead of a simple precondition), throwing an exception or returning None to filter things out. That would make it pretty easy to convert i2p:blah into i2pbob:blah to deal with the alternative-APIs that you mentioned earlier (although I'm still not entirely sure where the which-API-to-use decision should be made or implemented).

The which-API-to-use decision is a client decision, and currently the way to specify it is as a keyword parameter in the client endpoint string. I have tried to turn it into configuration options for Endpoints Twisted ticket #7634, but no success convincing them yet.

Last edited 10 years ago by Brian Warner (previous) (diff)

comment:44 in reply to:  43 Changed 10 years ago by Brian Warner

To support multiple channels, it should take a comma-separated list of server endpoint descriptors. The different endpoints should be considered independent of each other, so each will require its own listener.

Hm, Tub.listenOn() only (currently) takes a single connection string. But you can call it multiple times (a fact that should be added to the docs). So maybe we don't actually need to accomodate multiple descriptors in a single call.

Type is encoded in the endpoint string. All that clientFromString() does is search for a Twisted Client Endpoint plugin that provides that type.

So I guess the "protect me from connecting via anything other than Tor" filter would just be something like this:?

hints = filter(lambda h: split(x, ":")[0] == "tor", hints)

Hm, I think that'd work, although if there's any uncertainty about the mapping from hint to endpoint type, I'd worry that an attacker might give me a spiked FURL with the intention of tricking me into connecting via a method that doesn't protect my location. A whitelist works much better than a blacklist here, but it might be even safer to match on actual endpoint class. I dunno.

It's interesting to think about it in those antagonistic terms: as we've discussed before, if you're trying to protect your anonymity, then you probably want a lower-level defense against mistakes, like putting the entire process in a cage (VM with Tor being the only way to get out). Having application-level protections is nice, but I don't know if it'd be sufficient for the serious user.

comment:45 Changed 10 years ago by dawuud

Speaking of lower-level defenses... there's this ptrace and seccomp/BPF based Mbox sandbox thing that isis suggested we use... here's the link: https://lists.torproject.org/pipermail/tor-dev/2014-May/006911.html

comment:46 Changed 10 years ago by dawuud

Warner... with regards to your comments in http://foolscap.lothar.com/trac/ticket/203#comment:42

Yes 2 layers of crypto is fine... Tor on top of Foolscap's transport crypto makes sense I think.

These changesets are 100% backwards compatible. If we want to improve the API shouldn't we do that after we introduce the endpoint descriptor API? All of the Foolscap API calls is these new changesets behave just as they used to except that all functions that previously accepted "connection hints" now also accept Twisted endpoint descriptor strings. When connection hints are passed then they get converted into a TCP endpoint descriptor string.

Clearly satisfying this twisted trac ticket https://twistedmatrix.com/trac/ticket/7603 would make it possible to get rid of stuff like this: https://github.com/warner/foolscap/blob/master/foolscap/pb.py#L68-L81 or this: https://github.com/david415/foolscap/blob/endpoint_server1/foolscap/pb.py#L97-L114 ...which is an example of Twisted code that uses endpoints... but not in an agnostic way.

comment:47 Changed 10 years ago by Brian Warner

Nathan, dawuud, and I met this afternoon to figure this stuff out. The conclusions we came to:

  • Twisted's "Endpoint Descriptor Strings" are too powerful: applications like Tahoe should not blindly accept and use descriptor strings provided by external parties (e.g. an Introducer). They might e.g. pass ssl:host=foo:port=foo:certKey=/etc/knowncert.pem and thus discover whether you have a certificate at that location or not, and possibly its contents. Or they could use the proposed Tor endpoint syntax (tor:host=:port=) and include socksProxyHost=evil.com to force your device to reveal its IP address to their server.
  • Instead, applications should accept a limited subset of the descriptor language (e.g. just host= and port=) and build a real descriptor from that, carefully preventing injection vulnerabilities that would give the attacker access to the full set of attributes.
  • The flexibility of "any kind of endpoint descriptor" should be reserved for locally-supplied (trusted) strings.

So instead of changing Foolscap's connection-hint syntax to equal Twisted's endpoint-descriptor syntax (thus encouraging people to blindly accept FURLs' hints and passing them into Twisted's endpoints.clientFromString), our plan is:

  • stick with a connection-hint syntax that says "TYPE:STUFF", and when TYPE=tcp defines STUFF to be HOST:PORT (where PORT is numeric and contains everything past the *last* colon, so HOST can include IPv6 colon-hex addresses), and ignores all other TYPE values
  • use Endpoints *internally* in Foolscap: convert hints into TCP4ClientEndpoint (or TCP6ClientEndpoint, really), change negotiation.py to use endpoint.connect() instead of reactor.connectTCP()
  • add a Tub.registerConnectionHintTranslator() (or something similar) which lets the application override the function that turns hints into endpoints. The default function handles TYPE=tcp and ignores the rest.
  • The Endpoint is required to accept endpoint.startTLS(), and we require that twisted.internet.ssl.Certificate.peerFromTransport(endpoint) does the right thing. This should be trivial for endpoints that use a real TCP socket (but point it at a proxy of some sort), and may be impossible for more complex transports that provide a non-TCP-backed stream connection.

Then Tahoe (or other applications that want more than TCP) can register a handler function which accepts e.g. tor:HOST:PORT or i2p:HOST:PORT and constructs the appropriate Endpoint object. This handler will combine static/secret/powerful information from the Tahoe config (like the torSocksHost= and torSocksPort= to use) with the per-connection/less-powerful information from the FURL's connection hint. If the handler gets a hint type it doesn't recognize, it will return None to ignore it.

When Tahoe is configured to use Tor for everything, it should return a TorEndpoint for both tcp: and tor: hints, and never ever return a privacy-leaking TCP4ClientEndpoint.

If an application really wants to, it can just use twisted.internet.endpoints.clientFromString as the handler, and get full forward-compatible flexibility (and security vulnerabilities galore), assuming foolscap's legacy HOST:PORT would still work (which it probably wouldn't).

Later, we'll look into allowing Tub.listenOn() to accept a server-type Endpoint, or maybe an endpoint descriptor string.

comment:48 Changed 10 years ago by Brian Warner

Milestone: eventually0.8.0

comment:49 Changed 10 years ago by dawuud

Hey I'd like to point out that Warner's idea to create a registry of handlers can also be applied to the problem described here:

https://twistedmatrix.com/trac/ticket/7603

I was originally thinking to get this changed in Twisted... and even if that were possible I think we'd best add this functionality to Foolscap so that server-side Foolscap endpoints usage will not be blocked on Twisted changed getting merged upstream.

That is... the problem of producing a client connection hint out of an abstract server side "hint" such that the dynamic sections of the client connection hint are accurately produced;

I have these two Tahoe-LAFS/Foolscap use cases in mind:

  1. an ephemeral IP addressed - TCP storage server that upon startup reads it's tahoe.cfg and sees a TCP server endpoint or connection hint that specifies a zero port... meaning that the kernel should randomly select an available port. After we start listening we generate the "client side endpoint string" and announce it to the Introducer.
  1. an ephemeral Tor hidden service storage server - upon startup a server side endpoint string like this is read:

onion:port=80

which means that we do not already have any Tor hidden service key material... so we ask the tor process to generate it for us AND add this tor hidden service. After this Tor hidden service is created we procure the client side endpoint string for this newly ephemeral Tor hidden service, and announce it to the Introducer.

This should produce a client endpoint string that looks something like this:

tor:veryFluffy.onion:669

A handler plugin registry system would make it possible for the server side of Foolscap to use random address generation features of server endpoints. At least TCP and Tor hidden services have this feature... but perhaps others as well.

comment:50 Changed 9 years ago by Brian Warner

I'm in the process of switching Tub.listenOn() to accept a twisted "server endpoint description string", instead of the current tcp:%d[:interface=%s] subset. The one restriction is that connections on the resulting IStreamServerEndpoint must support startTLS(). That happens to be the case for TCP-ish server endpoints, but it's possible to specify an endpoint descriptor that won't actually work when a connection is finally established.

comment:51 Changed 9 years ago by Brian Warner

Component: negotiationnetwork

comment:52 Changed 9 years ago by Brian Warner

Description: modified (diff)
Milestone: 0.9.00.9.1
Summary: switch to using Endpointschange Tub.listenOn() to use Endpoints

comment:53 Changed 9 years ago by Brian Warner

Milestone: 0.10.00.11.0

the logging fix takes priority, pushing this out a release

comment:54 Changed 9 years ago by Brian Warner

Milestone: 0.11.00.12.0

Milestone renamed

comment:55 Changed 9 years ago by Brian Warner

Reviewing this just now, I'm much more in favor of passing a real Endpoint to Tub.listenOn(ep), instead of a string. The caller can be responsible for mapping from whatever configuration language their application uses (e.g. tahoe.cfg strings, or some --listen-on= CLI argument), into an endpoint of their choosing. No need for listening-side plugins.

At the same time, I think we'll remove the dynamic what-is-my-address what-is-my-port stuff that Listeners were providing before. We'll deprecate the listenOn("tcp:0") case (as discussed in tahoe#517, and now tracked in foolscap #252). (to be picky, you can still call tub.listenOn("tcp:0"), but it's a bad idea because you'll have no way to find out what port got allocated, so you won't know what to pass into tub.setLocation()).

Last edited 9 years ago by Brian Warner (previous) (diff)

comment:56 Changed 9 years ago by Brian Warner

Oh, I should mention that tub.listenOn() can still accept a string, which will be passed to the normal twisted.internet.endpoints.serverFromString() function. We'll make sure that existing applications keep working as usual.

comment:58 Changed 9 years ago by Brian Warner

Resolution: fixed
Status: newclosed

Oops, this was supposed to be automatically closed by the 203-server-endpoint branch (landed in [98c28e08]). Something must have gone wrong with the trac/git notification hook.

Closing it manually now.

Note: See TracTickets for help on using tickets.