Opened 14 years ago

Closed 11 years ago

#150 closed enhancement (wontfix)

HTTP proxy support

Reported by: duck Owned by: duck
Priority: minor Milestone: undecided
Component: negotiation Version: 0.5.0
Keywords: i2p http proxy transport test-needed i2p-collab Cc: Zooko, zooko@…, killyourtv@…

Description

To use Tahoe-LAFS over the I2P anonymous network I have added HTTP proxy support to the Foolscap library.

The HTTP proxy would be configured by specifying:

    tub.setOption("httpProxy", "127.0.0.1:4444")

The Tahoe-LAFS ticket for this is #1007.

Attachments (3)

150_http_proxy.txt (3.1 KB) - added by duck 14 years ago.
HTTP proxy support in Negotiate
150-http-proxy.patch (3.0 KB) - added by duck 13 years ago.
HTTP proxy support in Negotiate v2
150-proxy.patch (3.4 KB) - added by killyourtv 11 years ago.

Download all attachments as: .zip

Change History (20)

Changed 14 years ago by duck

Attachment: 150_http_proxy.txt added

HTTP proxy support in Negotiate

comment:1 Changed 14 years ago by duck

See also #151 about accepting I2P destinations.

comment:2 Changed 14 years ago by duck

Keywords: review-needed added

Please review if you agree with the approach in general. Your suggestion how to unit test would be most welcome!

comment:3 Changed 14 years ago by Brian Warner

I agree with the approach in general. It's not clear to me that a standard HTTP proxy can handle the binary foolscap protocol: the first "hello" message is made to look HTTPish, but unless the proxy immediately jumps into a byte-relay mode after the HTTP response, the rest of the protocol will get confused. OTOH, if you got this to work with I2P's proxy, then I guess it's good enough.

I'd just be worried about labeling this option "httpProxy" if "i2p-proxy" would be a more accurate name.

Also, I'd prefer "http-proxy" over camel-case "httpProxy", since the other options use words-joined-with-hyphens style.

To test it, you'll need to build a similarly-behaving HTTP proxy (probably using twisted.web) that can be launched at the start of a unit test, then do a Tub.getReference(i2p_furl) and confirm that you can send data across the connection, then shut down the Tubs and the proxy. Not trivial, I'm afraid. It might be reasonable to instead make a modified Listener that interprets the incoming tubid differently (recording the value it receives for later checking, then stripping the http.. prefix before doing the normal tubid lookup). But I'm not sure that would exercise enough.

Sorry to be so slow in responding!

comment:4 in reply to:  3 Changed 14 years ago by Zooko

Owner: set to duck

Replying to warner:

To test it, you'll need to build a similarly-behaving HTTP proxy (probably using twisted.web) that can be launched at the start of a unit test, then do a Tub.getReference(i2p_furl) and confirm that you can send data across the connection, then shut down the Tubs and the proxy. Not trivial, I'm afraid. It might be reasonable to instead make a modified Listener that interprets the incoming tubid differently (recording the value it receives for later checking, then stripping the http.. prefix before doing the normal tubid lookup). But I'm not sure that would exercise enough.

Hm... it shouldn't be so hard to test this patch. The patch is merely changing the text of the GET requests and changing the arguments that get passed to TubConnectorClientFactory and to connectTCP, right? So couldn't it be tested just by putting mocks in place of TubConnectorClientFactory, connectTCP, and Negotiation.transport.write? Then this code could run, and then the test code could ask those three mock objects if this code had done the right thing.

(This is a general theme I've noticed in my own work recently—if you're willing to just test this particular patch using mock objects, instead of testing the full stack of code which gets called from this code or which receives network traffic sent by this code, then testing is often much easier.)

comment:5 Changed 14 years ago by Zooko

Cc: Zooko zooko@… added

comment:6 Changed 13 years ago by Zooko

duck seems to have disappeared. I hope they are okay and will return!

In the meantime, maybe Brian and I should use this patch as an opportunity to try mock-based testing.

Changed 13 years ago by duck

Attachment: 150-http-proxy.patch added

HTTP proxy support in Negotiate v2

comment:7 Changed 13 years ago by duck

Brian's suggestion to use http-proxy instead of httpProxy has been taken.

I disagree with allowing a connection to bypass this proxy for non i2p-addresses while the setting is given, this would allow for anonymity leaks.

I disagree with calling this i2p-proxy, in theory this should work with other HTTP 1.1 proxies that support keep-alive connections (those are rare); except that the Banana protocol and/or the way the HTTP protocol is implemented in foolscap doesn't pass the HTTP parsers of a proxy like polipio. I consider this a to be outside the scope of this ticket.

Unit tests as suggested by zooko have yet to be written.

Changed 11 years ago by killyourtv

Attachment: 150-proxy.patch added

comment:8 Changed 11 years ago by killyourtv

Cc: killyourtv@… added

The patch has been refactored to apply to 0.6.4. It also reflects the naming change requested at https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1007

comment:9 Changed 11 years ago by killyourtv

Keywords: test-needed added

comment:10 Changed 11 years ago by killyourtv

Keywords: review-needed removed

comment:11 Changed 11 years ago by Zooko

Adding the i2p-collab tag, which is the same spelling used on the Tahoe-LAFS trac to indicate a ticket that is currently blocking the I2P project from using trunk.

comment:12 Changed 11 years ago by Zooko

Keywords: i2p-collab added

comment:13 Changed 11 years ago by Zooko

I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.

Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?

comment:14 Changed 11 years ago by str4d

Replying to zooko:

I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.

Based on my understanding of Twisted endpoints (obtained from https://twistedmatrix.com/documents/12.3.0/core/howto/endpoints.html):

Yes, Twisted endpoints would be better than this http proxy setting. It makes life for the foolscap (and Tahoe) devs easier because they don't need to care about what transport is used, and it makes life for users easier because setup becomes much simpler.

Right now, to use Tahoe over I2P, users need to manually configure their patched Tahoe to point to an HTTP proxy client tunnel so they can make outgoing connections, and carefully configure their client to not leak IP information. For incoming connections to storage nodes, they need to separately configure a server tunnel, and carefully modify their config to both not leak IP information and to broadcast the correct I2P address. Tor would involve a similar process (but using HSs). For the average user, this is Not Simple (TM).

With endpoint support, the user would only need to install the third-party endpoint plugin for their transport (I2P/Tor/...) and paste a single, likely-unchanging line into their config file. The I2P plugin would handle creation and management of the required I2P tunnels, and there would be no danger of IP leaks because the TCP endpoint would not be used at all and the app would not have (or need) any concept of local IP.

Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?

As far as I2P is concerned, there is no performance issue requiring HTTP proxy use - the HTTP client tunnel is just a special-cased regular tunnel with proxying support for ease of use, and header filtering to help protect users. The main barrier for I2P is implementation complexity - apps should ideally be written with knowledge of I2P, both for privacy/anonymity concerns and for best performance. To my knowledge, the reason duck used the HTTP proxy implementation was because it was a simple but effective patch. There is in fact another (abandoned) version of Tahoe for I2P, where the dev made more fundamental modifications to talk with I2P that could be turned on/off in the config.

All this is superseded by Twisted endpoints - the transport-specific knowledge can be abstracted into the plugin. This is very beneficial for I2P/Tor because any app using Twisted endpoints could immediately be set up inside the network. Of course, developers should still be aware that their software could be used in anonymity networks and design appropriately.

comment:15 in reply to:  3 Changed 11 years ago by nejucomo

Replying to warner:

I agree with the approach in general. It's not clear to me that a standard HTTP proxy can handle the binary foolscap protocol: the first "hello" message is made to look HTTPish, but unless the proxy immediately jumps into a byte-relay mode after the HTTP response, the rest of the protocol will get confused. OTOH, if you got this to work with I2P's proxy, then I guess it's good enough.

I wrote a long comment about how foolscap should follow websockets by using an Upgrade: header, then read negotiate.py where it's clearly commented that it does.

So the unknown seems to be: can "standard" HTTP proxies correctly proxy this upgraded protocol? It seems the same issue should apply to websockets, unless specs or implementations have hard-coded special cases specific to websockets.

A quick skim of the first part of How HTML5 Web Sockets Interact With Proxy Servers suggests that browsers using knowledge that a connection is for websockets will use the proxy CONNECT method with non-transparent (explicit) proxies. So I would assume a general foolscap "web proxy" setting to have the same behavior and similar interoperability.

This patch is different, and I have less confidence it will interoperate correctly with general HTTP proxies, although it may work specifically with an i2p specific proxy.

Therefore:

I'd just be worried about labeling this option "httpProxy" if "i2p-proxy" would be a more accurate name.

+1

Relying on the behavior of some implementations or custom protocols calls for a more specific name, IMO.

EDIT: Removed a lot of verbosity about rfc2616 and speculation about how foolscap negotiation should do what it already does. ;-)

Last edited 11 years ago by nejucomo (previous) (diff)

comment:16 in reply to:  14 Changed 11 years ago by nejucomo

Replying to str4d:

Replying to zooko:

I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.

Based on my understanding of Twisted endpoints (obtained from https://twistedmatrix.com/documents/12.3.0/core/howto/endpoints.html):

Yes, Twisted endpoints would be better than this http proxy setting. It makes life for the foolscap (and Tahoe) devs easier because they don't need to care about what transport is used, and it makes life for users easier because setup becomes much simpler.

Right now, to use Tahoe over I2P, users need to manually configure their patched Tahoe to point to an HTTP proxy client tunnel so they can make outgoing connections, and carefully configure their client to not leak IP information. For incoming connections to storage nodes, they need to separately configure a server tunnel, and carefully modify their config to both not leak IP information and to broadcast the correct I2P address. Tor would involve a similar process (but using HSs). For the average user, this is Not Simple (TM).

With endpoint support, the user would only need to install the third-party endpoint plugin for their transport (I2P/Tor/...) and paste a single, likely-unchanging line into their config file. The I2P plugin would handle creation and management of the required I2P tunnels, and there would be no danger of IP leaks because the TCP endpoint would not be used at all and the app would not have (or need) any concept of local IP.

Agreed with all of this: for Tor / I2P use cases, specialized endpoints are greatly advantageous for both user configuration simplicity and, importantly, IP address confidentiality correctness.

However, there may be other reasons to use an HTTP proxy. The safest / most interoperable usage of an HTTP proxy should be through a CONNECT method. Foolscap should *not* deal with this. Instead, there should be a twisted endpoint for HTTP proxies that makes a simple CONNECT and then uses the resulting stream.

Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?

As far as I2P is concerned, there is no performance issue requiring HTTP proxy use - the HTTP client tunnel is just a special-cased regular tunnel with proxying support for ease of use, and header filtering to help protect users.

I feel like there's some confusion about what kind of HTTP proxying is happening here. If the I2P HTTP proxy is filtering headers, then it is operating on HTTP requests and responses, such as is typical of HTTP proxies for all methods except CONNECT. OTOH, foolscap wants a generic bi-directional stream, so it wants functionality similar to an HTTP proxy's CONNECT support.

Anyway, this is the kind of detail a twisted endpoint plugin for I2P would manage, rather than foolscap, and it sounds like we all agree that's the best approach.

The main barrier for I2P is implementation complexity - apps should ideally be written with knowledge of I2P, both for privacy/anonymity concerns and for best performance. To my knowledge, the reason duck used the HTTP proxy implementation was because it was a simple but effective patch. There is in fact another (abandoned) version of Tahoe for I2P, where the dev made more fundamental modifications to talk with I2P that could be turned on/off in the config.

All this is superseded by Twisted endpoints - the transport-specific knowledge can be abstracted into the plugin. This is very beneficial for I2P/Tor because any app using Twisted endpoints could immediately be set up inside the network. Of course, developers should still be aware that their software could be used in anonymity networks and design appropriately.

+1

So... should we close this ticket as wontfix and focus on the twisted endpoints feature?

There may be users who need actual HTTP proxies, not I2P (or Tor), but I'm of the opinion HTTP proxy support should also be a twisted endpoint. If one does not already exist, I'd want to file a feature request over on twisted (or better yet, just write one).

There's actually a wrinkle with an "HTTP Proxy Endpoint": often applications want to connect to many different destinations with the same proxy setting, so to provide this kind of functionality the application would need to take an proxy config setting then mangle any to-be-proxied request. For example, by changing: tcp:127.0.0.1:80 to http-proxy:127.0.0.1:80...

comment:17 Changed 11 years ago by Zooko

Resolution: wontfix
Status: newclosed

Closing as "wontfix" because of the consensus (among people who've posted to this ticket, notably not Brian) that Endpoints would be better.

Note: See TracTickets for help on using tickets.