Opened 15 years ago
Closed 11 years ago
#150 closed enhancement (wontfix)
HTTP proxy support
Reported by: | duck | Owned by: | duck |
---|---|---|---|
Priority: | minor | Milestone: | undecided |
Component: | negotiation | Version: | 0.5.0 |
Keywords: | i2p http proxy transport test-needed i2p-collab | Cc: | Zooko, zooko@…, killyourtv@… |
Description
To use Tahoe-LAFS over the I2P anonymous network I have added HTTP proxy support to the Foolscap library.
The HTTP proxy would be configured by specifying:
tub.setOption("httpProxy", "127.0.0.1:4444")
The Tahoe-LAFS ticket for this is #1007.
Attachments (3)
Change History (20)
Changed 15 years ago by
Attachment: | 150_http_proxy.txt added |
---|
comment:2 Changed 15 years ago by
Keywords: | review-needed added |
---|
Please review if you agree with the approach in general. Your suggestion how to unit test would be most welcome!
comment:3 follow-ups: 4 15 Changed 14 years ago by
I agree with the approach in general. It's not clear to me that a standard HTTP proxy can handle the binary foolscap protocol: the first "hello" message is made to look HTTPish, but unless the proxy immediately jumps into a byte-relay mode after the HTTP response, the rest of the protocol will get confused. OTOH, if you got this to work with I2P's proxy, then I guess it's good enough.
I'd just be worried about labeling this option "httpProxy" if "i2p-proxy" would be a more accurate name.
Also, I'd prefer "http-proxy" over camel-case "httpProxy", since the other options use words-joined-with-hyphens style.
To test it, you'll need to build a similarly-behaving HTTP proxy (probably using twisted.web) that can be launched at the start of a unit test, then do a Tub.getReference(i2p_furl)
and confirm that you can send data across the connection, then shut down the Tubs and the proxy. Not trivial, I'm afraid. It might be reasonable to instead make a modified Listener that interprets the incoming tubid differently (recording the value it receives for later checking, then stripping the http.. prefix before doing the normal tubid lookup). But I'm not sure that would exercise enough.
Sorry to be so slow in responding!
comment:4 Changed 14 years ago by
Owner: | set to duck |
---|
Replying to warner:
To test it, you'll need to build a similarly-behaving HTTP proxy (probably using twisted.web) that can be launched at the start of a unit test, then do a
Tub.getReference(i2p_furl)
and confirm that you can send data across the connection, then shut down the Tubs and the proxy. Not trivial, I'm afraid. It might be reasonable to instead make a modified Listener that interprets the incoming tubid differently (recording the value it receives for later checking, then stripping the http.. prefix before doing the normal tubid lookup). But I'm not sure that would exercise enough.
Hm... it shouldn't be so hard to test this patch. The patch is merely changing the text of the GET
requests and changing the arguments that get passed to TubConnectorClientFactory
and to connectTCP
, right? So couldn't it be tested just by putting mocks in place of TubConnectorClientFactory
, connectTCP
, and Negotiation.transport.write
? Then this code could run, and then the test code could ask those three mock objects if this code had done the right thing.
(This is a general theme I've noticed in my own work recently—if you're willing to just test this particular patch using mock objects, instead of testing the full stack of code which gets called from this code or which receives network traffic sent by this code, then testing is often much easier.)
comment:5 Changed 14 years ago by
Cc: | Zooko zooko@… added |
---|
comment:6 Changed 14 years ago by
duck seems to have disappeared. I hope they are okay and will return!
In the meantime, maybe Brian and I should use this patch as an opportunity to try mock-based testing.
comment:7 Changed 14 years ago by
Brian's suggestion to use http-proxy
instead of httpProxy
has been taken.
I disagree with allowing a connection to bypass this proxy for non i2p-addresses while the setting is given, this would allow for anonymity leaks.
I disagree with calling this i2p-proxy
, in theory this should work with other HTTP 1.1 proxies that support keep-alive connections (those are rare); except that the Banana protocol and/or the way the HTTP protocol is implemented in foolscap doesn't pass the HTTP parsers of a proxy like polipio. I consider this a to be outside the scope of this ticket.
Unit tests as suggested by zooko have yet to be written.
Changed 11 years ago by
Attachment: | 150-proxy.patch added |
---|
comment:8 Changed 11 years ago by
Cc: | killyourtv@… added |
---|
The patch has been refactored to apply to 0.6.4. It also reflects the naming change requested at https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1007
comment:9 Changed 11 years ago by
Keywords: | test-needed added |
---|
comment:10 Changed 11 years ago by
Keywords: | review-needed removed |
---|
comment:11 Changed 11 years ago by
Adding the i2p-collab
tag, which is the same spelling used on the Tahoe-LAFS trac to indicate a ticket that is currently blocking the I2P project from using trunk.
comment:12 Changed 11 years ago by
Keywords: | i2p-collab added |
---|
comment:13 Changed 11 years ago by
I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.
Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?
comment:14 follow-up: 16 Changed 11 years ago by
Replying to zooko:
I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.
Based on my understanding of Twisted endpoints (obtained from https://twistedmatrix.com/documents/12.3.0/core/howto/endpoints.html):
Yes, Twisted endpoints would be better than this http proxy setting. It makes life for the foolscap (and Tahoe) devs easier because they don't need to care about what transport is used, and it makes life for users easier because setup becomes much simpler.
Right now, to use Tahoe over I2P, users need to manually configure their patched Tahoe to point to an HTTP proxy client tunnel so they can make outgoing connections, and carefully configure their client to not leak IP information. For incoming connections to storage nodes, they need to separately configure a server tunnel, and carefully modify their config to both not leak IP information and to broadcast the correct I2P address. Tor would involve a similar process (but using HSs). For the average user, this is Not Simple (TM).
With endpoint support, the user would only need to install the third-party endpoint plugin for their transport (I2P/Tor/...) and paste a single, likely-unchanging line into their config file. The I2P plugin would handle creation and management of the required I2P tunnels, and there would be no danger of IP leaks because the TCP endpoint would not be used at all and the app would not have (or need) any concept of local IP.
Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?
As far as I2P is concerned, there is no performance issue requiring HTTP proxy use - the HTTP client tunnel is just a special-cased regular tunnel with proxying support for ease of use, and header filtering to help protect users. The main barrier for I2P is implementation complexity - apps should ideally be written with knowledge of I2P, both for privacy/anonymity concerns and for best performance. To my knowledge, the reason duck used the HTTP proxy implementation was because it was a simple but effective patch. There is in fact another (abandoned) version of Tahoe for I2P, where the dev made more fundamental modifications to talk with I2P that could be turned on/off in the config.
All this is superseded by Twisted endpoints - the transport-specific knowledge can be abstracted into the plugin. This is very beneficial for I2P/Tor because any app using Twisted endpoints could immediately be set up inside the network. Of course, developers should still be aware that their software could be used in anonymity networks and design appropriately.
comment:15 Changed 11 years ago by
Replying to warner:
I agree with the approach in general. It's not clear to me that a standard HTTP proxy can handle the binary foolscap protocol: the first "hello" message is made to look HTTPish, but unless the proxy immediately jumps into a byte-relay mode after the HTTP response, the rest of the protocol will get confused. OTOH, if you got this to work with I2P's proxy, then I guess it's good enough.
I wrote a long comment about how foolscap should follow websockets by using an Upgrade:
header, then read negotiate.py where it's clearly commented that it does.
So the unknown seems to be: can "standard" HTTP
proxies correctly proxy this upgraded protocol? It seems the same issue should apply to websockets, unless specs or implementations have hard-coded special cases specific to websockets.
A quick skim of the first part of How HTML5 Web Sockets Interact With Proxy Servers suggests that browsers using knowledge that a connection is for websockets will use the proxy CONNECT
method with non-transparent (explicit) proxies. So I would assume a general foolscap "web proxy" setting to have the same behavior and similar interoperability.
This patch is different, and I have less confidence it will interoperate correctly with general HTTP
proxies, although it may work specifically with an
i2p
specific proxy.
Therefore:
I'd just be worried about labeling this option "httpProxy" if "i2p-proxy" would be a more accurate name.
+1
Relying on the behavior of some implementations or custom protocols calls for a more specific name, IMO.
EDIT: Removed a lot of verbosity about rfc2616 and speculation about how foolscap negotiation should do what it already does. ;-)
comment:16 Changed 11 years ago by
Replying to str4d:
Replying to zooko:
I looked at this a little today, with Jeff "psi"'s help. My current question is: would it be better if foolscap supported Twisted endpoints (foolscap ticket #203) instead of this http proxy setting? In that case, foolscap would, if I understand correctly, be able to talk over I2P, as well as with little or no extra work on our part, talk over Tor and maybe other cool stuff like cjdns.
Based on my understanding of Twisted endpoints (obtained from https://twistedmatrix.com/documents/12.3.0/core/howto/endpoints.html):
Yes, Twisted endpoints would be better than this http proxy setting. It makes life for the foolscap (and Tahoe) devs easier because they don't need to care about what transport is used, and it makes life for users easier because setup becomes much simpler.
Right now, to use Tahoe over I2P, users need to manually configure their patched Tahoe to point to an HTTP proxy client tunnel so they can make outgoing connections, and carefully configure their client to not leak IP information. For incoming connections to storage nodes, they need to separately configure a server tunnel, and carefully modify their config to both not leak IP information and to broadcast the correct I2P address. Tor would involve a similar process (but using HSs). For the average user, this is Not Simple (TM).
With endpoint support, the user would only need to install the third-party endpoint plugin for their transport (I2P/Tor/...) and paste a single, likely-unchanging line into their config file. The I2P plugin would handle creation and management of the required I2P tunnels, and there would be no danger of IP leaks because the TCP endpoint would not be used at all and the app would not have (or need) any concept of local IP.
Agreed with all of this: for Tor / I2P use cases, specialized endpoints are greatly advantageous for both user configuration simplicity and, importantly, IP address confidentiality correctness.
However, there may be other reasons to use an HTTP
proxy. The safest / most interoperable usage of an
HTTP
proxy should be through a
CONNECT
method. Foolscap should *not* deal with this. Instead, there should be a twisted endpoint for
HTTP
proxies that makes a simple
CONNECT
and then uses the resulting stream.
Also, Twisted endpoints feel like a more appropriate tool for this, to me, than an http proxy, because I don't think of foolscap's communications needs as being "HTTP-like", i.e. request-response transactions. I think of them as being "TCP-like", i.e. a bidirectional byte-pipe. Are there some performance problems or implementation-complexity problems to using an http proxy as a way to get foolscap to talk over I2P? Or is this just a misunderstanding on my part?
As far as I2P is concerned, there is no performance issue requiring HTTP proxy use - the HTTP client tunnel is just a special-cased regular tunnel with proxying support for ease of use, and header filtering to help protect users.
I feel like there's some confusion about what kind of HTTP
proxying is happening here. If the I2P
HTTP
proxy is filtering headers, then it is operating on
HTTP
requests and responses, such as is typical of
HTTP
proxies for all methods except
CONNECT
. OTOH, foolscap wants a generic bi-directional stream, so it wants functionality similar to an
HTTP
proxy's
CONNECT
support.
Anyway, this is the kind of detail a twisted endpoint plugin for I2P would manage, rather than foolscap, and it sounds like we all agree that's the best approach.
The main barrier for I2P is implementation complexity - apps should ideally be written with knowledge of I2P, both for privacy/anonymity concerns and for best performance. To my knowledge, the reason duck used the HTTP proxy implementation was because it was a simple but effective patch. There is in fact another (abandoned) version of Tahoe for I2P, where the dev made more fundamental modifications to talk with I2P that could be turned on/off in the config.
All this is superseded by Twisted endpoints - the transport-specific knowledge can be abstracted into the plugin. This is very beneficial for I2P/Tor because any app using Twisted endpoints could immediately be set up inside the network. Of course, developers should still be aware that their software could be used in anonymity networks and design appropriately.
+1
So... should we close this ticket as wontfix
and focus on the twisted endpoints feature?
There may be users who need actual HTTP
proxies, not I2P (or Tor), but I'm of the opinion
HTTP
proxy support should also be a twisted endpoint. If one does not already exist, I'd want to file a feature request over on twisted (or better yet, just write one).
There's actually a wrinkle with an "HTTP
Proxy Endpoint": often applications want to connect to many different destinations with the same proxy setting, so to provide this kind of functionality the application would need to take an proxy config setting then mangle any to-be-proxied request. For example, by changing:
tcp:127.0.0.1:80
to
http-proxy:127.0.0.1:80
...
comment:17 Changed 11 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Closing as "wontfix
" because of the consensus (among people who've posted to this ticket, notably not Brian) that Endpoints would be better.
HTTP proxy support in Negotiate