Opened 8 years ago

Closed 8 years ago

#258 closed defect (fixed)

allocate_tcp_port sometimes returns in-use port

Reported by: Brian Warner Owned by:
Priority: major Milestone: 0.12.3
Component: network Version: 0.12.2
Keywords: Cc:

Description

I've noticed intermittent unit test failures (in foolscap and in tahoe, which uses the same code) in which the port number returned by allocate_tcp_port() causes an EADDRINUSE when actually used to listen.

I did some quick experiments, but I still don't entirely understand what's going on. I know one issue is that allocate_tcp_port() binds the port to 127.0.0.1, and it seems like some operating systems will give us a port that someone else is already listening on (but bound to 0.0.0.0). It feels a bit like a kernel bug: it's comparing the interface identifiers and concluding that they don't overlap, when in fact 0.0.0.0 overlaps with everything.

But I'm pretty sure that's not the only failure mode. I think I tried changing our function to bind to 0.0.0.0, and there were other ports that got returned incorrectly. Also I think it behaves differently on linux and OS-X.

It's a rare failure, as there are only a handful of ports in use by other processes, and the kernel allocation uses a range of at least 10000 port numbers. But it's a drag to re-run the tests each time it happens.

Change History (5)

comment:1 Changed 8 years ago by Brian Warner

Resolution: fixed
Status: newclosed

Fixed, in [6a1de25d].

There turned out to be multiple issues, some of them platform-specific.

  • We get the port by creating a test socket, binding it to port 0, then asking what port got assigned. I wanted to further bind this socket to 127.0.0.1, rather than 0.0.0.0 (INADDR_ANY), to follow a general philosophy of not being visible on external interfaces unless we really want to be visible. Finally, I wanted to not have to call listen() on the socket, both to follow the same philosophy, and to avoid causing security alerts for monitoring tools like LittleSnitch? or seLinux.
  • I had to give up on all of those goals
  • Binding the socket to 127.0.0.1 allows a BSD kernel to give us a port that is already in use by a LISTEN socket, as long as that other listener bound it to a different interface (like 0.0.0.0)
    • our real listen(), performed by whoever called allocate_tcp_port(), will usually fail (it may work if our final interface binding is smaller than the other LISTEN socket, but allocate_tcp_port() doesn't know what that will be)
    • if we bind the test socket to 0.0.0.0, the kernel can give us ports used by 127.0.0.1-bound LISTEN sockets
    • Linux doesn't do this: if anybody has a LISTEN socket using a port, the kernel won't give you that same port number
    • note that we aren't using SO_REUSEPORT here (who knows what that would do)
  • Regardless of how you bind it, a BSD kernel will still give you ports in use by ESTABLISHED sockets, even if there are no LISTEN sockets on that port (e.g. the near side of an outbound connection). The subsequent listen() will fail, *if* your UID is not the owner of that other ESTABLISHED socket (at least on OS-X).
    • linux only returns ESTABLISHED ports if you bind to 127.0.0.1, and the listen() fails even if you also own the other socket
  • It's not sufficient to test listen() on the test socket we used to learn the port number: sometimes that will work, then a new socket (bound to the same port and interface) will fail.

If we only had to worry about linux, we could just bind to 0.0.0.0. To handle OS-X too, we need a two-phase test:

  • create a socket, bind to 0.0.0.0 port 0, get the assigned port number, close the socket
  • create a new socket, bind to 0.0.0.0 and the assigned port, attempt to listen
    • if the listen fails, close the socket and try again, up to some hard-coded retry limit (100)
    • if the listen succeeds, close the socket and return the port number

One final note, on OS-X this process seems to give us sequential ports in the range 49152-65535. On Linux we get random ports from 32768-49152 (ish), because when SO_REUSEADDR is in use, the kernel tries to assign a port from the lower half of the /proc/sys/net/ipv4/ip_local_port_range (which defaults to 32768-60999), and only uses the upper half if the lower half is full. Non-SO_REUSEADDR use the whole range.

comment:2 Changed 8 years ago by Brian Warner

Incidentally Tahoe#2795 is about copying this fix into the Tahoe tree.

comment:3 Changed 8 years ago by Brian Warner

Resolution: fixed
Status: closedreopened
Version: 0.9.10.12.2

I'm seeing this happen again. I think it's because some other process is listening on a socket that is bound to 127.0.0.1, and the kernel is willing to give us that same number. (in this case, it's a Tahoe node's controlport/logport causing the conflict).

I think the fix might be to make it a three-phase test:

  • create socket, bind to 0.0.0.0 port 0, get port number, close
  • create socket, bind to 0.0.0.0 port NUMBER, attempt to listen, close
  • create socket, bind to 127.0.0.1 port NUMBER, attempt to listen, close

comment:4 Changed 8 years ago by Brian Warner

Milestone: 0.12.00.12.3

comment:5 Changed 8 years ago by Brian Warner

Resolution: fixed
Status: reopenedclosed

re-closed in [a1cde254e3c8f22334bb6de595b66462ae0ceb96], which does the three-phase test described above. Some manual testing suggests that this should work.

Note: See TracTickets for help on using tickets.