Tools/Manuals/TS200

From EGIWiki
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Back to Administration FAQ


Considerations for worker nodes on a private network

For various reasons it may be desirable to put a site's worker nodes on a private network. A NAT box would be needed to allow them to access grid services at other sites, e.g. for WMS input and output sandbox handling or interactions with catalogs or storage elements.

The site's CE nodes typically would have an interface on the private network to interact with the worker nodes, as well as a public interface that can be contacted from the grid.

For the SE head node(s) and disk servers one would like to have a similar setup, such that worker nodes may use the private network for local data transfers and avoid the NAT bottleneck.

Here one has to pay attention to the case of incoming third party GridFTP transfers originated from the worker node via the private network. One needs to avoid the following scenario:

  1. The client connects to the local GridFTP server via the private network.
  2. It puts the local server in passive mode for the transfer.
  3. The server returns its private IP address and a port for the transfer.
  4. The client puts the remote GridFTP server in active mode.
  5. The remote server cannot connect to the private IP address that was supplied.

In principle this problem could be avoided by putting the local server in active mode and the remote server in passive mode, but there are 2 issues with that idea:

  1. The client cannot always know which of the 2 servers is unreachable from outside.
  2. When multiple streams are used in the data transfer, the GridFTP 1 protocol requires the sender to initiate the connections.
    • The GridFTP 2 protocol removes that limitation, but is not yet supported by all server implementations.

For those reasons GridFTP clients currently put the destination server in passive and the source server in active mode.

Solution

First, the globus-gridftp-server has an option to make it use the machine's public IP address for data connections. For example:

globus-gridftp-server -data-interface 123.45.67.89

It probably is better to define the option in gridftp.conf instead:

data_interface 123.45.67.89

The server will thus advertise its public address in the response to the passive mode command, but it will still accept the data connections on any of its interfaces: it binds the data socket to the advertised port, but on IP address 0.0.0.0 instead.

Second, to avoid that local transfers would go through the NAT box, on the worker nodes one would define an iptables NAT rule for each disk server, that translates the server's public to its private IP address. For example:

iptables -t nat -A OUTPUT -d 123.45.67.89 -j DNAT --to 192.168.0.1

A similar rule can be defined for the SE head node(s), such that the local SRM traffic etc. also goes over the private network.

Note

For SRM traffic one could let a private DNS (e.g. /etc/hosts) on the worker nodes resolve the SE's public host name to its private address, but that trick will not work for GridFTP traffic, because the GridFTP server's reply to the passive command contains the machine's IP address to be used, and not its host name.

dCache

For dCache installations please consult e.g. the following documentation and ask the dCache team for advice if needed: