San Francisco Hadoop Users Message Board › July 2011 - Network configurations for Hadoop deployment
|A former member||
1 GBit to top-of-rack is usual
multi-rack deployments often use 10 GBit to core switch -- maybe LACP-bonded dual uplink ports, if the switches support these.
* L2 forwarding rate
* switching backplane bandwidth (fully switched is best, but you'll pay for this)
* packets/second throughput.
Under high contention, TCP backoff will cause more shorter packets to be created; this means packets/second dominates throughput rather than overall bandwidth
Enabling Jumbo frames (if your switches support them) may help; one experiment shows 5% gain, but that seems low. More testing is necessary.
Spread ZK nodes and ETL tasks across all racks to get the most even distribution of data flow.
It's often the best idea to plug the namenode and jobtracker (hmaster, etc). directly into the core switch with 10 gbit connections to ensure low latency access to these services from all racks.