addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcredit-cardcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobe--smallglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1languagelaunch-new-window--smalllight-bulblinklocation-pinlockm-swarmSearchmailmessagesminusmobilemoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

San Francisco Hadoop Users Message Board › July 2011 - Network configurations for Hadoop deployment

July 2011 - Network configurations for Hadoop deployment

A former member
Post #: 17
1 GBit to top-of-rack is usual
multi-rack deployments often use 10 GBit to core switch -- maybe LACP-bonded dual uplink ports, if the switches support these.

Key metrics:
* L2 forwarding rate
* switching backplane bandwidth (fully switched is best, but you'll pay for this)
* packets/second throughput.

Under high contention, TCP backoff will cause more shorter packets to be created; this means packets/second dominates throughput rather than overall bandwidth

Enabling Jumbo frames (if your switches support them) may help; one experiment shows 5% gain, but that seems low. More testing is necessary.

Spread ZK nodes and ETL tasks across all racks to get the most even distribution of data flow.
It's often the best idea to plug the namenode and jobtracker (hmaster, etc). directly into the core switch with 10 gbit connections to ensure low latency access to these services from all racks.

Powered by mvnForum

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy