5522fa37c5797bfda1fef63542764cc0.ppt
- Количество слайдов: 64
The Performance Bottleneck Application, Computer, or Network Richard Carlson <rcarlson@internet 2. edu> e. VLBI Workshop – Performance Tuning Tutorial September 17, 2006
Outline • Why there is a problem • What can be done to find/fix problems • Tools you can use
Basic Premise • Application’s performance should meet your expectations! • If they don’t you should complain! • But you have to complain effectively.
Questions • How many times have you said: • What’s wrong with the network? • Why is the network so slow? • Do you have any way to find out? • Tools to check local host • Tools to check local network • Tools to check end-to-end path
Unfortunate Reality • Every problem, regardless of cause, exhibits the same symptom • The application performance doesn’t meet the users expectations!
Possible Bottlenecks • Network infrastructure • Host computer/appliance • Application design
Simple Network Picture Bob’s Host Network Infrastructure Carol’s Host
Network Infrastructure Switch 2 Switch 1 R 4 Switch 3 R 5 R 8 R 1 R 3 R 6 R 2 R 7 Switch 4 R 9
Network Infrastructure Bottlenecks • Links too small • Using Fast. Ethernet instead of Gigabit Ethernet • Links congested • Too many hosts crossing this link • Scenic routing • End-to-end path is longer than it needs to be • Broken equipment • Bad NIC, broken wire/cable, cross-talk • Administrative restrictions • Firewalls, Filters, shapers, restrictors
Host Computer Bottlenecks • CPU utilization • What else is the processor doing? • Memory limitations • Main memory and network buffers • I/O bus speed • Getting data into and out of the NIC • Disk access speed
Application Behavior Bottlenecks • Chatty protocol • Lots of short messages between peers • High reliability protocol • Send packet and wait for reply before continuing • No run-time tuning options • Use only default settings • Blaster protocol • Ignore congestion control feedback
Problems, Problems • Problems can exist at multiple levels • Network infrastructure • Host computer • Application design • Multiple problems can exist at the same time • All problems must be found and fixed before things get better
Transport Protocols 101 • Transmission Control Protocol (TCP) • Provides applications with a reliable in-order delivery service • The most widely used Internet transport protocol • Web, File transfers, email, P 2 P, Remote login • User Datagram Protocol (UDP) • Provides applications with an unreliable delivery service • RTP, DVTS, DNS
Outline • Why there is a problem • What can be done to find/fix problems • Tools you can use
Remote Image Processing • Carol is analyzing astronomical images. Bob needs to send a data file containing digital images (50 MB per file) to Carol every ½ hour. Bob and Carol are 2, 000 miles apart. How long should each transfer take? • 5 minutes? • 1 minute? • 5 seconds?
What should we expect? • Assumptions: • 100 Mbps Fast Ethernet is the slowest link • 50 msec round trip time • Bob & Carol calculate: • 50 MB * 8 = 400 Mbits • 400 Mb / 100 Mb/sec = 4 seconds
Initial Test Results
Initial Test Results • 18 Minutes!!! This is unacceptable! • First look for network infrastructure problem • Use NDT tester to examine both hosts
Initial NDT testing shows Duplex Mismatch at one end
NDT Found Duplex Mismatch • Investigating this it is found that the switch port is configured for 100 Mbps Full-Duplex operation. • Network administrator corrects configuration and asks for re-test
Duplex Mismatch Corrected
SCP results after Duplex Mismatch Corrected
Intermediate Results • Time dropped from 18 minutes to 40 seconds. • Is this acceptable? ? ? • Remember your calculations said it should take 4 seconds. • 400 Mb / 40 sec = 10 Mbps • Why are we limited to 10 Mbps? • Are you satisfied with 1/10 th of the possible performance?
Default TCP window size
Calculating the Window Size • Remember Bob found the round-trip time was 50 msec • Calculate window size limit • 85. 3 KB * 8 b/B = 698777 b • 698777 b /. 050 s = 13. 98 Mbps • Stated another way • 698777 b / 100 Mb/s = 6. 99 msec • 43 msec of idle time every RTT
Calculating the Window Size • Calculate new window size • (100 Mb/s *. 050 s) / 8 b/B = 610. 3 KB • Use 8 MB for testing purposes
Resetting Window Buffer
Intermediate Results • Use application specific options to manually reset buffer size • Fixes problem for this application • Doesn’t fix problem for other applications • Need better ‘default behavior’ for all applications
With TCP window size tuned
Steps so far • Found and fixed Duplex Mismatch • Network Infrastructure problem • Found and fixed TCP window size values • Host configuration problem • Are we done yet?
SCP results with auto-tuning enabled
Intermediate Results • SCP still runs slower than expected • Hint: SSH uses internal buffers • Design choice by Application Developers limit performance • Patch available from PSC
SCP Results with tuned SCP
Final Results • Fixed infrastructure problem • Fixed host configuration problem • Fixed Application configuration problem • Achieved target time of 4 seconds to transfer 50 MB file over 2000 miles
Follow-up questions • What would have happened if I tried the patched SCP version before fixing the TCP buffer problem? • Would not have been able to see improvement. • Discard patch because “it didn’t work”?
Why is it hard to Find/Fix Problems? • Network infrastructure is complex • Network infrastructure is shared • Network infrastructure consists of multiple components
Shared Infrastructure • Other applications accessing the network • Remote disk access • Automatic email checking • Heartbeat facilities • Other computers are attached to the closet switch • Uplink to facility infrastructure • Other users on and off site • Uplink from facility to gigapop/backbone
Other Network Components • DHCP (Dynamic Host Resolution Protocol) • At least 2 packets exchanged to configure your host • DNS (Domain Name Resolution) • At least 2 packets exchanged to translate FQDN into IP address • Multiple addresses require a sequential search • Network Security Devices • Intrusion Detection, VPN, Firewall
Why is it hard to Find/Fix Problems? • Computers have multiple components • Each Operating System (OS) has a unique set of tools to tune the network stack • Network Interface Cards also have tuning options • Application Appliances come with few knobs and limited options
Computer Components • • • Main CPU (clock speed) Front & Back side bus Main Memory I/O Bus (ATA, SCSI, SATA) Disk (access speed and size)
Computer Issues • Lots of internal components with multitasking OS • Lots of tunable TCP/IP parameters that need to be ‘right’ for each possible connection
Why is it hard to Find/Fix Problems? • Applications depend on default system settings • Problems scale with distance • More access to remote resources • 80/20 % rule since the early 1990’s, 80% of your traffic leaves your local network
Default System Settings • For Linux 2. 6. 13 there are: • 11 tunable IP parameters • 45 tunable TCP parameters • 148 Web 100 variables (TCP MIB) • Currently no OS ships with default settings that work well over trans-continental distances • Some applications allow run-time setting of some options • 30 settable/viewable IP parameters • 24 settable/viewable TCP parameters • There are no standard ways to set run-time option ‘flags’
Application Issues • Setting tunable parameters to the ‘right’ value • Getting the protocol ‘right’
Outline • Why there is a problem • What can be done to find/fix problems • Tools you can use
Tools, Tools • • Ping Traceroute Iperf Tcpdump Tcptrace BWCTL NDT OWAMP • • AMP Advisor Thrulay Web 100 Mona. Lisa pathchar NPAD Pathdiag • • Surveyor Ethereal Coral. Reef MRTG Skitter Cflowd Cricket Net 100
Active Measurement Tools • Tools that inject packets into the network to measure some value • Available Bandwidth • Delay/Jitter • Loss • May require bi-directional traffic or synchronized hosts • May require running test program on both hosts
Passive Measurement Tools • Tools that monitor existing traffic on the network and extract some information • Bandwidth used • Jitter • Loss rate • May generate some privacy and/or security concerns
How do you set realistic Expectations? • Assume network bandwidth exists or find out what the limits are • Local LAN connection • Site Access link • Monitor the link utilization occasionally • Weathermap • MRTG graphs • Look at your host config/utilization • What is the CPU utilization
Distance Matters • It’s harder to go fast over a long distance • TCP congestion control requires numerous round trips to prevent flooding network • TCP buffer limits can stop sender from injecting new data into the network • Application can exhibit poor behavior when used over long distances
Ethernet, Fast. Ethernet, Gigabit Ethernet, 10 GE • 10/1000 auto-sensing NICs are common today • Most facilities have installed 10/100 switched infrastructure • Access network links are currently the limiting factor in most networks • Backbone networks are 10 Gigabit/sec
Wireless LAN’s • 802. 11 b - 11 Mbps (expect 5) • 802. 11 a – 34 Mbps (expect 15) • 802. 11 g – 54 Mbps (expect 25) • Expect large variations in speed due to radio signal propagation
Focus on 2 tools • Existing NDT tool • Allows users to test network path for a limited number of common problems • Emerging Perf. Sonar tool • Allows users to retrieve network path data from major national and international REN network
Network Diagnostic Tool (NDT) • Measure performance to users desktop • Identify real problems for real users • Network infrastructure is the problem • Host tuning issues are the problem • Make tool simple to use and understand • Make tool useful for users and network administrators • Web-based JAVA applet allows testing from any browser
Installing your own server • All Internet 2 tools are FREE • Visit http: //e 2 epi. internet 2. edu/ for details • Workshops are available to help your administrator get them up and running ( http: //e 2 epi. internet 2. edu/net-perf-wkshp/ ) • Encourage your peers to start testing • Encourage your vendors to include the client programs
NPToolkit Bootable CD Knoppix based Live-CD Contains listed tools Download from Internet 2 Ask for a pre-built CD-ROM http: //e 2 epi. internet 2. edu/network-performance-toolkit. iso
Perf. Sonar – Next Steps in Performance Monitoring • New Initiative involving multiple partners • ESnet (DOE labs) • GEANT (European Research and Education network) • Internet 2 (Abilene and connectors) • Sample tool (Joe Metzger ESnet) https: //performance. es. net/cgi-bin/perfsonar-trace. cgi
Traceroute Visualizer
Abilene Weather Map http: //loadrunner. uits. iu. edu/weathermaps/abilene/
Windows XP Performance
Google it! • Enter “tuning tcp” into the google search engine. • Top 2 hits are: http: //www. psc. edu/networking/perf_tune. html http: //www-didc. lbl. gov/TCP-tuning. html
PSC Tuning Page
LBNL Tuning Page
Conclusions • Applications can fully utilize the network • All problems have a single symptom • All problems must be found and fixed before things get better • Some people stop investigating before finding all problems • Tools exist, and more are being developed, to make it easier to find problems
5522fa37c5797bfda1fef63542764cc0.ppt