outline gathering measurements
play

Outline Gathering Measurements Monitors CS 239 Tools for - PDF document

Outline Gathering Measurements Monitors CS 239 Tools for measurement Experimental Methodologies for Applying workloads to systems System Software Common mistakes in benchmarking Peter Reiher April 26, 2007 Lecture 7


  1. Outline Gathering Measurements • Monitors CS 239 • Tools for measurement Experimental Methodologies for • Applying workloads to systems System Software • Common mistakes in benchmarking Peter Reiher April 26, 2007 Lecture 7 Lecture 7 Page 1 Page 2 CS 239, Spring 2007 CS 239, Spring 2007 Monitors Classifications of Monitors • A monitor is a tool used to observe • Hardware vs. software monitors system activity • Event-driven vs. sampling monitors • Proper use of monitors is key to • On-line vs. batch monitors performance analysis • Also useful for other system observation purposes Lecture 7 Lecture 7 Page 3 Page 4 CS 239, Spring 2007 CS 239, Spring 2007 Event-Driven Vs. Sampling Hardware Vs. Software Monitors Monitors • Event-driven monitors notice every time a • Hardware monitors used primarily by particular type of event occurs hardware designers – Ideal for rare events –Requires substantial knowledge of – Require low per-invocation overheads hardware details • Sampling monitors check the state of the –VLSI limits monitoring possibilities system periodically • Software monitors used (mostly) by – Good for frequent events everyone else – Can afford higher overheads Lecture 7 Lecture 7 Page 5 Page 6 CS 239, Spring 2007 CS 239, Spring 2007 1

  2. On-Line Vs. Batch Monitors Issues in Monitor Design • On-line monitors can display their • Activation mechanism information continuously • Buffer issues –Or, at least, frequently • Data compression/analysis • Batch monitors save it for later • Enabling/disabling monitors –Usually using separate analysis • Priority issues procedures • Abnormal events monitoring Lecture 7 Lecture 7 Page 7 Page 8 CS 239, Spring 2007 CS 239, Spring 2007 Activation Mechanism Buffer Issues • Buffer size • When do you collect the data? – Big enough to avoid frequent disk writes – Small enough to make disk writes cheap – When an interesting event occurs, trap • Number of buffers to data collection routine – At least two, typically – Analyze every step taken by system – One to fill up, one to record – Go to data collection routine when • Buffer overflow timer expires – Overwrite old data you haven’t recorded – Or lose new data you don’t have room for Lecture 7 Lecture 7 Page 9 Page 10 CS 239, Spring 2007 CS 239, Spring 2007 Data Compression or Analysis Enabling/Disabling Monitors • Most system monitors have some overhead • Data can be literally compressed • So users should be able to turn them off, if • Or can be reduced to a summary form high performance is required • Both methods save space for holding data • Not necessary if overhead is truly trivial • But at the cost of extra overhead in • Or if purpose of system is primarily gathering it gathering data • Sometimes can use idle time for this – As is case with many research systems – But might be better spent dumping data to disk Lecture 7 Lecture 7 Page 11 Page 12 CS 239, Spring 2007 CS 239, Spring 2007 2

  3. Priority of Monitor Monitoring Abnormal Events • How high a priority should the • Sometimes, failures and errors are monitor’s operations have? most important thing to observe • Again, trading off performance impact • Can requires special attention against timely and complete data –System may not be operating very gathering well at the time of the failure • Not always a simple question Lecture 7 Lecture 7 Page 13 Page 14 CS 239, Spring 2007 CS 239, Spring 2007 Layered View of Distributed Monitoring Distributed Systems Monitor Make system changes, as necessary Management • Monitoring a distributed system is like designing a distributed system Console Control the overall system • Must deal with Interpretation Decide what the results mean –Distributed state Presentation Present your results –Unsynchronized clocks Analysis Analyze what you’ve stored –Partial failures Collection Store what you’ve seen for later Observation Watch what happens Lecture 7 Lecture 7 Page 15 Page 16 CS 239, Spring 2007 CS 239, Spring 2007 The Observation Layer The Collection Layer • Data can be collected at one or several • Layer that actually gathers the data points in the distributed system • Implicit spying - watching what other • How does the data get from observer to collector (if not co-located)? sites do without disturbing the activity – Advertising -observers send it out, • Explicit instrumentation - inserting collectors listen and grab it code to monitor activities – Soliciting - collectors ask observers to • Probing - making feeler requests into send it system to discover what’s happening • Clock issues can be key, here Lecture 7 Lecture 7 Page 17 Page 18 CS 239, Spring 2007 CS 239, Spring 2007 3

  4. Tools and Methods For The Analysis Layer Software Measurement • In distributed system, may be more • OK, so how do I actually measure a feasible to analyze on the fly piece of software? • Can sometimes dedicate one (or more) • What are the practical tools and machines to analysis methods available to me? • Often requires gathering all data to one • How do I get my damn project done? point, though Lecture 7 Lecture 7 Page 19 Page 20 CS 239, Spring 2007 CS 239, Spring 2007 Tools For Software Measurement Code Instrumentation • Code instrumentation • Adding monitoring code to the system under study • Tracing packages • Basically, just add the code that does • System-provided metrics and utilities what you want • Profiling Lecture 7 Lecture 7 Page 21 Page 22 CS 239, Spring 2007 CS 239, Spring 2007 Advantages of Code Disadvantages of Instrumenting Instrumentation the Code + Usually the most direct way to gather – Requires access to the source data – Requires strong knowledge of the + Complete flexibility of where to insert design and many details of the code monitoring code – Requires recompilation to change + Strong control over costs of monitoring monitoring facility + Resulting measurements always – If overdone, strong potential to affect available performance Lecture 7 Lecture 7 Page 23 Page 24 CS 239, Spring 2007 CS 239, Spring 2007 4

  5. Typical Types of Instrumentation Counters • Counters and accumulators – Cheap and fast • Useful if number of times an event – But low level of detail occurs is of interest • Logs • Can be used to accumulate totals – More detail – But more costly –E.g., total bytes read by file system – Require occasional dumping or digesting • In modern systems, make them wide • Timers enough so they won’t overflow – To determine elapsed time for operations – Typically using OS-provided system calls Lecture 7 Lecture 7 Page 25 Page 26 CS 239, Spring 2007 CS 239, Spring 2007 Examples of Counters Logs • Can log arbitrarily complex data about an • Count number of times a network event protocol transmits packets • But more complex data takes more space • Count number of times programs are • Typically, log data into a reserved buffer swapped out due to exceeding their • When full, request for buffer to be written time slices to disk • Count number of incoming requests to – Often want a second buffer to gather data a Web server while awaiting disk write Lecture 7 Lecture 7 Page 27 Page 28 CS 239, Spring 2007 CS 239, Spring 2007 Designing a Log Entry Timers • What form should a log entry take? • Many OSs provide system calls that start • Designing for compactness vs. human and stop timers readability • Allowing you to time how long things took –Former better for most purposes • Usually, only elapsed time measurable –Latter useful for system debugging – Not necessarily time spent running particular process –Make sure no important information • So care required to capture real meaning of is lost in compacting the log entry timings Lecture 7 Lecture 7 Page 29 Page 30 CS 239, Spring 2007 CS 239, Spring 2007 5

Recommend


More recommend