************************************************************************** * The following are additional notes on the iosnoop program. * * $Id: iosnoop_notes.txt,v 1.1.1.1 2015/09/30 22:01:09 christos Exp $ * * COPYRIGHT: Copyright (c) 2007 Brendan Gregg. ************************************************************************** * What does the output represent? The output is disk events - I/O operations that cause the disk to physically read or write data. The output is not application I/O events which may be absorbed by memory caches - many of which will be. The output really is physical disk events. iosnoop uses probes from the "io" provider - which traces the block device driver before disk events happen. disk events. The stack goes like this, application | V syscall | V vfs | V ufs/zfs/... | V block device driver | V physical device driver | V disk Due to caching (after vfs) few events will make it to the disk for iosnoop to see. If you want to trace all I/O activity, try using syscall provider based scripts first. * What do the elapsed and delta times mean? Glad you asked! The times may *not* be as useful as they appear. I should also add that this quickly becomes a very complex topic, There are two different delta times reported. -D prints the elapsed time from the disk request (strategy) to the disk completion iodone); -o prints the time for the disk to complete that event since it's last event (time between iodones, or since idle->strategy). The elapsed time is equivalent to the response time from the application request to the application completion. The delta time resembles the service time for this request (resembles means it will be generally correct, but not 100% accurate). The service time is the the time for the disk to complete the request, after it has travelled through any bus or queue. buuuttt.... you need to think carefully about what these times mean before jumping to conclusions. For example, You troubleshoot an application by running iosnoop and filtering on your application's PID. You notice large times for the disk events (responce, service, for this example it doesn't matter). Does this mean there is a problem with that application? What could be happening is that a different application is also using the disks at the same time, and is causing the disk heads to seek to elsewhere on the disk surface - increasing both service and response time. hmmm! so you can't just look at one application, one set of numbers, and understand fully what is going on. But it gets worse. Disks implement "tagged queueing", where events in the queue are reshuffeled to promote "elevator seeking" of the disk heads (this reduces head seeking). So the time for a disk event can be effected not just by the previous event (and previous location the heads had seeked to), but the surrounding events that enter the queue. So the good and the bad. The good news is that iosnoop makes it easy to fetch disk event data on a live system, the bad news is that understanding all the data is not really easy. For further information on disk measurements see, "How do disks really work?" - Adrian Cockcroft, SunWorld Online, June 1996 "Sun Performance and Tuning" - Adrian Cockcroft, Richard Pettit "Solaris Internals" - Richard McDougall, Jim Mauro * The output appears shuffled? Read the answer to this in ALLsnoop_notes.txt. |