the “Track Shipment” task must complete in less than .5 second in at least 99.9 percent of executions.
Response time of X is more than 20 seconds
in many cases. We’ll be happy when response time is one second or less in at least 95 percent of executions.
The sequence diagram is a good tool for conceptualizing flow of control and the corresponding flow of time. To think clearly about response time, however, you need something else.
A profile shows where your code has spent your time and—sometimes even more importantly—where it has not. There is tremendous value in not having to guess about these things.
With a profile, you can begin to formulate the answer to the question, “How long should this task run?” which, by now, you know is an important question in the first step of any good problem diagnosis.
Performance improvement is proportional to how much a program uses the thing you improved. If the thing you’re trying to improve contributes only 5 percent to your task’s total response time, then the maximum impact you’ll be able to make is 5 percent of your total response time. This means that the closer to the top of a profile that you work (assuming that the profile is sorted in descending response-time order), the bigger the benefit potential for your overall response time.
when everyone is happy except for you, make sure your local stuff is in order before you go messing around with the global stuff that affects everyone else, too.
One measure of load is utilization, which is resource usage divided by resource capacity for a specified time interval.
There are two reasons that systems get slower as load increases: queuing delay and coherency delay.
Response time (R), in the perfect scalability M/M/m model, consists of two components: service time (S) and queuing delay (Q), or R = S + Q.
Your system doesn’t have theoretically perfect scalability. Coherency delay is the factor that you can use to model the imperfection. It is the duration that a task spends communicating and coordinating access to a shared resource.
The utilization value at which this optimal balance occurs is called the knee. This is the point at which throughput is maximized with minimal negative impact to response times.