# HG changeset patch # User Some Random Person # Date 1336149854 25200 # Node ID 980d375417a31656baa4ed31126060e516a2ba4b # Parent a4e0504b60f69bf661247e26010a5f76b6b80fd9 Perf tuning -- push of related work minor fiddling diff -r a4e0504b60f6 -r 980d375417a3 0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex --- a/0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex Fri May 04 08:21:46 2012 -0700 +++ b/0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex Fri May 04 09:44:14 2012 -0700 @@ -352,6 +352,10 @@ Paragraph also follows an event-based model, and represents the large collection of simpler tools that instrument the MPI library. It shows whether cores are busy, and indicates communication overhead, but lacks any features that tie the communication pattern realized to application code features, which are what is under programmer control. It also fails to show runtime overhead, and which portions of idle time are caused by runtime internal constraints. +Paraver and Vampir are just painting tools that take event measurements and paint them on the screen. + +Other approaches concentrate on performance counter data to identify hot-spots and potential false-sharing.. these suffer from the same lack of encompasing computation model, leaving the user to guess at what might be the cause of measured numbers. The do a good job of saying that something might be wrong, bor a poor job of pointing to what is causing the problem, and hence leave the user baffled as to what to change in their code to get better performance. + The commonality among the classic approaches is the lack of a model of parallel computation. One difficulty faced by early tools is that parallel applications written in MPI or threads effectively end up implementing a runtime system in the application code. In such a case, the units of work are implied in the code, and difficult for tools to recognize. Likewise, constraints on scheduling are enforced by the code, but never stated in any explicit form.