VMS/0__Writings/kshalle

changeset 68:7e903acb5f64

perf tuning -- small changes to authors and so forth
author Sean Halle <seanhalle@yahoo.com>
date Sun, 15 Jul 2012 00:16:04 -0700
parents 4618b5af3b04
children 0e9165cb2c52
files 0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.pdf 0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex
diffstat 2 files changed, 2 insertions(+), 2 deletions(-) [+]
line diff
     1.1 Binary file 0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.pdf has changed
     2.1 --- a/0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex	Fri Jul 13 15:36:29 2012 -0700
     2.2 +++ b/0__Papers/Holistic_Model/Perf_Tune/latex/Holistic_Perf_Tuning.tex	Sun Jul 15 00:16:04 2012 -0700
     2.3 @@ -56,7 +56,7 @@
     2.4             {nengel@mailbox.tu-berlin.de}
     2.5  \authorinfo{Sean Halle}
     2.6             {Open Source Research Institute}
     2.7 -           {Email1}
     2.8 +           {seanhalle@OpenSourceResearchInstitute.org}
     2.9  \authorinfo{Ben Juurlink}
    2.10             {TU Berlin}
    2.11             {b.juurlink@tu-berlin.de}
    2.12 @@ -541,7 +541,7 @@
    2.13  \subsubsection{Recording time, instructions, and cache misses }
    2.14   Just recording the units and connections between them is not enough. Because the SCG represents core usage, it also needs  the cycles spent on each activity, including internal runtime activities. The size of each interval of core usage is recorded and assigned to a  segment of a particular unit's life-line.
    2.15  
    2.16 -The UCC also makes use  of the number of instructions in a unit, as an estimate of size of work in the unit, as illustrated by Fig [fig:UCC_expl]. Without knowing the relative size of the units, it is hard to estimate the amount of parallelism \emph{usefully} available in the application.
    2.17 +The UCC also makes use  of the number of instructions in a unit, as an estimate of size of work in the unit, as illustrated by Fig \ref{fig:UCC_expl}. Without knowing the relative size of the units, it is hard to estimate the amount of parallelism \emph{usefully} available in the application.
    2.18  
    2.19  To measure the instructions, cycles, and communication (cache misses), we use hardware performance counters. Readings are inserted into the runtime code to capture core time spent on each segment of the life-line of a unit: 
    2.20  \begin{enumerate}