Mercurial > cgi-bin > hgwebdir.cgi > PR > Applications > SSR > SSR__Blocked_Matrix_Mult__Bench
changeset 30:0d7177551bff perf_tuning_paper lots of cores version
optimization for large number of cores: dedicate core 0 to create/receive
| author | Nina Engelhardt <nengel@mailbox.tu-berlin.de> |
|---|---|
| date | Wed, 18 Apr 2012 15:50:14 +0200 |
| parents | c35cb1f48f89 |
| children | 3477d7443620 |
| files | SSR_Matrix_Mult/Divide_Pr.c |
| diffstat | 1 files changed, 2 insertions(+), 2 deletions(-) [+] |
line diff
1.1 --- a/SSR_Matrix_Mult/Divide_Pr.c Tue Apr 17 20:13:26 2012 +0200 1.2 +++ b/SSR_Matrix_Mult/Divide_Pr.c Wed Apr 18 15:50:14 2012 +0200 1.3 @@ -261,7 +261,7 @@ 1.4 idealNumWorkUnits = SSR__giveIdealNumWorkUnits(); 1.5 1.6 idealSizeOfSide2 = leftMatrix->numRows / rint(cbrt( idealNumWorkUnits )); 1.7 - idealSizeOfSide2 *= 0.5; //finer granularity to help load balance 1.8 + idealSizeOfSide2 *= 0.6; //finer granularity to help load balance 1.9 1.10 if( idealSizeOfSide1 > idealSizeOfSide2 ) 1.11 idealSizeOfSide = idealSizeOfSide1; 1.12 @@ -412,7 +412,7 @@ 1.13 1.14 //Move to next core, max core-value to incr to is numCores -1 1.15 coreToAssignOnto += 1; 1.16 - if( coreToAssignOnto >= numCores ) coreToAssignOnto = 0; 1.17 + if( coreToAssignOnto >= numCores ) coreToAssignOnto = 1; 1.18 } //if 1.19 } //for( vecIdx 1.20 } //for( resColIdx
