# HG changeset patch # User Nina Engelhardt # Date 1332324551 -3600 # Node ID b95711c6965ce5725273c6edbc51642fcbb050ea # Parent ce1f57e10face23de0653b3cb40861badb99e50d counters work now diff -r ce1f57e10fac -r b95711c6965c AnimationMaster.c --- a/AnimationMaster.c Mon Mar 19 10:03:45 2012 -0700 +++ b/AnimationMaster.c Wed Mar 21 11:09:11 2012 +0100 @@ -130,7 +130,7 @@ RequestHandler requestHandler; void *semanticEnv; int32 thisCoresIdx; - + //======================== Initializations ======================== masterEnv = (MasterEnv*)_VMSMasterEnv; @@ -140,7 +140,8 @@ requestHandler = masterEnv->requestHandler; slaveAssigner = masterEnv->slaveAssigner; semanticEnv = masterEnv->semanticEnv; - + + HOLISTIC__Insert_Master_Global_Vars; //======================== animationMaster ======================== while(1){ @@ -158,17 +159,20 @@ { currSlot->workIsDone = FALSE; currSlot->needsSlaveAssigned = TRUE; - + + HOLISTIC__Record_AppResponder_start; MEAS__startReqHdlr; //process the requests made by the slave (held inside slave struc) (*requestHandler)( currSlot->slaveAssignedToSlot, semanticEnv ); + HOLISTIC__Record_AppResponder_end; MEAS__endReqHdlr; } //If slot empty, hand to Assigner to fill with a slave if( currSlot->needsSlaveAssigned ) { //Call plugin's Assigner to give slot a new slave + HOLISTIC__Record_Assigner_start; assignedSlaveVP = (*slaveAssigner)( semanticEnv, currSlot ); @@ -178,6 +182,8 @@ assignedSlaveVP->animSlotAssignedTo = currSlot; currSlot->needsSlaveAssigned = FALSE; numSlotsFilled += 1; + + HOLISTIC__Record_Assigner_end; } } } diff -r ce1f57e10fac -r b95711c6965c CoreController.c --- a/CoreController.c Mon Mar 19 10:03:45 2012 -0700 +++ b/CoreController.c Wed Mar 21 11:09:11 2012 +0100 @@ -77,7 +77,7 @@ volatile int32 *addrOfMasterLock; //thing pointed to is volatile, not ptr SlaveVP *thisCoresMasterVP; //Variables used for pthread related things - ThdParams *coreCtlrThdParams; + ThdParams *thisCoresThdParams; cpu_set_t coreMask; //used during pinning pthread to CPU core int32 errorCode; //Variables used during measurements @@ -88,8 +88,8 @@ //=============== Initializations =================== - coreCtlrThdParams = (ThdParams *)paramsIn; - thisCoresIdx = coreCtlrThdParams->coreNum; + thisCoresThdParams = (ThdParams *)paramsIn; + thisCoresIdx = thisCoresThdParams->coreNum; //Assembly that saves addr of label of return instr -- label in assmbly recordCoreCtlrReturnLabelAddr((void**)&(_VMSMasterEnv->coreCtlrReturnPt)); @@ -105,7 +105,7 @@ //Linux requires pinning to be done inside the thread-function //Designate a core by a 1 in bit-position corresponding to the core CPU_ZERO(&coreMask); //initialize mask bits to zero - CPU_SET(coreCtlrThdParams->coreNum,&coreMask); //set bit repr the coreNum + CPU_SET(thisCoresThdParams->coreNum,&coreMask); //set bit repr the coreNum pthread_t selfThd = pthread_self(); errorCode = pthread_setaffinity_np( selfThd, sizeof(coreMask), &coreMask); @@ -118,8 +118,10 @@ } pthread_mutex_unlock( &suspendLock ); + HOLISTIC__CoreCtrl_Setup; + DEBUG__printf1(TRUE, "started coreCtrlr", thisCoresIdx ); - + //====================== The Core Controller ====================== while(1) //An endless loop is just one way of doing the control structure { //Assembly code switches the core between animating a VP and @@ -141,6 +143,7 @@ { numRepetitionsWithNoWork = 0; //reset back2back master count currSlotIdx ++; currVP = currSlot->slaveAssignedToSlot; + HOLISTIC__Record_last_work; } else //slot is empty, so switch to master { @@ -149,6 +152,7 @@ currVP = NULL; MEAS__Capture_Pre_Master_Lock_Point; + HOLISTIC__Record_AppResponderInvocation_start; int numTriesToGetLock = 0; int gotLock = 0; while( currVP == NULL ) //keep going until get master lock @@ -189,10 +193,13 @@ MEAS__Capture_Post_Master_Lock_Point; } + HOLISTIC__Record_Work_start; switchToSlv(currVP); //Slave suspend makes core "return" from this call flushRegisters(); //prevent GCC optimization from doing bad things + HOLISTIC__Record_Work_end; + MEAS__Capture_End_Susp_in_CoreCtlr_ForSys; }//while(1) diff -r ce1f57e10fac -r b95711c6965c Hardware_Dependent/VMS__HW_measurement.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Hardware_Dependent/VMS__HW_measurement.c Wed Mar 21 11:09:11 2012 +0100 @@ -0,0 +1,74 @@ +#include +#include +#include +#include +#include +#include +#include + +#include "../VMS.h" + +void setup_perf_counters(){ +#ifdef HOLISTIC__TURN_ON_PERF_COUNTERS + struct perf_event_attr hw_event; + memset(&hw_event,0,sizeof(hw_event)); + hw_event.type = PERF_TYPE_HARDWARE; + hw_event.size = sizeof(hw_event); + hw_event.disabled = 1; + hw_event.freq = 0; + hw_event.inherit = 1; /* children inherit it */ + hw_event.pinned = 1; /* must always be on PMU */ + hw_event.exclusive = 0; /* only group on PMU */ + hw_event.exclude_user = 0; /* don't count user */ + hw_event.exclude_kernel = 0; /* ditto kernel */ + hw_event.exclude_hv = 0; /* ditto hypervisor */ + hw_event.exclude_idle = 0; /* don't count when idle */ + hw_event.mmap = 0; /* include mmap data */ + hw_event.comm = 0; /* include comm data */ + + int coreIdx; + for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) + { + hw_event.config = 0x0000000000000000; //cycles + _VMSMasterEnv->cycles_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, + 0,//pid_t pid, + coreIdx,//int cpu, + -1,//int group_fd, + 0//unsigned long flags + ); + if (_VMSMasterEnv->cycles_counter_fd[coreIdx]<0){ + fprintf(stderr,"On core %d: ",coreIdx); + perror("Failed to open cycles counter"); + } + hw_event.config = 0x0000000000000001; //instrs + _VMSMasterEnv->instrs_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, + 0,//pid_t pid, + coreIdx,//int cpu, + -1,//int group_fd, + 0//unsigned long flags + ); + if (_VMSMasterEnv->instrs_counter_fd[coreIdx]<0){ + fprintf(stderr,"On core %d: ",coreIdx); + perror("Failed to open instrs counter"); + } + } + + prctl(PR_TASK_PERF_EVENTS_ENABLE); +#endif +} + +__inline__ uint64_t rdtsc(){ + uint32_t lo, hi; + __asm__ __volatile__ ( // serialize + "xorl %%eax,%%eax \n cpuid" + ::: "%rax", "%rbx", "%rcx", "%rdx"); + __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi)); + /* asm volatile("RDTSC;" + "movl %%eax, %0;" + "movl %%edx, %1;" + : "=m" (lo), "=m" (hi) + : + : "%eax", "%edx" + ); */ + return (uint64_t)hi << 32 | lo; +} \ No newline at end of file diff -r ce1f57e10fac -r b95711c6965c Hardware_Dependent/VMS__HW_measurement.h --- a/Hardware_Dependent/VMS__HW_measurement.h Mon Mar 19 10:03:45 2012 -0700 +++ b/Hardware_Dependent/VMS__HW_measurement.h Wed Mar 21 11:09:11 2012 +0100 @@ -58,5 +58,6 @@ //#define NUM_TSC_ROUND_TRIPS 10 void setup_perf_counters(); +uint64_t rdtsc(void); #endif /* */ diff -r ce1f57e10fac -r b95711c6965c Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h --- a/Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h Mon Mar 19 10:03:45 2012 -0700 +++ b/Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h Wed Mar 21 11:09:11 2012 +0100 @@ -313,6 +313,23 @@ Timestamp_end }; + #define saveCyclesAndInstrs(core,cycles,instrs) do{ \ + int cycles_fd = _VMSMasterEnv->cycles_counter_fd[core]; \ + int instrs_fd = _VMSMasterEnv->instrs_counter_fd[core]; \ + int nread; \ + \ + nread = read(cycles_fd,&(cycles),sizeof(cycles)); \ + if(nread<0){ \ + perror("Error reading cycles counter"); \ + cycles = 0; \ + } \ + \ + nread = read(instrs_fd,&(instrs),sizeof(instrs)); \ + if(nread<0){ \ + perror("Error reading cycles counter"); \ + instrs = 0; \ + } \ + } while (0) #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv \ int cycles_counter_fd[NUM_CORES]; \ @@ -320,22 +337,130 @@ uint64 start_master_lock[NUM_CORES][2]; \ CounterHandler counterHandler; - #define HOLISTIC__Setup_Perf_Counters void setup_perf_counters(); + #define HOLISTIC__Setup_Perf_Counters setup_perf_counters(); - #define HOLISTIC__Start_Perf_Counters prctl(PR_TASK_PERF_EVENTS_ENABLE); + + #define HOLISTIC__CoreCtrl_Setup \ + CounterHandler counterHandler = _VMSMasterEnv->counterHandler; \ + SlaveVP *lastVPBeforeMaster = NULL; \ + /*if(thisCoresThdParams->coreNum == 0){ \ + uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ + while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ + } \ + if(0 < (thisCoresThdParams->coreNum) && (thisCoresThdParams->coreNum) < (NUM_CORES - 1)){ \ + ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ + int sndctr = tsc_offset_resp(sendCoresThdParams, 0); \ + uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ + while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ + } \ + if(thisCoresThdParams->coreNum == (NUM_CORES - 1)){ \ + ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ + int sndctr = tsc_offset_resp(sendCoresThdParams,0); \ + }*/ + +#define HOLISTIC__Record_last_work lastVPBeforeMaster = currVP; + + #define HOLISTIC__Insert_Master_Global_Vars \ + int vpid,task; \ + CounterHandler counterHandler = masterEnv->counterHandler; + + #define HOLISTIC__Record_AppResponderInvocation_start \ + uint64 cycles,instrs; \ + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs); \ + if(lastVPBeforeMaster){ \ + (*counterHandler)(AppResponderInvocation_start,lastVPBeforeMaster->slaveID,lastVPBeforeMaster->assignCount,lastVPBeforeMaster,cycles,instrs); \ + lastVPBeforeMaster = NULL; \ + } else { \ + _VMSMasterEnv->start_master_lock[thisCoresIdx][0] = cycles; \ + _VMSMasterEnv->start_master_lock[thisCoresIdx][1] = instrs; \ + } + + /* Request Handler may call resume() on the VP, but we want to + * account the whole interval to the same task. Therefore, need + * to save task ID at the beginning. + * + * Using this value as "end of AppResponder Invocation Time" + * is possible if there is only one SchedSlot per core - + * invoking processor is last to be treated here! If more than + * one slot, MasterLoop processing time for all but the last VP + * would be erroneously counted as invocation time. + */ + #define HOLISTIC__Record_AppResponder_start \ + vpid = currSlot->slaveAssignedToSlot->slaveID; \ + task = currSlot->slaveAssignedToSlot->assignCount; \ + uint64 cycles, instrs; \ + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs); \ + (*counterHandler)(AppResponder_start,vpid,task,currSlot->slaveAssignedToSlot,cycles,instrs); + + #define HOLISTIC__Record_AppResponder_end \ + uint64 cycles2,instrs2; \ + saveCyclesAndInstrs(thisCoresIdx,cycles2, instrs2); \ + (*counterHandler)(AppResponder_end,vpid,task,currSlot->slaveAssignedToSlot,cycles2,instrs2); \ + (*counterHandler)(Timestamp_end,vpid,task,currSlot->slaveAssignedToSlot,rdtsc(),0); + + + /* Don't know who to account time to yet - goes to assigned VP + * after the call. + */ + #define HOLISTIC__Record_Assigner_start \ + int empty = FALSE; \ + if(currSlot->slaveAssignedToSlot == NULL){ \ + empty= TRUE; \ + } \ + uint64 tmp_cycles; \ + uint64 tmp_instrs; \ + saveCyclesAndInstrs(thisCoresIdx,tmp_cycles,tmp_instrs); \ + uint64 tsc = rdtsc(); \ + if(vpid > 0) { \ + (*counterHandler)(NextAssigner_start,vpid,task,currSlot->slaveAssignedToSlot,tmp_cycles,tmp_instrs); \ + vpid = 0; \ + task = 0; \ + } + + #define HOLISTIC__Record_Assigner_end \ + uint64 cycles; \ + uint64 instrs; \ + saveCyclesAndInstrs(thisCoresIdx,cycles,instrs); \ + if(empty){ \ + (*counterHandler)(AssignerInvocation_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,masterEnv->start_master_lock[thisCoresIdx][0],masterEnv->start_master_lock[thisCoresIdx][1]); \ + } \ + (*counterHandler)(Timestamp_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tsc,0); \ + (*counterHandler)(Assigner_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tmp_cycles,tmp_instrs); \ + (*counterHandler)(Assigner_end,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,cycles,instrs); + + #define HOLISTIC__Record_Work_start \ + if(currVP){ \ + uint64 cycles,instrs; \ + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs); \ + (*counterHandler)(Work_start,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs); \ + } + + #define HOLISTIC__Record_Work_end \ + if(currVP){ \ + uint64 cycles,instrs; \ + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs); \ + (*counterHandler)(Work_end,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs); \ + } #define HOLISTIC__Record_HwResponderInvocation_start \ uint64 cycles,instrs; \ - saveCyclesAndInstrs(animatingPr->coreAnimatedBy,cycles, instrs); \ - (*(_VMSMasterEnv->counterHandler))(HwResponderInvocation_start,animatingPr->procrID,animatingPr->numTimesScheduled,animatingPr,cycles,instrs); + saveCyclesAndInstrs(animatingSlv->coreAnimatedBy,cycles, instrs); \ + (*(_VMSMasterEnv->counterHandler))(HwResponderInvocation_start,animatingSlv->slaveID,animatingSlv->assignCount,animatingSlv,cycles,instrs); + - - + #else #define MEAS__Insert_Counter_Handler #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv #define HOLISTIC__Setup_Perf_Counters - #define HOLISTIC__Start_Perf_Counters + #define HOLISTIC__Record_AppResponderInvocation_start + #define HOLISTIC__Record_AppResponder_start + #define HOLISTIC__Record_AppResponder_end + #define HOLISTIC__Record_Assigner_start + #define HOLISTIC__Record_Assigner_end + #define HOLISTIC__Record_Work_start + #define HOLISTIC__Record_Work_end + #define HOLISTIC__Record_HwResponderInvocation_start #endif //Experiment in two-step macros -- if doesn't work, insert each separately diff -r ce1f57e10fac -r b95711c6965c VMS__int.c --- a/VMS__int.c Mon Mar 19 10:03:45 2012 -0700 +++ b/VMS__int.c Wed Mar 21 11:09:11 2012 +0100 @@ -81,6 +81,7 @@ //return ownership of the Slv and anim slot to Master virt pr animatingSlv->animSlotAssignedTo->workIsDone = TRUE; + HOLISTIC__Record_HwResponderInvocation_start; MEAS__Capture_Pre_Susp_Point; switchToCoreCtlr(animatingSlv); flushRegisters();