Mercurial > cgi-bin > hgwebdir.cgi > VMS > VMS_Implementations > VMS_impls > VMS__MC_shared_impl
changeset 260:999f2966a3e5 Dev_ML
new branch -- Dev_ML -- for making VMS take langlets whose constructs can be mixed
| author | Sean Halle <seanhalle@yahoo.com> |
|---|---|
| date | Wed, 19 Sep 2012 23:12:44 -0700 |
| parents | 0dc0b8653902 |
| children | dafae55597ce |
| files | AnimationMaster.c CoreController.c Defines/MEAS__macros_to_be_moved_to_langs.h Defines/PR_defs.h Defines/PR_defs__HW_constants.h Defines/VMS_defs.h Defines/VMS_defs__HW_constants.h HW_Dependent_Primitives/PR__HW_measurement.c HW_Dependent_Primitives/PR__HW_measurement.h HW_Dependent_Primitives/PR__primitives.c HW_Dependent_Primitives/PR__primitives.h HW_Dependent_Primitives/PR__primitives_asm.s HW_Dependent_Primitives/VMS__HW_measurement.c HW_Dependent_Primitives/VMS__HW_measurement.h HW_Dependent_Primitives/VMS__primitives.c HW_Dependent_Primitives/VMS__primitives.h HW_Dependent_Primitives/VMS__primitives_asm.s PR.h PR__PI.c PR__WL.c PR__int.c PR__startup_and_shutdown.c PR_primitive_data_types.h Services_Offered_by_PR/Measurement_and_Stats/MEAS__macros.h Services_Offered_by_PR/Measurement_and_Stats/probes.c Services_Offered_by_PR/Measurement_and_Stats/probes.h Services_Offered_by_PR/Memory_Handling/vmalloc.c Services_Offered_by_PR/Memory_Handling/vmalloc.h Services_Offered_by_VMS/Debugging/DEBUG__macros.h Services_Offered_by_VMS/Lang_Constructs/VMS_Lang.h Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h Services_Offered_by_VMS/Measurement_and_Stats/probes.c Services_Offered_by_VMS/Measurement_and_Stats/probes.h Services_Offered_by_VMS/Memory_Handling/vmalloc.c Services_Offered_by_VMS/Memory_Handling/vmalloc.h VMS.h VMS__PI.c VMS__WL.c VMS__int.c VMS__startup_and_shutdown.c VMS_primitive_data_types.h __README__Code_Overview.txt |
| diffstat | 42 files changed, 4729 insertions(+), 3923 deletions(-) [+] |
line diff
1.1 --- a/AnimationMaster.c Mon Sep 03 03:34:54 2012 -0700 1.2 +++ b/AnimationMaster.c Wed Sep 19 23:12:44 2012 -0700 1.3 @@ -9,7 +9,7 @@ 1.4 #include <stdio.h> 1.5 #include <stddef.h> 1.6 1.7 -#include "VMS.h" 1.8 +#include "PR.h" 1.9 1.10 1.11 1.12 @@ -20,11 +20,39 @@ 1.13 * 1.14 *Within the code, this is the top-level-function of the masterVPs, and 1.15 * runs when the coreController has no more slave VPs. It's job is to 1.16 - * refill the animation slots with slaves. 1.17 + * refill the animation slots with slaves that have work. 1.18 * 1.19 - *To do this, it scans the animation slots for just-completed slaves. 1.20 - * Each of these has a request in it. So, the master hands each to the 1.21 - * plugin's request handler. 1.22 + *There are multiple versions of the master, each tuned to a specific 1.23 + * combination of modes. This keeps the master simple, with reduced overhead, 1.24 + * when the application is not using the extra complexity. 1.25 + * 1.26 + *As of Sept 2012, the versions available will be: 1.27 + * 1) Single langauge, which only exposes slaves (such as SSR or Vthread) 1.28 + * 2) Single language, which only exposes tasks (such as pure dataflow) 1.29 + * 3) Single language, which exposes both (like Cilk, StarSs, and OpenMP) 1.30 + * 4) Multi-language, which always assumes both tasks and slaves 1.31 + * 5) Multi-language and multi-process, which also assumes both tasks and slaves 1.32 + * 1.33 + * 1.34 + * 1.35 + */ 1.36 + 1.37 + 1.38 +//===================== The versions of the Animation Master ================= 1.39 +// 1.40 +//============================================================================== 1.41 + 1.42 +/* 1) This version is for a single language, that has only slaves, no tasks, 1.43 + * such as Vthread or SSR. 1.44 + *This version is for when an application has only a single language, and 1.45 + * that language exposes slaves explicitly (as opposed to a task based 1.46 + * language like pure dataflow). 1.47 + * 1.48 + * 1.49 + *It scans the animation slots for just-completed slaves. 1.50 + * Each completed slave has a request in it. So, the master hands each to 1.51 + * the plugin's request handler (there is only one plugin, because only one 1.52 + * lang). 1.53 *Each request represents a language construct that has been encountered 1.54 * by the application code in the slave. Passing the request to the 1.55 * request handler is how that language construct's behavior gets invoked. 1.56 @@ -77,24 +105,24 @@ 1.57 *There is a separate masterVP for each core, but a single semantic 1.58 * environment shared by all cores. Each core also has its own scheduling 1.59 * slots, which are used to communicate slaves between animationMaster and 1.60 - * coreController. There is only one global variable, _VMSMasterEnv, which 1.61 + * coreController. There is only one global variable, _PRMasterEnv, which 1.62 * holds the semantic env and other things shared by the different 1.63 * masterVPs. The request handler and Assigner are registered with 1.64 * the animationMaster by the language's init function, and a pointer to 1.65 - * each is in the _VMSMasterEnv. (There are also some pthread related global 1.66 - * vars, but they're only used during init of VMS). 1.67 - *VMS gains control over the cores by essentially "turning off" the OS's 1.68 + * each is in the _PRMasterEnv. (There are also some pthread related global 1.69 + * vars, but they're only used during init of PR). 1.70 + *PR gains control over the cores by essentially "turning off" the OS's 1.71 * scheduler, using pthread pin-to-core commands. 1.72 * 1.73 *The masterVPs are created during init, with this animationMaster as their 1.74 * top level function. The masterVPs use the same SlaveVP data structure, 1.75 * even though they're not slave VPs. 1.76 *A "seed slave" is also created during init -- this is equivalent to the 1.77 - * "main" function in C, and acts as the entry-point to the VMS-language- 1.78 + * "main" function in C, and acts as the entry-point to the PR-language- 1.79 * based application. 1.80 - *The masterVPs shared a single system-wide master-lock, so only one 1.81 + *The masterVPs share a single system-wide master-lock, so only one 1.82 * masterVP may be animated at a time. 1.83 - *The core controllers access _VMSMasterEnv to get the masterVP, and when 1.84 + *The core controllers access _PRMasterEnv to get the masterVP, and when 1.85 * they start, the slots are all empty, so they run their associated core's 1.86 * masterVP. The first of those to get the master lock sees the seed slave 1.87 * in the shared semantic environment, so when it runs the Assigner, that 1.88 @@ -104,14 +132,14 @@ 1.89 * constructs to create more slaves, and so on. Each of those constructs 1.90 * causes the seed slave to suspend, switching over to the core controller, 1.91 * which eventually switches to the masterVP, which executes the 1.92 - * request handler, which uses VMS primitives to carry out the creation of 1.93 + * request handler, which uses PR primitives to carry out the creation of 1.94 * new slave VPs, which are marked as ready for the Assigner, and so on.. 1.95 * 1.96 *On animation slots, and system behavior: 1.97 - * A request may linger in a animation slot for a long time while 1.98 + * A request may linger in an animation slot for a long time while 1.99 * the slaves in the other slots are animated. This only becomes a problem 1.100 * when such a request is a choke-point in the constraints, and is needed 1.101 - * to free work for *other* cores. To reduce this occurance, the number 1.102 + * to free work for *other* cores. To reduce this occurrence, the number 1.103 * of animation slots should be kept low. In balance, having multiple 1.104 * animation slots amortizes the overhead of switching to the masterVP and 1.105 * executing the animationMaster code, which drives for more than one. In 1.106 @@ -163,7 +191,29 @@ 1.107 HOLISTIC__Record_AppResponder_start; 1.108 MEAS__startReqHdlr; 1.109 1.110 - //process the requests made by the slave (held inside slave struc) 1.111 + currSlot->workIsDone = FALSE; 1.112 + currSlot->needsSlaveAssigned = TRUE; 1.113 + SlaveVP *currSlave = currSlot->slaveAssignedToSlot; 1.114 + 1.115 + justAddedReqHdlrChg(); 1.116 + //handle the request, either by VMS or by the language 1.117 + if( currSlave->requests->reqType != LangReq ) 1.118 + { //The request is a standard VMS one, not one defined by the 1.119 + // language, so VMS handles it, then queues slave to be assigned 1.120 + handleReqInVMS( currSlave ); 1.121 + writePrivQ( currSlave, VMSReadyQ ); //Q slave to be assigned below 1.122 + } 1.123 + else 1.124 + { MEAS__startReqHdlr; 1.125 + 1.126 + //Language handles request, which is held inside slave struc 1.127 + (*requestHandler)( currSlave, semanticEnv ); 1.128 + 1.129 + MEAS__endReqHdlr; 1.130 + } 1.131 + } 1.132 + 1.133 + //process the requests made by the slave (held inside slave struc) 1.134 (*requestHandler)( currSlot->slaveAssignedToSlot, semanticEnv ); 1.135 1.136 HOLISTIC__Record_AppResponder_end; 1.137 @@ -196,3 +246,756 @@ 1.138 }//while(1) 1.139 } 1.140 1.141 + 1.142 +/* 2) This version is for a single language that has only tasks, which 1.143 + * cannot be suspended. 1.144 + */ 1.145 +void animationMaster( void *initData, SlaveVP *masterVP ) 1.146 + { 1.147 + //Used while scanning and filling animation slots 1.148 + int32 slotIdx, numSlotsFilled; 1.149 + AnimSlot *currSlot, **animSlots; 1.150 + SlaveVP *assignedSlaveVP; //the slave chosen by the assigner 1.151 + 1.152 + //Local copies, for performance 1.153 + MasterEnv *masterEnv; 1.154 + SlaveAssigner slaveAssigner; 1.155 + RequestHandler requestHandler; 1.156 + PRSemEnv *semanticEnv; 1.157 + int32 thisCoresIdx; 1.158 + 1.159 + //#ifdef MODE__MULTI_LANG 1.160 + SlaveVP *slave; 1.161 + PRProcess *process; 1.162 + PRConstrEnvHolder *constrEnvHolder; 1.163 + int32 langMagicNumber; 1.164 + //#endif 1.165 + 1.166 + //======================== Initializations ======================== 1.167 + masterEnv = (MasterEnv*)_PRMasterEnv; 1.168 + 1.169 + thisCoresIdx = masterVP->coreAnimatedBy; 1.170 + animSlots = masterEnv->allAnimSlots[thisCoresIdx]; 1.171 + 1.172 + requestHandler = masterEnv->requestHandler; 1.173 + slaveAssigner = masterEnv->slaveAssigner; 1.174 + semanticEnv = masterEnv->semanticEnv; 1.175 + 1.176 + //initialize, for non-multi-lang, non multi-proc case 1.177 + // default handler gets put into master env by a registration call by lang 1.178 + endTaskHandler = masterEnv->defaultTaskHandler; 1.179 + 1.180 + HOLISTIC__Insert_Master_Global_Vars; 1.181 + 1.182 + //======================== animationMaster ======================== 1.183 + //Do loop gets requests handled and work assigned to slots.. 1.184 + // work can either be a task or a resumed slave 1.185 + //Having two cases makes this logic complex.. can be finishing either, and 1.186 + // then the next available work may be either.. so really have two distinct 1.187 + // loops that are inter-twined.. 1.188 + while(1){ 1.189 + 1.190 + MEAS__Capture_Pre_Master_Point 1.191 + 1.192 + //Scan the animation slots 1.193 + numSlotsFilled = 0; 1.194 + for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++) 1.195 + { 1.196 + currSlot = animSlots[ slotIdx ]; 1.197 + 1.198 + //Check if newly-done slave in slot, which will need request handled 1.199 + if( currSlot->workIsDone ) 1.200 + { currSlot->workIsDone = FALSE; 1.201 + 1.202 + HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot 1.203 + MEAS__startReqHdlr; 1.204 + 1.205 + 1.206 + //process the request made by the slave (held inside slave struc) 1.207 + slave = currSlot->slaveAssignedToSlot; 1.208 + 1.209 + //check if the completed work was a task.. 1.210 + if( slave->taskMetaInfo->isATask ) 1.211 + { 1.212 + if( slave->reqst->type == TaskEnd ) 1.213 + { //do task end handler, which is registered separately 1.214 + //note, end hdlr may use semantic data from reqst.. 1.215 + //#ifdef MODE__MULTI_LANG 1.216 + //get end-task handler 1.217 + //taskEndHandler = lookup( slave->reqst->langMagicNumber, processEnv ); 1.218 + taskEndHandler = slave->taskMetaInfo->endTaskHandler; 1.219 + //#endif 1.220 + (*taskEndHandler)( slave, semanticEnv ); 1.221 + 1.222 + goto AssignWork; 1.223 + } 1.224 + else //is a task, and just suspended 1.225 + { //turn slot slave into free task slave & make replacement 1.226 + if( slave->typeOfVP == TaskSlotSlv ) changeSlvType(); 1.227 + 1.228 + //goto normal slave request handling 1.229 + goto SlaveReqHandling; 1.230 + } 1.231 + } 1.232 + else //is a slave that suspended 1.233 + { 1.234 + SlaveReqHandling: 1.235 + (*requestHandler)( slave, semanticEnv ); //(note: indirect Fn call more efficient when use fewer params, instead re-fetch from slave) 1.236 + 1.237 + HOLISTIC__Record_AppResponder_end; 1.238 + MEAS__endReqHdlr; 1.239 + 1.240 + goto AssignWork; 1.241 + } 1.242 + } //if has suspended slave that needs handling 1.243 + 1.244 + //if slot empty, hand to Assigner to fill with a slave 1.245 + if( currSlot->needsSlaveAssigned ) 1.246 + { //Call plugin's Assigner to give slot a new slave 1.247 + HOLISTIC__Record_Assigner_start; 1.248 + 1.249 + AssignWork: 1.250 + 1.251 + assignedSlaveVP = assignWork( semanticEnv, currSlot ); 1.252 + 1.253 + //put the chosen slave into slot, and adjust flags and state 1.254 + if( assignedSlaveVP != NULL ) 1.255 + { currSlot->slaveAssignedToSlot = assignedSlaveVP; 1.256 + assignedSlaveVP->animSlotAssignedTo = currSlot; 1.257 + currSlot->needsSlaveAssigned = FALSE; 1.258 + numSlotsFilled += 1; 1.259 + } 1.260 + else 1.261 + { 1.262 + currSlot->needsSlaveAssigned = TRUE; //local write 1.263 + } 1.264 + HOLISTIC__Record_Assigner_end; 1.265 + }//if slot needs slave assigned 1.266 + }//for( slotIdx.. 1.267 + 1.268 + MEAS__Capture_Post_Master_Point; 1.269 + 1.270 + masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master 1.271 + flushRegisters(); 1.272 + }//while(1) 1.273 + } 1.274 + 1.275 + 1.276 +/*This is the master when just multi-lang, but not multi-process mode is on. 1.277 + * This version has to handle both tasks and slaves, and do extra work of 1.278 + * looking up the semantic env and handlers to use, for each completed bit of 1.279 + * work. 1.280 + *It also has to search through the semantic envs to find one with work, 1.281 + * then ask that env's assigner to return a unit of that work. 1.282 + * 1.283 + *The language is written to startup in the same way as if it were the only 1.284 + * language in the app, and it operates in the same way, 1.285 + * the only difference between single language and multi-lang is here, in the 1.286 + * master. 1.287 + *This invisibility to mode is why the language has to use registration calls 1.288 + * for everything during startup -- those calls do different things depending 1.289 + * on whether it's single-language or multi-language mode. 1.290 + * 1.291 + *In this version of the master, work can either be a task or a resumed slave 1.292 + *Having two cases makes this logic complex.. can be finishing either, and 1.293 + * then the next available work may be either.. so really have two distinct 1.294 + * loops that are inter-twined.. 1.295 + * 1.296 + *Some special cases: 1.297 + * A task-end is a special case for a few reasons (below). 1.298 + * A task-end can't block a slave (can't cause it to "logically suspend") 1.299 + * A task available for work can only be assigned to a special slave, which 1.300 + * has been set aside for doing tasks, one such task-slave is always 1.301 + * assigned to each slot. So, when a task ends, a new task is assigned to 1.302 + * that slot's task-slave right away. 1.303 + * But if no tasks are available, then have to switch over to looking at 1.304 + * slaves to find one ready to resume, to find work for the slot. 1.305 + * If a task just suspends, not ends, then its task-slave is no longer 1.306 + * available to take new tasks, so a new task-slave has to be assigned to 1.307 + * that slot. Then the slave of the suspended task is turned into a free 1.308 + * task-slave and request handling is done on it as if it were a slave 1.309 + * that suspended. 1.310 + * After request handling, do the same sequence of looking for a task to be 1.311 + * work, and if none, look for a slave ready to resume, as work for the slot. 1.312 + * If a slave suspends, handle its request, then look for work.. first for a 1.313 + * task to assign, and if none, slaves ready to resume. 1.314 + * Another special case is when task-end is done on a free task-slave.. in 1.315 + * that case, the slave has no more work and no way to get more.. so place 1.316 + * it into a recycle queue. 1.317 + * If no work is found of either type, then do a special thing to prune down 1.318 + * the extra slaves in the recycle queue, just so don't get too many.. 1.319 + * 1.320 + *The multi-lang thing complicates matters.. 1.321 + * 1.322 + *For request handling, it means have to first fetch the semantic environment 1.323 + * of the language, and then do the request handler pointed to by that 1.324 + * semantic env. 1.325 + *For assigning, things get more complex because of competing goals.. One 1.326 + * goal is for language specific stuff to be used during assignment, so 1.327 + * assigner can make higher quality decisions.. but with multiple languages, 1.328 + * which only get mixed in the application, the assigners can't be written 1.329 + * with knowledge of each other. So, they can only make localized decisions, 1.330 + * and so different language's assigners may interfere with each other.. 1.331 + * 1.332 + *So, have some possibilities available: 1.333 + *1) can have a fixed scheduler in the proto-runtime, that all the 1.334 + * languages give their work to.. (but then lose language-specific info, 1.335 + * there is a standard PR format for assignment info, and the langauge 1.336 + * attaches this to the work-unit when it gives it to PR.. also have issue 1.337 + * with HWSim, which uses a priority Q instead of FIFO, and requests can 1.338 + * "undo" previous work put in, so request handlers need way to manipulate 1.339 + * the work-holding Q..) (this might be fudgeable with 1.340 + * HWSim, if the master did a lang-supplied callback each time it assigns a 1.341 + * unit to a slot.. then HWSim can keep exactly one unit of work in PR's 1.342 + * queue at a time.. but this is quite hack-like.. or perhaps HWSim supplies 1.343 + * a task-end handler that kicks the next unit of work from HWSim internal 1.344 + * priority queue, over to PR readyQ) 1.345 + *2) can have each language have its own semantic env, that holds its own 1.346 + * work, which is assigned by its own assigner.. then the master searches 1.347 + * through all the semantic envs to find one with work and asks it give work.. 1.348 + * (this has downside of blinding assigners to each other.. but does work 1.349 + * for HWSim case) 1.350 + *3) could make PR have a different readyQ for each core, and ask the lang 1.351 + * to put work to the core it prefers.. but the work may be moved by PR if 1.352 + * needed, say if one core idles for too long. This is a hybrid approach, 1.353 + * letting the language decide which core, but PR keeps the work and does it 1.354 + * FIFO style.. (this might als be fudgeable with HWSim, in similar fashion, 1.355 + * but it would be complicated by having to track cores separately) 1.356 + * 1.357 + *Choosing 2, to keep compatibility with single-lang mode.. it allows the same 1.358 + * assigner to be used for single-lang as for multi-lang.. the overhead of 1.359 + * the extra master search for work is part of the price of the flexibility, 1.360 + * but should be fairly small.. takes the first env that has work available, 1.361 + * and whatever it returns is assigned to the slot.. 1.362 + * 1.363 + *As a hybrid, giving an option for a unified override assigner to be registered 1.364 + * and used.. This allows something like a static analysis to detect 1.365 + * which languages are grouped together, and then analyze the pattern of 1.366 + * construct calls, and generate a custom assigner that uses info from all 1.367 + * the languages in a unified way.. Don't really expect this to happen, 1.368 + * but making it possible. 1.369 + */ 1.370 +#ifdef MODE__MULTI_LANG 1.371 +void animationMaster( void *initData, SlaveVP *masterVP ) 1.372 + { 1.373 + //Used while scanning and filling animation slots 1.374 + int32 slotIdx, numSlotsFilled; 1.375 + AnimSlot *currSlot, **animSlots; 1.376 + SlaveVP *assignedSlaveVP; //the slave chosen by the assigner 1.377 + 1.378 + //Local copies, for performance 1.379 + MasterEnv *masterEnv; 1.380 + SlaveAssigner slaveAssigner; 1.381 + RequestHandler requestHandler; 1.382 + PRSemEnv *semanticEnv; 1.383 + int32 thisCoresIdx; 1.384 + 1.385 + //#ifdef MODE__MULTI_LANG 1.386 + SlaveVP *slave; 1.387 + PRProcess *process; 1.388 + PRConstrEnvHolder *constrEnvHolder; 1.389 + int32 langMagicNumber; 1.390 + //#endif 1.391 + 1.392 + //======================== Initializations ======================== 1.393 + masterEnv = (MasterEnv*)_PRMasterEnv; 1.394 + 1.395 + thisCoresIdx = masterVP->coreAnimatedBy; 1.396 + animSlots = masterEnv->allAnimSlots[thisCoresIdx]; 1.397 + 1.398 + requestHandler = masterEnv->requestHandler; 1.399 + slaveAssigner = masterEnv->slaveAssigner; 1.400 + semanticEnv = masterEnv->semanticEnv; 1.401 + 1.402 + //initialize, for non-multi-lang, non multi-proc case 1.403 + // default handler gets put into master env by a registration call by lang 1.404 + endTaskHandler = masterEnv->defaultTaskHandler; 1.405 + 1.406 + HOLISTIC__Insert_Master_Global_Vars; 1.407 + 1.408 + //======================== animationMaster ======================== 1.409 + //Do loop gets requests handled and work assigned to slots.. 1.410 + // work can either be a task or a resumed slave 1.411 + //Having two cases makes this logic complex.. can be finishing either, and 1.412 + // then the next available work may be either.. so really have two distinct 1.413 + // loops that are inter-twined.. 1.414 + while(1){ 1.415 + 1.416 + MEAS__Capture_Pre_Master_Point 1.417 + 1.418 + //Scan the animation slots 1.419 + numSlotsFilled = 0; 1.420 + for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++) 1.421 + { 1.422 + currSlot = animSlots[ slotIdx ]; 1.423 + 1.424 + //Check if newly-done slave in slot, which will need request handled 1.425 + if( currSlot->workIsDone ) 1.426 + { currSlot->workIsDone = FALSE; 1.427 + 1.428 + HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot 1.429 + MEAS__startReqHdlr; 1.430 + 1.431 + 1.432 + //process the request made by the slave (held inside slave struc) 1.433 + slave = currSlot->slaveAssignedToSlot; 1.434 + 1.435 + //check if the completed work was a task.. 1.436 + if( slave->taskMetaInfo->isATask ) 1.437 + { 1.438 + if( slave->reqst->type == TaskEnd ) 1.439 + { //do task end handler, which is registered separately 1.440 + //note, end hdlr may use semantic data from reqst.. 1.441 + //#ifdef MODE__MULTI_LANG 1.442 + //get end-task handler 1.443 + //taskEndHandler = lookup( slave->reqst->langMagicNumber, processEnv ); 1.444 + taskEndHandler = slave->taskMetaInfo->endTaskHandler; 1.445 + //#endif 1.446 + (*taskEndHandler)( slave, semanticEnv ); 1.447 + 1.448 + goto AssignWork; 1.449 + } 1.450 + else //is a task, and just suspended 1.451 + { //turn slot slave into free task slave & make replacement 1.452 + if( slave->typeOfVP == TaskSlotSlv ) changeSlvType(); 1.453 + 1.454 + //goto normal slave request handling 1.455 + goto SlaveReqHandling; 1.456 + } 1.457 + } 1.458 + else //is a slave that suspended 1.459 + { 1.460 + SlaveReqHandling: 1.461 + (*requestHandler)( slave, semanticEnv ); //(note: indirect Fn call more efficient when use fewer params, instead re-fetch from slave) 1.462 + 1.463 + HOLISTIC__Record_AppResponder_end; 1.464 + MEAS__endReqHdlr; 1.465 + 1.466 + goto AssignWork; 1.467 + } 1.468 + } //if has suspended slave that needs handling 1.469 + 1.470 + //if slot empty, hand to Assigner to fill with a slave 1.471 + if( currSlot->needsSlaveAssigned ) 1.472 + { //Call plugin's Assigner to give slot a new slave 1.473 + HOLISTIC__Record_Assigner_start; 1.474 + 1.475 + AssignWork: 1.476 + 1.477 + assignedSlaveVP = assignWork( semanticEnv, currSlot ); 1.478 + 1.479 + //put the chosen slave into slot, and adjust flags and state 1.480 + if( assignedSlaveVP != NULL ) 1.481 + { currSlot->slaveAssignedToSlot = assignedSlaveVP; 1.482 + assignedSlaveVP->animSlotAssignedTo = currSlot; 1.483 + currSlot->needsSlaveAssigned = FALSE; 1.484 + numSlotsFilled += 1; 1.485 + } 1.486 + else 1.487 + { 1.488 + currSlot->needsSlaveAssigned = TRUE; //local write 1.489 + } 1.490 + HOLISTIC__Record_Assigner_end; 1.491 + }//if slot needs slave assigned 1.492 + }//for( slotIdx.. 1.493 + 1.494 + MEAS__Capture_Post_Master_Point; 1.495 + 1.496 + masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master 1.497 + flushRegisters(); 1.498 + }//while(1) 1.499 + } 1.500 +#endif //MODE__MULTI_LANG 1.501 + 1.502 + 1.503 + 1.504 +//This is the master when both multi-lang and multi-process modes are turned on 1.505 +//#ifdef MODE__MULTI_LANG 1.506 +//#ifdef MODE__MULTI_PROCESS 1.507 +void animationMaster( void *initData, SlaveVP *masterVP ) 1.508 + { 1.509 + //Used while scanning and filling animation slots 1.510 + int32 slotIdx, numSlotsFilled; 1.511 + AnimSlot *currSlot, **animSlots; 1.512 + SlaveVP *assignedSlaveVP; //the slave chosen by the assigner 1.513 + 1.514 + //Local copies, for performance 1.515 + MasterEnv *masterEnv; 1.516 + SlaveAssigner slaveAssigner; 1.517 + RequestHandler requestHandler; 1.518 + PRSemEnv *semanticEnv; 1.519 + int32 thisCoresIdx; 1.520 + 1.521 + SlaveVP *slave; 1.522 + PRProcess *process; 1.523 + PRConstrEnvHolder *constrEnvHolder; 1.524 + int32 langMagicNumber; 1.525 + 1.526 + //======================== Initializations ======================== 1.527 + masterEnv = (MasterEnv*)_PRMasterEnv; 1.528 + 1.529 + thisCoresIdx = masterVP->coreAnimatedBy; 1.530 + animSlots = masterEnv->allAnimSlots[thisCoresIdx]; 1.531 + 1.532 + requestHandler = masterEnv->requestHandler; 1.533 + slaveAssigner = masterEnv->slaveAssigner; 1.534 + semanticEnv = masterEnv->semanticEnv; 1.535 + 1.536 + //initialize, for non-multi-lang, non multi-proc case 1.537 + // default handler gets put into master env by a registration call by lang 1.538 + endTaskHandler = masterEnv->defaultTaskHandler; 1.539 + 1.540 + HOLISTIC__Insert_Master_Global_Vars; 1.541 + 1.542 + //======================== animationMaster ======================== 1.543 + //Do loop gets requests handled and work assigned to slots.. 1.544 + // work can either be a task or a resumed slave 1.545 + //Having two cases makes this logic complex.. can be finishing either, and 1.546 + // then the next available work may be either.. so really have two distinct 1.547 + // loops that are inter-twined.. 1.548 + while(1){ 1.549 + 1.550 + MEAS__Capture_Pre_Master_Point 1.551 + 1.552 + //Scan the animation slots 1.553 + numSlotsFilled = 0; 1.554 + for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++) 1.555 + { 1.556 + currSlot = animSlots[ slotIdx ]; 1.557 + 1.558 + //Check if newly-done slave in slot, which will need request handled 1.559 + if( currSlot->workIsDone ) 1.560 + { currSlot->workIsDone = FALSE; 1.561 + 1.562 + HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot 1.563 + MEAS__startReqHdlr; 1.564 + 1.565 + 1.566 + //process the request made by the slave (held inside slave struc) 1.567 + slave = currSlot->slaveAssignedToSlot; 1.568 + 1.569 + //check if the completed work was a task.. 1.570 + if( slave->taskMetaInfo->isATask ) 1.571 + { 1.572 + if( slave->reqst->type == TaskEnd ) 1.573 + { //do task end handler, which is registered separately 1.574 + //note, end hdlr may use semantic data from reqst.. 1.575 + //get end-task handler 1.576 + //taskEndHandler = lookup( slave->reqst->langMagicNumber, processEnv ); 1.577 + taskEndHandler = slave->taskMetaInfo->endTaskHandler; 1.578 + 1.579 + (*taskEndHandler)( slave, semanticEnv ); 1.580 + 1.581 + goto AssignWork; 1.582 + } 1.583 + else //is a task, and just suspended 1.584 + { //turn slot slave into free task slave & make replacement 1.585 + if( slave->typeOfVP == TaskSlotSlv ) changeSlvType(); 1.586 + 1.587 + //goto normal slave request handling 1.588 + goto SlaveReqHandling; 1.589 + } 1.590 + } 1.591 + else //is a slave that suspended 1.592 + { 1.593 + 1.594 + SlaveReqHandling: 1.595 + (*requestHandler)( slave, semanticEnv ); //(note: indirect Fn call more efficient when use fewer params, instead re-fetch from slave) 1.596 + 1.597 + HOLISTIC__Record_AppResponder_end; 1.598 + MEAS__endReqHdlr; 1.599 + 1.600 + goto AssignWork; 1.601 + } 1.602 + } //if has suspended slave that needs handling 1.603 + 1.604 + //if slot empty, hand to Assigner to fill with a slave 1.605 + if( currSlot->needsSlaveAssigned ) 1.606 + { //Scan sem environs, looking for one with ready work. 1.607 + // call the Assigner for that sem Env, to give slot a new slave 1.608 + HOLISTIC__Record_Assigner_start; 1.609 + 1.610 + AssignWork: 1.611 + 1.612 + assignedSlaveVP = assignWork( semanticEnv, currSlot ); 1.613 + 1.614 + //put the chosen slave into slot, and adjust flags and state 1.615 + if( assignedSlaveVP != NULL ) 1.616 + { currSlot->slaveAssignedToSlot = assignedSlaveVP; 1.617 + assignedSlaveVP->animSlotAssignedTo = currSlot; 1.618 + currSlot->needsSlaveAssigned = FALSE; 1.619 + numSlotsFilled += 1; 1.620 + } 1.621 + else 1.622 + { 1.623 + currSlot->needsSlaveAssigned = TRUE; //local write 1.624 + } 1.625 + HOLISTIC__Record_Assigner_end; 1.626 + }//if slot needs slave assigned 1.627 + }//for( slotIdx.. 1.628 + 1.629 + MEAS__Capture_Post_Master_Point; 1.630 + 1.631 + masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master 1.632 + flushRegisters(); 1.633 + }//while(1) 1.634 + } 1.635 +#endif //MODE__MULTI_LANG 1.636 +#endif //MODE__MULTI_PROCESS 1.637 + 1.638 + 1.639 +/*This does three things: 1.640 + * 1) ask for a slave ready to resume 1.641 + * 2) if none, then ask for a task, and assign to the slot slave 1.642 + * 3) if none, then prune former task slaves waiting to be recycled. 1.643 + * 1.644 + //Have two separate assigners in each semantic env, 1.645 + // which keeps its own work in its own structures.. the master, here, 1.646 + // searches through the semantic environs, takes the first that has work 1.647 + // available, and whatever it returns is assigned to the slot.. 1.648 + //However, also have an override assigner.. because static analysis tools know 1.649 + // which languages are grouped together.. and the override enables them to 1.650 + // generate a custom assigner that uses info from all the languages in a 1.651 + // unified way.. Don't really expect this to happen, but making it possible. 1.652 + */ 1.653 +inline SlaveVP * 1.654 +assignWork( PRProcessEnv *processEnv, AnimSlot *slot ) 1.655 + { SlaveVP *returnSlv; 1.656 + //VSsSemEnv *semEnv; 1.657 + //VSsSemData *semData; 1.658 + int32 coreNum, slotNum; 1.659 + PRTaskMetaInfo *newTaskStub; 1.660 + SlaveVP *freeTaskSlv; 1.661 + 1.662 + 1.663 + //master has to handle slot slaves.. so either assigner returns 1.664 + // taskMetaInfo or else two assigners, one for slaves, other for tasks.. 1.665 + semEnvs = processEnv->semEnvs; 1.666 + numEnvs = processEnv->numSemEnvs; 1.667 + for( envIdx = 0; envIdx < numEnvs; envIdx++ ) 1.668 + { semEnv = semEnvs[envIdx]; 1.669 + if( semEnv->hasWork ) 1.670 + { assigner = semEnv->assigner; 1.671 + retTaskMetaInfo = (*assigner)( semEnv, slot ); 1.672 + 1.673 + return retTaskMetaInfo; //quit, have work 1.674 + } 1.675 + } 1.676 + 1.677 + coreNum = slot->coreSlotIsOn; 1.678 + slotNum = slot->slotIdx; 1.679 + 1.680 + //first try to get a ready slave 1.681 + returnSlv = getReadySlave(); 1.682 + 1.683 + if( returnSlv != NULL ) 1.684 + { returnSlv->coreAnimatedBy = coreNum; 1.685 + 1.686 + //have work, so reset Done flag (when work generated on other core) 1.687 + if( processEnv->coreIsDone[coreNum] == TRUE ) //reads are higher perf 1.688 + processEnv->coreIsDone[coreNum] = FALSE; //don't just write always 1.689 + 1.690 + goto ReturnTheSlv; 1.691 + } 1.692 + 1.693 + //were no slaves, so try to get a ready task.. 1.694 + newTaskStub = getTaskStub(); 1.695 + 1.696 + if( newTaskStub != NULL ) 1.697 + { 1.698 + //get the slot slave to assign the task to.. 1.699 + returnSlv = processEnv->slotTaskSlvs[coreNum][slotNum]; 1.700 + 1.701 + //point slave to task's function, and mark slave as having task 1.702 + PR_int__reset_slaveVP_to_TopLvlFn( returnSlv, 1.703 + newTaskStub->taskType->fn, newTaskStub->args ); 1.704 + returnSlv->taskStub = newTaskStub; 1.705 + newTaskStub->slaveAssignedTo = returnSlv; 1.706 + returnSlv->needsTaskAssigned = FALSE; //slot slave is a "Task" slave type 1.707 + 1.708 + //have work, so reset Done flag, if was set 1.709 + if( processEnv->coreIsDone[coreNum] == TRUE ) //reads are higher perf 1.710 + processEnv->coreIsDone[coreNum] = FALSE; //don't just write always 1.711 + 1.712 + goto ReturnTheSlv; 1.713 + } 1.714 + else 1.715 + { //no task, so prune the recycle pool of free task slaves 1.716 + freeTaskSlv = readPrivQ( processEnv->freeTaskSlvRecycleQ ); 1.717 + if( freeTaskSlv != NULL ) 1.718 + { //delete to bound the num extras, and deliver shutdown cond 1.719 + handleDissipate( freeTaskSlv, processEnv ); 1.720 + //then return NULL 1.721 + returnSlv = NULL; 1.722 + 1.723 + goto ReturnTheSlv; 1.724 + } 1.725 + else 1.726 + { //candidate for shutdown.. if all extras dissipated, and no tasks 1.727 + // and no ready to resume slaves, then no way to generate 1.728 + // more tasks (on this core -- other core might have task still) 1.729 + if( processEnv->numLiveExtraTaskSlvs == 0 && 1.730 + processEnv->numLiveThreadSlvs == 0 ) 1.731 + { //This core sees no way to generate more tasks, so say it 1.732 + if( processEnv->coreIsDone[coreNum] == FALSE ) 1.733 + { processEnv->numCoresDone += 1; 1.734 + processEnv->coreIsDone[coreNum] = TRUE; 1.735 + #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 1.736 + processEnv->shutdownInitiated = TRUE; 1.737 + 1.738 + #else 1.739 + if( processEnv->numCoresDone == NUM_CORES ) 1.740 + { //means no cores have work, and none can generate more 1.741 + processEnv->shutdownInitiated = TRUE; 1.742 + } 1.743 + #endif 1.744 + } 1.745 + } 1.746 + //check if shutdown has been initiated by this or other core 1.747 + if(processEnv->shutdownInitiated) 1.748 + { returnSlv = PR_SS__create_shutdown_slave(); 1.749 + } 1.750 + else 1.751 + returnSlv = NULL; 1.752 + 1.753 + goto ReturnTheSlv; //don't need, but completes pattern 1.754 + } //if( freeTaskSlv != NULL ) 1.755 + } //if( newTaskStub == NULL ) 1.756 + //outcome: 1)slave was just pointed to task, 2)no tasks, so slave NULL 1.757 + 1.758 + 1.759 + ReturnTheSlv: //All paths goto here.. to provide single point for holistic.. 1.760 + 1.761 + #ifdef HOLISTIC__TURN_ON_OBSERVE_UCC 1.762 + if( returnSlv == NULL ) 1.763 + { returnSlv = processEnv->idleSlv[coreNum][slotNum]; 1.764 + 1.765 + //things that would normally happen in resume(), but idle VPs 1.766 + // never go there 1.767 + returnSlv->assignCount++; //gives each idle unit a unique ID 1.768 + Unit newU; 1.769 + newU.vp = returnSlv->slaveID; 1.770 + newU.task = returnSlv->assignCount; 1.771 + addToListOfArrays(Unit,newU,processEnv->unitList); 1.772 + 1.773 + if (returnSlv->assignCount > 1) //make a dependency from prev idle unit 1.774 + { Dependency newD; // to this one 1.775 + newD.from_vp = returnSlv->slaveID; 1.776 + newD.from_task = returnSlv->assignCount - 1; 1.777 + newD.to_vp = returnSlv->slaveID; 1.778 + newD.to_task = returnSlv->assignCount; 1.779 + addToListOfArrays(Dependency, newD ,processEnv->ctlDependenciesList); 1.780 + } 1.781 + } 1.782 + else //have a slave will be assigned to the slot 1.783 + { //assignSlv->numTimesAssigned++; 1.784 + //get previous occupant of the slot 1.785 + Unit prev_in_slot = 1.786 + processEnv->last_in_slot[coreNum * NUM_ANIM_SLOTS + slotNum]; 1.787 + if(prev_in_slot.vp != 0) //if not first slave in slot, make dependency 1.788 + { Dependency newD; // is a hardware dependency 1.789 + newD.from_vp = prev_in_slot.vp; 1.790 + newD.from_task = prev_in_slot.task; 1.791 + newD.to_vp = returnSlv->slaveID; 1.792 + newD.to_task = returnSlv->assignCount; 1.793 + addToListOfArrays(Dependency,newD,processEnv->hwArcs); 1.794 + } 1.795 + prev_in_slot.vp = returnSlv->slaveID; //make new slave the new previous 1.796 + prev_in_slot.task = returnSlv->assignCount; 1.797 + processEnv->last_in_slot[coreNum * NUM_ANIM_SLOTS + slotNum] = 1.798 + prev_in_slot; 1.799 + } 1.800 + #endif 1.801 + 1.802 + return( returnSlv ); 1.803 + } 1.804 + 1.805 + 1.806 +//================================================================= 1.807 + //#else //is MODE__MULTI_LANG 1.808 + //For multi-lang mode, first, get the constraint-env holder out of 1.809 + // the process, which is in the slave. 1.810 + //Second, get the magic number out of the request, use it to look up 1.811 + // the constraint Env within the constraint-env holder. 1.812 + //Then get the request handler out of the constr env 1.813 + constrEnvHolder = slave->process->constrEnvHolder; 1.814 + reqst = slave->request; 1.815 + langMagicNumber = reqst->langMagicNumber; 1.816 + semanticEnv = lookup( langMagicNumber, constrEnvHolder ); //a macro 1.817 + if( slave->reqst->type == taskEnd ) //end-task is special 1.818 + { //need to know what lang's task ended 1.819 + taskEndHandler = semanticEnv->taskEndHandler; 1.820 + (*taskEndHandler)( slave, reqst, semanticEnv ); //can put semantic data into task end reqst, for continuation, etc 1.821 + //this is a slot slave, get a new task for it 1.822 + if( !existsOverrideAssigner )//if exists, is set above, before loop 1.823 + { //search for task assigner that has work 1.824 + for( a = 0; a < num_assigners; a++ ) 1.825 + { if( taskAssigners[a]->hasWork ) 1.826 + { newTaskAssigner = taskAssigners[a]; 1.827 + (*newTaskAssigner)( slave, semanticEnv ); 1.828 + goto GotTask; 1.829 + } 1.830 + } 1.831 + goto NoTasks; 1.832 + } 1.833 + 1.834 + GotTask: 1.835 + continue; //have work, so do next iter of loop, don't call slave assigner 1.836 + } 1.837 + if( slave->typeOfVP == taskSlotSlv ) changeSlvType();//is suspended task 1.838 + //now do normal suspended slave request handler 1.839 + requestHandler = semanticEnv->requestHandler; 1.840 + //#endif 1.841 + 1.842 + 1.843 + } 1.844 + //If make it here, then was no task for this slot 1.845 + //slot empty, hand to Assigner to fill with a slave 1.846 + if( currSlot->needsSlaveAssigned ) 1.847 + { //Call plugin's Assigner to give slot a new slave 1.848 + HOLISTIC__Record_Assigner_start; 1.849 + 1.850 + //#ifdef MODE__MULTI_LANG 1.851 + NoTasks: 1.852 + //First, choose an Assigner.. 1.853 + //There are several Assigners, one for each langlet.. they all 1.854 + // indicate whether they have work available.. just pick the first 1.855 + // one that has work.. Or, if there's a Unified Assigner, call 1.856 + // that one.. So, go down array, checking.. 1.857 + if( !existsOverrideAssigner ) 1.858 + { for( a = 0; a < num_assigners; a++ ) 1.859 + { if( assigners[a]->hasWork ) 1.860 + { slaveAssigner = assigners[a]; 1.861 + goto GotAssigner; 1.862 + } 1.863 + } 1.864 + //no work, so just continue to next iter of scan loop 1.865 + continue; 1.866 + } 1.867 + //when exists override, the assigner is set, once, above, so do nothing 1.868 + GotAssigner: 1.869 + //#endif 1.870 + 1.871 + assignedSlaveVP = 1.872 + (*slaveAssigner)( semanticEnv, currSlot ); 1.873 + 1.874 + //put the chosen slave into slot, and adjust flags and state 1.875 + if( assignedSlaveVP != NULL ) 1.876 + { currSlot->slaveAssignedToSlot = assignedSlaveVP; 1.877 + assignedSlaveVP->animSlotAssignedTo = currSlot; 1.878 + currSlot->needsSlaveAssigned = FALSE; 1.879 + numSlotsFilled += 1; 1.880 + 1.881 + HOLISTIC__Record_Assigner_end; 1.882 + } 1.883 + }//if slot needs slave assigned 1.884 + }//for( slotIdx.. 1.885 + 1.886 + MEAS__Capture_Post_Master_Point; 1.887 + 1.888 + masterSwitchToCoreCtlr( masterVP ); 1.889 + flushRegisters(); 1.890 + DEBUG__printf(FALSE,"came back after switch to core -- so lock released!"); 1.891 + }//while(1) 1.892 + } 1.893 +
2.1 --- a/CoreController.c Mon Sep 03 03:34:54 2012 -0700 2.2 +++ b/CoreController.c Wed Sep 19 23:12:44 2012 -0700 2.3 @@ -5,7 +5,7 @@ 2.4 */ 2.5 2.6 2.7 -#include "VMS.h" 2.8 +#include "PR.h" 2.9 2.10 #include <stdlib.h> 2.11 #include <stdio.h> 2.12 @@ -55,9 +55,9 @@ 2.13 * amortize the overhead of switching to the master VP and running it. With 2.14 * multiple animation slots, the time to switch-to-master and the code in 2.15 * the animation master is divided by the number of animation slots. 2.16 - *The core controller and animation slots are not fundamental parts of VMS, 2.17 + *The core controller and animation slots are not fundamental parts of PR, 2.18 * but rather optimizations put into the shared-semantic-state version of 2.19 - * VMS. Other versions of VMS will not have a core controller nor scheduling 2.20 + * PR. Other versions of PR will not have a core controller nor scheduling 2.21 * slots. 2.22 * 2.23 *The core controller "owns" the physical core, in effect, and is the 2.24 @@ -92,13 +92,13 @@ 2.25 thisCoresIdx = thisCoresThdParams->coreNum; 2.26 2.27 //Assembly that saves addr of label of return instr -- label in assmbly 2.28 - recordCoreCtlrReturnLabelAddr((void**)&(_VMSMasterEnv->coreCtlrReturnPt)); 2.29 + recordCoreCtlrReturnLabelAddr((void**)&(_PRMasterEnv->coreCtlrReturnPt)); 2.30 2.31 - animSlots = _VMSMasterEnv->allAnimSlots[thisCoresIdx]; 2.32 + animSlots = _PRMasterEnv->allAnimSlots[thisCoresIdx]; 2.33 currSlotIdx = 0; //start at slot 0, go up until one empty, then do master 2.34 numRepetitionsWithNoWork = 0; 2.35 - addrOfMasterLock = &(_VMSMasterEnv->masterLock); 2.36 - thisCoresMasterVP = _VMSMasterEnv->masterVPs[thisCoresIdx]; 2.37 + addrOfMasterLock = &(_PRMasterEnv->masterLock); 2.38 + thisCoresMasterVP = _PRMasterEnv->masterVPs[thisCoresIdx]; 2.39 2.40 //==================== pthread related stuff ====================== 2.41 //pin the pthread to the core -- takes away Linux control 2.42 @@ -113,7 +113,7 @@ 2.43 2.44 //make sure the controllers all start at same time, by making them wait 2.45 pthread_mutex_lock( &suspendLock ); 2.46 - while( !(_VMSMasterEnv->setupComplete) ) 2.47 + while( !(_PRMasterEnv->setupComplete) ) 2.48 { pthread_cond_wait( &suspendCond, &suspendLock ); 2.49 } 2.50 pthread_mutex_unlock( &suspendLock ); 2.51 @@ -209,7 +209,7 @@ 2.52 }//while(1) 2.53 } 2.54 2.55 -/*Shutdown of VMS involves several steps, of which this is the last. This 2.56 +/*Shutdown of PR involves several steps, of which this is the last. This 2.57 * function is jumped to from the asmTerminateCoreCtrl, which is in turn 2.58 * called from endOSThreadFn, which is the top-level-fn of the shutdown 2.59 * slaves. 2.60 @@ -218,18 +218,18 @@ 2.61 terminateCoreCtlr(SlaveVP *currSlv) 2.62 { 2.63 //first, free shutdown Slv that jumped here, then end the pthread 2.64 - VMS_int__dissipate_slaveVP( currSlv ); 2.65 + PR_int__dissipate_slaveVP( currSlv ); 2.66 pthread_exit( NULL ); 2.67 } 2.68 2.69 inline uint32_t 2.70 randomNumber() 2.71 { 2.72 - _VMSMasterEnv->seed1 = (uint32)(36969 * (_VMSMasterEnv->seed1 & 65535) + 2.73 - (_VMSMasterEnv->seed1 >> 16) ); 2.74 - _VMSMasterEnv->seed2 = (uint32)(18000 * (_VMSMasterEnv->seed2 & 65535) + 2.75 - (_VMSMasterEnv->seed2 >> 16) ); 2.76 - return (_VMSMasterEnv->seed1 << 16) + _VMSMasterEnv->seed2; 2.77 + _PRMasterEnv->seed1 = (uint32)(36969 * (_PRMasterEnv->seed1 & 65535) + 2.78 + (_PRMasterEnv->seed1 >> 16) ); 2.79 + _PRMasterEnv->seed2 = (uint32)(18000 * (_PRMasterEnv->seed2 & 65535) + 2.80 + (_PRMasterEnv->seed2 >> 16) ); 2.81 + return (_PRMasterEnv->seed1 << 16) + _PRMasterEnv->seed2; 2.82 } 2.83 2.84 2.85 @@ -292,14 +292,14 @@ 2.86 2.87 //=============== Initializations =================== 2.88 thisCoresIdx = 0; //sequential version 2.89 - animSlots = _VMSMasterEnv->allAnimSlots[thisCoresIdx]; 2.90 + animSlots = _PRMasterEnv->allAnimSlots[thisCoresIdx]; 2.91 currSlotIdx = 0; //start at slot 0, go up until one empty, then do master 2.92 numRepetitionsWithNoWork = 0; 2.93 - addrOfMasterLock = &(_VMSMasterEnv->masterLock); 2.94 - thisCoresMasterVP = _VMSMasterEnv->masterVPs[thisCoresIdx]; 2.95 + addrOfMasterLock = &(_PRMasterEnv->masterLock); 2.96 + thisCoresMasterVP = _PRMasterEnv->masterVPs[thisCoresIdx]; 2.97 2.98 //Assembly that saves addr of label of return instr -- label in assmbly 2.99 - recordCoreCtlrReturnLabelAddr((void**)&(_VMSMasterEnv->coreCtlrReturnPt)); 2.100 + recordCoreCtlrReturnLabelAddr((void**)&(_PRMasterEnv->coreCtlrReturnPt)); 2.101 2.102 2.103 //====================== The Core Controller ======================
3.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 3.2 +++ b/Defines/MEAS__macros_to_be_moved_to_langs.h Wed Sep 19 23:12:44 2012 -0700 3.3 @@ -0,0 +1,57 @@ 3.4 +/* 3.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 3.6 + * Licensed under GNU General Public License version 2 3.7 + * 3.8 + * Author: seanhalle@yahoo.com 3.9 + * 3.10 + */ 3.11 + 3.12 +#ifndef _PR_LANG_SPEC_DEFS_H 3.13 +#define _PR_LANG_SPEC_DEFS_H 3.14 + 3.15 + 3.16 + 3.17 +//=================== Language-specific Measurement Stuff =================== 3.18 +// 3.19 +//TODO: move these into the language implementation directories 3.20 +// 3.21 + 3.22 + 3.23 +//=========================================================================== 3.24 +//VCilk 3.25 + 3.26 +#ifdef VCILK 3.27 + 3.28 +#define spawnHistIdx 1 //note: starts at 1 3.29 +#define syncHistIdx 2 3.30 + 3.31 +#define MEAS__Make_Meas_Hists_for_Language() \ 3.32 + _PRMasterEnv->measHistsInfo = \ 3.33 + makePrivDynArrayOfSize( (void***)&(_PRMasterEnv->measHists), 200); \ 3.34 + makeAMeasHist( spawnHistIdx, "Spawn", 50, 0, 200 ) \ 3.35 + makeAMeasHist( syncHistIdx, "Sync", 50, 0, 200 ) 3.36 + 3.37 + 3.38 +#define Meas_startSpawn \ 3.39 + int32 startStamp, endStamp; \ 3.40 + saveLowTimeStampCountInto( startStamp ); \ 3.41 + 3.42 +#define Meas_endSpawn \ 3.43 + saveLowTimeStampCountInto( endStamp ); \ 3.44 + addIntervalToHist( startStamp, endStamp, \ 3.45 + _PRMasterEnv->measHists[ spawnHistIdx ] ); 3.46 + 3.47 +#define Meas_startSync \ 3.48 + int32 startStamp, endStamp; \ 3.49 + saveLowTimeStampCountInto( startStamp ); \ 3.50 + 3.51 +#define Meas_endSync \ 3.52 + saveLowTimeStampCountInto( endStamp ); \ 3.53 + addIntervalToHist( startStamp, endStamp, \ 3.54 + _PRMasterEnv->measHists[ syncHistIdx ] ); 3.55 +#endif 3.56 + 3.57 +//=========================================================================== 3.58 + 3.59 +#endif /* _PR_DEFS_H */ 3.60 +
4.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 4.2 +++ b/Defines/PR_defs.h Wed Sep 19 23:12:44 2012 -0700 4.3 @@ -0,0 +1,43 @@ 4.4 +/* 4.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 4.6 + * Licensed under GNU General Public License version 2 4.7 + * 4.8 + * Author: seanhalle@yahoo.com 4.9 + * 4.10 + */ 4.11 + 4.12 +#ifndef _PR_DEFS_MAIN_H 4.13 +#define _PR_DEFS_MAIN_H 4.14 +#define _GNU_SOURCE 4.15 + 4.16 +//=========================== PR-wide defs =============================== 4.17 + 4.18 +#define SUCCESS 0 4.19 + 4.20 + //only after macro-expansion are the defs of writePrivQ, aso looked up 4.21 + // so these defs can be at the top, and writePrivQ defined later on.. 4.22 +#define writePRQ writePrivQ 4.23 +#define readPRQ readPrivQ 4.24 +#define makePRQ makePrivQ 4.25 +#define numInPRQ numInPrivQ 4.26 +#define PRQueueStruc PrivQueueStruc 4.27 + 4.28 + 4.29 +/*The language should re-define this, but need a default in case it doesn't*/ 4.30 +#ifndef _LANG_NAME_ 4.31 +#define _LANG_NAME_ "" 4.32 +#endif 4.33 + 4.34 +//====================== Hardware Constants ============================ 4.35 +#include "PR_defs__HW_constants.h" 4.36 + 4.37 +//====================== Macros ====================== 4.38 + //for turning macros and other PR features on and off 4.39 +#include "PR_defs__turn_on_and_off.h" 4.40 + 4.41 +#include "../Services_Offered_by_PR/Debugging/DEBUG__macros.h" 4.42 +#include "../Services_Offered_by_PR/Measurement_and_Stats/MEAS__macros.h" 4.43 + 4.44 +//=========================================================================== 4.45 +#endif /* */ 4.46 +
5.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 5.2 +++ b/Defines/PR_defs__HW_constants.h Wed Sep 19 23:12:44 2012 -0700 5.3 @@ -0,0 +1,54 @@ 5.4 +/* 5.5 + * Copyright 2012 OpenSourceStewardshipFoundation 5.6 + * Licensed under BSD 5.7 + * 5.8 + * Author: seanhalle@yahoo.com 5.9 + * 5.10 + */ 5.11 + 5.12 +#ifndef _PR_HW_SPEC_DEFS_H 5.13 +#define _PR_HW_SPEC_DEFS_H 5.14 +#define _GNU_SOURCE 5.15 + 5.16 + 5.17 +//========================= Hardware related Constants ===================== 5.18 + //This value is the number of hardware threads in the shared memory 5.19 + // machine 5.20 +#define NUM_CORES 4 5.21 + 5.22 + // tradeoff amortizing master fixed overhead vs imbalance potential 5.23 + // when work-stealing, can make bigger, at risk of losing cache affinity 5.24 +#define NUM_ANIM_SLOTS 1 5.25 + 5.26 + //These are for backoff inside core-loop, which reduces lock contention 5.27 +#define NUM_REPS_W_NO_WORK_BEFORE_YIELD 10 5.28 +#define NUM_REPS_W_NO_WORK_BEFORE_BACKOFF 2 5.29 +#define MASTERLOCK_RETRIES_BEFORE_YIELD 100 5.30 +#define NUM_TRIES_BEFORE_DO_BACKOFF 10 5.31 +#define GET_LOCK_BACKOFF_WEIGHT 100 5.32 + 5.33 + // stack size in virtual processors created 5.34 +#define VIRT_PROCR_STACK_SIZE 0x8000 /* 32K */ 5.35 + 5.36 + // memory for PR_int__malloc 5.37 +#define MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE 0x8000000 /* 128M */ 5.38 + 5.39 + //Frequency of TS counts -- have to do tests to verify 5.40 + //NOTE: turn off (in BIOS) TURBO-BOOST and SPEED-STEP else won't be const 5.41 +#define TSCOUNT_FREQ 3180000000 5.42 + 5.43 +#define CACHE_LINE_SZ 256 5.44 +#define PAGE_SIZE 4096 5.45 + 5.46 +//To prevent false-sharing, aligns a variable to a cache-line boundary. 5.47 +//No need to use for local vars because those are never shared between cores 5.48 +#define __align_to_cacheline__ __attribute__ ((aligned(CACHE_LINE_SZ))) 5.49 + 5.50 +//aligns a pointer to cacheline. The memory area has to contain at least 5.51 +//CACHE_LINE_SZ bytes more then needed 5.52 +#define __align_address(ptr) ((void*)(((uintptr_t)(ptr))&((uintptr_t)(~0x0FF)))) 5.53 + 5.54 +//=========================================================================== 5.55 + 5.56 +#endif /* _PR_DEFS_H */ 5.57 +
6.1 --- a/Defines/VMS_defs.h Mon Sep 03 03:34:54 2012 -0700 6.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 6.3 @@ -1,43 +0,0 @@ 6.4 -/* 6.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 6.6 - * Licensed under GNU General Public License version 2 6.7 - * 6.8 - * Author: seanhalle@yahoo.com 6.9 - * 6.10 - */ 6.11 - 6.12 -#ifndef _VMS_DEFS_MAIN_H 6.13 -#define _VMS_DEFS_MAIN_H 6.14 -#define _GNU_SOURCE 6.15 - 6.16 -//=========================== VMS-wide defs =============================== 6.17 - 6.18 -#define SUCCESS 0 6.19 - 6.20 - //only after macro-expansion are the defs of writePrivQ, aso looked up 6.21 - // so these defs can be at the top, and writePrivQ defined later on.. 6.22 -#define writeVMSQ writePrivQ 6.23 -#define readVMSQ readPrivQ 6.24 -#define makeVMSQ makePrivQ 6.25 -#define numInVMSQ numInPrivQ 6.26 -#define VMSQueueStruc PrivQueueStruc 6.27 - 6.28 - 6.29 -/*The language should re-define this, but need a default in case it doesn't*/ 6.30 -#ifndef _LANG_NAME_ 6.31 -#define _LANG_NAME_ "" 6.32 -#endif 6.33 - 6.34 -//====================== Hardware Constants ============================ 6.35 -#include "VMS_defs__HW_constants.h" 6.36 - 6.37 -//====================== Macros ====================== 6.38 - //for turning macros and other VMS features on and off 6.39 -#include "VMS_defs__turn_on_and_off.h" 6.40 - 6.41 -#include "../Services_Offered_by_VMS/Debugging/DEBUG__macros.h" 6.42 -#include "../Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h" 6.43 - 6.44 -//=========================================================================== 6.45 -#endif /* */ 6.46 -
7.1 --- a/Defines/VMS_defs__HW_constants.h Mon Sep 03 03:34:54 2012 -0700 7.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 7.3 @@ -1,54 +0,0 @@ 7.4 -/* 7.5 - * Copyright 2012 OpenSourceStewardshipFoundation 7.6 - * Licensed under BSD 7.7 - * 7.8 - * Author: seanhalle@yahoo.com 7.9 - * 7.10 - */ 7.11 - 7.12 -#ifndef _VMS_HW_SPEC_DEFS_H 7.13 -#define _VMS_HW_SPEC_DEFS_H 7.14 -#define _GNU_SOURCE 7.15 - 7.16 - 7.17 -//========================= Hardware related Constants ===================== 7.18 - //This value is the number of hardware threads in the shared memory 7.19 - // machine 7.20 -#define NUM_CORES 4 7.21 - 7.22 - // tradeoff amortizing master fixed overhead vs imbalance potential 7.23 - // when work-stealing, can make bigger, at risk of losing cache affinity 7.24 -#define NUM_ANIM_SLOTS 1 7.25 - 7.26 - //These are for backoff inside core-loop, which reduces lock contention 7.27 -#define NUM_REPS_W_NO_WORK_BEFORE_YIELD 10 7.28 -#define NUM_REPS_W_NO_WORK_BEFORE_BACKOFF 2 7.29 -#define MASTERLOCK_RETRIES_BEFORE_YIELD 100 7.30 -#define NUM_TRIES_BEFORE_DO_BACKOFF 10 7.31 -#define GET_LOCK_BACKOFF_WEIGHT 100 7.32 - 7.33 - // stack size in virtual processors created 7.34 -#define VIRT_PROCR_STACK_SIZE 0x8000 /* 32K */ 7.35 - 7.36 - // memory for VMS_int__malloc 7.37 -#define MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE 0x8000000 /* 128M */ 7.38 - 7.39 - //Frequency of TS counts -- have to do tests to verify 7.40 - //NOTE: turn off (in BIOS) TURBO-BOOST and SPEED-STEP else won't be const 7.41 -#define TSCOUNT_FREQ 3180000000 7.42 - 7.43 -#define CACHE_LINE_SZ 256 7.44 -#define PAGE_SIZE 4096 7.45 - 7.46 -//To prevent false-sharing, aligns a variable to a cache-line boundary. 7.47 -//No need to use for local vars because those are never shared between cores 7.48 -#define __align_to_cacheline__ __attribute__ ((aligned(CACHE_LINE_SZ))) 7.49 - 7.50 -//aligns a pointer to cacheline. The memory area has to contain at least 7.51 -//CACHE_LINE_SZ bytes more then needed 7.52 -#define __align_address(ptr) ((void*)(((uintptr_t)(ptr))&((uintptr_t)(~0x0FF)))) 7.53 - 7.54 -//=========================================================================== 7.55 - 7.56 -#endif /* _VMS_DEFS_H */ 7.57 -
8.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 8.2 +++ b/HW_Dependent_Primitives/PR__HW_measurement.c Wed Sep 19 23:12:44 2012 -0700 8.3 @@ -0,0 +1,87 @@ 8.4 +#include <unistd.h> 8.5 +#include <fcntl.h> 8.6 +#include <linux/types.h> 8.7 +#include <linux/perf_event.h> 8.8 +#include <errno.h> 8.9 +#include <sys/syscall.h> 8.10 +#include <linux/prctl.h> 8.11 + 8.12 +#include "../PR.h" 8.13 + 8.14 +void setup_perf_counters(){ 8.15 +#ifdef HOLISTIC__TURN_ON_PERF_COUNTERS 8.16 + struct perf_event_attr hw_event; 8.17 + memset(&hw_event,0,sizeof(hw_event)); 8.18 + hw_event.size = sizeof(struct perf_event_attr); 8.19 + hw_event.disabled = 1; 8.20 + hw_event.inherit = 1; /* children inherit it */ 8.21 + hw_event.pinned = 1; /* must always be on PMU */ 8.22 + hw_event.exclusive = 0; /* only group on PMU */ 8.23 + hw_event.exclude_user = 0; /* don't count user */ 8.24 + hw_event.exclude_kernel = 0; /* ditto kernel */ 8.25 + hw_event.exclude_hv = 0; /* ditto hypervisor */ 8.26 + hw_event.exclude_idle = 0; /* don't count when idle */ 8.27 + 8.28 + int coreIdx; 8.29 + for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 8.30 + { 8.31 + hw_event.type = PERF_TYPE_HARDWARE; 8.32 + hw_event.config = PERF_COUNT_HW_CPU_CYCLES; //cycles 8.33 + _PRMasterEnv->cycles_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 8.34 + 0,//pid_t pid, 8.35 + coreIdx,//int cpu, 8.36 + -1,//int group_fd, 8.37 + 0//unsigned long flags 8.38 + ); 8.39 + if (_PRMasterEnv->cycles_counter_fd[coreIdx]<0){ 8.40 + fprintf(stderr,"On core %d: ",coreIdx); 8.41 + perror("Failed to open cycles counter"); 8.42 + } 8.43 + hw_event.type = PERF_TYPE_HARDWARE; 8.44 + hw_event.config = PERF_COUNT_HW_INSTRUCTIONS; //instrs 8.45 + _PRMasterEnv->instrs_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 8.46 + 0,//pid_t pid, 8.47 + coreIdx,//int cpu, 8.48 + -1,//int group_fd, 8.49 + 0//unsigned long flags 8.50 + ); 8.51 + if (_PRMasterEnv->instrs_counter_fd[coreIdx]<0){ 8.52 + fprintf(stderr,"On core %d: ",coreIdx); 8.53 + perror("Failed to open instrs counter"); 8.54 + } 8.55 + hw_event.type = PERF_TYPE_HW_CACHE; 8.56 + hw_event.config = PERF_COUNT_HW_CACHE_L1D << 0 | 8.57 + (PERF_COUNT_HW_CACHE_OP_READ << 8) | 8.58 + (PERF_COUNT_HW_CACHE_RESULT_MISS << 16); //cache misses 8.59 + _PRMasterEnv->cachem_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 8.60 + 0,//pid_t pid, 8.61 + coreIdx,//int cpu, 8.62 + -1,//int group_fd, 8.63 + 0//unsigned long flags 8.64 + ); 8.65 + if (_PRMasterEnv->cachem_counter_fd[coreIdx]<0){ 8.66 + fprintf(stderr,"On core %d: ",coreIdx); 8.67 + perror("Failed to open cache miss counter"); 8.68 + exit(1); 8.69 + } 8.70 + } 8.71 + 8.72 + prctl(PR_TASK_PERF_EVENTS_ENABLE); 8.73 +#endif 8.74 +} 8.75 + 8.76 +__inline__ uint64_t rdtsc(){ 8.77 + uint32_t lo, hi; 8.78 + __asm__ __volatile__ ( // serialize 8.79 + "xorl %%eax,%%eax \n cpuid" 8.80 + ::: "%rax", "%rbx", "%rcx", "%rdx"); 8.81 + __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi)); 8.82 + /* asm volatile("RDTSC;" 8.83 + "movl %%eax, %0;" 8.84 + "movl %%edx, %1;" 8.85 + : "=m" (lo), "=m" (hi) 8.86 + : 8.87 + : "%eax", "%edx" 8.88 + ); */ 8.89 + return (uint64_t)hi << 32 | lo; 8.90 +} 8.91 \ No newline at end of file
9.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 9.2 +++ b/HW_Dependent_Primitives/PR__HW_measurement.h Wed Sep 19 23:12:44 2012 -0700 9.3 @@ -0,0 +1,63 @@ 9.4 +/* 9.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 9.6 + * Licensed under GNU General Public License version 2 9.7 + * 9.8 + * Author: seanhalle@yahoo.com 9.9 + * 9.10 + */ 9.11 + 9.12 +#ifndef _PR__HW_MEASUREMENT_H 9.13 +#define _PR__HW_MEASUREMENT_H 9.14 +#define _GNU_SOURCE 9.15 + 9.16 + 9.17 +//=================== Macros to Capture Measurements ====================== 9.18 + 9.19 +typedef union 9.20 + { uint32 lowHigh[2]; 9.21 + uint64 longVal; 9.22 + } 9.23 +TSCountLowHigh; 9.24 + 9.25 + 9.26 +//=================== Macros to Capture Measurements ====================== 9.27 +// 9.28 +//===== RDTSC wrapper ===== 9.29 +//Also runs with x86_64 code 9.30 +#define saveTSCLowHigh(lowHighIn) \ 9.31 + asm volatile("RDTSC; \ 9.32 + movl %%eax, %0; \ 9.33 + movl %%edx, %1;" \ 9.34 + /* outputs */ : "=m" (lowHighIn.lowHigh[0]), "=m" (lowHighIn.lowHigh[1])\ 9.35 + /* inputs */ : \ 9.36 + /* clobber */ : "%eax", "%edx" \ 9.37 + ); 9.38 + 9.39 +#define saveTimeStampCountInto(low, high) \ 9.40 + asm volatile("RDTSC; \ 9.41 + movl %%eax, %0; \ 9.42 + movl %%edx, %1;" \ 9.43 + /* outputs */ : "=m" (low), "=m" (high)\ 9.44 + /* inputs */ : \ 9.45 + /* clobber */ : "%eax", "%edx" \ 9.46 + ); 9.47 + 9.48 +#define saveLowTimeStampCountInto(low) \ 9.49 + asm volatile("RDTSC; \ 9.50 + movl %%eax, %0;" \ 9.51 + /* outputs */ : "=m" (low) \ 9.52 + /* inputs */ : \ 9.53 + /* clobber */ : "%eax", "%edx" \ 9.54 + ); 9.55 + 9.56 +inline TSCount getTSCount(); 9.57 + 9.58 + 9.59 + //For code that calculates normalization-offset between TSC counts of 9.60 + // different cores. 9.61 +//#define NUM_TSC_ROUND_TRIPS 10 9.62 + 9.63 +void setup_perf_counters(); 9.64 +uint64_t rdtsc(void); 9.65 +#endif /* */ 9.66 +
10.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 10.2 +++ b/HW_Dependent_Primitives/PR__primitives.c Wed Sep 19 23:12:44 2012 -0700 10.3 @@ -0,0 +1,137 @@ 10.4 +/* 10.5 + * This File contains all hardware dependent C code. 10.6 + */ 10.7 + 10.8 + 10.9 +#include "../PR.h" 10.10 + 10.11 +/*Reset the stack then set it up with __cdecl structure on it 10.12 + * Except doing a trick for 64 bits, where point slave to helper assembly 10.13 + * that copies the function pointer off stack and into a reg, then 10.14 + * jumps to it. So, set the resumeInstrPtr to the helper-assembly. 10.15 + *This is for first-time startup of slave.. it trashes the stack. 10.16 + *No registers saved into old stack frame, and no animator state to 10.17 + * return to 10.18 + * 10.19 + *This was factored into separate function because it's used stand-alone in 10.20 + * some wrapper-libraries (but only "int" version, to warn users to check 10.21 + * carefully that it's safe) 10.22 + */ 10.23 +inline void 10.24 +PR_int__reset_slaveVP_to_TopLvlFn( SlaveVP *slaveVP, TopLevelFnPtr fnPtr, 10.25 + void *dataParam) 10.26 + { void *stackPtr; 10.27 + 10.28 +// Start of Hardware dependent part 10.29 + 10.30 + //Set slave's instr pointer to a helper Fn that copies params from stack 10.31 + slaveVP->resumeInstrPtr = (TopLevelFnPtr)&startUpTopLevelFn; 10.32 + 10.33 + //fnPtr takes two params -- void *dataParam & void *animSlv 10.34 + // Stack grows *down*, so start it at highest stack addr, minus room 10.35 + // for 2 params + return addr. Do ptr arith in terms of bytes.. 10.36 + stackPtr = 10.37 + (uint8 *)slaveVP->startOfStack + VIRT_PROCR_STACK_SIZE - 4*sizeof(void*); 10.38 + 10.39 + //setup __cdecl on stack 10.40 + //Normally, return Addr is in loc pointed to by stackPtr, but doing a 10.41 + // trick for 64 bit arch, where put ptr to top-level fn there instead, 10.42 + // and set resumeInstrPtr to a helper-fn that copies the top-level 10.43 + // fn ptr and params into registers. 10.44 + //Then, dataParam is at stackPtr + 8 bytes, & animating SlaveVP above 10.45 + //Do ptr arith in terms of pointers 10.46 + *((SlaveVP**)stackPtr + 2 ) = slaveVP; //rightmost param 10.47 + *((void**)stackPtr + 1 ) = dataParam; //next param to left 10.48 + *((void**)stackPtr) = (void*)fnPtr; //copied to reg by helper Fn 10.49 + 10.50 + 10.51 +// end of Hardware dependent part 10.52 + 10.53 + //core controller will switch to stack & frame pointers stored in slave, 10.54 + // can't use this fn if have state on stack that needs preserving. 10.55 + slaveVP->stackPtr = stackPtr; 10.56 + slaveVP->framePtr = stackPtr; 10.57 + } 10.58 + 10.59 + 10.60 +/*Preserve the stack, pushing the __cdecl structure onto it 10.61 + * For 64 bits, params passed in regs, so point slave to helper assembly 10.62 + * that copies the arguments off stack and into regs, then 10.63 + * jumps to Fn. So, set the resumeInstrPtr to the helper-assembly. 10.64 + * 10.65 + *This preserves the stack state existed at time slave was suspended. 10.66 + */ 10.67 +inline void 10.68 +PR_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 10.69 + void *param) 10.70 + { void *stackPtr; 10.71 + 10.72 +// Start of Hardware dependent part 10.73 + 10.74 + // Get the slave's current stack ptr, and make room for param + ret addr 10.75 + stackPtr = ((void **)slaveVP->stackPtr - 2); 10.76 + 10.77 + //save slave's current instr ptr as the return addr, so stack looks 10.78 + // just like it does after a call instr. 10.79 + //Put argument plus fn addr onto stack -- helper will copy into regs 10.80 + // then jump to the fn 10.81 + //fnPtr is just below top of stack, param is above at stackPtr + 8 bytes 10.82 + *((void**)stackPtr + 1 ) = param; 10.83 + *((void**)stackPtr) = slaveVP->resumeInstrPtr; //acts as return addr 10.84 + *((void**)stackPtr - 1) = (void*)fnPtr; //what helper jmps to 10.85 + 10.86 + //Set slave's instr pointer to a helper Fn that copies params from stack 10.87 + slaveVP->resumeInstrPtr = (TopLevelFnPtr)&jmpToOneParamFn; 10.88 + 10.89 +// end of Hardware dependent part 10.90 + 10.91 + //core controller will switch to stack & frame pointers stored in slave, 10.92 + // then jmp to helper Fn, which will then move param to register used 10.93 + // to pass argument and jmp to fnPtr saved on stack. 10.94 + //That fn should save the framePtr on stack and make room 10.95 + // for its own frame, as normal. So don't modify framePtr, only stack 10.96 + slaveVP->stackPtr = stackPtr; 10.97 + } 10.98 + 10.99 + 10.100 +/*Same as for one-parameter function, but puts two arguments on stack 10.101 + *Preserve the stack, pushing the __cdecl structure onto it 10.102 + * For 64 bits, params passed in regs, so point slave to helper assembly 10.103 + * that copies the arguments off stack and into regs, then 10.104 + * jumps to Fn. So, set the resumeInstrPtr to the helper-assembly. 10.105 + * 10.106 + *This preserves the stack state existed at time slave was suspended. 10.107 + */ 10.108 +inline void 10.109 +PR_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 10.110 + void *param1, void *param2) 10.111 + { void *stackPtr; 10.112 + 10.113 +// Start of Hardware dependent part 10.114 + 10.115 + // Get the slave's current stack ptr, and make room for param + ret addr 10.116 + stackPtr = slaveVP->stackPtr - 3; 10.117 + 10.118 + //save slave's current instr ptr as the return addr, so stack looks 10.119 + // just like it does after a call instr. 10.120 + //Put argument plus fn addr onto stack -- helper will copy into regs 10.121 + // then jump to the fn 10.122 + //fnPtr is just below top of stack, param1 is above at stackPtr + 8 bytes 10.123 + *((void**)stackPtr + 2 ) = param2; 10.124 + *((void**)stackPtr + 1 ) = param1; 10.125 + *((void**)stackPtr) = slaveVP->resumeInstrPtr; //acts as return addr 10.126 + *((void**)stackPtr - 1) = (void*)fnPtr; //what helper jmps to 10.127 + 10.128 + //Set slave's instr pointer to a helper Fn that copies params from stack 10.129 + slaveVP->resumeInstrPtr = (TopLevelFnPtr)&jmpToTwoParamFn; 10.130 + 10.131 +// end of Hardware dependent part 10.132 + 10.133 + //core controller will switch to stack & frame pointers stored in slave, 10.134 + // then jmp to helper Fn, which will then move param to register used 10.135 + // to pass argument and jmp to fnPtr saved on stack. 10.136 + //That fn should save the framePtr on stack and make room 10.137 + // for its own frame, as normal. So don't modify framePtr, only stack 10.138 + slaveVP->stackPtr = stackPtr; 10.139 + } 10.140 +
11.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 11.2 +++ b/HW_Dependent_Primitives/PR__primitives.h Wed Sep 19 23:12:44 2012 -0700 11.3 @@ -0,0 +1,55 @@ 11.4 +/* 11.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 11.6 + * Licensed under GNU General Public License version 2 11.7 + * 11.8 + * Author: seanhalle@yahoo.com 11.9 + * 11.10 + */ 11.11 + 11.12 +#ifndef _PR__PRIMITIVES_H 11.13 +#define _PR__PRIMITIVES_H 11.14 +#define _GNU_SOURCE 11.15 + 11.16 +void 11.17 +recordCoreCtlrReturnLabelAddr(void **returnAddress); 11.18 + 11.19 +void 11.20 +switchToSlv(SlaveVP *nextSlave); 11.21 + 11.22 +void 11.23 +switchToCoreCtlr(SlaveVP *nextSlave); 11.24 + 11.25 +void 11.26 +masterSwitchToCoreCtlr(SlaveVP *nextSlave); 11.27 + 11.28 +void 11.29 +startUpTopLevelFn(); 11.30 + 11.31 +void 11.32 +jmpToOneParamFn(); 11.33 + 11.34 +void 11.35 +jmpToTwoParamFn(); 11.36 + 11.37 +void * 11.38 +asmTerminateCoreCtlr(SlaveVP *currSlv); 11.39 + 11.40 +#define flushRegisters() \ 11.41 + asm volatile ("":::"%rbx", "%r12", "%r13","%r14","%r15") 11.42 + 11.43 +void 11.44 +PR_int__save_return_into_ptd_to_loc_then_do_ret(void *ptdToLoc); 11.45 + 11.46 +void 11.47 +PR_int__return_to_addr_in_ptd_to_loc(void *ptdToLoc); 11.48 + 11.49 +inline void 11.50 +PR_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 11.51 + void *param); 11.52 + 11.53 +inline void 11.54 +PR_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 11.55 + void *param1, void *param2); 11.56 + 11.57 +#endif /* _PR__HW_DEPENDENT_H */ 11.58 +
12.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 12.2 +++ b/HW_Dependent_Primitives/PR__primitives_asm.s Wed Sep 19 23:12:44 2012 -0700 12.3 @@ -0,0 +1,189 @@ 12.4 +.data 12.5 + 12.6 + 12.7 +.text 12.8 + 12.9 +//Save return label address for the coreCtlr to pointer 12.10 +//Arguments: Pointer to variable holding address 12.11 +.globl recordCoreCtlrReturnLabelAddr 12.12 +recordCoreCtlrReturnLabelAddr: 12.13 + movq $coreCtlrReturn, %rcx #load label address 12.14 + movq %rcx, (%rdi) #save address to pointer 12.15 + ret 12.16 + 12.17 + 12.18 +//Trick for 64 bit arch -- copies args from stack into regs, then does jmp to 12.19 +// the top-level function, which was pointed to by the stack-ptr 12.20 +.globl startUpTopLevelFn 12.21 +startUpTopLevelFn: 12.22 + movq %rdi , %rsi #get second argument from first argument of switchSlv 12.23 + movq 0x08(%rsp), %rdi #get first argument from stack 12.24 + movq (%rsp) , %rax #get top-level function's addr from stack 12.25 + jmp *%rax #jump to the top-level function 12.26 + 12.27 + 12.28 +//Args passed in regs in 64 bit arch. This copies args from stack into regs, 12.29 +// then does jmp to the function, whose addr is on stack. 12.30 +//For 64bit, %rdi is first arg, %rsi is second arg to function 12.31 +//The top of stack is a valid return addr (old value of slaveVP's instrPtr), 12.32 +// and the fnPtr is just below the top of stack (will be overwritten when 12.33 +// fn saves the frame ptr) 12.34 +.globl jmpToOneParamFn 12.35 +jmpToOneParamFn: 12.36 + movq 0x08(%rsp), %rdi #get the argument from stack 12.37 + movq -0x08(%rsp), %rax #get function's addr from stack 12.38 + jmp *%rax #jump to the function 12.39 + 12.40 +.globl jmpToTwoParamFn 12.41 +jmpToTwoParamFn: 12.42 + movq 0x10(%rsp), %rsi #get the second argument from stack 12.43 + movq 0x08(%rsp), %rdi #get the first argument from stack 12.44 + movq -0x08(%rsp), %rax #get function's addr from stack 12.45 + jmp *%rax #jump to the function 12.46 + 12.47 + 12.48 +//Switches form CoreCtlr to either a normal Slv VP or the Master VP 12.49 +//switch to VP's stack and frame ptr then jump to VP's next-instr-ptr 12.50 +/* SlaveVP offsets: 12.51 + * 0x00 stackPtr 12.52 + * 0x08 framePtr 12.53 + * 0x10 resumeInstrPtr 12.54 + * 0x18 coreCtlrFramePtr 12.55 + * 0x20 coreCtlrStackPtr 12.56 + * 12.57 + * _PRMasterEnv offsets: 12.58 + * 0x00 coreCtlrReturnPt 12.59 + * 0x100 masterLock 12.60 + */ 12.61 +.globl switchToSlv 12.62 +switchToSlv: 12.63 + #SlaveVP in %rdi 12.64 + movq %rsp , 0x20(%rdi) #save core ctlr stack pointer 12.65 + movq %rbp , 0x18(%rdi) #save core ctlr frame pointer 12.66 + movq 0x00(%rdi), %rsp #restore stack pointer 12.67 + movq 0x08(%rdi), %rbp #restore frame pointer 12.68 + movq 0x10(%rdi), %rax #get jmp pointer 12.69 + jmp *%rax #jmp to Slv 12.70 +coreCtlrReturn: 12.71 + ret 12.72 + 12.73 + 12.74 +//switches to core controller. saves return address 12.75 +/* SlaveVP offsets: 12.76 + * 0x00 stackPtr 12.77 + * 0x08 framePtr 12.78 + * 0x10 resumeInstrPtr 12.79 + * 0x18 coreCtlrFramePtr 12.80 + * 0x20 coreCtlrStackPtr 12.81 + * 12.82 + * _PRMasterEnv offsets: 12.83 + * 0x00 coreCtlrReturnPt 12.84 + * 0x100 masterLock 12.85 + */ 12.86 +.globl switchToCoreCtlr 12.87 +switchToCoreCtlr: 12.88 + #SlaveVP in %rdi 12.89 + movq $SlvReturn, 0x10(%rdi) #store return address 12.90 + movq %rsp , 0x00(%rdi) #save stack pointer 12.91 + movq %rbp , 0x08(%rdi) #save frame pointer 12.92 + movq 0x20(%rdi), %rsp #restore stack pointer 12.93 + movq 0x18(%rdi), %rbp #restore frame pointer 12.94 + movq $_PRMasterEnv, %rcx 12.95 + movq (%rcx), %rcx #_PRMasterEnv is pointer to struct 12.96 + movq 0x00(%rcx), %rax #get CoreCtlrStartPt 12.97 + jmp *%rax #jmp to CoreCtlr 12.98 +SlvReturn: 12.99 + ret 12.100 + 12.101 + 12.102 + 12.103 +//switches to core controller from master. saves return address 12.104 +//Releases masterLock so the next AnimationMaster can be executed 12.105 +/* SlaveVP offsets: 12.106 + * 0x00 stackPtr 12.107 + * 0x08 framePtr 12.108 + * 0x10 resumeInstrPtr 12.109 + * 0x18 coreCtlrFramePtr 12.110 + * 0x20 coreCtlrStackPtr 12.111 + * 12.112 + * _PRMasterEnv offsets: 12.113 + * 0x00 coreCtlrReturnPt 12.114 + * 0x100 masterLock 12.115 + */ 12.116 +.globl masterSwitchToCoreCtlr 12.117 +masterSwitchToCoreCtlr: 12.118 + #SlaveVP in %rdi 12.119 + movq $MasterReturn, 0x10(%rdi) #store return address 12.120 + movq %rsp , 0x00(%rdi) #save stack pointer 12.121 + movq %rbp , 0x08(%rdi) #save frame pointer 12.122 + movq 0x20(%rdi), %rsp #restore stack pointer 12.123 + movq 0x18(%rdi), %rbp #restore frame pointer 12.124 + movq $_PRMasterEnv, %rcx 12.125 + movq (%rcx), %rcx #_PRMasterEnv is pointer to struct 12.126 + movq 0x00(%rcx), %rax #get CoreCtlr return pt 12.127 + movl $0x0 , 0x100(%rcx) #release lock 12.128 + jmp *%rax #jmp to CoreCtlr 12.129 +MasterReturn: 12.130 + ret 12.131 + 12.132 + 12.133 +/*Switch to terminateCoreCtlr 12.134 + *This is called by endOSThreadFn, which is the top-level function given 12.135 + * to a shutdown slave. When such a slave gets switched to, by the core 12.136 + * controller, it runs the top-level function, which calls this, which 12.137 + * then calls terminateCoreCtlr, which ends the pthread. Note, when get 12.138 + * here, stack is already set up for switchSlv and Slv ptr is in %rdi. 12.139 + *Do not save registers of Slv because this function will never return 12.140 + * 12.141 + * SlaveVP offsets: 12.142 + * 0x00 stackPtr 12.143 + * 0x08 framePtr 12.144 + * 0x10 resumeInstrPtr 12.145 + * 0x18 coreCtlrFramePtr 12.146 + * 0x20 coreCtlrStackPtr 12.147 + * 12.148 + * _PRMasterEnv offsets: 12.149 + * 0x00 coreCtlrReturnPt 12.150 + * 0x100 masterLock 12.151 + */ 12.152 +.globl asmTerminateCoreCtlr 12.153 +asmTerminateCoreCtlr: #SlaveVP ptr is in %rdi 12.154 + movq 0x20(%rdi), %rsp #restore stack pointer 12.155 + movq 0x18(%rdi), %rbp #restore frame pointer 12.156 + movq $terminateCoreCtlr, %rax 12.157 + jmp *%rax #jmp to fn that ends the pthread 12.158 + 12.159 + 12.160 +/* 12.161 + * This one for the sequential version is special. It discards the current stack 12.162 + * and returns directly from the coreCtlr after PR_WL__dissipate_slaveVP was called 12.163 + */ 12.164 +.globl asmTerminateCoreCtlrSeq 12.165 +asmTerminateCoreCtlrSeq: 12.166 + #SlaveVP in %rdi 12.167 + movq 0x20(%rdi), %rsp #restore stack pointer 12.168 + movq 0x18(%rdi), %rbp #restore frame pointer 12.169 + #argument is in %rdi 12.170 + call PR_int__dissipate_slaveVP 12.171 + movq %rbp , %rsp #goto the coreCtlrs stack 12.172 + pop %rbp #restore the old framepointer 12.173 + ret #return from core controller 12.174 + 12.175 + 12.176 +//Takes the return addr off the stack and saves into the loc pointed to by 12.177 +// by the parameter passed in via rdi. Return addr is at 0x8(%rbp) for 64bit 12.178 +.globl PR_int__save_return_into_ptd_to_loc_then_do_ret 12.179 +PR_int__save_return_into_ptd_to_loc_then_do_ret: 12.180 + movq 0x08(%rbp), %rax #get ret address, rbp is the same as in the calling function 12.181 + movq %rax, (%rdi) #write ret addr into addr passed as param field 12.182 + ret 12.183 + 12.184 + 12.185 +//Assembly code changes the return addr on the stack to the one 12.186 +// pointed to by the parameter, then returns. Stack's return addr is at 0x8(%rbp) 12.187 +.globl PR_int__return_to_addr_in_ptd_to_loc 12.188 +PR_int__return_to_addr_in_ptd_to_loc: 12.189 + movq (%rdi), %rax #get return addr from addr passed as param 12.190 + movq %rax, 0x08(%rbp) #write return addr to the stack of the caller 12.191 + ret 12.192 +
13.1 --- a/HW_Dependent_Primitives/VMS__HW_measurement.c Mon Sep 03 03:34:54 2012 -0700 13.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 13.3 @@ -1,87 +0,0 @@ 13.4 -#include <unistd.h> 13.5 -#include <fcntl.h> 13.6 -#include <linux/types.h> 13.7 -#include <linux/perf_event.h> 13.8 -#include <errno.h> 13.9 -#include <sys/syscall.h> 13.10 -#include <linux/prctl.h> 13.11 - 13.12 -#include "../VMS.h" 13.13 - 13.14 -void setup_perf_counters(){ 13.15 -#ifdef HOLISTIC__TURN_ON_PERF_COUNTERS 13.16 - struct perf_event_attr hw_event; 13.17 - memset(&hw_event,0,sizeof(hw_event)); 13.18 - hw_event.size = sizeof(struct perf_event_attr); 13.19 - hw_event.disabled = 1; 13.20 - hw_event.inherit = 1; /* children inherit it */ 13.21 - hw_event.pinned = 1; /* must always be on PMU */ 13.22 - hw_event.exclusive = 0; /* only group on PMU */ 13.23 - hw_event.exclude_user = 0; /* don't count user */ 13.24 - hw_event.exclude_kernel = 0; /* ditto kernel */ 13.25 - hw_event.exclude_hv = 0; /* ditto hypervisor */ 13.26 - hw_event.exclude_idle = 0; /* don't count when idle */ 13.27 - 13.28 - int coreIdx; 13.29 - for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 13.30 - { 13.31 - hw_event.type = PERF_TYPE_HARDWARE; 13.32 - hw_event.config = PERF_COUNT_HW_CPU_CYCLES; //cycles 13.33 - _VMSMasterEnv->cycles_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 13.34 - 0,//pid_t pid, 13.35 - coreIdx,//int cpu, 13.36 - -1,//int group_fd, 13.37 - 0//unsigned long flags 13.38 - ); 13.39 - if (_VMSMasterEnv->cycles_counter_fd[coreIdx]<0){ 13.40 - fprintf(stderr,"On core %d: ",coreIdx); 13.41 - perror("Failed to open cycles counter"); 13.42 - } 13.43 - hw_event.type = PERF_TYPE_HARDWARE; 13.44 - hw_event.config = PERF_COUNT_HW_INSTRUCTIONS; //instrs 13.45 - _VMSMasterEnv->instrs_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 13.46 - 0,//pid_t pid, 13.47 - coreIdx,//int cpu, 13.48 - -1,//int group_fd, 13.49 - 0//unsigned long flags 13.50 - ); 13.51 - if (_VMSMasterEnv->instrs_counter_fd[coreIdx]<0){ 13.52 - fprintf(stderr,"On core %d: ",coreIdx); 13.53 - perror("Failed to open instrs counter"); 13.54 - } 13.55 - hw_event.type = PERF_TYPE_HW_CACHE; 13.56 - hw_event.config = PERF_COUNT_HW_CACHE_L1D << 0 | 13.57 - (PERF_COUNT_HW_CACHE_OP_READ << 8) | 13.58 - (PERF_COUNT_HW_CACHE_RESULT_MISS << 16); //cache misses 13.59 - _VMSMasterEnv->cachem_counter_fd[coreIdx] = syscall(__NR_perf_event_open, &hw_event, 13.60 - 0,//pid_t pid, 13.61 - coreIdx,//int cpu, 13.62 - -1,//int group_fd, 13.63 - 0//unsigned long flags 13.64 - ); 13.65 - if (_VMSMasterEnv->cachem_counter_fd[coreIdx]<0){ 13.66 - fprintf(stderr,"On core %d: ",coreIdx); 13.67 - perror("Failed to open cache miss counter"); 13.68 - exit(1); 13.69 - } 13.70 - } 13.71 - 13.72 - prctl(PR_TASK_PERF_EVENTS_ENABLE); 13.73 -#endif 13.74 -} 13.75 - 13.76 -__inline__ uint64_t rdtsc(){ 13.77 - uint32_t lo, hi; 13.78 - __asm__ __volatile__ ( // serialize 13.79 - "xorl %%eax,%%eax \n cpuid" 13.80 - ::: "%rax", "%rbx", "%rcx", "%rdx"); 13.81 - __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi)); 13.82 - /* asm volatile("RDTSC;" 13.83 - "movl %%eax, %0;" 13.84 - "movl %%edx, %1;" 13.85 - : "=m" (lo), "=m" (hi) 13.86 - : 13.87 - : "%eax", "%edx" 13.88 - ); */ 13.89 - return (uint64_t)hi << 32 | lo; 13.90 -} 13.91 \ No newline at end of file
14.1 --- a/HW_Dependent_Primitives/VMS__HW_measurement.h Mon Sep 03 03:34:54 2012 -0700 14.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 14.3 @@ -1,63 +0,0 @@ 14.4 -/* 14.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 14.6 - * Licensed under GNU General Public License version 2 14.7 - * 14.8 - * Author: seanhalle@yahoo.com 14.9 - * 14.10 - */ 14.11 - 14.12 -#ifndef _VMS__HW_MEASUREMENT_H 14.13 -#define _VMS__HW_MEASUREMENT_H 14.14 -#define _GNU_SOURCE 14.15 - 14.16 - 14.17 -//=================== Macros to Capture Measurements ====================== 14.18 - 14.19 -typedef union 14.20 - { uint32 lowHigh[2]; 14.21 - uint64 longVal; 14.22 - } 14.23 -TSCountLowHigh; 14.24 - 14.25 - 14.26 -//=================== Macros to Capture Measurements ====================== 14.27 -// 14.28 -//===== RDTSC wrapper ===== 14.29 -//Also runs with x86_64 code 14.30 -#define saveTSCLowHigh(lowHighIn) \ 14.31 - asm volatile("RDTSC; \ 14.32 - movl %%eax, %0; \ 14.33 - movl %%edx, %1;" \ 14.34 - /* outputs */ : "=m" (lowHighIn.lowHigh[0]), "=m" (lowHighIn.lowHigh[1])\ 14.35 - /* inputs */ : \ 14.36 - /* clobber */ : "%eax", "%edx" \ 14.37 - ); 14.38 - 14.39 -#define saveTimeStampCountInto(low, high) \ 14.40 - asm volatile("RDTSC; \ 14.41 - movl %%eax, %0; \ 14.42 - movl %%edx, %1;" \ 14.43 - /* outputs */ : "=m" (low), "=m" (high)\ 14.44 - /* inputs */ : \ 14.45 - /* clobber */ : "%eax", "%edx" \ 14.46 - ); 14.47 - 14.48 -#define saveLowTimeStampCountInto(low) \ 14.49 - asm volatile("RDTSC; \ 14.50 - movl %%eax, %0;" \ 14.51 - /* outputs */ : "=m" (low) \ 14.52 - /* inputs */ : \ 14.53 - /* clobber */ : "%eax", "%edx" \ 14.54 - ); 14.55 - 14.56 -inline TSCount getTSCount(); 14.57 - 14.58 - 14.59 - //For code that calculates normalization-offset between TSC counts of 14.60 - // different cores. 14.61 -//#define NUM_TSC_ROUND_TRIPS 10 14.62 - 14.63 -void setup_perf_counters(); 14.64 -uint64_t rdtsc(void); 14.65 -#endif /* */ 14.66 -
15.1 --- a/HW_Dependent_Primitives/VMS__primitives.c Mon Sep 03 03:34:54 2012 -0700 15.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 15.3 @@ -1,137 +0,0 @@ 15.4 -/* 15.5 - * This File contains all hardware dependent C code. 15.6 - */ 15.7 - 15.8 - 15.9 -#include "../VMS.h" 15.10 - 15.11 -/*Reset the stack then set it up with __cdecl structure on it 15.12 - * Except doing a trick for 64 bits, where point slave to helper assembly 15.13 - * that copies the function pointer off stack and into a reg, then 15.14 - * jumps to it. So, set the resumeInstrPtr to the helper-assembly. 15.15 - *This is for first-time startup of slave.. it trashes the stack. 15.16 - *No registers saved into old stack frame, and no animator state to 15.17 - * return to 15.18 - * 15.19 - *This was factored into separate function because it's used stand-alone in 15.20 - * some wrapper-libraries (but only "int" version, to warn users to check 15.21 - * carefully that it's safe) 15.22 - */ 15.23 -inline void 15.24 -VMS_int__reset_slaveVP_to_TopLvlFn( SlaveVP *slaveVP, TopLevelFnPtr fnPtr, 15.25 - void *dataParam) 15.26 - { void *stackPtr; 15.27 - 15.28 -// Start of Hardware dependent part 15.29 - 15.30 - //Set slave's instr pointer to a helper Fn that copies params from stack 15.31 - slaveVP->resumeInstrPtr = (TopLevelFnPtr)&startUpTopLevelFn; 15.32 - 15.33 - //fnPtr takes two params -- void *dataParam & void *animSlv 15.34 - // Stack grows *down*, so start it at highest stack addr, minus room 15.35 - // for 2 params + return addr. Do ptr arith in terms of bytes.. 15.36 - stackPtr = 15.37 - (uint8 *)slaveVP->startOfStack + VIRT_PROCR_STACK_SIZE - 4*sizeof(void*); 15.38 - 15.39 - //setup __cdecl on stack 15.40 - //Normally, return Addr is in loc pointed to by stackPtr, but doing a 15.41 - // trick for 64 bit arch, where put ptr to top-level fn there instead, 15.42 - // and set resumeInstrPtr to a helper-fn that copies the top-level 15.43 - // fn ptr and params into registers. 15.44 - //Then, dataParam is at stackPtr + 8 bytes, & animating SlaveVP above 15.45 - //Do ptr arith in terms of pointers 15.46 - *((SlaveVP**)stackPtr + 2 ) = slaveVP; //rightmost param 15.47 - *((void**)stackPtr + 1 ) = dataParam; //next param to left 15.48 - *((void**)stackPtr) = (void*)fnPtr; //copied to reg by helper Fn 15.49 - 15.50 - 15.51 -// end of Hardware dependent part 15.52 - 15.53 - //core controller will switch to stack & frame pointers stored in slave, 15.54 - // can't use this fn if have state on stack that needs preserving. 15.55 - slaveVP->stackPtr = stackPtr; 15.56 - slaveVP->framePtr = stackPtr; 15.57 - } 15.58 - 15.59 - 15.60 -/*Preserve the stack, pushing the __cdecl structure onto it 15.61 - * For 64 bits, params passed in regs, so point slave to helper assembly 15.62 - * that copies the arguments off stack and into regs, then 15.63 - * jumps to Fn. So, set the resumeInstrPtr to the helper-assembly. 15.64 - * 15.65 - *This preserves the stack state existed at time slave was suspended. 15.66 - */ 15.67 -inline void 15.68 -VMS_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 15.69 - void *param) 15.70 - { void *stackPtr; 15.71 - 15.72 -// Start of Hardware dependent part 15.73 - 15.74 - // Get the slave's current stack ptr, and make room for param + ret addr 15.75 - stackPtr = ((void **)slaveVP->stackPtr - 2); 15.76 - 15.77 - //save slave's current instr ptr as the return addr, so stack looks 15.78 - // just like it does after a call instr. 15.79 - //Put argument plus fn addr onto stack -- helper will copy into regs 15.80 - // then jump to the fn 15.81 - //fnPtr is just below top of stack, param is above at stackPtr + 8 bytes 15.82 - *((void**)stackPtr + 1 ) = param; 15.83 - *((void**)stackPtr) = slaveVP->resumeInstrPtr; //acts as return addr 15.84 - *((void**)stackPtr - 1) = (void*)fnPtr; //what helper jmps to 15.85 - 15.86 - //Set slave's instr pointer to a helper Fn that copies params from stack 15.87 - slaveVP->resumeInstrPtr = (TopLevelFnPtr)&jmpToOneParamFn; 15.88 - 15.89 -// end of Hardware dependent part 15.90 - 15.91 - //core controller will switch to stack & frame pointers stored in slave, 15.92 - // then jmp to helper Fn, which will then move param to register used 15.93 - // to pass argument and jmp to fnPtr saved on stack. 15.94 - //That fn should save the framePtr on stack and make room 15.95 - // for its own frame, as normal. So don't modify framePtr, only stack 15.96 - slaveVP->stackPtr = stackPtr; 15.97 - } 15.98 - 15.99 - 15.100 -/*Same as for one-parameter function, but puts two arguments on stack 15.101 - *Preserve the stack, pushing the __cdecl structure onto it 15.102 - * For 64 bits, params passed in regs, so point slave to helper assembly 15.103 - * that copies the arguments off stack and into regs, then 15.104 - * jumps to Fn. So, set the resumeInstrPtr to the helper-assembly. 15.105 - * 15.106 - *This preserves the stack state existed at time slave was suspended. 15.107 - */ 15.108 -inline void 15.109 -VMS_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 15.110 - void *param1, void *param2) 15.111 - { void *stackPtr; 15.112 - 15.113 -// Start of Hardware dependent part 15.114 - 15.115 - // Get the slave's current stack ptr, and make room for param + ret addr 15.116 - stackPtr = slaveVP->stackPtr - 3; 15.117 - 15.118 - //save slave's current instr ptr as the return addr, so stack looks 15.119 - // just like it does after a call instr. 15.120 - //Put argument plus fn addr onto stack -- helper will copy into regs 15.121 - // then jump to the fn 15.122 - //fnPtr is just below top of stack, param1 is above at stackPtr + 8 bytes 15.123 - *((void**)stackPtr + 2 ) = param2; 15.124 - *((void**)stackPtr + 1 ) = param1; 15.125 - *((void**)stackPtr) = slaveVP->resumeInstrPtr; //acts as return addr 15.126 - *((void**)stackPtr - 1) = (void*)fnPtr; //what helper jmps to 15.127 - 15.128 - //Set slave's instr pointer to a helper Fn that copies params from stack 15.129 - slaveVP->resumeInstrPtr = (TopLevelFnPtr)&jmpToTwoParamFn; 15.130 - 15.131 -// end of Hardware dependent part 15.132 - 15.133 - //core controller will switch to stack & frame pointers stored in slave, 15.134 - // then jmp to helper Fn, which will then move param to register used 15.135 - // to pass argument and jmp to fnPtr saved on stack. 15.136 - //That fn should save the framePtr on stack and make room 15.137 - // for its own frame, as normal. So don't modify framePtr, only stack 15.138 - slaveVP->stackPtr = stackPtr; 15.139 - } 15.140 -
16.1 --- a/HW_Dependent_Primitives/VMS__primitives.h Mon Sep 03 03:34:54 2012 -0700 16.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 16.3 @@ -1,55 +0,0 @@ 16.4 -/* 16.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 16.6 - * Licensed under GNU General Public License version 2 16.7 - * 16.8 - * Author: seanhalle@yahoo.com 16.9 - * 16.10 - */ 16.11 - 16.12 -#ifndef _VMS__PRIMITIVES_H 16.13 -#define _VMS__PRIMITIVES_H 16.14 -#define _GNU_SOURCE 16.15 - 16.16 -void 16.17 -recordCoreCtlrReturnLabelAddr(void **returnAddress); 16.18 - 16.19 -void 16.20 -switchToSlv(SlaveVP *nextSlave); 16.21 - 16.22 -void 16.23 -switchToCoreCtlr(SlaveVP *nextSlave); 16.24 - 16.25 -void 16.26 -masterSwitchToCoreCtlr(SlaveVP *nextSlave); 16.27 - 16.28 -void 16.29 -startUpTopLevelFn(); 16.30 - 16.31 -void 16.32 -jmpToOneParamFn(); 16.33 - 16.34 -void 16.35 -jmpToTwoParamFn(); 16.36 - 16.37 -void * 16.38 -asmTerminateCoreCtlr(SlaveVP *currSlv); 16.39 - 16.40 -#define flushRegisters() \ 16.41 - asm volatile ("":::"%rbx", "%r12", "%r13","%r14","%r15") 16.42 - 16.43 -void 16.44 -VMS_int__save_return_into_ptd_to_loc_then_do_ret(void *ptdToLoc); 16.45 - 16.46 -void 16.47 -VMS_int__return_to_addr_in_ptd_to_loc(void *ptdToLoc); 16.48 - 16.49 -inline void 16.50 -VMS_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 16.51 - void *param); 16.52 - 16.53 -inline void 16.54 -VMS_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 16.55 - void *param1, void *param2); 16.56 - 16.57 -#endif /* _VMS__HW_DEPENDENT_H */ 16.58 -
17.1 --- a/HW_Dependent_Primitives/VMS__primitives_asm.s Mon Sep 03 03:34:54 2012 -0700 17.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 17.3 @@ -1,189 +0,0 @@ 17.4 -.data 17.5 - 17.6 - 17.7 -.text 17.8 - 17.9 -//Save return label address for the coreCtlr to pointer 17.10 -//Arguments: Pointer to variable holding address 17.11 -.globl recordCoreCtlrReturnLabelAddr 17.12 -recordCoreCtlrReturnLabelAddr: 17.13 - movq $coreCtlrReturn, %rcx #load label address 17.14 - movq %rcx, (%rdi) #save address to pointer 17.15 - ret 17.16 - 17.17 - 17.18 -//Trick for 64 bit arch -- copies args from stack into regs, then does jmp to 17.19 -// the top-level function, which was pointed to by the stack-ptr 17.20 -.globl startUpTopLevelFn 17.21 -startUpTopLevelFn: 17.22 - movq %rdi , %rsi #get second argument from first argument of switchSlv 17.23 - movq 0x08(%rsp), %rdi #get first argument from stack 17.24 - movq (%rsp) , %rax #get top-level function's addr from stack 17.25 - jmp *%rax #jump to the top-level function 17.26 - 17.27 - 17.28 -//Args passed in regs in 64 bit arch. This copies args from stack into regs, 17.29 -// then does jmp to the function, whose addr is on stack. 17.30 -//For 64bit, %rdi is first arg, %rsi is second arg to function 17.31 -//The top of stack is a valid return addr (old value of slaveVP's instrPtr), 17.32 -// and the fnPtr is just below the top of stack (will be overwritten when 17.33 -// fn saves the frame ptr) 17.34 -.globl jmpToOneParamFn 17.35 -jmpToOneParamFn: 17.36 - movq 0x08(%rsp), %rdi #get the argument from stack 17.37 - movq -0x08(%rsp), %rax #get function's addr from stack 17.38 - jmp *%rax #jump to the function 17.39 - 17.40 -.globl jmpToTwoParamFn 17.41 -jmpToTwoParamFn: 17.42 - movq 0x10(%rsp), %rsi #get the second argument from stack 17.43 - movq 0x08(%rsp), %rdi #get the first argument from stack 17.44 - movq -0x08(%rsp), %rax #get function's addr from stack 17.45 - jmp *%rax #jump to the function 17.46 - 17.47 - 17.48 -//Switches form CoreCtlr to either a normal Slv VP or the Master VP 17.49 -//switch to VP's stack and frame ptr then jump to VP's next-instr-ptr 17.50 -/* SlaveVP offsets: 17.51 - * 0x00 stackPtr 17.52 - * 0x08 framePtr 17.53 - * 0x10 resumeInstrPtr 17.54 - * 0x18 coreCtlrFramePtr 17.55 - * 0x20 coreCtlrStackPtr 17.56 - * 17.57 - * _VMSMasterEnv offsets: 17.58 - * 0x00 coreCtlrReturnPt 17.59 - * 0x100 masterLock 17.60 - */ 17.61 -.globl switchToSlv 17.62 -switchToSlv: 17.63 - #SlaveVP in %rdi 17.64 - movq %rsp , 0x20(%rdi) #save core ctlr stack pointer 17.65 - movq %rbp , 0x18(%rdi) #save core ctlr frame pointer 17.66 - movq 0x00(%rdi), %rsp #restore stack pointer 17.67 - movq 0x08(%rdi), %rbp #restore frame pointer 17.68 - movq 0x10(%rdi), %rax #get jmp pointer 17.69 - jmp *%rax #jmp to Slv 17.70 -coreCtlrReturn: 17.71 - ret 17.72 - 17.73 - 17.74 -//switches to core controller. saves return address 17.75 -/* SlaveVP offsets: 17.76 - * 0x00 stackPtr 17.77 - * 0x08 framePtr 17.78 - * 0x10 resumeInstrPtr 17.79 - * 0x18 coreCtlrFramePtr 17.80 - * 0x20 coreCtlrStackPtr 17.81 - * 17.82 - * _VMSMasterEnv offsets: 17.83 - * 0x00 coreCtlrReturnPt 17.84 - * 0x100 masterLock 17.85 - */ 17.86 -.globl switchToCoreCtlr 17.87 -switchToCoreCtlr: 17.88 - #SlaveVP in %rdi 17.89 - movq $SlvReturn, 0x10(%rdi) #store return address 17.90 - movq %rsp , 0x00(%rdi) #save stack pointer 17.91 - movq %rbp , 0x08(%rdi) #save frame pointer 17.92 - movq 0x20(%rdi), %rsp #restore stack pointer 17.93 - movq 0x18(%rdi), %rbp #restore frame pointer 17.94 - movq $_VMSMasterEnv, %rcx 17.95 - movq (%rcx), %rcx #_VMSMasterEnv is pointer to struct 17.96 - movq 0x00(%rcx), %rax #get CoreCtlrStartPt 17.97 - jmp *%rax #jmp to CoreCtlr 17.98 -SlvReturn: 17.99 - ret 17.100 - 17.101 - 17.102 - 17.103 -//switches to core controller from master. saves return address 17.104 -//Releases masterLock so the next AnimationMaster can be executed 17.105 -/* SlaveVP offsets: 17.106 - * 0x00 stackPtr 17.107 - * 0x08 framePtr 17.108 - * 0x10 resumeInstrPtr 17.109 - * 0x18 coreCtlrFramePtr 17.110 - * 0x20 coreCtlrStackPtr 17.111 - * 17.112 - * _VMSMasterEnv offsets: 17.113 - * 0x00 coreCtlrReturnPt 17.114 - * 0x100 masterLock 17.115 - */ 17.116 -.globl masterSwitchToCoreCtlr 17.117 -masterSwitchToCoreCtlr: 17.118 - #SlaveVP in %rdi 17.119 - movq $MasterReturn, 0x10(%rdi) #store return address 17.120 - movq %rsp , 0x00(%rdi) #save stack pointer 17.121 - movq %rbp , 0x08(%rdi) #save frame pointer 17.122 - movq 0x20(%rdi), %rsp #restore stack pointer 17.123 - movq 0x18(%rdi), %rbp #restore frame pointer 17.124 - movq $_VMSMasterEnv, %rcx 17.125 - movq (%rcx), %rcx #_VMSMasterEnv is pointer to struct 17.126 - movq 0x00(%rcx), %rax #get CoreCtlr return pt 17.127 - movl $0x0 , 0x100(%rcx) #release lock 17.128 - jmp *%rax #jmp to CoreCtlr 17.129 -MasterReturn: 17.130 - ret 17.131 - 17.132 - 17.133 -/*Switch to terminateCoreCtlr 17.134 - *This is called by endOSThreadFn, which is the top-level function given 17.135 - * to a shutdown slave. When such a slave gets switched to, by the core 17.136 - * controller, it runs the top-level function, which calls this, which 17.137 - * then calls terminateCoreCtlr, which ends the pthread. Note, when get 17.138 - * here, stack is already set up for switchSlv and Slv ptr is in %rdi. 17.139 - *Do not save registers of Slv because this function will never return 17.140 - * 17.141 - * SlaveVP offsets: 17.142 - * 0x00 stackPtr 17.143 - * 0x08 framePtr 17.144 - * 0x10 resumeInstrPtr 17.145 - * 0x18 coreCtlrFramePtr 17.146 - * 0x20 coreCtlrStackPtr 17.147 - * 17.148 - * _VMSMasterEnv offsets: 17.149 - * 0x00 coreCtlrReturnPt 17.150 - * 0x100 masterLock 17.151 - */ 17.152 -.globl asmTerminateCoreCtlr 17.153 -asmTerminateCoreCtlr: #SlaveVP ptr is in %rdi 17.154 - movq 0x20(%rdi), %rsp #restore stack pointer 17.155 - movq 0x18(%rdi), %rbp #restore frame pointer 17.156 - movq $terminateCoreCtlr, %rax 17.157 - jmp *%rax #jmp to fn that ends the pthread 17.158 - 17.159 - 17.160 -/* 17.161 - * This one for the sequential version is special. It discards the current stack 17.162 - * and returns directly from the coreCtlr after VMS_WL__dissipate_slaveVP was called 17.163 - */ 17.164 -.globl asmTerminateCoreCtlrSeq 17.165 -asmTerminateCoreCtlrSeq: 17.166 - #SlaveVP in %rdi 17.167 - movq 0x20(%rdi), %rsp #restore stack pointer 17.168 - movq 0x18(%rdi), %rbp #restore frame pointer 17.169 - #argument is in %rdi 17.170 - call VMS_int__dissipate_slaveVP 17.171 - movq %rbp , %rsp #goto the coreCtlrs stack 17.172 - pop %rbp #restore the old framepointer 17.173 - ret #return from core controller 17.174 - 17.175 - 17.176 -//Takes the return addr off the stack and saves into the loc pointed to by 17.177 -// by the parameter passed in via rdi. Return addr is at 0x8(%rbp) for 64bit 17.178 -.globl VMS_int__save_return_into_ptd_to_loc_then_do_ret 17.179 -VMS_int__save_return_into_ptd_to_loc_then_do_ret: 17.180 - movq 0x08(%rbp), %rax #get ret address, rbp is the same as in the calling function 17.181 - movq %rax, (%rdi) #write ret addr into addr passed as param field 17.182 - ret 17.183 - 17.184 - 17.185 -//Assembly code changes the return addr on the stack to the one 17.186 -// pointed to by the parameter, then returns. Stack's return addr is at 0x8(%rbp) 17.187 -.globl VMS_int__return_to_addr_in_ptd_to_loc 17.188 -VMS_int__return_to_addr_in_ptd_to_loc: 17.189 - movq (%rdi), %rax #get return addr from addr passed as param 17.190 - movq %rax, 0x08(%rbp) #write return addr to the stack of the caller 17.191 - ret 17.192 -
18.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 18.2 +++ b/PR.h Wed Sep 19 23:12:44 2012 -0700 18.3 @@ -0,0 +1,442 @@ 18.4 +/* 18.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 18.6 + * Licensed under GNU General Public License version 2 18.7 + * 18.8 + * Author: seanhalle@yahoo.com 18.9 + * 18.10 + */ 18.11 + 18.12 +#ifndef _PR_H 18.13 +#define _PR_H 18.14 +#define _GNU_SOURCE 18.15 + 18.16 +#include "DynArray/DynArray.h" 18.17 +#include "Hash_impl/PrivateHash.h" 18.18 +#include "Histogram/Histogram.h" 18.19 +#include "Queue_impl/PrivateQueue.h" 18.20 + 18.21 +#include "PR_primitive_data_types.h" 18.22 +#include "Services_Offered_by_PR/Memory_Handling/vmalloc.h" 18.23 + 18.24 +#include <pthread.h> 18.25 +#include <sys/time.h> 18.26 + 18.27 +//================= Defines: included from separate files ================= 18.28 +// 18.29 +// Note: ALL defines are in other files, none are in here 18.30 +// 18.31 +#include "Defines/PR_defs.h" 18.32 + 18.33 + 18.34 +//================================ Typedefs ================================= 18.35 +// 18.36 +typedef unsigned long long TSCount; 18.37 + 18.38 +typedef struct _AnimSlot AnimSlot; 18.39 +typedef struct _PRReqst PRReqst; 18.40 +typedef struct _SlaveVP SlaveVP; 18.41 +typedef struct _MasterVP MasterVP; 18.42 +typedef struct _IntervalProbe IntervalProbe; 18.43 + 18.44 + 18.45 +typedef SlaveVP *(*SlaveAssigner) ( void *, AnimSlot*); //semEnv, slot for HW info 18.46 +typedef void (*RequestHandler) ( SlaveVP *, void * ); //prWReqst, semEnv 18.47 +typedef void (*TopLevelFnPtr) ( void *, SlaveVP * ); //initData, animSlv 18.48 +typedef void TopLevelFn ( void *, SlaveVP * ); //initData, animSlv 18.49 +typedef void (*ResumeSlvFnPtr) ( SlaveVP *, void * ); 18.50 + //=========== MEASUREMENT STUFF ========== 18.51 + MEAS__Insert_Counter_Handler 18.52 + //======================================== 18.53 + 18.54 +//============================ HW Dependent Fns ================================ 18.55 + 18.56 +#include "HW_Dependent_Primitives/PR__HW_measurement.h" 18.57 +#include "HW_Dependent_Primitives/PR__primitives.h" 18.58 + 18.59 + 18.60 +//============= Request Related =========== 18.61 +// 18.62 + 18.63 +enum PRReqstType //avoid starting enums at 0, for debug reasons 18.64 + { 18.65 + semantic = 1, 18.66 + createReq, 18.67 + dissipate, 18.68 + PRSemantic //goes with PRSemReqst below 18.69 + }; 18.70 + 18.71 +struct _PRReqst 18.72 + { 18.73 + enum PRReqstType reqType;//used for dissipate and in future for IO requests 18.74 + void *semReqData; 18.75 + 18.76 + PRReqst *nextReqst; 18.77 + }; 18.78 +//PRReqst 18.79 + 18.80 +enum PRSemReqstType //These are equivalent to semantic requests, but for 18.81 + { // PR's services available directly to app, like OS 18.82 + make_probe = 1, // and probe services -- like a PR-wide built-in lang 18.83 + throw_excp, 18.84 + openFile, 18.85 + otherIO 18.86 + }; 18.87 + 18.88 +typedef struct 18.89 + { enum PRSemReqstType reqType; 18.90 + SlaveVP *requestingSlv; 18.91 + char *nameStr; //for create probe 18.92 + char *msgStr; //for exception 18.93 + void *exceptionData; 18.94 + } 18.95 + PRSemReq; 18.96 + 18.97 + 18.98 +//==================== Core data structures =================== 18.99 + 18.100 +typedef struct 18.101 + { 18.102 + //for future expansion 18.103 + } 18.104 +SlotPerfInfo; 18.105 + 18.106 +struct _AnimSlot 18.107 + { 18.108 + int workIsDone; 18.109 + int needsSlaveAssigned; 18.110 + SlaveVP *slaveAssignedToSlot; 18.111 + 18.112 + int slotIdx; //needed by Holistic Model's data gathering 18.113 + int coreSlotIsOn; 18.114 + SlotPerfInfo *perfInfo; //used by assigner to pick best slave for core 18.115 + }; 18.116 +//AnimSlot 18.117 + 18.118 +enum VPtype 18.119 + { TaskSlotSlv = 1,//Slave tied to an anim slot, only animates tasks 18.120 + TaskExtraSlv, //When a suspended task ends, the slave becomes this 18.121 + PersistentSlv, //the VP is explicitly seen in the app code, or task suspends 18.122 + Slave, //to be removed 18.123 + Master, 18.124 + Shutdown, 18.125 + Idle 18.126 + }; 18.127 + 18.128 +/*This structure embodies the state of a slaveVP. It is reused for masterVP 18.129 + * and shutdownVPs. 18.130 + */ 18.131 +struct _SlaveVP 18.132 + { //The offsets of these fields are hard-coded into assembly 18.133 + void *stackPtr; //save the core's stack ptr when suspend 18.134 + void *framePtr; //save core's frame ptr when suspend 18.135 + void *resumeInstrPtr; //save core's program-counter when suspend 18.136 + void *coreCtlrFramePtr; //restore before jmp back to core controller 18.137 + void *coreCtlrStackPtr; //restore before jmp back to core controller 18.138 + 18.139 + //============ below this, no fields are used in asm ============= 18.140 + 18.141 + int slaveID; //each slave given a globally unique ID 18.142 + int coreAnimatedBy; 18.143 + void *startOfStack; //used to free, and to point slave to Fn 18.144 + enum VPtype typeOfVP; //Slave vs Master vs Shutdown.. 18.145 + int assignCount; //Each assign is for one work-unit, so IDs it 18.146 + //note, a scheduling decision is uniquely identified by the triple: 18.147 + // <slaveID, coreAnimatedBy, assignCount> -- used in record & replay 18.148 + 18.149 + //for comm -- between master and coreCtlr & btwn wrapper lib and plugin 18.150 + AnimSlot *animSlotAssignedTo; 18.151 + PRReqst *request; //wrapper lib puts in requests, plugin takes out 18.152 + void *dataRetFromReq;//Return vals from plugin to Wrapper Lib 18.153 + 18.154 + //For using Slave as carrier for data 18.155 + void *semanticData; //Lang saves lang-specific things in slave here 18.156 + 18.157 + //=========== MEASUREMENT STUFF ========== 18.158 + MEAS__Insert_Meas_Fields_into_Slave; 18.159 + float64 createPtInSecs; //time VP created, in seconds 18.160 + //======================================== 18.161 + }; 18.162 +//SlaveVP 18.163 + 18.164 + 18.165 +/* The one and only global variable, holds many odds and ends 18.166 + */ 18.167 +typedef struct 18.168 + { //The offsets of these fields are hard-coded into assembly 18.169 + void *coreCtlrReturnPt; //offset to this field used in asm 18.170 + int8 falseSharePad1[256 - sizeof(void*)]; 18.171 + int32 masterLock; //offset to this field used in asm 18.172 + int8 falseSharePad2[256 - sizeof(int32)]; 18.173 + //============ below this, no fields are used in asm ============= 18.174 + 18.175 + //Basic PR infrastructure 18.176 + SlaveVP **masterVPs; 18.177 + AnimSlot ***allAnimSlots; 18.178 + 18.179 + //plugin related 18.180 + PRSemEnv **langlets; 18.181 + 18.182 + //Slave creation -- global count of slaves existing, across langs and processes 18.183 + int32 numSlavesCreated; //used to give unique ID to processor 18.184 +//no reasonable way to do fail-safe when have mult langlets and processes.. have to detect for each langlet separately 18.185 +// int32 numSlavesAlive; //used to detect fail-safe shutdown 18.186 + 18.187 + //Initialization related 18.188 + int32 setupComplete; //use while starting up coreCtlr 18.189 + 18.190 + //Memory management related 18.191 + MallocArrays *freeLists; 18.192 + int32 amtOfOutstandingMem;//total currently allocated 18.193 + 18.194 + //Random number seeds -- random nums used in various places 18.195 + uint32_t seed1; 18.196 + uint32_t seed2; 18.197 + 18.198 + //=========== MEASUREMENT STUFF ============= 18.199 + IntervalProbe **intervalProbes; 18.200 + PtrToPrivDynArray *dynIntervalProbesInfo; 18.201 + HashTable *probeNameHashTbl; 18.202 + int32 masterCreateProbeID; 18.203 + float64 createPtInSecs; //real-clock time PR initialized 18.204 + Histogram **measHists; 18.205 + PtrToPrivDynArray *measHistsInfo; 18.206 + MEAS__Insert_Susp_Meas_Fields_into_MasterEnv; 18.207 + MEAS__Insert_Master_Meas_Fields_into_MasterEnv; 18.208 + MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv; 18.209 + MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv; 18.210 + MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv; 18.211 + MEAS__Insert_System_Meas_Fields_into_MasterEnv; 18.212 + MEAS__Insert_Counter_Meas_Fields_into_MasterEnv; 18.213 + //========================================== 18.214 + } 18.215 +MasterEnv; 18.216 + 18.217 +//===================== 18.218 +typedef struct 18.219 + { int32 langletID; //acts as index into array of langlets in master env 18.220 + void *langletSemEnv; 18.221 + int32 langMagicNumber; 18.222 + SlaveAssigner slaveAssigner; 18.223 + RequestHandler requestHandler; 18.224 + EndTaskHandler endTaskHandler; 18.225 + 18.226 + //Tack slaves created, separately for each langlet (in each process) 18.227 + int32 numSlavesCreated; //gives ordering to processor creation 18.228 + int32 numSlavesAlive; //used to detect fail-safe shutdown 18.229 + 18.230 + //when multi-lang, master polls sem env's to find one with work in it.. 18.231 + // in single-lang case, flag ignored, master always asks lang for work 18.232 + int32 hasWork; 18.233 + } 18.234 +PRSemEnv; 18.235 + 18.236 +//===================== Top Processor level Data Strucs ====================== 18.237 +typedef struct 18.238 + { 18.239 + 18.240 + } 18.241 +PRProcess; 18.242 +/*This structure holds all the information PR needs to manage a program. PR 18.243 + * stores information about what percent of CPU time the program is getting, 18.244 + * 18.245 + */ 18.246 +typedef struct 18.247 + { //void *semEnv; 18.248 + //RequestHdlrFnPtr requestHandler; 18.249 + //SlaveAssignerFnPtr slaveAssigner; 18.250 + int32 numSlavesLive; 18.251 + void *resultToReturn; 18.252 + 18.253 + SlaveVP *seedSlv; 18.254 + 18.255 + //These are used to coordinate within the main function..? 18.256 + bool32 executionIsComplete; 18.257 + pthread_mutex_t doneLock; //? not sure need these..? 18.258 + pthread_cond_t doneCond; 18.259 + } 18.260 +PRProcess; 18.261 + 18.262 + 18.263 +//========================= Extra Stuff Data Strucs ======================= 18.264 +typedef struct 18.265 + { 18.266 + 18.267 + } 18.268 +PRExcp; //exception 18.269 + 18.270 +//======================= OS Thread related =============================== 18.271 + 18.272 +void * coreController( void *paramsIn ); //standard PThreads fn prototype 18.273 +void * coreCtlr_Seq( void *paramsIn ); //standard PThreads fn prototype 18.274 +void animationMaster( void *initData, SlaveVP *masterVP ); 18.275 + 18.276 + 18.277 +typedef struct 18.278 + { 18.279 + void *endThdPt; 18.280 + unsigned int coreNum; 18.281 + } 18.282 +ThdParams; 18.283 + 18.284 +//============================= Global Vars ================================ 18.285 + 18.286 +volatile MasterEnv *_PRMasterEnv __align_to_cacheline__; 18.287 + 18.288 + //these are global, but only used for startup and shutdown 18.289 +pthread_t coreCtlrThdHandles[ NUM_CORES ]; //pthread's virt-procr state 18.290 +ThdParams *coreCtlrThdParams [ NUM_CORES ]; 18.291 + 18.292 +pthread_mutex_t suspendLock; 18.293 +pthread_cond_t suspendCond; 18.294 + 18.295 +//========================= Function Prototypes =========================== 18.296 +/* MEANING OF WL PI SS int PROS 18.297 + * These indicate which places the function is safe to use. They stand for: 18.298 + * 18.299 + * WL Wrapper Library -- wrapper lib code should only use these 18.300 + * PI Plugin -- plugin code should only use these 18.301 + * SS Startup and Shutdown -- designates these relate to startup & shutdown 18.302 + * int internal to PR -- should not be used in wrapper lib or plugin 18.303 + * PROS means "OS functions for applications to use" 18.304 + * 18.305 + * PR_int__ functions touch internal PR data structs and are only safe 18.306 + * to be used inside the master lock. However, occasionally, they appear 18.307 + * in wrapper-lib or plugin code. In those cases, very careful analysis 18.308 + * has been done to be sure no concurrency issues could arise. 18.309 + * 18.310 + * PR_WL__ functions are all safe for use outside the master lock. 18.311 + * 18.312 + * PROS are only safe for applications to use -- they're like a second 18.313 + * language mixed in -- but they can't be used inside plugin code, and 18.314 + * aren't meant for use in wrapper libraries, because they are themselves 18.315 + * wrapper-library calls! 18.316 + */ 18.317 +//========== Startup and shutdown ========== 18.318 +void 18.319 +PR__start(); 18.320 + 18.321 +void 18.322 +PR_SS__start_the_work_then_wait_until_done(); 18.323 + 18.324 +SlaveVP* 18.325 +PR_SS__create_shutdown_slave(); 18.326 + 18.327 +void 18.328 +PR_SS__shutdown(); 18.329 + 18.330 +void 18.331 +PR_SS__cleanup_at_end_of_shutdown(); 18.332 + 18.333 +void 18.334 +PR_SS__register_langlets_semEnv( PRSemEnv *semEnv, int32 VSs_MAGIC_NUMBER, 18.335 + SlaveVP *seedVP ); 18.336 + 18.337 + 18.338 +//============== =============== 18.339 + 18.340 +inline SlaveVP * 18.341 +PR_int__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ); 18.342 +#define PR_PI__create_slaveVP PR_int__create_slaveVP 18.343 +#define PR_WL__create_slaveVP PR_int__create_slaveVP 18.344 + 18.345 + //Use this to create processor inside entry point & other places outside 18.346 + // the PR system boundary (IE, don't animate with a SlaveVP or MasterVP) 18.347 +SlaveVP * 18.348 +PR_ext__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ); 18.349 + 18.350 +inline SlaveVP * 18.351 +PR_int__create_slaveVP_helper( SlaveVP *newSlv, TopLevelFnPtr fnPtr, 18.352 + void *dataParam, void *stackLocs ); 18.353 + 18.354 +inline void 18.355 +PR_int__reset_slaveVP_to_TopLvlFn( SlaveVP *slaveVP, TopLevelFnPtr fnPtr, 18.356 + void *dataParam); 18.357 + 18.358 +inline void 18.359 +PR_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 18.360 + void *param); 18.361 + 18.362 +inline void 18.363 +PR_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 18.364 + void *param1, void *param2); 18.365 + 18.366 +void 18.367 +PR_int__dissipate_slaveVP( SlaveVP *slaveToDissipate ); 18.368 +#define PR_PI__dissipate_slaveVP PR_int__dissipate_slaveVP 18.369 +//WL: dissipate a SlaveVP by sending a request 18.370 + 18.371 +void 18.372 +PR_ext__dissipate_slaveVP( SlaveVP *slaveToDissipate ); 18.373 + 18.374 +void 18.375 +PR_int__throw_exception( char *msgStr, SlaveVP *reqstSlv, PRExcp *excpData ); 18.376 +#define PR_PI__throw_exception PR_int__throw_exception 18.377 +void 18.378 +PR_WL__throw_exception( char *msgStr, SlaveVP *reqstSlv, PRExcp *excpData ); 18.379 +#define PR_App__throw_exception PR_WL__throw_exception 18.380 + 18.381 +void * 18.382 +PR_int__give_sem_env_for( SlaveVP *animSlv ); 18.383 +#define PR_PI__give_sem_env_for PR_int__give_sem_env_for 18.384 +#define PR_SS__give_sem_env_for PR_int__give_sem_env_for 18.385 +//No WL version -- not safe! if use in WL, be sure data rd & wr is stable 18.386 + 18.387 + 18.388 +inline void 18.389 +PR_int__get_master_lock(); 18.390 + 18.391 +#define PR_int__release_master_lock() _PRMasterEnv->masterLock = UNLOCKED 18.392 + 18.393 +inline uint32_t 18.394 +PR_int__randomNumber(); 18.395 + 18.396 +//============== Request Related =============== 18.397 + 18.398 +void 18.399 +PR_int__suspend_slaveVP_and_send_req( SlaveVP *callingSlv ); 18.400 + 18.401 +inline void 18.402 +PR_WL__add_sem_request_in_mallocd_PRReqst( void *semReqData, SlaveVP *callingSlv ); 18.403 + 18.404 +inline void 18.405 +PR_WL__send_sem_request( void *semReqData, SlaveVP *callingSlv ); 18.406 + 18.407 +void 18.408 +PR_WL__send_create_slaveVP_req( void *semReqData, SlaveVP *reqstingSlv ); 18.409 + 18.410 +void inline 18.411 +PR_WL__send_dissipate_req( SlaveVP *prToDissipate ); 18.412 + 18.413 +inline void 18.414 +PR_WL__send_PRSem_request( void *semReqData, SlaveVP *callingSlv ); 18.415 + 18.416 +PRReqst * 18.417 +PR_PI__take_next_request_out_of( SlaveVP *slaveWithReq ); 18.418 +//#define PR_PI__take_next_request_out_of( slave ) slave->requests 18.419 + 18.420 +//inline void * 18.421 +//PR_PI__take_sem_reqst_from( PRReqst *req ); 18.422 +#define PR_PI__take_sem_reqst_from( req ) req->semReqData 18.423 + 18.424 +void inline 18.425 +PR_PI__handle_PRSemReq( PRReqst *req, SlaveVP *requestingSlv, void *semEnv, 18.426 + ResumeSlvFnPtr resumeSlvFnPtr ); 18.427 + 18.428 +//======================== MEASUREMENT ====================== 18.429 +uint64 18.430 +PR_WL__give_num_plugin_cycles(); 18.431 +uint32 18.432 +PR_WL__give_num_plugin_animations(); 18.433 + 18.434 + 18.435 +//========================= Utilities ======================= 18.436 +inline char * 18.437 +PR_int__strDup( char *str ); 18.438 + 18.439 + 18.440 +//========================= Probes ======================= 18.441 +#include "Services_Offered_by_PR/Measurement_and_Stats/probes.h" 18.442 + 18.443 +//================================================ 18.444 +#endif /* _PR_H */ 18.445 +
19.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 19.2 +++ b/PR__PI.c Wed Sep 19 23:12:44 2012 -0700 19.3 @@ -0,0 +1,121 @@ 19.4 +/* 19.5 + * Copyright 2010 OpenSourceStewardshipFoundation 19.6 + * 19.7 + * Licensed under BSD 19.8 + */ 19.9 + 19.10 +#include <stdio.h> 19.11 +#include <stdlib.h> 19.12 +#include <string.h> 19.13 +#include <malloc.h> 19.14 +#include <inttypes.h> 19.15 +#include <sys/time.h> 19.16 + 19.17 +#include "PR.h" 19.18 + 19.19 + 19.20 +/* MEANING OF WL PI SS int 19.21 + * These indicate which places the function is safe to use. They stand for: 19.22 + * WL: Wrapper Library 19.23 + * PI: Plugin 19.24 + * SS: Startup and Shutdown 19.25 + * int: internal to the PR implementation 19.26 + */ 19.27 + 19.28 +//========================= Local Declarations ======================== 19.29 +void inline 19.30 +handleMakeProbe( PRSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ); 19.31 + 19.32 +void inline 19.33 +handleThrowException( PRSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ); 19.34 +//======================================================================= 19.35 + 19.36 + 19.37 +PRReqst * 19.38 +PR_PI__take_next_request_out_of( SlaveVP *slaveWithReq ) 19.39 + { PRReqst *req; 19.40 + 19.41 + req = slaveWithReq->request; 19.42 + if( req == NULL ) return NULL; 19.43 + 19.44 + slaveWithReq->request = slaveWithReq->request->nextReqst; 19.45 + return req; 19.46 + } 19.47 + 19.48 + 19.49 + 19.50 +/*May 2012 19.51 + *CHANGED IMPL -- now a macro in header file 19.52 + * 19.53 + *Turn function into macro that just accesses the request field 19.54 + * 19.55 +inline void * 19.56 +PR_PI__take_sem_reqst_from( PRReqst *req ) 19.57 + { 19.58 + return req->semReqData; 19.59 + } 19.60 +*/ 19.61 + 19.62 + 19.63 +/* This is for OS requests and PR infrastructure requests, such as to create 19.64 + * a probe -- a probe is inside the heart of PR-core, it's not part of any 19.65 + * language -- but it's also a semantic thing that's triggered from and used 19.66 + * in the application.. so it crosses abstractions.. so, need some special 19.67 + * pattern here for handling such requests. 19.68 + * Doing this just like it were a second language sharing PR-core. 19.69 + * 19.70 + * This is called from the language's request handler when it sees a request 19.71 + * of type PRSemReq 19.72 + * 19.73 + * TODO: Later change this, to give probes their own separate plugin & have 19.74 + * PR-core steer the request to appropriate plugin 19.75 + * Do the same for OS calls -- look later at it.. 19.76 + */ 19.77 +void inline 19.78 +PR_PI__handle_PRSemReq( PRReqst *req, SlaveVP *requestingSlv, void *semEnv, 19.79 + ResumeSlvFnPtr resumeFn ) 19.80 + { PRSemReq *semReq; 19.81 + 19.82 + semReq = PR_PI__take_sem_reqst_from(req); 19.83 + if( semReq == NULL ) return; 19.84 + switch( semReq->reqType ) //sem handlers are all in other file 19.85 + { 19.86 + case make_probe: handleMakeProbe( semReq, semEnv, resumeFn); 19.87 + break; 19.88 + case throw_excp: handleThrowException( semReq, semEnv, resumeFn); 19.89 + break; 19.90 + } 19.91 + } 19.92 + 19.93 +/* 19.94 + */ 19.95 +void inline 19.96 +handleMakeProbe( PRSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ) 19.97 + { IntervalProbe *newProbe; 19.98 + 19.99 + newProbe = PR_int__malloc( sizeof(IntervalProbe) ); 19.100 + newProbe->nameStr = PR_int__strDup( semReq->nameStr ); 19.101 + newProbe->hist = NULL; 19.102 + newProbe->schedChoiceWasRecorded = FALSE; 19.103 + 19.104 + //This runs in masterVP, so no race-condition worries 19.105 + newProbe->probeID = 19.106 + addToDynArray( newProbe, _PRMasterEnv->dynIntervalProbesInfo ); 19.107 + 19.108 + semReq->requestingSlv->dataRetFromReq = newProbe; 19.109 + 19.110 + //This in inside PR, while resume_slaveVP fn is inside language, so pass 19.111 + // pointer from lang to here, then call it. 19.112 + (*resumeFn)( semReq->requestingSlv, semEnv ); 19.113 + } 19.114 + 19.115 +void inline 19.116 +handleThrowException( PRSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ) 19.117 + { 19.118 + PR_int__throw_exception( semReq->msgStr, semReq->requestingSlv, semReq->exceptionData ); 19.119 + 19.120 + (*resumeFn)( semReq->requestingSlv, semEnv ); 19.121 + } 19.122 + 19.123 + 19.124 +
20.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 20.2 +++ b/PR__WL.c Wed Sep 19 23:12:44 2012 -0700 20.3 @@ -0,0 +1,160 @@ 20.4 +/* 20.5 + * Copyright 2010 OpenSourceStewardshipFoundation 20.6 + * 20.7 + * Licensed under BSD 20.8 + */ 20.9 + 20.10 +#include <stdio.h> 20.11 +#include <stdlib.h> 20.12 +#include <string.h> 20.13 +#include <malloc.h> 20.14 +#include <inttypes.h> 20.15 +#include <sys/time.h> 20.16 + 20.17 +#include "PR.h" 20.18 + 20.19 + 20.20 +/* MEANING OF WL PI SS int 20.21 + * These indicate which places the function is safe to use. They stand for: 20.22 + * WL: Wrapper Library 20.23 + * PI: Plugin 20.24 + * SS: Startup and Shutdown 20.25 + * int: internal to the PR implementation 20.26 + */ 20.27 + 20.28 + 20.29 + 20.30 +/*For this implementation of PR, it may not make much sense to have the 20.31 + * system of requests for creating a new processor done this way.. but over 20.32 + * the scope of single-master, multi-master, mult-tasking, OS-implementing, 20.33 + * distributed-memory, and so on, this gives PR implementation a chance to 20.34 + * do stuff before suspend, in the SlaveVP, and in the Master before the plugin 20.35 + * is called, as well as in the lang-lib before this is called, and in the 20.36 + * plugin. So, this gives both PR and language implementations a chance to 20.37 + * intercept at various points and do order-dependent stuff. 20.38 + *Having a standard PRNewPrReqData struc allows the language to create and 20.39 + * free the struc, while PR knows how to get the newSlv if it wants it, and 20.40 + * it lets the lang have lang-specific data related to creation transported 20.41 + * to the plugin. 20.42 + */ 20.43 +void 20.44 +PR_WL__send_create_slaveVP_req( void *semReqData, SlaveVP *reqstingSlv ) 20.45 + { PRReqst req; 20.46 + 20.47 + req.reqType = createReq; 20.48 + req.semReqData = semReqData; 20.49 + req.nextReqst = reqstingSlv->request; 20.50 + reqstingSlv->request = &req; 20.51 + 20.52 + PR_int__suspend_slaveVP_and_send_req( reqstingSlv ); 20.53 + } 20.54 + 20.55 + 20.56 +/* 20.57 + *This adds a request to dissipate, then suspends the processor so that the 20.58 + * request handler will receive the request. The request handler is what 20.59 + * does the work of freeing memory and removing the processor from the 20.60 + * semantic environment's data structures. 20.61 + *The request handler also is what figures out when to shutdown the PR 20.62 + * system -- which causes all the core controller threads to die, and returns from 20.63 + * the call that started up PR to perform the work. 20.64 + * 20.65 + *This form is a bit misleading to understand if one is trying to figure out 20.66 + * how PR works -- it looks like a normal function call, but inside it 20.67 + * sends a request to the request handler and suspends the processor, which 20.68 + * jumps out of the PR_WL__dissipate_slaveVP function, and out of all nestings 20.69 + * above it, transferring the work of dissipating to the request handler, 20.70 + * which then does the actual work -- causing the processor that animated 20.71 + * the call of this function to disappear and the "hanging" state of this 20.72 + * function to just poof into thin air -- the virtual processor's trace 20.73 + * never returns from this call, but instead the virtual processor's trace 20.74 + * gets suspended in this call and all the virt processor's state disap- 20.75 + * pears -- making that suspend the last thing in the Slv's trace. 20.76 + */ 20.77 +void 20.78 +PR_WL__send_dissipate_req( SlaveVP *slaveToDissipate ) 20.79 + { PRReqst req; 20.80 + 20.81 + req.reqType = dissipate; 20.82 + req.nextReqst = slaveToDissipate->request; 20.83 + slaveToDissipate->request = &req; 20.84 + 20.85 + PR_int__suspend_slaveVP_and_send_req( slaveToDissipate ); 20.86 + } 20.87 + 20.88 + 20.89 + 20.90 +/*This call's name indicates that request is malloc'd -- so req handler 20.91 + * has to free any extra requests tacked on before a send, using this. 20.92 + * 20.93 + * This inserts the semantic-layer's request data into standard PR carrier 20.94 + * request data-struct that is mallocd. The sem request doesn't need to 20.95 + * be malloc'd if this is called inside the same call chain before the 20.96 + * send of the last request is called. 20.97 + * 20.98 + *The request handler has to call PR_int__free_PRReq for any of these 20.99 + */ 20.100 +inline void 20.101 +PR_WL__add_sem_request_in_mallocd_PRReqst( void *semReqData, 20.102 + SlaveVP *callingSlv ) 20.103 + { PRReqst *req; 20.104 + 20.105 + req = PR_int__malloc( sizeof(PRReqst) ); 20.106 + req->reqType = semantic; 20.107 + req->semReqData = semReqData; 20.108 + req->nextReqst = callingSlv->request; 20.109 + callingSlv->request = req; 20.110 + } 20.111 + 20.112 +/*This inserts the semantic-layer's request data into standard PR carrier 20.113 + * request data-struct is allocated on stack of this call & ptr to it sent 20.114 + * to plugin 20.115 + *Then it does suspend, to cause request to be sent. 20.116 + */ 20.117 +inline void 20.118 +PR_WL__send_sem_request( void *semReqData, SlaveVP *callingSlv ) 20.119 + { PRReqst req; 20.120 + 20.121 + req.reqType = semantic; 20.122 + req.semReqData = semReqData; 20.123 + req.nextReqst = callingSlv->request; 20.124 + callingSlv->request = &req; 20.125 + 20.126 + PR_int__suspend_slaveVP_and_send_req( callingSlv ); 20.127 + } 20.128 + 20.129 + 20.130 +/*May 2012 Not sure what this is.. looks like old idea for PR semantic 20.131 + * request 20.132 + */ 20.133 +inline void 20.134 +PR_WL__send_PRSem_request( void *semReqData, SlaveVP *callingSlv ) 20.135 + { PRReqst req; 20.136 + 20.137 + req.reqType = PRSemantic; 20.138 + req.semReqData = semReqData; 20.139 + req.nextReqst = callingSlv->request; //gab any other preceeding 20.140 + callingSlv->request = &req; 20.141 + 20.142 + PR_int__suspend_slaveVP_and_send_req( callingSlv ); 20.143 + } 20.144 + 20.145 +/*May 2012 20.146 + *To throw exception from wrapper lib or application, first turn 20.147 + * it into a request, then send the request 20.148 + */ 20.149 +void 20.150 +PR_WL__throw_exception( char *msgStr, SlaveVP *reqstSlv, PRExcp *excpData ) 20.151 + { PRReqst req; 20.152 + PRSemReq semReq; 20.153 + 20.154 + req.reqType = PRSemantic; 20.155 + req.semReqData = &semReq; 20.156 + req.nextReqst = reqstSlv->request; //gab any other preceeding 20.157 + reqstSlv->request = &req; 20.158 + 20.159 + semReq.msgStr = msgStr; 20.160 + semReq.exceptionData = excpData; 20.161 + 20.162 + PR_int__suspend_slaveVP_and_send_req( reqstSlv ); 20.163 + }
21.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 21.2 +++ b/PR__int.c Wed Sep 19 23:12:44 2012 -0700 21.3 @@ -0,0 +1,289 @@ 21.4 +/* 21.5 + * Copyright 2010 OpenSourceStewardshipFoundation 21.6 + * 21.7 + * Licensed under BSD 21.8 + */ 21.9 + 21.10 +#include <stdio.h> 21.11 +#include <stdlib.h> 21.12 +#include <string.h> 21.13 +#include <malloc.h> 21.14 +#include <inttypes.h> 21.15 +#include <sys/time.h> 21.16 + 21.17 +#include "PR.h" 21.18 + 21.19 + 21.20 +/* MEANING OF WL PI SS int 21.21 + * These indicate which places the function is safe to use. They stand for: 21.22 + * WL: Wrapper Library 21.23 + * PI: Plugin 21.24 + * SS: Startup and Shutdown 21.25 + * int: internal to the PR implementation 21.26 + */ 21.27 + 21.28 + 21.29 +inline SlaveVP * 21.30 +PR_int__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ) 21.31 + { SlaveVP *newSlv; 21.32 + void *stackLocs; 21.33 + 21.34 + newSlv = PR_int__malloc( sizeof(SlaveVP) ); 21.35 + stackLocs = PR_int__malloc( VIRT_PROCR_STACK_SIZE ); 21.36 + if( stackLocs == 0 ) 21.37 + { perror("PR_int__malloc stack"); exit(1); } 21.38 + 21.39 + _PRMasterEnv->numSlavesAlive += 1; 21.40 + 21.41 + return PR_int__create_slaveVP_helper( newSlv, fnPtr, dataParam, stackLocs ); 21.42 + } 21.43 + 21.44 +/* "ext" designates that it's for use outside the PR system -- should only 21.45 + * be called from main thread or other thread -- never from code animated by 21.46 + * a PR virtual processor. 21.47 + */ 21.48 +inline SlaveVP * 21.49 +PR_ext__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ) 21.50 + { SlaveVP *newSlv; 21.51 + char *stackLocs; 21.52 + 21.53 + newSlv = malloc( sizeof(SlaveVP) ); 21.54 + stackLocs = malloc( VIRT_PROCR_STACK_SIZE ); 21.55 + if( stackLocs == 0 ) 21.56 + { perror("malloc stack"); exit(1); } 21.57 + 21.58 + _PRMasterEnv->numSlavesAlive += 1; 21.59 + 21.60 + return PR_int__create_slaveVP_helper(newSlv, fnPtr, dataParam, stackLocs); 21.61 + } 21.62 + 21.63 + 21.64 +//=========================================================================== 21.65 +/*there is a label inside this function -- save the addr of this label in 21.66 + * the callingSlv struc, as the pick-up point from which to start the next 21.67 + * work-unit for that slave. If turns out have to save registers, then 21.68 + * save them in the slave struc too. Then do assembly jump to the CoreCtlr's 21.69 + * "done with work-unit" label. The slave struc is in the request in the 21.70 + * slave that animated the just-ended work-unit, so all the state is saved 21.71 + * there, and will get passed along, inside the request handler, to the 21.72 + * next work-unit for that slave. 21.73 + */ 21.74 +void 21.75 +PR_int__suspend_slaveVP_and_send_req( SlaveVP *animatingSlv ) 21.76 + { 21.77 + 21.78 + //This suspended Slv will get assigned by Master again at some 21.79 + // future point 21.80 + 21.81 + //return ownership of the Slv and anim slot to Master virt pr 21.82 + animatingSlv->animSlotAssignedTo->workIsDone = TRUE; 21.83 + 21.84 + HOLISTIC__Record_HwResponderInvocation_start; 21.85 + MEAS__Capture_Pre_Susp_Point; 21.86 + //This assembly function is a PR primitive that first saves the 21.87 + // stack and frame pointer, plus an addr inside this assembly code. 21.88 + //When core ctlr later gets this slave out of a sched slot, it 21.89 + // restores the stack and frame and then jumps to the addr.. that 21.90 + // jmp causes return from this function. 21.91 + //So, in effect, this function takes a variable amount of wall-clock 21.92 + // time to complete -- the amount of time is determined by the 21.93 + // Master, which makes sure the memory is in a consistent state first. 21.94 + switchToCoreCtlr(animatingSlv); 21.95 + flushRegisters(); 21.96 + MEAS__Capture_Post_Susp_Point; 21.97 + 21.98 + return; 21.99 + } 21.100 + 21.101 + 21.102 +/* "ext" designates that it's for use outside the PR system -- should only 21.103 + * be called from main thread or other thread -- never from code animated by 21.104 + * a SlaveVP, nor from a masterVP. 21.105 + * 21.106 + *Use this version to dissipate Slvs created outside the PR system. 21.107 + */ 21.108 +void 21.109 +PR_ext__dissipate_slaveVP( SlaveVP *slaveToDissipate ) 21.110 + { 21.111 + _PRMasterEnv->numSlavesAlive -= 1; 21.112 + if( _PRMasterEnv->numSlavesAlive == 0 ) 21.113 + { //no more work, so shutdown 21.114 + PR_SS__shutdown(); //note, creates shut-down slaves on each core 21.115 + } 21.116 + 21.117 + //NOTE: dataParam was given to the processor, so should either have 21.118 + // been alloc'd with PR_int__malloc, or freed by the level above animSlv. 21.119 + //So, all that's left to free here is the stack and the SlaveVP struc 21.120 + // itself 21.121 + //Note, should not stack-allocate the data param -- no guarantee, in 21.122 + // general that creating processor will outlive ones it creates. 21.123 + free( slaveToDissipate->startOfStack ); 21.124 + free( slaveToDissipate ); 21.125 + } 21.126 + 21.127 + 21.128 + 21.129 +/*This must be called by the request handler plugin -- it cannot be called 21.130 + * from the semantic library "dissipate processor" function -- instead, the 21.131 + * semantic layer has to generate a request, and the plug-in calls this 21.132 + * function. 21.133 + *The reason is that this frees the virtual processor's stack -- which is 21.134 + * still in use inside semantic library calls! 21.135 + * 21.136 + *This frees or recycles all the state owned by and comprising the PR 21.137 + * portion of the animating virtual procr. The request handler must first 21.138 + * free any semantic data created for the processor that didn't use the 21.139 + * PR_malloc mechanism. Then it calls this, which first asks the malloc 21.140 + * system to disown any state that did use PR_malloc, and then frees the 21.141 + * statck and the processor-struct itself. 21.142 + *If the dissipated processor is the sole (remaining) owner of PR_int__malloc'd 21.143 + * state, then that state gets freed (or sent to recycling) as a side-effect 21.144 + * of dis-owning it. 21.145 + */ 21.146 +void 21.147 +PR_int__dissipate_slaveVP( SlaveVP *animatingSlv ) 21.148 + { 21.149 + DEBUG__printf2(dbgRqstHdlr, "PR int dissipate slaveID: %d, alive: %d",animatingSlv->slaveID, _PRMasterEnv->numSlavesAlive-1); 21.150 + //dis-own all locations owned by this processor, causing to be freed 21.151 + // any locations that it is (was) sole owner of 21.152 + _PRMasterEnv->numSlavesAlive -= 1; 21.153 + if( _PRMasterEnv->numSlavesAlive == 0 ) 21.154 + { //no more work, so shutdown 21.155 + PR_SS__shutdown(); //note, creates shut-down processor on each core 21.156 + } 21.157 + 21.158 + //NOTE: dataParam was given to the processor, so should either have 21.159 + // been alloc'd with PR_int__malloc, or freed by the level above animSlv. 21.160 + //So, all that's left to free here is the stack and the SlaveVP struc 21.161 + // itself 21.162 + //Note, should not stack-allocate initial data -- no guarantee, in 21.163 + // general that creating processor will outlive ones it creates. 21.164 + PR_int__free( animatingSlv->startOfStack ); 21.165 + PR_int__free( animatingSlv ); 21.166 + } 21.167 + 21.168 +/*Anticipating multi-tasking 21.169 + */ 21.170 +void * 21.171 +PR_int__give_sem_env_for( SlaveVP *animSlv ) 21.172 + { 21.173 + return _PRMasterEnv->semanticEnv; 21.174 + } 21.175 + 21.176 +/* 21.177 + * 21.178 + */ 21.179 +inline SlaveVP * 21.180 +PR_int__create_slaveVP_helper( SlaveVP *newSlv, TopLevelFnPtr fnPtr, 21.181 + void *dataParam, void *stackLocs ) 21.182 + { 21.183 + newSlv->startOfStack = stackLocs; 21.184 + newSlv->slaveID = _PRMasterEnv->numSlavesCreated++; 21.185 + newSlv->request = NULL; 21.186 + newSlv->animSlotAssignedTo = NULL; 21.187 + newSlv->typeOfVP = Slave; 21.188 + newSlv->assignCount = 0; 21.189 + 21.190 + PR_int__reset_slaveVP_to_TopLvlFn( newSlv, fnPtr, dataParam ); 21.191 + 21.192 + //============================= MEASUREMENT STUFF ======================== 21.193 + #ifdef PROBES__TURN_ON_STATS_PROBES 21.194 + //TODO: make this TSCHiLow or generic equivalent 21.195 + //struct timeval timeStamp; 21.196 + //gettimeofday( &(timeStamp), NULL); 21.197 + //newSlv->createPtInSecs = timeStamp.tv_sec +(timeStamp.tv_usec/1000000.0) - 21.198 + // _PRMasterEnv->createPtInSecs; 21.199 + #endif 21.200 + //======================================================================== 21.201 + 21.202 + return newSlv; 21.203 + } 21.204 + 21.205 + 21.206 +/*Later, improve this -- for now, just exits the application after printing 21.207 + * the error message. 21.208 + */ 21.209 +void 21.210 +PR_int__throw_exception( char *msgStr, SlaveVP *reqstSlv, PRExcp *excpData ) 21.211 + { 21.212 + printf("%s",msgStr); 21.213 + fflush(stdin); 21.214 + exit(1); 21.215 + } 21.216 + 21.217 + 21.218 +inline char * 21.219 +PR_int__strDup( char *str ) 21.220 + { char *retStr; 21.221 + 21.222 + if( str == NULL ) return (char *)NULL; 21.223 + retStr = (char *)PR_int__malloc( strlen(str) + 1 ); 21.224 + strcpy( retStr, str ); 21.225 + 21.226 + return (char *)retStr; 21.227 + } 21.228 + 21.229 + 21.230 +inline void 21.231 +PR_int__backoff_for_TooLongToGetLock( int32 numTriesToGetLock ); 21.232 + 21.233 +inline void 21.234 +PR_int__get_master_lock() 21.235 + { int32 *addrOfMasterLock; 21.236 + 21.237 + addrOfMasterLock = &(_PRMasterEnv->masterLock); 21.238 + 21.239 + int numTriesToGetLock = 0; 21.240 + int gotLock = 0; 21.241 + 21.242 + MEAS__Capture_Pre_Master_Lock_Point; 21.243 + 21.244 + while( !gotLock ) //keep going until get master lock 21.245 + { 21.246 + numTriesToGetLock++; //if too many, means too much contention 21.247 + if( numTriesToGetLock > NUM_TRIES_BEFORE_DO_BACKOFF ) 21.248 + { PR_int__backoff_for_TooLongToGetLock( numTriesToGetLock ); 21.249 + } 21.250 + if( numTriesToGetLock > MASTERLOCK_RETRIES_BEFORE_YIELD ) 21.251 + { numTriesToGetLock = 0; 21.252 + pthread_yield(); 21.253 + } 21.254 + 21.255 + //try to get the lock 21.256 + gotLock = __sync_bool_compare_and_swap( addrOfMasterLock, 21.257 + UNLOCKED, LOCKED ); 21.258 + } 21.259 + MEAS__Capture_Post_Master_Lock_Point; 21.260 + } 21.261 + 21.262 +/*Used by the backoff to pick a random amount of busy-wait. Can't use the 21.263 + * system rand because it takes much too long. 21.264 + *Note, are passing pointers to the seeds, which are then modified 21.265 + */ 21.266 +inline uint32_t 21.267 +PR_int__randomNumber() 21.268 + { 21.269 + _PRMasterEnv->seed1 = 36969 * (_PRMasterEnv->seed1 & 65535) + 21.270 + (_PRMasterEnv->seed1 >> 16); 21.271 + _PRMasterEnv->seed2 = 18000 * (_PRMasterEnv->seed2 & 65535) + 21.272 + (_PRMasterEnv->seed2 >> 16); 21.273 + return (_PRMasterEnv->seed1 << 16) + _PRMasterEnv->seed2; 21.274 + } 21.275 + 21.276 + 21.277 +/*Busy-waits for a random number of cycles -- chooses number of cycles 21.278 + * differently than for the no-work backoff 21.279 + */ 21.280 +inline void 21.281 +PR_int__backoff_for_TooLongToGetLock( int32 numTriesToGetLock ) 21.282 + { int32 i, waitIterations; 21.283 + volatile double fakeWorkVar; //busy-wait fake work 21.284 + 21.285 + waitIterations = 21.286 + PR_int__randomNumber()% (numTriesToGetLock * GET_LOCK_BACKOFF_WEIGHT); 21.287 + //addToHist( wait_iterations, coreLoopThdParams->wait_iterations_hist ); 21.288 + for( i = 0; i < waitIterations; i++ ) 21.289 + { fakeWorkVar += (fakeWorkVar + 32.0) / 2.0; //busy-wait 21.290 + } 21.291 + } 21.292 +
22.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 22.2 +++ b/PR__startup_and_shutdown.c Wed Sep 19 23:12:44 2012 -0700 22.3 @@ -0,0 +1,601 @@ 22.4 +/* 22.5 + * Copyright 2010 OpenSourceStewardshipFoundation 22.6 + * 22.7 + * Licensed under BSD 22.8 + */ 22.9 + 22.10 +#include <stdio.h> 22.11 +#include <stdlib.h> 22.12 +#include <string.h> 22.13 +#include <malloc.h> 22.14 +#include <inttypes.h> 22.15 +#include <sys/time.h> 22.16 +#include <pthread.h> 22.17 + 22.18 +#include "PR.h" 22.19 + 22.20 + 22.21 +#define thdAttrs NULL 22.22 + 22.23 + 22.24 +/* MEANING OF WL PI SS int 22.25 + * These indicate which places the function is safe to use. They stand for: 22.26 + * WL: Wrapper Library 22.27 + * PI: Plugin 22.28 + * SS: Startup and Shutdown 22.29 + * int: internal to the PR implementation 22.30 + */ 22.31 + 22.32 + 22.33 +//=========================================================================== 22.34 +AnimSlot ** 22.35 +create_anim_slots( int32 coreSlotsAreOn ); 22.36 + 22.37 +void 22.38 +create_masterEnv(); 22.39 + 22.40 +void 22.41 +create_the_coreCtlr_OS_threads(); 22.42 + 22.43 +MallocProlog * 22.44 +create_free_list(); 22.45 + 22.46 +void 22.47 +endOSThreadFn( void *initData, SlaveVP *animatingSlv ); 22.48 + 22.49 + 22.50 +//=========================================================================== 22.51 + 22.52 +/*Setup has two phases: 22.53 + * 1) Semantic layer first calls init_PR, which creates masterEnv, and puts 22.54 + * the master Slv into the work-queue, ready for first "call" 22.55 + * 2) Semantic layer then does its own init, which creates the seed virt 22.56 + * slave inside the semantic layer, ready to assign it when 22.57 + * asked by the first run of the animationMaster. 22.58 + * 22.59 + *This part is bit weird because PR really wants to be "always there", and 22.60 + * have applications attach and detach.. for now, this PR is part of 22.61 + * the app, so the PR system starts up as part of running the app. 22.62 + * 22.63 + *The semantic layer is isolated from the PR internals by making the 22.64 + * semantic layer do setup to a state that it's ready with its 22.65 + * initial Slvs, ready to assign them to slots when the animationMaster 22.66 + * asks. Without this pattern, the semantic layer's setup would 22.67 + * have to modify slots directly to assign the initial virt-procrs, and put 22.68 + * them into the readyToAnimateQ itself, breaking the isolation completely. 22.69 + * 22.70 + * 22.71 + *The semantic layer creates the initial Slv(s), and adds its 22.72 + * own environment to masterEnv, and fills in the pointers to 22.73 + * the requestHandler and slaveAssigner plug-in functions 22.74 + */ 22.75 + 22.76 +/*This allocates PR data structures, populates the master PRProc, 22.77 + * and master environment, and returns the master environment to the semantic 22.78 + * layer. 22.79 + */ 22.80 +void 22.81 +PR__start() 22.82 + { 22.83 + #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 22.84 + create_masterEnv(); 22.85 + printf( "\n\n Running in SEQUENTIAL mode \n\n" ); 22.86 + #else 22.87 + create_masterEnv(); 22.88 + DEBUG__printf1(dbgInfra,"Offset of lock in masterEnv: %d ", (int32)offsetof(MasterEnv,masterLock) ); 22.89 + create_the_coreCtlr_OS_threads(); 22.90 + #endif 22.91 + } 22.92 + 22.93 +/*This gets the process struct out of the seedVP, then gets the semEnv-holding 22.94 + * struct out of that, then inserts the semantic env into that struct, using 22.95 + * the magic number as the key to the sem env placement. The master will 22.96 + * use the magic number from a request to retrieve the semantic env appropriate 22.97 + * for the construct that made the request. 22.98 + */ 22.99 +void 22.100 +PR__register_langlets_semEnv( PRSemEnv *semEnv, int32 magicNumber, 22.101 + SlaveVP *seedVP ) 22.102 + { PREnvHolder *envHolder; 22.103 + PRProcess *process; 22.104 + 22.105 + process = seedVP->process; 22.106 + envHolder = process->semEnvHolder; 22.107 + 22.108 + insert( magicNumber, semEnv, envHolder ); 22.109 + } 22.110 + 22.111 + 22.112 +/*TODO: finish implementing 22.113 + *This function returns information about the version of PR, the language 22.114 + * the program is being run in, its version, and information on the 22.115 + * hardware. 22.116 + */ 22.117 +/* 22.118 +char * 22.119 +PR_App__give_environment_string() 22.120 + { 22.121 + //-------------------------- 22.122 + fprintf(output, "#\n# >> Build information <<\n"); 22.123 + fprintf(output, "# GCC VERSION: %d.%d.%d\n",__GNUC__,__GNUC_MINOR__,__GNUC_PATCHLEVEL__); 22.124 + fprintf(output, "# Build Date: %s %s\n", __DATE__, __TIME__); 22.125 + 22.126 + fprintf(output, "#\n# >> Hardware information <<\n"); 22.127 + fprintf(output, "# Hardware Architecture: "); 22.128 + #ifdef __x86_64 22.129 + fprintf(output, "x86_64"); 22.130 + #endif //__x86_64 22.131 + #ifdef __i386 22.132 + fprintf(output, "x86"); 22.133 + #endif //__i386 22.134 + fprintf(output, "\n"); 22.135 + fprintf(output, "# Number of Cores: %d\n", NUM_CORES); 22.136 + //-------------------------- 22.137 + 22.138 + //PR Plugins 22.139 + fprintf(output, "#\n# >> PR Plugins <<\n"); 22.140 + fprintf(output, "# Language : "); 22.141 + fprintf(output, _LANG_NAME_); 22.142 + fprintf(output, "\n"); 22.143 + //Meta info gets set by calls from the language during its init, 22.144 + // and info registered by calls from inside the application 22.145 + fprintf(output, "# Assigner: %s\n", _PRMasterEnv->metaInfo->assignerInfo); 22.146 + 22.147 + //-------------------------- 22.148 + //Application 22.149 + fprintf(output, "#\n# >> Application <<\n"); 22.150 + fprintf(output, "# Name: %s\n", _PRMasterEnv->metaInfo->appInfo); 22.151 + fprintf(output, "# Data Set:\n%s\n",_PRMasterEnv->metaInfo->inputSet); 22.152 + 22.153 + //-------------------------- 22.154 + } 22.155 + */ 22.156 + 22.157 + 22.158 +/*A pointer to the startup-function for the language is given as the last 22.159 + * argument to the call. Use this to initialize a program in the language. 22.160 + * This creates a data structure that encapsulates the bookkeeping info 22.161 + * PR uses to track and schedule a program run. 22.162 + */ 22.163 +PRProcess * 22.164 +PR__spawn_program_on_data_in_Lang( TopLevelFnPtr seed_fn, void *data ) 22.165 + { PRProcess *newProcess; 22.166 + newProcess = malloc( sizeof(PRProcess) ); 22.167 + 22.168 + newProcess->doneLock = PTHREAD_MUTEX_INITIALIZER; 22.169 + newProcess->doneCond = PTHREAD_COND_INITIALIZER; 22.170 + newProcess->executionIsComplete = FALSE; 22.171 + newProcess->numSlavesLive = 0; 22.172 + 22.173 + newProcess->dataForSeed = data; 22.174 + newProcess->seedFnPtr = prog_seed_fn; 22.175 + 22.176 + //The language's spawn-process function fills in the plugin function-ptrs in 22.177 + // the PRProcess struct, gives the struct to PR, which then makes and 22.178 + // queues the seed SlaveVP, which starts processors made from the code being 22.179 + // animated. 22.180 + 22.181 + (*langInitFnPtr)( newProcess ); 22.182 + 22.183 + return newProcess; 22.184 + } 22.185 + 22.186 + 22.187 +/*When all SlaveVPs owned by the program-run associated to the process have 22.188 + * dissipated, then return from this call. There is no language to cleanup, 22.189 + * and PR does not shutdown.. but the process bookkeeping structure, 22.190 + * which is used by PR to track and schedule the program, is freed. 22.191 + *The PRProcess structure is kept until this call collects the results from it, 22.192 + * then freed. If the process is not done yet when PR gets this 22.193 + * call, then this call waits.. the challenge here is that this call comes from 22.194 + * a live OS thread that's outside PR.. so, inside here, it waits on a 22.195 + * condition.. then it's a PR thread that signals this to wake up.. 22.196 + *First checks whether the process is done, if yes, calls the clean-up fn then 22.197 + * returns the result extracted from the PRProcess struct. 22.198 + *If process not done yet, then performs a wait (in a loop to be sure the 22.199 + * wakeup is not spurious, which can happen). PR registers the wait, and upon 22.200 + * the process ending (last SlaveVP owned by it dissipates), then PR signals 22.201 + * this to wakeup. This then calls the cleanup fn and returns the result. 22.202 + */ 22.203 +/* 22.204 +void * 22.205 +PR_App__give_results_when_done_for( PRProcess *process ) 22.206 + { void *result; 22.207 + 22.208 + pthread_mutex_lock( process->doneLock ); 22.209 + while( !(process->executionIsComplete) ) 22.210 + { 22.211 + pthread_cond_wait( process->doneCond, 22.212 + process->doneLock ); 22.213 + } 22.214 + pthread_mutex_unlock( process->doneLock ); 22.215 + 22.216 + result = process->resultToReturn; 22.217 + 22.218 + PR_int__cleanup_process_after_done( process ); 22.219 + free( process ); //was malloc'd above, so free it here 22.220 + 22.221 + return result; 22.222 + } 22.223 +*/ 22.224 + 22.225 +/*Turns off the PR system, and frees all data associated with it. Does this 22.226 + * by creating shutdown SlaveVPs and inserting them into animation slots. 22.227 + * Will probably have to wake up sleeping cores as part of this -- the fn that 22.228 + * inserts the new SlaveVPs should handle the wakeup.. 22.229 + */ 22.230 +/* 22.231 +void 22.232 +PR_SS__shutdown(); //already defined -- look at it 22.233 + 22.234 +void 22.235 +PR_App__shutdown() 22.236 + { 22.237 + for( cores ) 22.238 + { slave = PR_int__create_new_SlaveVP( endOSThreadFn, NULL ); 22.239 + PR_int__insert_slave_onto_core( SlaveVP *slave, coreNum ); 22.240 + } 22.241 + } 22.242 +*/ 22.243 + 22.244 +/* PR_App__start_PR_running(); 22.245 + 22.246 + PRProcess matrixMultProcess; 22.247 + 22.248 + matrixMultProcess = 22.249 + PR_App__spawn_program_on_data_in_Lang( &prog_seed_fn, data, Vthread_lang ); 22.250 + 22.251 + resMatrix = PR_App__give_results_when_done_for( matrixMultProcess ); 22.252 + 22.253 + PR_App__shutdown(); 22.254 + */ 22.255 + 22.256 +void 22.257 +create_masterEnv() 22.258 + { MasterEnv *masterEnv; 22.259 + PRQueueStruc **readyToAnimateQs; 22.260 + int coreIdx; 22.261 + SlaveVP **masterVPs; 22.262 + AnimSlot ***allAnimSlots; //ptr to array of ptrs 22.263 + 22.264 + 22.265 + //Make the master env, which holds everything else 22.266 + _PRMasterEnv = malloc( sizeof(MasterEnv) ); 22.267 + 22.268 + //Very first thing put into the master env is the free-list, seeded 22.269 + // with a massive initial chunk of memory. 22.270 + //After this, all other mallocs are PR__malloc. 22.271 + _PRMasterEnv->freeLists = PR_ext__create_free_list(); 22.272 + 22.273 + 22.274 + //===================== Only PR__malloc after this ==================== 22.275 + masterEnv = (MasterEnv*)_PRMasterEnv; 22.276 + 22.277 + //Make a readyToAnimateQ for each core controller 22.278 + readyToAnimateQs = PR_int__malloc( NUM_CORES * sizeof(PRQueueStruc *) ); 22.279 + masterVPs = PR_int__malloc( NUM_CORES * sizeof(SlaveVP *) ); 22.280 + 22.281 + //One array for each core, several in array, core's masterVP scheds all 22.282 + allAnimSlots = PR_int__malloc( NUM_CORES * sizeof(AnimSlot *) ); 22.283 + 22.284 + _PRMasterEnv->numSlavesAlive = 0; //used to detect shut-down condition 22.285 + 22.286 +//======================================== 22.287 + semEnv->shutdownInitiated = FALSE; 22.288 + semEnv->coreIsDone = PR_int__malloc( NUM_CORES * sizeof( bool32 ) ); 22.289 + 22.290 + //For each animation slot, there is an idle slave, and an initial 22.291 + // slave assigned as the current-task-slave. Create them here. 22.292 + SlaveVP *idleSlv, *slotTaskSlv; 22.293 + for( coreNum = 0; coreNum < NUM_CORES; coreNum++ ) 22.294 + { semEnv->coreIsDone[coreNum] = FALSE; //use during shutdown 22.295 + 22.296 + for( slotNum = 0; slotNum < NUM_ANIM_SLOTS; ++slotNum ) 22.297 + { idleSlv = VSs__create_slave_helper( &idle_fn, NULL, semEnv, 0); 22.298 + idleSlv->coreAnimatedBy = coreNum; 22.299 + idleSlv->animSlotAssignedTo = 22.300 + _PRMasterEnv->allAnimSlots[coreNum][slotNum]; 22.301 + semEnv->idleSlv[coreNum][slotNum] = idleSlv; 22.302 + 22.303 + slotTaskSlv = VSs__create_slave_helper( &idle_fn, NULL, semEnv, 0); 22.304 + slotTaskSlv->coreAnimatedBy = coreNum; 22.305 + slotTaskSlv->animSlotAssignedTo = 22.306 + _PRMasterEnv->allAnimSlots[coreNum][slotNum]; 22.307 + 22.308 + semData = slotTaskSlv->semanticData; 22.309 + semData->needsTaskAssigned = TRUE; 22.310 + semData->slaveType = SlotTaskSlv; 22.311 + semEnv->slotTaskSlvs[coreNum][slotNum] = slotTaskSlv; 22.312 + } 22.313 + } 22.314 + 22.315 + //create the recycle queue where free task slaves are put after their task ends 22.316 + semEnv->freeTaskSlvRecycleQ = makePRQ(); 22.317 + 22.318 + 22.319 + semEnv->numLiveExtraTaskSlvs = 0; 22.320 + semEnv->numLiveThreadSlvs = 0; //none existent yet.. "create process" creates the seeds 22.321 +//================================================================== 22.322 + 22.323 + _PRMasterEnv->numSlavesCreated = 0; //used by create slave to set slave ID 22.324 + for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 22.325 + { 22.326 + readyToAnimateQs[ coreIdx ] = makePRQ(); 22.327 + 22.328 + //Q: should give masterVP core-specific info as its init data? 22.329 + masterVPs[ coreIdx ] = PR_int__create_slaveVP( (TopLevelFnPtr)&animationMaster, (void*)masterEnv ); 22.330 + masterVPs[ coreIdx ]->coreAnimatedBy = coreIdx; 22.331 + masterVPs[ coreIdx ]->typeOfVP = Master; 22.332 + allAnimSlots[ coreIdx ] = create_anim_slots( coreIdx ); //makes for one core 22.333 + } 22.334 + _PRMasterEnv->masterVPs = masterVPs; 22.335 + _PRMasterEnv->masterLock = UNLOCKED; 22.336 + _PRMasterEnv->seed1 = rand()%1000; // init random number generator 22.337 + _PRMasterEnv->seed2 = rand()%1000; // init random number generator 22.338 + _PRMasterEnv->allAnimSlots = allAnimSlots; 22.339 + _PRMasterEnv->measHistsInfo = NULL; 22.340 + 22.341 + //============================= MEASUREMENT STUFF ======================== 22.342 + 22.343 + MEAS__Make_Meas_Hists_for_Susp_Meas; 22.344 + MEAS__Make_Meas_Hists_for_Master_Meas; 22.345 + MEAS__Make_Meas_Hists_for_Master_Lock_Meas; 22.346 + MEAS__Make_Meas_Hists_for_Malloc_Meas; 22.347 + MEAS__Make_Meas_Hists_for_Plugin_Meas; 22.348 + MEAS__Make_Meas_Hists_for_Language; 22.349 + 22.350 + PROBES__Create_Probe_Bookkeeping_Vars; 22.351 + 22.352 + HOLISTIC__Setup_Perf_Counters; 22.353 + 22.354 + //======================================================================== 22.355 + } 22.356 + 22.357 +AnimSlot ** 22.358 +create_anim_slots( int32 coreSlotsAreOn ) 22.359 + { AnimSlot **animSlots; 22.360 + int i; 22.361 + 22.362 + animSlots = PR_int__malloc( NUM_ANIM_SLOTS * sizeof(AnimSlot *) ); 22.363 + 22.364 + for( i = 0; i < NUM_ANIM_SLOTS; i++ ) 22.365 + { 22.366 + animSlots[i] = PR_int__malloc( sizeof(AnimSlot) ); 22.367 + 22.368 + //Set state to mean "handling requests done, slot needs filling" 22.369 + animSlots[i]->workIsDone = FALSE; 22.370 + animSlots[i]->needsSlaveAssigned = TRUE; 22.371 + animSlots[i]->slotIdx = i; //quick retrieval of slot pos 22.372 + animSlots[i]->coreSlotIsOn = coreSlotsAreOn; 22.373 + } 22.374 + return animSlots; 22.375 + } 22.376 + 22.377 + 22.378 +void 22.379 +freeAnimSlots( AnimSlot **animSlots ) 22.380 + { int i; 22.381 + for( i = 0; i < NUM_ANIM_SLOTS; i++ ) 22.382 + { 22.383 + PR_int__free( animSlots[i] ); 22.384 + } 22.385 + PR_int__free( animSlots ); 22.386 + } 22.387 + 22.388 + 22.389 +void 22.390 +create_the_coreCtlr_OS_threads() 22.391 + { 22.392 + //======================================================================== 22.393 + // Create the Threads 22.394 + int coreIdx, retCode; 22.395 + 22.396 + //Need the threads to be created suspended, and wait for a signal 22.397 + // before proceeding -- gives time after creating to initialize other 22.398 + // stuff before the coreCtlrs set off. 22.399 + _PRMasterEnv->setupComplete = 0; 22.400 + 22.401 + //initialize the cond used to make the new threads wait and sync up 22.402 + //must do this before *creating* the threads.. 22.403 + pthread_mutex_init( &suspendLock, NULL ); 22.404 + pthread_cond_init( &suspendCond, NULL ); 22.405 + 22.406 + //Make the threads that animate the core controllers 22.407 + for( coreIdx=0; coreIdx < NUM_CORES; coreIdx++ ) 22.408 + { coreCtlrThdParams[coreIdx] = PR_int__malloc( sizeof(ThdParams) ); 22.409 + coreCtlrThdParams[coreIdx]->coreNum = coreIdx; 22.410 + 22.411 + retCode = 22.412 + pthread_create( &(coreCtlrThdHandles[coreIdx]), 22.413 + thdAttrs, 22.414 + &coreController, 22.415 + (void *)(coreCtlrThdParams[coreIdx]) ); 22.416 + if(retCode){printf("ERROR creating thread: %d\n", retCode); exit(1);} 22.417 + } 22.418 + } 22.419 + 22.420 + 22.421 +/*This is what causes the PR system to initialize.. then waits for it to 22.422 + * exit. 22.423 + * 22.424 + *Wrapper lib layer calls this when it wants the system to start running.. 22.425 + */ 22.426 +/* 22.427 +void 22.428 +PR_SS__start_the_work_then_wait_until_done() 22.429 + { 22.430 +#ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 22.431 + //Only difference between version with an OS thread pinned to each core and 22.432 + // the sequential version of PR is PR__init_Seq, this, and coreCtlr_Seq. 22.433 + // 22.434 + //Instead of un-suspending threads, just call the one and only 22.435 + // core ctlr (sequential version), in the main thread. 22.436 + coreCtlr_Seq( NULL ); 22.437 + flushRegisters(); 22.438 +#else 22.439 + int coreIdx; 22.440 + //Start the core controllers running 22.441 + 22.442 + //tell the core controller threads that setup is complete 22.443 + //get lock, to lock out any threads still starting up -- they'll see 22.444 + // that setupComplete is true before entering while loop, and so never 22.445 + // wait on the condition 22.446 + pthread_mutex_lock( &suspendLock ); 22.447 + _PRMasterEnv->setupComplete = 1; 22.448 + pthread_mutex_unlock( &suspendLock ); 22.449 + pthread_cond_broadcast( &suspendCond ); 22.450 + 22.451 + 22.452 + //wait for all to complete 22.453 + for( coreIdx=0; coreIdx < NUM_CORES; coreIdx++ ) 22.454 + { 22.455 + pthread_join( coreCtlrThdHandles[coreIdx], NULL ); 22.456 + } 22.457 + 22.458 + //NOTE: do not clean up PR env here -- semantic layer has to have 22.459 + // a chance to clean up its environment first, then do a call to free 22.460 + // the Master env and rest of PR locations 22.461 +#endif 22.462 + } 22.463 +*/ 22.464 + 22.465 +SlaveVP* PR_SS__create_shutdown_slave(){ 22.466 + SlaveVP* shutdownVP; 22.467 + 22.468 + shutdownVP = PR_int__create_slaveVP( &endOSThreadFn, NULL ); 22.469 + shutdownVP->typeOfVP = Shutdown; 22.470 + 22.471 + return shutdownVP; 22.472 +} 22.473 + 22.474 +//TODO: look at architecting cleanest separation between request handler 22.475 +// and animation master, for dissipate, create, shutdown, and other non-semantic 22.476 +// requests. Issue is chain: one removes requests from AppSlv, one dispatches 22.477 +// on type of request, and one handles each type.. but some types require 22.478 +// action from both request handler and animation master -- maybe just give the 22.479 +// request handler calls like: PR__handle_X_request_type 22.480 + 22.481 + 22.482 +/*This is called by the semantic layer's request handler when it decides its 22.483 + * time to shut down the PR system. Calling this causes the core controller OS 22.484 + * threads to exit, which unblocks the entry-point function that started up 22.485 + * PR, and allows it to grab the result and return to the original single- 22.486 + * threaded application. 22.487 + * 22.488 + *The _PRMasterEnv is needed by this shut down function, so the create-seed- 22.489 + * and-wait function has to free a bunch of stuff after it detects the 22.490 + * threads have all died: the masterEnv, the thread-related locations, 22.491 + * masterVP any AppSlvs that might still be allocated and sitting in the 22.492 + * semantic environment, or have been orphaned in the _PRWorkQ. 22.493 + * 22.494 + *NOTE: the semantic plug-in is expected to use PR__malloc to get all the 22.495 + * locations it needs, and give ownership to masterVP. Then, they will be 22.496 + * automatically freed. 22.497 + * 22.498 + *In here,create one core-loop shut-down processor for each core controller and put 22.499 + * them all directly into the readyToAnimateQ. 22.500 + *Note, this function can ONLY be called after the semantic environment no 22.501 + * longer cares if AppSlvs get animated after the point this is called. In 22.502 + * other words, this can be used as an abort, or else it should only be 22.503 + * called when all AppSlvs have finished dissipate requests -- only at that 22.504 + * point is it sure that all results have completed. 22.505 + */ 22.506 +void 22.507 +PR_SS__shutdown() 22.508 + { int32 coreIdx; 22.509 + SlaveVP *shutDownSlv; 22.510 + AnimSlot **animSlots; 22.511 + //create the shutdown processors, one for each core controller -- put them 22.512 + // directly into the Q -- each core will die when gets one 22.513 + for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 22.514 + { //Note, this is running in the master 22.515 + shutDownSlv = PR_SS__create_shutdown_slave(); 22.516 + //last slave has dissipated, so no more in slots, so write 22.517 + // shut down slave into first animulng slot. 22.518 + animSlots = _PRMasterEnv->allAnimSlots[ coreIdx ]; 22.519 + animSlots[0]->slaveAssignedToSlot = shutDownSlv; 22.520 + animSlots[0]->needsSlaveAssigned = FALSE; 22.521 + shutDownSlv->coreAnimatedBy = coreIdx; 22.522 + shutDownSlv->animSlotAssignedTo = animSlots[ 0 ]; 22.523 + } 22.524 + } 22.525 + 22.526 + 22.527 +/*Am trying to be cute, avoiding IF statement in coreCtlr that checks for 22.528 + * a special shutdown slaveVP. Ended up with extra-complex shutdown sequence. 22.529 + *This function has the sole purpose of setting the stack and framePtr 22.530 + * to the coreCtlr's stack and framePtr.. it does that then jumps to the 22.531 + * core ctlr's shutdown point -- might be able to just call Pthread_exit 22.532 + * from here, but am going back to the pthread's stack and setting everything 22.533 + * up just as if it never jumped out, before calling pthread_exit. 22.534 + *The end-point of core ctlr will free the stack and so forth of the 22.535 + * processor that animates this function, (this fn is transfering the 22.536 + * animator of the AppSlv that is in turn animating this function over 22.537 + * to core controller function -- note that this slices out a level of virtual 22.538 + * processors). 22.539 + */ 22.540 +void 22.541 +endOSThreadFn( void *initData, SlaveVP *animatingSlv ) 22.542 + { 22.543 + #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 22.544 + asmTerminateCoreCtlrSeq(animatingSlv); 22.545 + #else 22.546 + asmTerminateCoreCtlr(animatingSlv); 22.547 + #endif 22.548 + } 22.549 + 22.550 + 22.551 +/*This is called from the startup & shutdown 22.552 + */ 22.553 +void 22.554 +PR_SS__cleanup_at_end_of_shutdown() 22.555 + { 22.556 + //Before getting rid of everything, print out any measurements made 22.557 + if( _PRMasterEnv->measHistsInfo != NULL ) 22.558 + { forAllInDynArrayDo( _PRMasterEnv->measHistsInfo, (DynArrayFnPtr)&printHist ); 22.559 + forAllInDynArrayDo( _PRMasterEnv->measHistsInfo, (DynArrayFnPtr)&saveHistToFile); 22.560 + forAllInDynArrayDo( _PRMasterEnv->measHistsInfo, (DynArrayFnPtr)&freeHist ); 22.561 + } 22.562 + 22.563 + MEAS__Print_Hists_for_Susp_Meas; 22.564 + MEAS__Print_Hists_for_Master_Meas; 22.565 + MEAS__Print_Hists_for_Master_Lock_Meas; 22.566 + MEAS__Print_Hists_for_Malloc_Meas; 22.567 + MEAS__Print_Hists_for_Plugin_Meas; 22.568 + 22.569 + 22.570 + //All the environment data has been allocated with PR__malloc, so just 22.571 + // free its internal big-chunk and all inside it disappear. 22.572 +/* 22.573 + readyToAnimateQs = _PRMasterEnv->readyToAnimateQs; 22.574 + masterVPs = _PRMasterEnv->masterVPs; 22.575 + allAnimSlots = _PRMasterEnv->allAnimSlots; 22.576 + 22.577 + for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 22.578 + { 22.579 + freePRQ( readyToAnimateQs[ coreIdx ] ); 22.580 + //master Slvs were created external to PR, so use external free 22.581 + PR_int__dissipate_slaveVP( masterVPs[ coreIdx ] ); 22.582 + 22.583 + freeAnimSlots( allAnimSlots[ coreIdx ] ); 22.584 + } 22.585 + 22.586 + PR_int__free( _PRMasterEnv->readyToAnimateQs ); 22.587 + PR_int__free( _PRMasterEnv->masterVPs ); 22.588 + PR_int__free( _PRMasterEnv->allAnimSlots ); 22.589 + 22.590 + //============================= MEASUREMENT STUFF ======================== 22.591 + #ifdef PROBES__TURN_ON_STATS_PROBES 22.592 + freeDynArrayDeep( _PRMasterEnv->dynIntervalProbesInfo, &PR_WL__free_probe); 22.593 + #endif 22.594 + //======================================================================== 22.595 +*/ 22.596 + //These are the only two that use system free 22.597 + PR_ext__free_free_list( _PRMasterEnv->freeLists ); 22.598 + free( (void *)_PRMasterEnv ); 22.599 + } 22.600 + 22.601 + 22.602 +//================================ 22.603 + 22.604 +
23.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 23.2 +++ b/PR_primitive_data_types.h Wed Sep 19 23:12:44 2012 -0700 23.3 @@ -0,0 +1,42 @@ 23.4 +/* 23.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 23.6 + * Licensed under GNU General Public License version 2 23.7 + * 23.8 + * Author: seanhalle@yahoo.com 23.9 + * 23.10 + 23.11 + */ 23.12 + 23.13 +#ifndef _PRIMITIVE_DATA_TYPES_H 23.14 +#define _PRIMITIVE_DATA_TYPES_H 23.15 + 23.16 + 23.17 +/*For portability, need primitive data types that have a well defined 23.18 + * size, and well-defined layout into bytes 23.19 + *To do this, provide standard aliases for all primitive data types 23.20 + *These aliases must be used in all functions instead of the ANSI types 23.21 + * 23.22 + *When PR is used together with BLIS, these definitions will be replaced 23.23 + * inside each specialization module according to the compiler used in 23.24 + * that module and the hardware being specialized to. 23.25 + */ 23.26 +typedef char bool8; 23.27 +typedef char int8; 23.28 +typedef char uint8; 23.29 +typedef short int16; 23.30 +typedef unsigned short uint16; 23.31 +typedef int int32; 23.32 +typedef unsigned int uint32; 23.33 +typedef unsigned int bool32; 23.34 +typedef long long int64; 23.35 +typedef unsigned long long uint64; 23.36 +typedef float float32; 23.37 +typedef double float64; 23.38 +//typedef double double float128; //GCC doesn't like this 23.39 +#define float128 double double 23.40 + 23.41 +#define TRUE 1 23.42 +#define FALSE 0 23.43 + 23.44 +#endif /* _PRIMITIVE_DATA_TYPES_H */ 23.45 +
24.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 24.2 +++ b/Services_Offered_by_PR/Measurement_and_Stats/MEAS__macros.h Wed Sep 19 23:12:44 2012 -0700 24.3 @@ -0,0 +1,514 @@ 24.4 +/* 24.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 24.6 + * Licensed under GNU General Public License version 2 24.7 + * 24.8 + * Author: seanhalle@yahoo.com 24.9 + * 24.10 + */ 24.11 + 24.12 +#ifndef _PR_MEAS_MACROS_H 24.13 +#define _PR_MEAS_MACROS_H 24.14 +#define _GNU_SOURCE 24.15 + 24.16 +//================== Macros define types of meas want ===================== 24.17 +// 24.18 +/*Generic measurement macro -- has name-space collision potential, which 24.19 + * compiler will catch.. so only use one pair inside a given set of 24.20 + * curly braces. 24.21 + */ 24.22 +//TODO: finish generic capture interval in hist 24.23 +enum histograms 24.24 + { generic1 24.25 + }; 24.26 + #define MEAS__Capture_Pre_Point \ 24.27 + int32 startStamp, endStamp; \ 24.28 + saveLowTimeStampCountInto( startStamp ); 24.29 + 24.30 + #define MEAS__Capture_Post_Point( histName ) \ 24.31 + saveLowTimeStampCountInto( endStamp ); \ 24.32 + addIntervalToHist( startStamp, endStamp, _PRMasterEnv->histName ); 24.33 + 24.34 + 24.35 + 24.36 + 24.37 +//================== Macros define types of meas want ===================== 24.38 + 24.39 +#ifdef MEAS__TURN_ON_SUSP_MEAS 24.40 + #define MEAS__Insert_Susp_Meas_Fields_into_Slave \ 24.41 + uint32 preSuspTSCLow; \ 24.42 + uint32 postSuspTSCLow; 24.43 + 24.44 + #define MEAS__Insert_Susp_Meas_Fields_into_MasterEnv \ 24.45 + Histogram *suspLowTimeHist; \ 24.46 + Histogram *suspHighTimeHist; 24.47 + 24.48 + #define MEAS__Make_Meas_Hists_for_Susp_Meas \ 24.49 + _PRMasterEnv->suspLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.50 + "master_low_time_hist");\ 24.51 + _PRMasterEnv->suspHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.52 + "master_high_time_hist"); 24.53 + 24.54 + //record time stamp: compare to time-stamp recorded below 24.55 + #define MEAS__Capture_Pre_Susp_Point \ 24.56 + saveLowTimeStampCountInto( animatingSlv->preSuspTSCLow ); 24.57 + 24.58 + //NOTE: only take low part of count -- do sanity check when take diff 24.59 + #define MEAS__Capture_Post_Susp_Point \ 24.60 + saveLowTimeStampCountInto( animatingSlv->postSuspTSCLow );\ 24.61 + addIntervalToHist( preSuspTSCLow, postSuspTSCLow,\ 24.62 + _PRMasterEnv->suspLowTimeHist ); \ 24.63 + addIntervalToHist( preSuspTSCLow, postSuspTSCLow,\ 24.64 + _PRMasterEnv->suspHighTimeHist ); 24.65 + 24.66 + #define MEAS__Print_Hists_for_Susp_Meas \ 24.67 + printHist( _PRMasterEnv->pluginTimeHist ); 24.68 + 24.69 +#else 24.70 + #define MEAS__Insert_Susp_Meas_Fields_into_Slave 24.71 + #define MEAS__Insert_Susp_Meas_Fields_into_MasterEnv 24.72 + #define MEAS__Make_Meas_Hists_for_Susp_Meas 24.73 + #define MEAS__Capture_Pre_Susp_Point 24.74 + #define MEAS__Capture_Post_Susp_Point 24.75 + #define MEAS__Print_Hists_for_Susp_Meas 24.76 +#endif 24.77 + 24.78 +#ifdef MEAS__TURN_ON_MASTER_MEAS 24.79 + #define MEAS__Insert_Master_Meas_Fields_into_Slave \ 24.80 + uint32 startMasterTSCLow; \ 24.81 + uint32 endMasterTSCLow; 24.82 + 24.83 + #define MEAS__Insert_Master_Meas_Fields_into_MasterEnv \ 24.84 + Histogram *masterLowTimeHist; \ 24.85 + Histogram *masterHighTimeHist; 24.86 + 24.87 + #define MEAS__Make_Meas_Hists_for_Master_Meas \ 24.88 + _PRMasterEnv->masterLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.89 + "master_low_time_hist");\ 24.90 + _PRMasterEnv->masterHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.91 + "master_high_time_hist"); 24.92 + 24.93 + //Total Master time includes one coreloop time -- just assume the core 24.94 + // loop time is same for Master as for AppSlvs, even though it may be 24.95 + // smaller due to higher predictability of the fixed jmp. 24.96 + #define MEAS__Capture_Pre_Master_Point\ 24.97 + saveLowTimeStampCountInto( masterVP->startMasterTSCLow ); 24.98 + 24.99 + #define MEAS__Capture_Post_Master_Point \ 24.100 + saveLowTimeStampCountInto( masterVP->endMasterTSCLow );\ 24.101 + addIntervalToHist( startMasterTSCLow, endMasterTSCLow,\ 24.102 + _PRMasterEnv->masterLowTimeHist ); \ 24.103 + addIntervalToHist( startMasterTSCLow, endMasterTSCLow,\ 24.104 + _PRMasterEnv->masterHighTimeHist ); 24.105 + 24.106 + #define MEAS__Print_Hists_for_Master_Meas \ 24.107 + printHist( _PRMasterEnv->pluginTimeHist ); 24.108 + 24.109 +#else 24.110 + #define MEAS__Insert_Master_Meas_Fields_into_Slave 24.111 + #define MEAS__Insert_Master_Meas_Fields_into_MasterEnv 24.112 + #define MEAS__Make_Meas_Hists_for_Master_Meas 24.113 + #define MEAS__Capture_Pre_Master_Point 24.114 + #define MEAS__Capture_Post_Master_Point 24.115 + #define MEAS__Print_Hists_for_Master_Meas 24.116 +#endif 24.117 + 24.118 + 24.119 +#ifdef MEAS__TURN_ON_MASTER_LOCK_MEAS 24.120 + #define MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv \ 24.121 + Histogram *masterLockLowTimeHist; \ 24.122 + Histogram *masterLockHighTimeHist; 24.123 + 24.124 + #define MEAS__Make_Meas_Hists_for_Master_Lock_Meas \ 24.125 + _PRMasterEnv->masterLockLowTimeHist = makeFixedBinHist( 50, 0, 2, \ 24.126 + "master lock low time hist");\ 24.127 + _PRMasterEnv->masterLockHighTimeHist = makeFixedBinHist( 50, 0, 100,\ 24.128 + "master lock high time hist"); 24.129 + 24.130 + #define MEAS__Capture_Pre_Master_Lock_Point \ 24.131 + int32 startStamp, endStamp; \ 24.132 + saveLowTimeStampCountInto( startStamp ); 24.133 + 24.134 + #define MEAS__Capture_Post_Master_Lock_Point \ 24.135 + saveLowTimeStampCountInto( endStamp ); \ 24.136 + addIntervalToHist( startStamp, endStamp,\ 24.137 + _PRMasterEnv->masterLockLowTimeHist ); \ 24.138 + addIntervalToHist( startStamp, endStamp,\ 24.139 + _PRMasterEnv->masterLockHighTimeHist ); 24.140 + 24.141 + #define MEAS__Print_Hists_for_Master_Lock_Meas \ 24.142 + printHist( _PRMasterEnv->masterLockLowTimeHist ); \ 24.143 + printHist( _PRMasterEnv->masterLockHighTimeHist ); 24.144 + 24.145 +#else 24.146 + #define MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv 24.147 + #define MEAS__Make_Meas_Hists_for_Master_Lock_Meas 24.148 + #define MEAS__Capture_Pre_Master_Lock_Point 24.149 + #define MEAS__Capture_Post_Master_Lock_Point 24.150 + #define MEAS__Print_Hists_for_Master_Lock_Meas 24.151 +#endif 24.152 + 24.153 + 24.154 +#ifdef MEAS__TURN_ON_MALLOC_MEAS 24.155 + #define MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv\ 24.156 + Histogram *mallocTimeHist; \ 24.157 + Histogram *freeTimeHist; 24.158 + 24.159 + #define MEAS__Make_Meas_Hists_for_Malloc_Meas \ 24.160 + _PRMasterEnv->mallocTimeHist = makeFixedBinHistExt( 100, 0, 30,\ 24.161 + "malloc_time_hist");\ 24.162 + _PRMasterEnv->freeTimeHist = makeFixedBinHistExt( 100, 0, 30,\ 24.163 + "free_time_hist"); 24.164 + 24.165 + #define MEAS__Capture_Pre_Malloc_Point \ 24.166 + int32 startStamp, endStamp; \ 24.167 + saveLowTimeStampCountInto( startStamp ); 24.168 + 24.169 + #define MEAS__Capture_Post_Malloc_Point \ 24.170 + saveLowTimeStampCountInto( endStamp ); \ 24.171 + addIntervalToHist( startStamp, endStamp,\ 24.172 + _PRMasterEnv->mallocTimeHist ); 24.173 + 24.174 + #define MEAS__Capture_Pre_Free_Point \ 24.175 + int32 startStamp, endStamp; \ 24.176 + saveLowTimeStampCountInto( startStamp ); 24.177 + 24.178 + #define MEAS__Capture_Post_Free_Point \ 24.179 + saveLowTimeStampCountInto( endStamp ); \ 24.180 + addIntervalToHist( startStamp, endStamp,\ 24.181 + _PRMasterEnv->freeTimeHist ); 24.182 + 24.183 + #define MEAS__Print_Hists_for_Malloc_Meas \ 24.184 + printHist( _PRMasterEnv->mallocTimeHist ); \ 24.185 + saveHistToFile( _PRMasterEnv->mallocTimeHist ); \ 24.186 + printHist( _PRMasterEnv->freeTimeHist ); \ 24.187 + saveHistToFile( _PRMasterEnv->freeTimeHist ); \ 24.188 + freeHistExt( _PRMasterEnv->mallocTimeHist ); \ 24.189 + freeHistExt( _PRMasterEnv->freeTimeHist ); 24.190 + 24.191 +#else 24.192 + #define MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv 24.193 + #define MEAS__Make_Meas_Hists_for_Malloc_Meas 24.194 + #define MEAS__Capture_Pre_Malloc_Point 24.195 + #define MEAS__Capture_Post_Malloc_Point 24.196 + #define MEAS__Capture_Pre_Free_Point 24.197 + #define MEAS__Capture_Post_Free_Point 24.198 + #define MEAS__Print_Hists_for_Malloc_Meas 24.199 +#endif 24.200 + 24.201 + 24.202 + 24.203 +#ifdef MEAS__TURN_ON_PLUGIN_MEAS 24.204 + #define MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv \ 24.205 + Histogram *reqHdlrLowTimeHist; \ 24.206 + Histogram *reqHdlrHighTimeHist; 24.207 + 24.208 + #define MEAS__Make_Meas_Hists_for_Plugin_Meas \ 24.209 + _PRMasterEnv->reqHdlrLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.210 + "plugin_low_time_hist");\ 24.211 + _PRMasterEnv->reqHdlrHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 24.212 + "plugin_high_time_hist"); 24.213 + 24.214 + #define MEAS__startReqHdlr \ 24.215 + int32 startStamp1, endStamp1; \ 24.216 + saveLowTimeStampCountInto( startStamp1 ); 24.217 + 24.218 + #define MEAS__endReqHdlr \ 24.219 + saveLowTimeStampCountInto( endStamp1 ); \ 24.220 + addIntervalToHist( startStamp1, endStamp1, \ 24.221 + _PRMasterEnv->reqHdlrLowTimeHist ); \ 24.222 + addIntervalToHist( startStamp1, endStamp1, \ 24.223 + _PRMasterEnv->reqHdlrHighTimeHist ); 24.224 + 24.225 + #define MEAS__Print_Hists_for_Plugin_Meas \ 24.226 + printHist( _PRMasterEnv->reqHdlrLowTimeHist ); \ 24.227 + saveHistToFile( _PRMasterEnv->reqHdlrLowTimeHist ); \ 24.228 + printHist( _PRMasterEnv->reqHdlrHighTimeHist ); \ 24.229 + saveHistToFile( _PRMasterEnv->reqHdlrHighTimeHist ); \ 24.230 + freeHistExt( _PRMasterEnv->reqHdlrLowTimeHist ); \ 24.231 + freeHistExt( _PRMasterEnv->reqHdlrHighTimeHist ); 24.232 +#else 24.233 + #define MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv 24.234 + #define MEAS__Make_Meas_Hists_for_Plugin_Meas 24.235 + #define MEAS__startReqHdlr 24.236 + #define MEAS__endReqHdlr 24.237 + #define MEAS__Print_Hists_for_Plugin_Meas 24.238 + 24.239 +#endif 24.240 + 24.241 + 24.242 +#ifdef MEAS__TURN_ON_SYSTEM_MEAS 24.243 + #define MEAS__Insert_System_Meas_Fields_into_Slave \ 24.244 + TSCountLowHigh startSusp; \ 24.245 + uint64 totalSuspCycles; \ 24.246 + uint32 numGoodSusp; 24.247 + 24.248 + #define MEAS__Insert_System_Meas_Fields_into_MasterEnv \ 24.249 + TSCountLowHigh startMaster; \ 24.250 + uint64 totalMasterCycles; \ 24.251 + uint32 numMasterAnimations; \ 24.252 + TSCountLowHigh startReqHdlr; \ 24.253 + uint64 totalPluginCycles; \ 24.254 + uint32 numPluginAnimations; \ 24.255 + uint64 cyclesTillStartAnimationMaster; \ 24.256 + TSCountLowHigh endAnimationMaster; 24.257 + 24.258 + #define MEAS__startAnimationMaster_forSys \ 24.259 + TSCountLowHigh startStamp1, endStamp1; \ 24.260 + saveTSCLowHigh( endStamp1 ); \ 24.261 + _PRMasterEnv->cyclesTillStartAnimationMaster = \ 24.262 + endStamp1.longVal - masterVP->startSusp.longVal; 24.263 + 24.264 + #define Meas_startReqHdlr_forSys \ 24.265 + saveTSCLowHigh( startStamp1 ); \ 24.266 + _PRMasterEnv->startReqHdlr.longVal = startStamp1.longVal; 24.267 + 24.268 + #define MEAS__endAnimationMaster_forSys \ 24.269 + saveTSCLowHigh( startStamp1 ); \ 24.270 + _PRMasterEnv->endAnimationMaster.longVal = startStamp1.longVal; 24.271 + 24.272 + /*A TSC is stored in VP first thing inside wrapper-lib 24.273 + * Now, measures cycles from there to here 24.274 + * Master and Plugin will add this value to other trace-seg measures 24.275 + */ 24.276 + #define MEAS__Capture_End_Susp_in_CoreCtlr_ForSys\ 24.277 + saveTSCLowHigh(endSusp); \ 24.278 + numCycles = endSusp.longVal - currVP->startSusp.longVal; \ 24.279 + /*sanity check (400K is about 20K iters)*/ \ 24.280 + if( numCycles < 400000 ) \ 24.281 + { currVP->totalSuspCycles += numCycles; \ 24.282 + currVP->numGoodSusp++; \ 24.283 + } \ 24.284 + /*recorded every time, but only read if currVP == MasterVP*/ \ 24.285 + _PRMasterEnv->startMaster.longVal = endSusp.longVal; 24.286 + 24.287 +#else 24.288 + #define MEAS__Insert_System_Meas_Fields_into_Slave 24.289 + #define MEAS__Insert_System_Meas_Fields_into_MasterEnv 24.290 + #define MEAS__Make_Meas_Hists_for_System_Meas 24.291 + #define MEAS__startAnimationMaster_forSys 24.292 + #define MEAS__startReqHdlr_forSys 24.293 + #define MEAS__endAnimationMaster_forSys 24.294 + #define MEAS__Capture_End_Susp_in_CoreCtlr_ForSys 24.295 + #define MEAS__Print_Hists_for_System_Meas 24.296 +#endif 24.297 + 24.298 +#ifdef HOLISTIC__TURN_ON_PERF_COUNTERS 24.299 + 24.300 + #define MEAS__Insert_Counter_Handler \ 24.301 + typedef void (*CounterHandler) (int,int,int,SlaveVP*,uint64,uint64,uint64); 24.302 + 24.303 + enum eventType { 24.304 + DebugEvt = 0, 24.305 + AppResponderInvocation_start, 24.306 + AppResponder_start, 24.307 + AppResponder_end, 24.308 + AssignerInvocation_start, 24.309 + NextAssigner_start, 24.310 + Assigner_start, 24.311 + Assigner_end, 24.312 + Work_start, 24.313 + Work_end, 24.314 + HwResponderInvocation_start, 24.315 + Timestamp_start, 24.316 + Timestamp_end 24.317 + }; 24.318 + 24.319 + #define saveCyclesAndInstrs(core,cycles,instrs,cachem) do{ \ 24.320 + int cycles_fd = _PRMasterEnv->cycles_counter_fd[core]; \ 24.321 + int instrs_fd = _PRMasterEnv->instrs_counter_fd[core]; \ 24.322 + int cachem_fd = _PRMasterEnv->cachem_counter_fd[core]; \ 24.323 + int nread; \ 24.324 + \ 24.325 + nread = read(cycles_fd,&(cycles),sizeof(cycles)); \ 24.326 + if(nread<0){ \ 24.327 + perror("Error reading cycles counter"); \ 24.328 + cycles = 0; \ 24.329 + } \ 24.330 + \ 24.331 + nread = read(instrs_fd,&(instrs),sizeof(instrs)); \ 24.332 + if(nread<0){ \ 24.333 + perror("Error reading cycles counter"); \ 24.334 + instrs = 0; \ 24.335 + } \ 24.336 + nread = read(cachem_fd,&(cachem),sizeof(cachem)); \ 24.337 + if(nread<0){ \ 24.338 + perror("Error reading last level cache miss counter"); \ 24.339 + cachem = 0; \ 24.340 + } \ 24.341 + } while (0) 24.342 + 24.343 + #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv \ 24.344 + int cycles_counter_fd[NUM_CORES]; \ 24.345 + int instrs_counter_fd[NUM_CORES]; \ 24.346 + int cachem_counter_fd[NUM_CORES]; \ 24.347 + uint64 start_master_lock[NUM_CORES][3]; \ 24.348 + CounterHandler counterHandler; 24.349 + 24.350 + #define HOLISTIC__Setup_Perf_Counters setup_perf_counters(); 24.351 + 24.352 + 24.353 + #define HOLISTIC__CoreCtrl_Setup \ 24.354 + CounterHandler counterHandler = _PRMasterEnv->counterHandler; \ 24.355 + SlaveVP *lastVPBeforeMaster = NULL; \ 24.356 + /*if(thisCoresThdParams->coreNum == 0){ \ 24.357 + uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ 24.358 + while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ 24.359 + } \ 24.360 + if(0 < (thisCoresThdParams->coreNum) && (thisCoresThdParams->coreNum) < (NUM_CORES - 1)){ \ 24.361 + ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ 24.362 + int sndctr = tsc_offset_resp(sendCoresThdParams, 0); \ 24.363 + uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ 24.364 + while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ 24.365 + } \ 24.366 + if(thisCoresThdParams->coreNum == (NUM_CORES - 1)){ \ 24.367 + ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ 24.368 + int sndctr = tsc_offset_resp(sendCoresThdParams,0); \ 24.369 + }*/ 24.370 + 24.371 + 24.372 + #define HOLISTIC__Insert_Master_Global_Vars \ 24.373 + int vpid,task; \ 24.374 + CounterHandler counterHandler = masterEnv->counterHandler; 24.375 + 24.376 + #define HOLISTIC__Record_last_work lastVPBeforeMaster = currVP; 24.377 + 24.378 + #define HOLISTIC__Record_AppResponderInvocation_start \ 24.379 + uint64 cycles,instrs,cachem; \ 24.380 + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 24.381 + if(lastVPBeforeMaster){ \ 24.382 + (*counterHandler)(AppResponderInvocation_start,lastVPBeforeMaster->slaveID,lastVPBeforeMaster->assignCount,lastVPBeforeMaster,cycles,instrs,cachem); \ 24.383 + lastVPBeforeMaster = NULL; \ 24.384 + } else { \ 24.385 + _PRMasterEnv->start_master_lock[thisCoresIdx][0] = cycles; \ 24.386 + _PRMasterEnv->start_master_lock[thisCoresIdx][1] = instrs; \ 24.387 + _PRMasterEnv->start_master_lock[thisCoresIdx][2] = cachem; \ 24.388 + } 24.389 + 24.390 + /* Request Handler may call resume() on the VP, but we want to 24.391 + * account the whole interval to the same task. Therefore, need 24.392 + * to save task ID at the beginning. 24.393 + * 24.394 + * Using this value as "end of AppResponder Invocation Time" 24.395 + * is possible if there is only one SchedSlot per core - 24.396 + * invoking processor is last to be treated here! If more than 24.397 + * one slot, MasterLoop processing time for all but the last VP 24.398 + * would be erroneously counted as invocation time. 24.399 + */ 24.400 + #define HOLISTIC__Record_AppResponder_start \ 24.401 + vpid = currSlot->slaveAssignedToSlot->slaveID; \ 24.402 + task = currSlot->slaveAssignedToSlot->assignCount; \ 24.403 + uint64 cycles, instrs, cachem; \ 24.404 + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 24.405 + (*counterHandler)(AppResponder_start,vpid,task,currSlot->slaveAssignedToSlot,cycles,instrs,cachem); 24.406 + 24.407 + #define HOLISTIC__Record_AppResponder_end \ 24.408 + uint64 cycles2,instrs2,cachem2; \ 24.409 + saveCyclesAndInstrs(thisCoresIdx,cycles2, instrs2,cachem2); \ 24.410 + (*counterHandler)(AppResponder_end,vpid,task,currSlot->slaveAssignedToSlot,cycles2,instrs2,cachem2); \ 24.411 + (*counterHandler)(Timestamp_end,vpid,task,currSlot->slaveAssignedToSlot,rdtsc(),0,0); 24.412 + 24.413 + 24.414 + /* Don't know who to account time to yet - goes to assigned VP 24.415 + * after the call. 24.416 + */ 24.417 + #define HOLISTIC__Record_Assigner_start \ 24.418 + int empty = FALSE; \ 24.419 + if(currSlot->slaveAssignedToSlot == NULL){ \ 24.420 + empty= TRUE; \ 24.421 + } \ 24.422 + uint64 tmp_cycles, tmp_instrs, tmp_cachem; \ 24.423 + saveCyclesAndInstrs(thisCoresIdx,tmp_cycles,tmp_instrs,tmp_cachem); \ 24.424 + uint64 tsc = rdtsc(); \ 24.425 + if(vpid > 0) { \ 24.426 + (*counterHandler)(NextAssigner_start,vpid,task,currSlot->slaveAssignedToSlot,tmp_cycles,tmp_instrs,tmp_cachem); \ 24.427 + vpid = 0; \ 24.428 + task = 0; \ 24.429 + } 24.430 + 24.431 + #define HOLISTIC__Record_Assigner_end \ 24.432 + uint64 cycles,instrs,cachem; \ 24.433 + saveCyclesAndInstrs(thisCoresIdx,cycles,instrs,cachem); \ 24.434 + if(empty){ \ 24.435 + (*counterHandler)(AssignerInvocation_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,masterEnv->start_master_lock[thisCoresIdx][0],masterEnv->start_master_lock[thisCoresIdx][1],masterEnv->start_master_lock[thisCoresIdx][2]); \ 24.436 + } \ 24.437 + (*counterHandler)(Timestamp_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tsc,0,0); \ 24.438 + (*counterHandler)(Assigner_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tmp_cycles,tmp_instrs,tmp_cachem); \ 24.439 + (*counterHandler)(Assigner_end,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,cycles,instrs,tmp_cachem); 24.440 + 24.441 + #define HOLISTIC__Record_Work_start \ 24.442 + if(currVP){ \ 24.443 + uint64 cycles,instrs,cachem; \ 24.444 + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 24.445 + (*counterHandler)(Work_start,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs,cachem); \ 24.446 + } 24.447 + 24.448 + #define HOLISTIC__Record_Work_end \ 24.449 + if(currVP){ \ 24.450 + uint64 cycles,instrs,cachem; \ 24.451 + saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 24.452 + (*counterHandler)(Work_end,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs,cachem); \ 24.453 + } 24.454 + 24.455 + #define HOLISTIC__Record_HwResponderInvocation_start \ 24.456 + uint64 cycles,instrs,cachem; \ 24.457 + saveCyclesAndInstrs(animatingSlv->coreAnimatedBy,cycles, instrs,cachem); \ 24.458 + (*(_PRMasterEnv->counterHandler))(HwResponderInvocation_start,animatingSlv->slaveID,animatingSlv->assignCount,animatingSlv,cycles,instrs,cachem); 24.459 + 24.460 + 24.461 + #define getReturnAddressBeforeLibraryCall(vp_ptr, res_ptr) do{ \ 24.462 +void* frame_ptr0 = vp_ptr->framePtr; \ 24.463 +void* frame_ptr1 = *((void**)frame_ptr0); \ 24.464 +void* frame_ptr2 = *((void**)frame_ptr1); \ 24.465 +void* frame_ptr3 = *((void**)frame_ptr2); \ 24.466 +void* ret_addr = *((void**)frame_ptr3 + 1); \ 24.467 +*res_ptr = ret_addr; \ 24.468 +} while (0) 24.469 + 24.470 +#else 24.471 + #define MEAS__Insert_Counter_Handler 24.472 + #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv 24.473 + #define HOLISTIC__Setup_Perf_Counters 24.474 + #define HOLISTIC__CoreCtrl_Setup 24.475 + #define HOLISTIC__Insert_Master_Global_Vars 24.476 + #define HOLISTIC__Record_last_work 24.477 + #define HOLISTIC__Record_AppResponderInvocation_start 24.478 + #define HOLISTIC__Record_AppResponder_start 24.479 + #define HOLISTIC__Record_AppResponder_end 24.480 + #define HOLISTIC__Record_Assigner_start 24.481 + #define HOLISTIC__Record_Assigner_end 24.482 + #define HOLISTIC__Record_Work_start 24.483 + #define HOLISTIC__Record_Work_end 24.484 + #define HOLISTIC__Record_HwResponderInvocation_start 24.485 + #define getReturnAddressBeforeLibraryCall(vp_ptr, res_ptr) 24.486 +#endif 24.487 + 24.488 +//Experiment in two-step macros -- if doesn't work, insert each separately 24.489 +#define MEAS__Insert_Meas_Fields_into_Slave \ 24.490 + MEAS__Insert_Susp_Meas_Fields_into_Slave \ 24.491 + MEAS__Insert_Master_Meas_Fields_into_Slave \ 24.492 + MEAS__Insert_System_Meas_Fields_into_Slave 24.493 + 24.494 + 24.495 +//====================== Histogram Macros -- Create ======================== 24.496 +// 24.497 +// 24.498 + 24.499 +//The language implementation should include a definition of this macro, 24.500 +// which creates all the histograms the language uses to collect measurements 24.501 +// of plugin operation -- so, if the language didn't define it, must 24.502 +// define it here (as empty), to avoid compile error 24.503 +#ifndef MEAS__Make_Meas_Hists_for_Language 24.504 +#define MEAS__Make_Meas_Hists_for_Language 24.505 +#endif 24.506 + 24.507 +#define makeAMeasHist( idx, name, numBins, startVal, binWidth ) \ 24.508 + makeHighestDynArrayIndexBeAtLeast( _PRMasterEnv->measHistsInfo, idx ); \ 24.509 + _PRMasterEnv->measHists[idx] = \ 24.510 + makeFixedBinHist( numBins, startVal, binWidth, name ); 24.511 + 24.512 +//============================== Probes =================================== 24.513 + 24.514 + 24.515 +//=========================================================================== 24.516 +#endif /* _PR_DEFS_MEAS_H */ 24.517 +
25.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 25.2 +++ b/Services_Offered_by_PR/Measurement_and_Stats/probes.c Wed Sep 19 23:12:44 2012 -0700 25.3 @@ -0,0 +1,304 @@ 25.4 +/* 25.5 + * Copyright 2010 OpenSourceStewardshipFoundation 25.6 + * 25.7 + * Licensed under BSD 25.8 + */ 25.9 + 25.10 +#include <stdio.h> 25.11 +#include <malloc.h> 25.12 +#include <sys/time.h> 25.13 + 25.14 +#include "PR_impl/PR.h" 25.15 + 25.16 + 25.17 + 25.18 +//==================== Probes ================= 25.19 +/* 25.20 + * In practice, probe operations are called from the app, from inside slaves 25.21 + * -- so have to be sure each probe is single-Slv owned, and be sure that 25.22 + * any place common structures are modified it's done inside the master. 25.23 + * So -- the only place common structures are modified is during creation. 25.24 + * after that, all mods are to individual instances. 25.25 + * 25.26 + * Thniking perhaps should change the semantics to be that probes are 25.27 + * attached to the virtual processor -- and then everything is guaranteed 25.28 + * to be isolated -- except then can't take any intervals that span Slvs, 25.29 + * and would have to transfer the probes to Master env when Slv dissipates.. 25.30 + * gets messy.. 25.31 + * 25.32 + * For now, just making so that probe creation causes a suspend, so that 25.33 + * the dynamic array in the master env is only modified from the master 25.34 + * 25.35 + */ 25.36 + 25.37 +//============================ Helpers =========================== 25.38 +inline void 25.39 +doNothing() 25.40 + { 25.41 + } 25.42 + 25.43 +float64 inline 25.44 +giveInterval( struct timeval _start, struct timeval _end ) 25.45 + { float64 start, end; 25.46 + start = _start.tv_sec + _start.tv_usec / 1000000.0; 25.47 + end = _end.tv_sec + _end.tv_usec / 1000000.0; 25.48 + return end - start; 25.49 + } 25.50 + 25.51 +//================================================================= 25.52 +IntervalProbe * 25.53 +create_generic_probe( char *nameStr, SlaveVP *animSlv ) 25.54 + { 25.55 + PRSemReq reqData; 25.56 + 25.57 + reqData.reqType = make_probe; 25.58 + reqData.nameStr = nameStr; 25.59 + 25.60 + PR_WL__send_PRSem_request( &reqData, animSlv ); 25.61 + 25.62 + return animSlv->dataRetFromReq; 25.63 + } 25.64 + 25.65 +/*Use this version from outside PR -- it uses external malloc, and modifies 25.66 + * dynamic array, so can't be animated in a slave Slv 25.67 + */ 25.68 +IntervalProbe * 25.69 +ext__create_generic_probe( char *nameStr ) 25.70 + { IntervalProbe *newProbe; 25.71 + int32 nameLen; 25.72 + 25.73 + newProbe = malloc( sizeof(IntervalProbe) ); 25.74 + nameLen = strlen( nameStr ); 25.75 + newProbe->nameStr = malloc( nameLen ); 25.76 + memcpy( newProbe->nameStr, nameStr, nameLen ); 25.77 + newProbe->hist = NULL; 25.78 + newProbe->schedChoiceWasRecorded = FALSE; 25.79 + newProbe->probeID = 25.80 + addToDynArray( newProbe, _PRMasterEnv->dynIntervalProbesInfo ); 25.81 + 25.82 + return newProbe; 25.83 + } 25.84 + 25.85 +//============================ Fns def in header ======================= 25.86 + 25.87 +int32 25.88 +PR_impl__create_single_interval_probe( char *nameStr, SlaveVP *animSlv ) 25.89 + { IntervalProbe *newProbe; 25.90 + 25.91 + newProbe = create_generic_probe( nameStr, animSlv ); 25.92 + 25.93 + return newProbe->probeID; 25.94 + } 25.95 + 25.96 +int32 25.97 +PR_impl__create_histogram_probe( int32 numBins, float64 startValue, 25.98 + float64 binWidth, char *nameStr, SlaveVP *animSlv ) 25.99 + { IntervalProbe *newProbe; 25.100 + 25.101 + newProbe = create_generic_probe( nameStr, animSlv ); 25.102 + 25.103 +#ifdef PROBES__USE_TIME_OF_DAY_PROBES 25.104 + DblHist *hist; 25.105 + hist = makeDblHistogram( numBins, startValue, binWidth ); 25.106 +#else 25.107 + Histogram *hist; 25.108 + hist = makeHistogram( numBins, startValue, binWidth ); 25.109 +#endif 25.110 + newProbe->hist = hist; 25.111 + return newProbe->probeID; 25.112 + } 25.113 + 25.114 + 25.115 +int32 25.116 +PR_impl__record_time_point_into_new_probe( char *nameStr, SlaveVP *animSlv) 25.117 + { IntervalProbe *newProbe; 25.118 + struct timeval *startStamp; 25.119 + float64 startSecs; 25.120 + 25.121 + newProbe = create_generic_probe( nameStr, animSlv ); 25.122 + newProbe->endSecs = 0; 25.123 + 25.124 + 25.125 + gettimeofday( &(newProbe->startStamp), NULL); 25.126 + 25.127 + //turn into a double 25.128 + startStamp = &(newProbe->startStamp); 25.129 + startSecs = startStamp->tv_sec + ( startStamp->tv_usec / 1000000.0 ); 25.130 + newProbe->startSecs = startSecs; 25.131 + 25.132 + return newProbe->probeID; 25.133 + } 25.134 + 25.135 +int32 25.136 +PR_ext_impl__record_time_point_into_new_probe( char *nameStr ) 25.137 + { IntervalProbe *newProbe; 25.138 + struct timeval *startStamp; 25.139 + float64 startSecs; 25.140 + 25.141 + newProbe = ext__create_generic_probe( nameStr ); 25.142 + newProbe->endSecs = 0; 25.143 + 25.144 + gettimeofday( &(newProbe->startStamp), NULL); 25.145 + 25.146 + //turn into a double 25.147 + startStamp = &(newProbe->startStamp); 25.148 + startSecs = startStamp->tv_sec + ( startStamp->tv_usec / 1000000.0 ); 25.149 + newProbe->startSecs = startSecs; 25.150 + 25.151 + return newProbe->probeID; 25.152 + } 25.153 + 25.154 + 25.155 +/*Only call from inside master or main startup/shutdown thread 25.156 + */ 25.157 +void 25.158 +PR_impl__free_probe( IntervalProbe *probe ) 25.159 + { if( probe->hist != NULL ) freeDblHist( probe->hist ); 25.160 + if( probe->nameStr != NULL) PR_int__free( probe->nameStr ); 25.161 + PR_int__free( probe ); 25.162 + } 25.163 + 25.164 + 25.165 +void 25.166 +PR_impl__index_probe_by_its_name( int32 probeID, SlaveVP *animSlv ) 25.167 + { IntervalProbe *probe; 25.168 + 25.169 + PR_int__get_master_lock(); 25.170 + probe = _PRMasterEnv->intervalProbes[ probeID ]; 25.171 + 25.172 + addValueIntoTable(probe->nameStr, probe, _PRMasterEnv->probeNameHashTbl); 25.173 + PR_int__release_master_lock(); 25.174 + } 25.175 + 25.176 + 25.177 +IntervalProbe * 25.178 +PR_impl__get_probe_by_name( char *probeName, SlaveVP *animSlv ) 25.179 + { 25.180 + //TODO: fix this To be in Master -- race condition 25.181 + return getValueFromTable( probeName, _PRMasterEnv->probeNameHashTbl ); 25.182 + } 25.183 + 25.184 + 25.185 +/*Everything is local to the animating slaveVP, so no need for request, do 25.186 + * work locally, in the anim Slv 25.187 + */ 25.188 +void 25.189 +PR_impl__record_sched_choice_into_probe( int32 probeID, SlaveVP *animatingSlv ) 25.190 + { IntervalProbe *probe; 25.191 + 25.192 + probe = _PRMasterEnv->intervalProbes[ probeID ]; 25.193 + probe->schedChoiceWasRecorded = TRUE; 25.194 + probe->coreNum = animatingSlv->coreAnimatedBy; 25.195 + probe->slaveID = animatingSlv->slaveID; 25.196 + probe->slaveCreateSecs = animatingSlv->createPtInSecs; 25.197 + } 25.198 + 25.199 +/*Everything is local to the animating slaveVP, so no need for request, do 25.200 + * work locally, in the anim Slv 25.201 + */ 25.202 +void 25.203 +PR_impl__record_interval_start_in_probe( int32 probeID ) 25.204 + { IntervalProbe *probe; 25.205 + 25.206 + DEBUG__printf( dbgProbes, "record start of interval" ) 25.207 + probe = _PRMasterEnv->intervalProbes[ probeID ]; 25.208 + 25.209 + //record *start* point as last thing, after lookup 25.210 +#ifdef PROBES__USE_TIME_OF_DAY_PROBES 25.211 + gettimeofday( &(probe->startStamp), NULL); 25.212 +#endif 25.213 +#ifdef PROBES__USE_TSC_PROBES 25.214 + probe->startStamp = getTSCount(); 25.215 +#endif 25.216 + } 25.217 + 25.218 + 25.219 +/*Everything is local to the animating slaveVP, except the histogram, so do 25.220 + * work locally, in the anim Slv -- may lose a few histogram counts 25.221 + * 25.222 + *This should be safe to run inside SlaveVP 25.223 + */ 25.224 +void 25.225 +PR_impl__record_interval_end_in_probe( int32 probeID ) 25.226 + { IntervalProbe *probe; 25.227 + 25.228 + //Record first thing -- before looking up the probe to store it into 25.229 +#ifdef PROBES__USE_TIME_OF_DAY_PROBES 25.230 + struct timeval endStamp; 25.231 + gettimeofday( &(endStamp), NULL); 25.232 +#endif 25.233 +#ifdef PROBES__USE_TSC_PROBES 25.234 + TSCount endStamp, interval; 25.235 + endStamp = getTSCount(); 25.236 +#endif 25.237 +#ifdef PROBES__USE_PERF_CTR_PROBES 25.238 + 25.239 +#endif 25.240 + 25.241 + probe = _PRMasterEnv->intervalProbes[ probeID ]; 25.242 + 25.243 +#ifdef PROBES__USE_TIME_OF_DAY_PROBES 25.244 + if( probe->hist != NULL ) 25.245 + { addToDblHist( giveInterval( probe->startStamp, endStamp), probe->hist ); 25.246 + } 25.247 +#endif 25.248 +#ifdef PROBES__USE_TSC_PROBES 25.249 + if( probe->hist != NULL ) 25.250 + { interval = probe->endStamp - probe->startStamp; 25.251 + //Sanity check for TSC counter overflow: if sane, add to histogram 25.252 + if( interval < probe->hist->endOfRange * 10 ) 25.253 + addToHist( interval, probe->hist ); 25.254 + } 25.255 +#endif 25.256 +#ifdef PROBES__USE_PERF_CTR_PROBES 25.257 + 25.258 +#endif 25.259 + 25.260 + DEBUG__printf( dbgProbes, "record end of interval" ) 25.261 + } 25.262 + 25.263 + 25.264 +void 25.265 +print_probe_helper( IntervalProbe *probe ) 25.266 + { 25.267 + printf( "\nprobe: %s, ", probe->nameStr ); 25.268 + 25.269 + 25.270 + if( probe->schedChoiceWasRecorded ) 25.271 + { printf( "coreNum: %d, slaveID: %d, slaveVPCreated: %0.6f | ", 25.272 + probe->coreNum, probe->slaveID, probe->slaveCreateSecs ); 25.273 + } 25.274 + 25.275 + if( probe->endSecs == 0 ) //just a single point in time 25.276 + { 25.277 + printf( " time point: %.6f\n", 25.278 + probe->startSecs - _PRMasterEnv->createPtInSecs ); 25.279 + } 25.280 + else if( probe->hist == NULL ) //just an interval 25.281 + { 25.282 + printf( " startSecs: %.6f interval: %.6f\n", 25.283 + (probe->startSecs - _PRMasterEnv->createPtInSecs), probe->interval); 25.284 + } 25.285 + else //a full histogram of intervals 25.286 + { 25.287 + printDblHist( probe->hist ); 25.288 + } 25.289 + } 25.290 + 25.291 +void 25.292 +PR_impl__print_stats_of_probe( IntervalProbe *probe ) 25.293 + { 25.294 + 25.295 +// probe = _PRMasterEnv->intervalProbes[ probeID ]; 25.296 + 25.297 + print_probe_helper( probe ); 25.298 + } 25.299 + 25.300 + 25.301 +void 25.302 +PR_impl__print_stats_of_all_probes() 25.303 + { 25.304 + forAllInDynArrayDo( _PRMasterEnv->dynIntervalProbesInfo, 25.305 + (DynArrayFnPtr) &PR_impl__print_stats_of_probe ); 25.306 + fflush( stdout ); 25.307 + }
26.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 26.2 +++ b/Services_Offered_by_PR/Measurement_and_Stats/probes.h Wed Sep 19 23:12:44 2012 -0700 26.3 @@ -0,0 +1,192 @@ 26.4 +/* 26.5 + * Copyright 2009 OpenSourceStewardshipFoundation.org 26.6 + * Licensed under GNU General Public License version 2 26.7 + * 26.8 + * Author: seanhalle@yahoo.com 26.9 + * 26.10 + */ 26.11 + 26.12 +#ifndef _PROBES_H 26.13 +#define _PROBES_H 26.14 +#define _GNU_SOURCE 26.15 + 26.16 +#include "PR_impl/PR_primitive_data_types.h" 26.17 + 26.18 +#include <sys/time.h> 26.19 + 26.20 +/*Note on order of include files: 26.21 + * This file relies on #defines that appear in other files, which must come 26.22 + * first in the #include sequence.. 26.23 + */ 26.24 + 26.25 +/*Use these aliases in application code*/ 26.26 +#define PR_App__record_time_point_into_new_probe PR_WL__record_time_point_into_new_probe 26.27 +#define PR_App__create_single_interval_probe PR_WL__create_single_interval_probe 26.28 +#define PR_App__create_histogram_probe PR_WL__create_histogram_probe 26.29 +#define PR_App__index_probe_by_its_name PR_WL__index_probe_by_its_name 26.30 +#define PR_App__get_probe_by_name PR_WL__get_probe_by_name 26.31 +#define PR_App__record_sched_choice_into_probe PR_WL__record_sched_choice_into_probe 26.32 +#define PR_App__record_interval_start_in_probe PR_WL__record_interval_start_in_probe 26.33 +#define PR_App__record_interval_end_in_probe PR_WL__record_interval_end_in_probe 26.34 +#define PR_App__print_stats_of_probe PR_WL__print_stats_of_probe 26.35 +#define PR_App__print_stats_of_all_probes PR_WL__print_stats_of_all_probes 26.36 + 26.37 + 26.38 +//========================== 26.39 +#ifdef PROBES__USE_TSC_PROBES 26.40 + #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 26.41 + TSCount startStamp; \ 26.42 + TSCount endStamp; \ 26.43 + TSCount interval; \ 26.44 + Histogram *hist; /*if left NULL, then is single interval probe*/ 26.45 +#endif 26.46 +#ifdef PROBES__USE_TIME_OF_DAY_PROBES 26.47 + #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 26.48 + struct timeval startStamp; \ 26.49 + struct timeval endStamp; \ 26.50 + float64 startSecs; \ 26.51 + float64 endSecs; \ 26.52 + float64 interval; \ 26.53 + DblHist *hist; /*if NULL, then is single interval probe*/ 26.54 +#endif 26.55 +#ifdef PROBES__USE_PERF_CTR_PROBES 26.56 + #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 26.57 + int64 startStamp; \ 26.58 + int64 endStamp; \ 26.59 + int64 interval; \ 26.60 + Histogram *hist; /*if left NULL, then is single interval probe*/ 26.61 +#endif 26.62 + 26.63 +//typedef struct _IntervalProbe IntervalProbe; -- is in PR.h 26.64 +struct _IntervalProbe 26.65 + { 26.66 + char *nameStr; 26.67 + int32 probeID; 26.68 + 26.69 + int32 schedChoiceWasRecorded; 26.70 + int32 coreNum; 26.71 + int32 slaveID; 26.72 + float64 slaveCreateSecs; 26.73 + PROBES__Insert_timestamps_and_intervals_into_probe_struct; 26.74 + }; 26.75 + 26.76 +//=========================== NEVER USE THESE ========================== 26.77 +/*NEVER use these in any code!! These are here only for use in the macros 26.78 + * defined in this file!! 26.79 + */ 26.80 +int32 26.81 +PR_impl__create_single_interval_probe( char *nameStr, SlaveVP *animSlv ); 26.82 + 26.83 +int32 26.84 +PR_impl__create_histogram_probe( int32 numBins, float64 startValue, 26.85 + float64 binWidth, char *nameStr, SlaveVP *animSlv ); 26.86 + 26.87 +int32 26.88 +PR_impl__record_time_point_into_new_probe( char *nameStr, SlaveVP *animSlv); 26.89 + 26.90 +int32 26.91 +PR_ext_impl__record_time_point_into_new_probe( char *nameStr ); 26.92 + 26.93 +void 26.94 +PR_impl__free_probe( IntervalProbe *probe ); 26.95 + 26.96 +void 26.97 +PR_impl__index_probe_by_its_name( int32 probeID, SlaveVP *animSlv ); 26.98 + 26.99 +IntervalProbe * 26.100 +PR_impl__get_probe_by_name( char *probeName, SlaveVP *animSlv ); 26.101 + 26.102 +void 26.103 +PR_impl__record_sched_choice_into_probe( int32 probeID, SlaveVP *animSlv ); 26.104 + 26.105 +void 26.106 +PR_impl__record_interval_start_in_probe( int32 probeID ); 26.107 + 26.108 +void 26.109 +PR_impl__record_interval_end_in_probe( int32 probeID ); 26.110 + 26.111 +void 26.112 +PR_impl__print_stats_of_probe( IntervalProbe *probe ); 26.113 + 26.114 +void 26.115 +PR_impl__print_stats_of_all_probes(); 26.116 + 26.117 + 26.118 +//======================== Probes ============================= 26.119 +// 26.120 +// Use macros to allow turning probes off with a #define switch 26.121 +// This means probes have zero impact on performance when off 26.122 +//============================================================= 26.123 + 26.124 +#ifdef PROBES__TURN_ON_STATS_PROBES 26.125 + 26.126 + #define PROBES__Create_Probe_Bookkeeping_Vars \ 26.127 + _PRMasterEnv->dynIntervalProbesInfo = \ 26.128 + makePrivDynArrayOfSize( (void***)&(_PRMasterEnv->intervalProbes), 200); \ 26.129 + \ 26.130 + _PRMasterEnv->probeNameHashTbl = makeHashTable( 1000, &PR_int__free ); \ 26.131 + \ 26.132 + /*put creation time directly into master env, for fast retrieval*/ \ 26.133 + struct timeval timeStamp; \ 26.134 + gettimeofday( &(timeStamp), NULL); \ 26.135 + _PRMasterEnv->createPtInSecs = \ 26.136 + timeStamp.tv_sec +(timeStamp.tv_usec/1000000.0); 26.137 + 26.138 + #define PR_WL__record_time_point_into_new_probe( nameStr, animSlv ) \ 26.139 + PR_impl__record_time_point_in_new_probe( nameStr, animSlv ) 26.140 + 26.141 + #define PR_ext__record_time_point_into_new_probe( nameStr ) \ 26.142 + PR_ext_impl__record_time_point_into_new_probe( nameStr ) 26.143 + 26.144 + #define PR_WL__create_single_interval_probe( nameStr, animSlv ) \ 26.145 + PR_impl__create_single_interval_probe( nameStr, animSlv ) 26.146 + 26.147 + #define PR_WL__create_histogram_probe( numBins, startValue, \ 26.148 + binWidth, nameStr, animSlv ) \ 26.149 + PR_impl__create_histogram_probe( numBins, startValue, \ 26.150 + binWidth, nameStr, animSlv ) 26.151 + #define PR_int__free_probe( probe ) \ 26.152 + PR_impl__free_probe( probe ) 26.153 + 26.154 + #define PR_WL__index_probe_by_its_name( probeID, animSlv ) \ 26.155 + PR_impl__index_probe_by_its_name( probeID, animSlv ) 26.156 + 26.157 + #define PR_WL__get_probe_by_name( probeID, animSlv ) \ 26.158 + PR_impl__get_probe_by_name( probeName, animSlv ) 26.159 + 26.160 + #define PR_WL__record_sched_choice_into_probe( probeID, animSlv ) \ 26.161 + PR_impl__record_sched_choice_into_probe( probeID, animSlv ) 26.162 + 26.163 + #define PR_WL__record_interval_start_in_probe( probeID ) \ 26.164 + PR_impl__record_interval_start_in_probe( probeID ) 26.165 + 26.166 + #define PR_WL__record_interval_end_in_probe( probeID ) \ 26.167 + PR_impl__record_interval_end_in_probe( probeID ) 26.168 + 26.169 + #define PR_WL__print_stats_of_probe( probeID ) \ 26.170 + PR_impl__print_stats_of_probe( probeID ) 26.171 + 26.172 + #define PR_WL__print_stats_of_all_probes() \ 26.173 + PR_impl__print_stats_of_all_probes() 26.174 + 26.175 + 26.176 +#else 26.177 + #define PROBES__Create_Probe_Bookkeeping_Vars 26.178 + #define PR_WL__record_time_point_into_new_probe( nameStr, animSlv ) 0 /* do nothing */ 26.179 + #define PR_ext__record_time_point_into_new_probe( nameStr ) 0 /* do nothing */ 26.180 + #define PR_WL__create_single_interval_probe( nameStr, animSlv ) 0 /* do nothing */ 26.181 + #define PR_WL__create_histogram_probe( numBins, startValue, \ 26.182 + binWidth, nameStr, animSlv ) \ 26.183 + 0 /* do nothing */ 26.184 + #define PR_WL__index_probe_by_its_name( probeID, animSlv ) /* do nothing */ 26.185 + #define PR_WL__get_probe_by_name( probeID, animSlv ) NULL /* do nothing */ 26.186 + #define PR_WL__record_sched_choice_into_probe( probeID, animSlv ) /* do nothing */ 26.187 + #define PR_WL__record_interval_start_in_probe( probeID ) /* do nothing */ 26.188 + #define PR_WL__record_interval_end_in_probe( probeID ) /* do nothing */ 26.189 + #define PR_WL__print_stats_of_probe( probeID ) ; /* do nothing */ 26.190 + #define PR_WL__print_stats_of_all_probes() ;/* do nothing */ 26.191 + 26.192 +#endif /* defined PROBES__TURN_ON_STATS_PROBES */ 26.193 + 26.194 +#endif /* _PROBES_H */ 26.195 +
27.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 27.2 +++ b/Services_Offered_by_PR/Memory_Handling/vmalloc.c Wed Sep 19 23:12:44 2012 -0700 27.3 @@ -0,0 +1,438 @@ 27.4 +/* 27.5 + * Copyright 2009 OpenSourceCodeStewardshipFoundation.org 27.6 + * Licensed under GNU General Public License version 2 27.7 + * 27.8 + * Author: seanhalle@yahoo.com 27.9 + * 27.10 + * Created on November 14, 2009, 9:07 PM 27.11 + */ 27.12 + 27.13 +#include <malloc.h> 27.14 +#include <inttypes.h> 27.15 +#include <stdlib.h> 27.16 +#include <stdio.h> 27.17 +#include <string.h> 27.18 +#include <math.h> 27.19 + 27.20 +#include "PR_impl/PR.h" 27.21 +#include "Histogram/Histogram.h" 27.22 + 27.23 +#define MAX_UINT64 0xFFFFFFFFFFFFFFFF 27.24 + 27.25 +//A MallocProlog is a head element if the HigherInMem variable is NULL 27.26 +//A Chunk is free if the prevChunkInFreeList variable is NULL 27.27 + 27.28 +/* 27.29 + * This calculates the container which fits the given size. 27.30 + */ 27.31 +inline 27.32 +uint32 getContainer(size_t size) 27.33 +{ 27.34 + return (log2(size)-LOG128)/LOG54; 27.35 +} 27.36 + 27.37 +/* 27.38 + * Removes the first chunk of a freeList 27.39 + * The chunk is removed but not set as free. There is no check if 27.40 + * the free list is empty, so make sure this is not the case. 27.41 + */ 27.42 +inline 27.43 +MallocProlog *removeChunk(MallocArrays* freeLists, uint32 containerIdx) 27.44 +{ 27.45 + MallocProlog** container = &freeLists->bigChunks[containerIdx]; 27.46 + MallocProlog* removedChunk = *container; 27.47 + *container = removedChunk->nextChunkInFreeList; 27.48 + 27.49 + if(removedChunk->nextChunkInFreeList) 27.50 + removedChunk->nextChunkInFreeList->prevChunkInFreeList = 27.51 + (MallocProlog*)container; 27.52 + 27.53 + if(*container == NULL) 27.54 + { 27.55 + if(containerIdx < 64) 27.56 + freeLists->bigChunksSearchVector[0] &= ~((uint64)1 << containerIdx); 27.57 + else 27.58 + freeLists->bigChunksSearchVector[1] &= ~((uint64)1 << (containerIdx-64)); 27.59 + } 27.60 + 27.61 + return removedChunk; 27.62 +} 27.63 + 27.64 +/* 27.65 + * Removes the first chunk of a freeList 27.66 + * The chunk is removed but not set as free. There is no check if 27.67 + * the free list is empty, so make sure this is not the case. 27.68 + */ 27.69 +inline 27.70 +MallocProlog *removeSmallChunk(MallocArrays* freeLists, uint32 containerIdx) 27.71 +{ 27.72 + MallocProlog** container = &freeLists->smallChunks[containerIdx]; 27.73 + MallocProlog* removedChunk = *container; 27.74 + *container = removedChunk->nextChunkInFreeList; 27.75 + 27.76 + if(removedChunk->nextChunkInFreeList) 27.77 + removedChunk->nextChunkInFreeList->prevChunkInFreeList = 27.78 + (MallocProlog*)container; 27.79 + 27.80 + return removedChunk; 27.81 +} 27.82 + 27.83 +inline 27.84 +size_t getChunkSize(MallocProlog* chunk) 27.85 +{ 27.86 + return (uintptr_t)chunk->nextHigherInMem - 27.87 + (uintptr_t)chunk - sizeof(MallocProlog); 27.88 +} 27.89 + 27.90 +/* 27.91 + * Removes a chunk from a free list. 27.92 + */ 27.93 +inline 27.94 +void extractChunk(MallocProlog* chunk, MallocArrays *freeLists) 27.95 +{ 27.96 + chunk->prevChunkInFreeList->nextChunkInFreeList = chunk->nextChunkInFreeList; 27.97 + if(chunk->nextChunkInFreeList) 27.98 + chunk->nextChunkInFreeList->prevChunkInFreeList = chunk->prevChunkInFreeList; 27.99 + 27.100 + //The last element in the list points to the container. If the container points 27.101 + //to NULL the container is empty 27.102 + if(*((void**)(chunk->prevChunkInFreeList)) == NULL && getChunkSize(chunk) >= BIG_LOWER_BOUND) 27.103 + { 27.104 + //Find the approppiate container because we do not know it 27.105 + uint64 containerIdx = ((uintptr_t)chunk->prevChunkInFreeList - (uintptr_t)freeLists->bigChunks) >> 3; 27.106 + if(containerIdx < (uint32)64) 27.107 + freeLists->bigChunksSearchVector[0] &= ~((uint64)1 << containerIdx); 27.108 + if(containerIdx < 128 && containerIdx >=64) 27.109 + freeLists->bigChunksSearchVector[1] &= ~((uint64)1 << (containerIdx-64)); 27.110 + 27.111 + } 27.112 +} 27.113 + 27.114 +/* 27.115 + * Merges two chunks. 27.116 + * Chunk A has to be before chunk B in memory. Both have to be removed from 27.117 + * a free list 27.118 + */ 27.119 +inline 27.120 +MallocProlog *mergeChunks(MallocProlog* chunkA, MallocProlog* chunkB) 27.121 +{ 27.122 + chunkA->nextHigherInMem = chunkB->nextHigherInMem; 27.123 + chunkB->nextHigherInMem->nextLowerInMem = chunkA; 27.124 + return chunkA; 27.125 +} 27.126 +/* 27.127 + * Inserts a chunk into a free list. 27.128 + */ 27.129 +inline 27.130 +void insertChunk(MallocProlog* chunk, MallocProlog** container) 27.131 +{ 27.132 + chunk->nextChunkInFreeList = *container; 27.133 + chunk->prevChunkInFreeList = (MallocProlog*)container; 27.134 + if(*container) 27.135 + (*container)->prevChunkInFreeList = chunk; 27.136 + *container = chunk; 27.137 +} 27.138 + 27.139 +/* 27.140 + * Divides the chunk that a new chunk of newSize is created. 27.141 + * There is no size check, so make sure the size value is valid. 27.142 + */ 27.143 +inline 27.144 +MallocProlog *divideChunk(MallocProlog* chunk, size_t newSize) 27.145 +{ 27.146 + MallocProlog* newChunk = (MallocProlog*)((uintptr_t)chunk->nextHigherInMem - 27.147 + newSize - sizeof(MallocProlog)); 27.148 + 27.149 + newChunk->nextLowerInMem = chunk; 27.150 + newChunk->nextHigherInMem = chunk->nextHigherInMem; 27.151 + 27.152 + chunk->nextHigherInMem->nextLowerInMem = newChunk; 27.153 + chunk->nextHigherInMem = newChunk; 27.154 + 27.155 + return newChunk; 27.156 +} 27.157 + 27.158 +/* 27.159 + * Search for chunk in the list of big chunks. Split the block if it's too big 27.160 + */ 27.161 +inline 27.162 +MallocProlog *searchChunk(MallocArrays *freeLists, size_t sizeRequested, uint32 containerIdx) 27.163 +{ 27.164 + MallocProlog* foundChunk; 27.165 + 27.166 + uint64 searchVector = freeLists->bigChunksSearchVector[0]; 27.167 + //set small chunk bits to zero 27.168 + searchVector &= MAX_UINT64 << containerIdx; 27.169 + containerIdx = __builtin_ffsl(searchVector); //least significant 1 bit 27.170 + 27.171 + if(containerIdx == 0) 27.172 + { 27.173 + searchVector = freeLists->bigChunksSearchVector[1]; 27.174 + containerIdx = __builtin_ffsl(searchVector); 27.175 + if(containerIdx == 0) 27.176 + { 27.177 + //TODO: get additional mem and insert into free list 27.178 + //malloc( MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE ); 27.179 + printf("PR malloc failed: low memory"); 27.180 + exit(1); 27.181 + } 27.182 + containerIdx += 64; 27.183 + } 27.184 + containerIdx--; 27.185 + 27.186 + 27.187 + foundChunk = removeChunk(freeLists, containerIdx); 27.188 + size_t chunkSize = getChunkSize(foundChunk); 27.189 + 27.190 + //If the new chunk is larger than the requested size: split 27.191 + if(chunkSize > sizeRequested + 2 * sizeof(MallocProlog) + BIG_LOWER_BOUND) 27.192 + { 27.193 + MallocProlog *newChunk = divideChunk(foundChunk,sizeRequested); 27.194 + containerIdx = getContainer(getChunkSize(foundChunk)) - 1; 27.195 + insertChunk(foundChunk,&freeLists->bigChunks[containerIdx]); 27.196 + if(containerIdx < 64) 27.197 + freeLists->bigChunksSearchVector[0] |= ((uint64)1 << containerIdx); 27.198 + else 27.199 + freeLists->bigChunksSearchVector[1] |= ((uint64)1 << (containerIdx-64)); 27.200 + foundChunk = newChunk; 27.201 + } 27.202 + 27.203 + return foundChunk; 27.204 +} 27.205 + 27.206 + 27.207 +/* 27.208 + * This is sequential code, meant to only be called from the Master, not from 27.209 + * any slave Slvs. 27.210 + * 27.211 + *May 2012 27.212 + *ToDo: Improve speed, by using built-in leading 1 detector to calc free-list 27.213 + * index. 27.214 + *Change to two separate arrays, one for free-lists of small fixed-size chunks 27.215 + * other for free lists of exponentially growing chunk sizes 27.216 + *Do simple compare to decide which array of lists to use 27.217 + *For small chunks, size the lists in increments of 16, up to, say, 128 (1024 27.218 + * is max if want less than 64 lists, which allows searching for first 27.219 + * occupied free-list using leading-1 detector on a bit-vector) 27.220 + *To find index, right-shift by 4 bits, and that's the index! (works because 27.221 + * compare says no 1's above 128 position ((bit 7)), and sizes are every 16, 27.222 + * so dividing by 16 equals exactly the position) 27.223 + *For large chunks, have 63 free lists, but split into even and odd indexes. 27.224 + *For even indexes, each list starts with chunks twice the size of previous 27.225 + * even index. 27.226 + *For odd indexes, each list starts with chunks of size half-way between those 27.227 + * of the even indexes on either side. 27.228 + * 27.229 + *To calc the free-list position of a requested size, get pos of leading 1 27.230 + * of the size, call this msbsP (most-significant-bit-set-position). Then 27.231 + * check bit to right of it (one-less-significant) 27.232 + *If it's 0 then use the even index: msbsP * 2, which is msbsP << 1. 27.233 + *If it's 1, then use the odd-index, which is msbsP << 1 + 1 27.234 + * 27.235 + *To find msbsP, use GCC builtin: "int __builtin_clzll (unsigned long long)" 27.236 + * which returns the number of zeros above (left of) msb set. Note, dies if 27.237 + * give it zero, but the compare used to choose between arrays makes sure 27.238 + * requested size given to it is not zero. 27.239 + * 27.240 + *This scheme keeps wastage small, while finding free element is O(1), and a 27.241 + * fast constant. 27.242 + *For large chunk sizes, if don't shave excess, then it ensures worst-case 27.243 + * wastage due to mis-match in size of chunk vs requested size is 33% 27.244 + * (invariant: take any even list.. it starts at a power of 2, and next list 27.245 + * up starts at 50% larger, so biggest chunk is 1.5 x smallest request, that's 27.246 + * 33% of total memory wasted. Then, for the odd index above, smallest chunk 27.247 + * is 2x for smallest request of 1.5x, for 25% total wasted memory) 27.248 + *For smallest size chunks, the pre-amble wastes quite a bit, but above that, 27.249 + * sizing in increments of 16 keeps wastage small. And, if always shave, then 27.250 + * wastage due to size mis-match is maximum 16 bytes for the large chunks. 27.251 + * 27.252 + */ 27.253 +void * 27.254 +PR_int__malloc( size_t sizeRequested ) 27.255 + { 27.256 + MEAS__Capture_Pre_Malloc_Point 27.257 + 27.258 + MallocArrays* freeLists = _PRMasterEnv->freeLists; 27.259 + MallocProlog* foundChunk; 27.260 + 27.261 + //Return a small chunk if the requested size is smaller than 128B 27.262 + if(sizeRequested <= LOWER_BOUND) 27.263 + { 27.264 + uint32 freeListIdx = (sizeRequested-1)/SMALL_CHUNK_SIZE; 27.265 + if(freeLists->smallChunks[freeListIdx] == NULL) 27.266 + foundChunk = searchChunk(freeLists, SMALL_CHUNK_SIZE*(freeListIdx+1), 0); 27.267 + else 27.268 + foundChunk = removeSmallChunk(freeLists, freeListIdx); 27.269 + 27.270 + //Mark as allocated 27.271 + foundChunk->prevChunkInFreeList = NULL; 27.272 + return foundChunk + 1; 27.273 + } 27.274 + 27.275 + //Calculate the expected container. Start one higher to have a Chunk that's 27.276 + //always big enough. 27.277 + uint32 containerIdx = getContainer(sizeRequested); 27.278 + 27.279 + if(freeLists->bigChunks[containerIdx] == NULL) 27.280 + foundChunk = searchChunk(freeLists, sizeRequested, containerIdx); 27.281 + else 27.282 + foundChunk = removeChunk(freeLists, containerIdx); 27.283 + 27.284 + //Mark as allocated 27.285 + foundChunk->prevChunkInFreeList = NULL; 27.286 + 27.287 + MEAS__Capture_Post_Malloc_Point 27.288 + 27.289 + //skip over the prolog by adding its size to the pointer return 27.290 + return foundChunk + 1; 27.291 + } 27.292 + 27.293 +void * 27.294 +PR_WL__malloc( int32 sizeRequested ) 27.295 + { void *ret; 27.296 + 27.297 + PR_int__get_master_lock(); 27.298 + ret = PR_int__malloc( sizeRequested ); 27.299 + PR_int__release_master_lock(); 27.300 + return ret; 27.301 + } 27.302 + 27.303 + 27.304 +/* 27.305 + * This is sequential code, meant to only be called from the Master, not from 27.306 + * any slave Slvs. 27.307 + */ 27.308 +void 27.309 +PR_int__free( void *ptrToFree ) 27.310 + { 27.311 + 27.312 + MEAS__Capture_Pre_Free_Point; 27.313 + 27.314 + MallocArrays* freeLists = _PRMasterEnv->freeLists; 27.315 + MallocProlog *chunkToFree = (MallocProlog*)ptrToFree - 1; 27.316 + uint32 containerIdx; 27.317 + 27.318 + //Check for free neighbors 27.319 + if(chunkToFree->nextLowerInMem) 27.320 + { 27.321 + if(chunkToFree->nextLowerInMem->prevChunkInFreeList != NULL) 27.322 + {//Chunk is not allocated 27.323 + extractChunk(chunkToFree->nextLowerInMem, freeLists); 27.324 + chunkToFree = mergeChunks(chunkToFree->nextLowerInMem, chunkToFree); 27.325 + } 27.326 + } 27.327 + if(chunkToFree->nextHigherInMem) 27.328 + { 27.329 + if(chunkToFree->nextHigherInMem->prevChunkInFreeList != NULL) 27.330 + {//Chunk is not allocated 27.331 + extractChunk(chunkToFree->nextHigherInMem, freeLists); 27.332 + chunkToFree = mergeChunks(chunkToFree, chunkToFree->nextHigherInMem); 27.333 + } 27.334 + } 27.335 + 27.336 + size_t chunkSize = getChunkSize(chunkToFree); 27.337 + if(chunkSize < BIG_LOWER_BOUND) 27.338 + { 27.339 + containerIdx = (chunkSize/SMALL_CHUNK_SIZE)-1; 27.340 + if(containerIdx > SMALL_CHUNK_COUNT-1) 27.341 + containerIdx = SMALL_CHUNK_COUNT-1; 27.342 + insertChunk(chunkToFree, &freeLists->smallChunks[containerIdx]); 27.343 + } 27.344 + else 27.345 + { 27.346 + containerIdx = getContainer(getChunkSize(chunkToFree)) - 1; 27.347 + insertChunk(chunkToFree, &freeLists->bigChunks[containerIdx]); 27.348 + if(containerIdx < 64) 27.349 + freeLists->bigChunksSearchVector[0] |= (uint64)1 << containerIdx; 27.350 + else 27.351 + freeLists->bigChunksSearchVector[1] |= (uint64)1 << (containerIdx-64); 27.352 + } 27.353 + 27.354 + MEAS__Capture_Post_Free_Point; 27.355 + } 27.356 + 27.357 +void 27.358 +PR_WL__free( void *ptrToFree ) 27.359 + { 27.360 + PR_int__get_master_lock(); 27.361 + PR_int__free( ptrToFree ); 27.362 + PR_int__release_master_lock(); 27.363 + } 27.364 + 27.365 +/* 27.366 + * Designed to be called from the main thread outside of PR, during init 27.367 + */ 27.368 +MallocArrays * 27.369 +PR_ext__create_free_list() 27.370 +{ 27.371 + //Initialize containers for small chunks and fill with zeros 27.372 + _PRMasterEnv->freeLists = (MallocArrays*)malloc( sizeof(MallocArrays) ); 27.373 + MallocArrays *freeLists = _PRMasterEnv->freeLists; 27.374 + 27.375 + freeLists->smallChunks = 27.376 + (MallocProlog**)malloc(SMALL_CHUNK_COUNT*sizeof(MallocProlog*)); 27.377 + memset((void*)freeLists->smallChunks, 27.378 + 0,SMALL_CHUNK_COUNT*sizeof(MallocProlog*)); 27.379 + 27.380 + //Calculate number of containers for big chunks 27.381 + uint32 container = getContainer(MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE)+1; 27.382 + freeLists->bigChunks = (MallocProlog**)malloc(container*sizeof(MallocProlog*)); 27.383 + memset((void*)freeLists->bigChunks,0,container*sizeof(MallocProlog*)); 27.384 + freeLists->containerCount = container; 27.385 + 27.386 + //Create first element in lastContainer 27.387 + MallocProlog *firstChunk = malloc( MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE ); 27.388 + if( firstChunk == NULL ) {printf("Can't allocate initial memory\n"); exit(1);} 27.389 + freeLists->memSpace = firstChunk; 27.390 + 27.391 + //Touch memory to avoid page faults 27.392 + void *ptr,*endPtr; 27.393 + endPtr = (void*)firstChunk+MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE; 27.394 + for(ptr = firstChunk; ptr < endPtr; ptr+=PAGE_SIZE) 27.395 + { 27.396 + *(char*)ptr = 0; 27.397 + } 27.398 + 27.399 + firstChunk->nextLowerInMem = NULL; 27.400 + firstChunk->nextHigherInMem = (MallocProlog*)((uintptr_t)firstChunk + 27.401 + MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE - sizeof(MallocProlog)); 27.402 + firstChunk->nextChunkInFreeList = NULL; 27.403 + //previous element in the queue is the container 27.404 + firstChunk->prevChunkInFreeList = &freeLists->bigChunks[container-2]; 27.405 + 27.406 + freeLists->bigChunks[container-2] = firstChunk; 27.407 + //Insert into bit search list 27.408 + if(container <= 65) 27.409 + { 27.410 + freeLists->bigChunksSearchVector[0] = ((uint64)1 << (container-2)); 27.411 + freeLists->bigChunksSearchVector[1] = 0; 27.412 + } 27.413 + else 27.414 + { 27.415 + freeLists->bigChunksSearchVector[0] = 0; 27.416 + freeLists->bigChunksSearchVector[1] = ((uint64)1 << (container-66)); 27.417 + } 27.418 + 27.419 + //Create dummy chunk to mark the top of stack this is of course 27.420 + //never freed 27.421 + MallocProlog *dummyChunk = firstChunk->nextHigherInMem; 27.422 + dummyChunk->nextHigherInMem = dummyChunk+1; 27.423 + dummyChunk->nextLowerInMem = NULL; 27.424 + dummyChunk->nextChunkInFreeList = NULL; 27.425 + dummyChunk->prevChunkInFreeList = NULL; 27.426 + 27.427 + return freeLists; 27.428 + } 27.429 + 27.430 + 27.431 +/*Designed to be called from the main thread outside of PR, during cleanup 27.432 + */ 27.433 +void 27.434 +PR_ext__free_free_list( MallocArrays *freeLists ) 27.435 + { 27.436 + free(freeLists->memSpace); 27.437 + free(freeLists->bigChunks); 27.438 + free(freeLists->smallChunks); 27.439 + 27.440 + } 27.441 +
28.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 28.2 +++ b/Services_Offered_by_PR/Memory_Handling/vmalloc.h Wed Sep 19 23:12:44 2012 -0700 28.3 @@ -0,0 +1,94 @@ 28.4 +/* 28.5 + * Copyright 2009 OpenSourceCodeStewardshipFoundation.org 28.6 + * Licensed under GNU General Public License version 2 28.7 + * 28.8 + * Author: seanhalle@yahoo.com 28.9 + * 28.10 + * Created on November 14, 2009, 9:07 PM 28.11 + */ 28.12 + 28.13 +#ifndef _VMALLOC_H 28.14 +#define _VMALLOC_H 28.15 + 28.16 +#include <malloc.h> 28.17 +#include <inttypes.h> 28.18 +#include "PR_impl/PR_primitive_data_types.h" 28.19 + 28.20 +#define SMALL_CHUNK_SIZE 32 28.21 +#define SMALL_CHUNK_COUNT 4 28.22 +#define LOWER_BOUND 128 //Biggest chunk size that is created for the small chunks 28.23 +#define BIG_LOWER_BOUND 160 //Smallest chunk size that is created for the big chunks 28.24 + 28.25 +#define LOG54 0.3219280948873623 28.26 +#define LOG128 7 28.27 + 28.28 +typedef struct _MallocProlog MallocProlog; 28.29 + 28.30 +struct _MallocProlog 28.31 + { 28.32 + MallocProlog *nextChunkInFreeList; 28.33 + MallocProlog *prevChunkInFreeList; 28.34 + MallocProlog *nextHigherInMem; 28.35 + MallocProlog *nextLowerInMem; 28.36 + }; 28.37 +//MallocProlog 28.38 + 28.39 + typedef struct MallocArrays MallocArrays; 28.40 + 28.41 + struct MallocArrays 28.42 + { 28.43 + MallocProlog **smallChunks; 28.44 + MallocProlog **bigChunks; 28.45 + uint64 bigChunksSearchVector[2]; 28.46 + void *memSpace; 28.47 + uint32 containerCount; 28.48 + }; 28.49 + //MallocArrays 28.50 + 28.51 +typedef struct 28.52 + { 28.53 + MallocProlog *firstChunkInFreeList; 28.54 + int32 numInList; //TODO not used 28.55 + } 28.56 +FreeListHead; 28.57 + 28.58 +void * 28.59 +PR_int__malloc( size_t sizeRequested ); 28.60 +#define PR_PI__malloc PR_int__malloc 28.61 + 28.62 +void * 28.63 +PR_WL__malloc( int32 sizeRequested ); /*BUG: -- get master lock */ 28.64 +#define PR_App__malloc PR_WL__malloc 28.65 + 28.66 +void * 28.67 +PR_int__malloc_aligned( size_t sizeRequested ); 28.68 +#define PR_PI__malloc_aligned PR_int__malloc_aligned 28.69 + 28.70 +void 28.71 +PR_int__free( void *ptrToFree ); 28.72 +#define PR_PI__free PR_int__free 28.73 + 28.74 +void 28.75 +PR_WL__free( void *ptrToFree ); 28.76 +#define PR_App__free PR_WL__free 28.77 + 28.78 + 28.79 + 28.80 +/*Allocates memory from the external system -- higher overhead 28.81 + */ 28.82 +void * 28.83 +PR_ext__malloc_in_ext( size_t sizeRequested ); 28.84 + 28.85 +/*Frees memory that was allocated in the external system -- higher overhead 28.86 + */ 28.87 +void 28.88 +PR_ext__free_in_ext( void *ptrToFree ); 28.89 + 28.90 + 28.91 +MallocArrays * 28.92 +PR_ext__create_free_list(); 28.93 + 28.94 +void 28.95 +PR_ext__free_free_list(MallocArrays *freeLists ); 28.96 + 28.97 +#endif 28.98 \ No newline at end of file
29.1 --- a/Services_Offered_by_VMS/Debugging/DEBUG__macros.h Mon Sep 03 03:34:54 2012 -0700 29.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 29.3 @@ -1,65 +0,0 @@ 29.4 -/* 29.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 29.6 - * Licensed under GNU General Public License version 2 29.7 - * 29.8 - * Author: seanhalle@yahoo.com 29.9 - * 29.10 - */ 29.11 - 29.12 -#ifndef _VMS_DEFS_DEBUG_H 29.13 -#define _VMS_DEFS_DEBUG_H 29.14 -#define _GNU_SOURCE 29.15 - 29.16 -/* 29.17 - */ 29.18 -#ifdef DEBUG__TURN_ON_DEBUG_PRINT 29.19 - #define DEBUG__printf( bool, msg) \ 29.20 - do{\ 29.21 - if(bool)\ 29.22 - { printf(msg);\ 29.23 - printf(" | function: %s\n", __FUNCTION__);\ 29.24 - fflush(stdin);\ 29.25 - }\ 29.26 - }while(0);/*macro magic to isolate var-names*/ 29.27 - 29.28 - #define DEBUG__printf1( bool, msg, param) \ 29.29 - do{\ 29.30 - if(bool)\ 29.31 - { printf(msg, param);\ 29.32 - printf(" | function: %s\n", __FUNCTION__);\ 29.33 - fflush(stdin);\ 29.34 - }\ 29.35 - }while(0);/*macro magic to isolate var-names*/ 29.36 - 29.37 - #define DEBUG__printf2( bool, msg, p1, p2) \ 29.38 - do{\ 29.39 - if(bool)\ 29.40 - { printf(msg, p1, p2); \ 29.41 - printf(" | function: %s\n", __FUNCTION__);\ 29.42 - fflush(stdin);\ 29.43 - }\ 29.44 - }while(0);/*macro magic to isolate var-names*/ 29.45 - 29.46 - #define DEBUG__printf3( bool, msg, p1, p2, p3) \ 29.47 - do{\ 29.48 - if(bool)\ 29.49 - { printf(msg, p1, p2, p3); \ 29.50 - printf(" | function: %s\n", __FUNCTION__);\ 29.51 - fflush(stdin);\ 29.52 - }\ 29.53 - }while(0);/*macro magic to isolate var-names*/ 29.54 - 29.55 -#else 29.56 - #define DEBUG__printf( bool, msg) 29.57 - #define DEBUG__printf1( bool, msg, param) 29.58 - #define DEBUG__printf2( bool, msg, p1, p2) 29.59 -#endif 29.60 - 29.61 -//============================= ERROR MSGs ============================ 29.62 -#define ERROR(msg) printf(msg); 29.63 -#define ERROR1(msg, param) printf(msg, param); 29.64 -#define ERROR2(msg, p1, p2) printf(msg, p1, p2); 29.65 - 29.66 -//=========================================================================== 29.67 -#endif /* _VMS_DEFS_H */ 29.68 -
30.1 --- a/Services_Offered_by_VMS/Lang_Constructs/VMS_Lang.h Mon Sep 03 03:34:54 2012 -0700 30.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 30.3 @@ -1,44 +0,0 @@ 30.4 -/* 30.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 30.6 - * Licensed under GNU General Public License version 2 30.7 - * 30.8 - * Author: seanhalle@yahoo.com 30.9 - * 30.10 - */ 30.11 - 30.12 -#ifndef _VMS_LANG_CONSTRUCTS_H 30.13 -#define _VMS_LANG_CONSTRUCTS_H 30.14 - 30.15 -#include "VMS_impl/VMS_primitive_data_types.h" 30.16 - 30.17 -/*This header defines everything specific to the VMS provided language 30.18 - * constructs. 30.19 - *Such constructs are used in application code, mixed-in with calls to 30.20 - * constructs of the VMS-based language. 30.21 - */ 30.22 -inline void 30.23 -handleMalloc( SSRSemReq *semReq, SlaveVP *requestingSlv, SSRSemEnv *semEnv); 30.24 -inline void 30.25 -handleFree( SSRSemReq *semReq, SlaveVP *requestingSlv, SSRSemEnv *semEnv ); 30.26 -inline void 30.27 -handleTransEnd(SSRSemReq *semReq, SlaveVP *requestingSlv, SSRSemEnv*semEnv); 30.28 -inline void 30.29 -handleTransStart( SSRSemReq *semReq, SlaveVP *requestingSlv, 30.30 - SSRSemEnv *semEnv ); 30.31 -inline void 30.32 -handleAtomic( SSRSemReq *semReq, SlaveVP *requestingSlv, SSRSemEnv *semEnv); 30.33 -inline void 30.34 -handleStartFnSingleton( SSRSemReq *semReq, SlaveVP *reqstingSlv, 30.35 - SSRSemEnv *semEnv ); 30.36 -inline void 30.37 -handleEndFnSingleton( SSRSemReq *semReq, SlaveVP *requestingSlv, 30.38 - SSRSemEnv *semEnv ); 30.39 -inline void 30.40 -handleStartDataSingleton( SSRSemReq *semReq, SlaveVP *reqstingSlv, 30.41 - SSRSemEnv *semEnv ); 30.42 -inline void 30.43 -handleEndDataSingleton( SSRSemReq *semReq, SlaveVP *requestingSlv, 30.44 - SSRSemEnv *semEnv ); 30.45 - 30.46 -#endif /* _VMS_LANG_CONSTRUCTS_H */ 30.47 -
31.1 --- a/Services_Offered_by_VMS/Measurement_and_Stats/MEAS__macros.h Mon Sep 03 03:34:54 2012 -0700 31.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 31.3 @@ -1,514 +0,0 @@ 31.4 -/* 31.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 31.6 - * Licensed under GNU General Public License version 2 31.7 - * 31.8 - * Author: seanhalle@yahoo.com 31.9 - * 31.10 - */ 31.11 - 31.12 -#ifndef _VMS_MEAS_MACROS_H 31.13 -#define _VMS_MEAS_MACROS_H 31.14 -#define _GNU_SOURCE 31.15 - 31.16 -//================== Macros define types of meas want ===================== 31.17 -// 31.18 -/*Generic measurement macro -- has name-space collision potential, which 31.19 - * compiler will catch.. so only use one pair inside a given set of 31.20 - * curly braces. 31.21 - */ 31.22 -//TODO: finish generic capture interval in hist 31.23 -enum histograms 31.24 - { generic1 31.25 - }; 31.26 - #define MEAS__Capture_Pre_Point \ 31.27 - int32 startStamp, endStamp; \ 31.28 - saveLowTimeStampCountInto( startStamp ); 31.29 - 31.30 - #define MEAS__Capture_Post_Point( histName ) \ 31.31 - saveLowTimeStampCountInto( endStamp ); \ 31.32 - addIntervalToHist( startStamp, endStamp, _VMSMasterEnv->histName ); 31.33 - 31.34 - 31.35 - 31.36 - 31.37 -//================== Macros define types of meas want ===================== 31.38 - 31.39 -#ifdef MEAS__TURN_ON_SUSP_MEAS 31.40 - #define MEAS__Insert_Susp_Meas_Fields_into_Slave \ 31.41 - uint32 preSuspTSCLow; \ 31.42 - uint32 postSuspTSCLow; 31.43 - 31.44 - #define MEAS__Insert_Susp_Meas_Fields_into_MasterEnv \ 31.45 - Histogram *suspLowTimeHist; \ 31.46 - Histogram *suspHighTimeHist; 31.47 - 31.48 - #define MEAS__Make_Meas_Hists_for_Susp_Meas \ 31.49 - _VMSMasterEnv->suspLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.50 - "master_low_time_hist");\ 31.51 - _VMSMasterEnv->suspHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.52 - "master_high_time_hist"); 31.53 - 31.54 - //record time stamp: compare to time-stamp recorded below 31.55 - #define MEAS__Capture_Pre_Susp_Point \ 31.56 - saveLowTimeStampCountInto( animatingSlv->preSuspTSCLow ); 31.57 - 31.58 - //NOTE: only take low part of count -- do sanity check when take diff 31.59 - #define MEAS__Capture_Post_Susp_Point \ 31.60 - saveLowTimeStampCountInto( animatingSlv->postSuspTSCLow );\ 31.61 - addIntervalToHist( preSuspTSCLow, postSuspTSCLow,\ 31.62 - _VMSMasterEnv->suspLowTimeHist ); \ 31.63 - addIntervalToHist( preSuspTSCLow, postSuspTSCLow,\ 31.64 - _VMSMasterEnv->suspHighTimeHist ); 31.65 - 31.66 - #define MEAS__Print_Hists_for_Susp_Meas \ 31.67 - printHist( _VMSMasterEnv->pluginTimeHist ); 31.68 - 31.69 -#else 31.70 - #define MEAS__Insert_Susp_Meas_Fields_into_Slave 31.71 - #define MEAS__Insert_Susp_Meas_Fields_into_MasterEnv 31.72 - #define MEAS__Make_Meas_Hists_for_Susp_Meas 31.73 - #define MEAS__Capture_Pre_Susp_Point 31.74 - #define MEAS__Capture_Post_Susp_Point 31.75 - #define MEAS__Print_Hists_for_Susp_Meas 31.76 -#endif 31.77 - 31.78 -#ifdef MEAS__TURN_ON_MASTER_MEAS 31.79 - #define MEAS__Insert_Master_Meas_Fields_into_Slave \ 31.80 - uint32 startMasterTSCLow; \ 31.81 - uint32 endMasterTSCLow; 31.82 - 31.83 - #define MEAS__Insert_Master_Meas_Fields_into_MasterEnv \ 31.84 - Histogram *masterLowTimeHist; \ 31.85 - Histogram *masterHighTimeHist; 31.86 - 31.87 - #define MEAS__Make_Meas_Hists_for_Master_Meas \ 31.88 - _VMSMasterEnv->masterLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.89 - "master_low_time_hist");\ 31.90 - _VMSMasterEnv->masterHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.91 - "master_high_time_hist"); 31.92 - 31.93 - //Total Master time includes one coreloop time -- just assume the core 31.94 - // loop time is same for Master as for AppSlvs, even though it may be 31.95 - // smaller due to higher predictability of the fixed jmp. 31.96 - #define MEAS__Capture_Pre_Master_Point\ 31.97 - saveLowTimeStampCountInto( masterVP->startMasterTSCLow ); 31.98 - 31.99 - #define MEAS__Capture_Post_Master_Point \ 31.100 - saveLowTimeStampCountInto( masterVP->endMasterTSCLow );\ 31.101 - addIntervalToHist( startMasterTSCLow, endMasterTSCLow,\ 31.102 - _VMSMasterEnv->masterLowTimeHist ); \ 31.103 - addIntervalToHist( startMasterTSCLow, endMasterTSCLow,\ 31.104 - _VMSMasterEnv->masterHighTimeHist ); 31.105 - 31.106 - #define MEAS__Print_Hists_for_Master_Meas \ 31.107 - printHist( _VMSMasterEnv->pluginTimeHist ); 31.108 - 31.109 -#else 31.110 - #define MEAS__Insert_Master_Meas_Fields_into_Slave 31.111 - #define MEAS__Insert_Master_Meas_Fields_into_MasterEnv 31.112 - #define MEAS__Make_Meas_Hists_for_Master_Meas 31.113 - #define MEAS__Capture_Pre_Master_Point 31.114 - #define MEAS__Capture_Post_Master_Point 31.115 - #define MEAS__Print_Hists_for_Master_Meas 31.116 -#endif 31.117 - 31.118 - 31.119 -#ifdef MEAS__TURN_ON_MASTER_LOCK_MEAS 31.120 - #define MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv \ 31.121 - Histogram *masterLockLowTimeHist; \ 31.122 - Histogram *masterLockHighTimeHist; 31.123 - 31.124 - #define MEAS__Make_Meas_Hists_for_Master_Lock_Meas \ 31.125 - _VMSMasterEnv->masterLockLowTimeHist = makeFixedBinHist( 50, 0, 2, \ 31.126 - "master lock low time hist");\ 31.127 - _VMSMasterEnv->masterLockHighTimeHist = makeFixedBinHist( 50, 0, 100,\ 31.128 - "master lock high time hist"); 31.129 - 31.130 - #define MEAS__Capture_Pre_Master_Lock_Point \ 31.131 - int32 startStamp, endStamp; \ 31.132 - saveLowTimeStampCountInto( startStamp ); 31.133 - 31.134 - #define MEAS__Capture_Post_Master_Lock_Point \ 31.135 - saveLowTimeStampCountInto( endStamp ); \ 31.136 - addIntervalToHist( startStamp, endStamp,\ 31.137 - _VMSMasterEnv->masterLockLowTimeHist ); \ 31.138 - addIntervalToHist( startStamp, endStamp,\ 31.139 - _VMSMasterEnv->masterLockHighTimeHist ); 31.140 - 31.141 - #define MEAS__Print_Hists_for_Master_Lock_Meas \ 31.142 - printHist( _VMSMasterEnv->masterLockLowTimeHist ); \ 31.143 - printHist( _VMSMasterEnv->masterLockHighTimeHist ); 31.144 - 31.145 -#else 31.146 - #define MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv 31.147 - #define MEAS__Make_Meas_Hists_for_Master_Lock_Meas 31.148 - #define MEAS__Capture_Pre_Master_Lock_Point 31.149 - #define MEAS__Capture_Post_Master_Lock_Point 31.150 - #define MEAS__Print_Hists_for_Master_Lock_Meas 31.151 -#endif 31.152 - 31.153 - 31.154 -#ifdef MEAS__TURN_ON_MALLOC_MEAS 31.155 - #define MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv\ 31.156 - Histogram *mallocTimeHist; \ 31.157 - Histogram *freeTimeHist; 31.158 - 31.159 - #define MEAS__Make_Meas_Hists_for_Malloc_Meas \ 31.160 - _VMSMasterEnv->mallocTimeHist = makeFixedBinHistExt( 100, 0, 30,\ 31.161 - "malloc_time_hist");\ 31.162 - _VMSMasterEnv->freeTimeHist = makeFixedBinHistExt( 100, 0, 30,\ 31.163 - "free_time_hist"); 31.164 - 31.165 - #define MEAS__Capture_Pre_Malloc_Point \ 31.166 - int32 startStamp, endStamp; \ 31.167 - saveLowTimeStampCountInto( startStamp ); 31.168 - 31.169 - #define MEAS__Capture_Post_Malloc_Point \ 31.170 - saveLowTimeStampCountInto( endStamp ); \ 31.171 - addIntervalToHist( startStamp, endStamp,\ 31.172 - _VMSMasterEnv->mallocTimeHist ); 31.173 - 31.174 - #define MEAS__Capture_Pre_Free_Point \ 31.175 - int32 startStamp, endStamp; \ 31.176 - saveLowTimeStampCountInto( startStamp ); 31.177 - 31.178 - #define MEAS__Capture_Post_Free_Point \ 31.179 - saveLowTimeStampCountInto( endStamp ); \ 31.180 - addIntervalToHist( startStamp, endStamp,\ 31.181 - _VMSMasterEnv->freeTimeHist ); 31.182 - 31.183 - #define MEAS__Print_Hists_for_Malloc_Meas \ 31.184 - printHist( _VMSMasterEnv->mallocTimeHist ); \ 31.185 - saveHistToFile( _VMSMasterEnv->mallocTimeHist ); \ 31.186 - printHist( _VMSMasterEnv->freeTimeHist ); \ 31.187 - saveHistToFile( _VMSMasterEnv->freeTimeHist ); \ 31.188 - freeHistExt( _VMSMasterEnv->mallocTimeHist ); \ 31.189 - freeHistExt( _VMSMasterEnv->freeTimeHist ); 31.190 - 31.191 -#else 31.192 - #define MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv 31.193 - #define MEAS__Make_Meas_Hists_for_Malloc_Meas 31.194 - #define MEAS__Capture_Pre_Malloc_Point 31.195 - #define MEAS__Capture_Post_Malloc_Point 31.196 - #define MEAS__Capture_Pre_Free_Point 31.197 - #define MEAS__Capture_Post_Free_Point 31.198 - #define MEAS__Print_Hists_for_Malloc_Meas 31.199 -#endif 31.200 - 31.201 - 31.202 - 31.203 -#ifdef MEAS__TURN_ON_PLUGIN_MEAS 31.204 - #define MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv \ 31.205 - Histogram *reqHdlrLowTimeHist; \ 31.206 - Histogram *reqHdlrHighTimeHist; 31.207 - 31.208 - #define MEAS__Make_Meas_Hists_for_Plugin_Meas \ 31.209 - _VMSMasterEnv->reqHdlrLowTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.210 - "plugin_low_time_hist");\ 31.211 - _VMSMasterEnv->reqHdlrHighTimeHist = makeFixedBinHistExt( 100, 0, 200,\ 31.212 - "plugin_high_time_hist"); 31.213 - 31.214 - #define MEAS__startReqHdlr \ 31.215 - int32 startStamp1, endStamp1; \ 31.216 - saveLowTimeStampCountInto( startStamp1 ); 31.217 - 31.218 - #define MEAS__endReqHdlr \ 31.219 - saveLowTimeStampCountInto( endStamp1 ); \ 31.220 - addIntervalToHist( startStamp1, endStamp1, \ 31.221 - _VMSMasterEnv->reqHdlrLowTimeHist ); \ 31.222 - addIntervalToHist( startStamp1, endStamp1, \ 31.223 - _VMSMasterEnv->reqHdlrHighTimeHist ); 31.224 - 31.225 - #define MEAS__Print_Hists_for_Plugin_Meas \ 31.226 - printHist( _VMSMasterEnv->reqHdlrLowTimeHist ); \ 31.227 - saveHistToFile( _VMSMasterEnv->reqHdlrLowTimeHist ); \ 31.228 - printHist( _VMSMasterEnv->reqHdlrHighTimeHist ); \ 31.229 - saveHistToFile( _VMSMasterEnv->reqHdlrHighTimeHist ); \ 31.230 - freeHistExt( _VMSMasterEnv->reqHdlrLowTimeHist ); \ 31.231 - freeHistExt( _VMSMasterEnv->reqHdlrHighTimeHist ); 31.232 -#else 31.233 - #define MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv 31.234 - #define MEAS__Make_Meas_Hists_for_Plugin_Meas 31.235 - #define MEAS__startReqHdlr 31.236 - #define MEAS__endReqHdlr 31.237 - #define MEAS__Print_Hists_for_Plugin_Meas 31.238 - 31.239 -#endif 31.240 - 31.241 - 31.242 -#ifdef MEAS__TURN_ON_SYSTEM_MEAS 31.243 - #define MEAS__Insert_System_Meas_Fields_into_Slave \ 31.244 - TSCountLowHigh startSusp; \ 31.245 - uint64 totalSuspCycles; \ 31.246 - uint32 numGoodSusp; 31.247 - 31.248 - #define MEAS__Insert_System_Meas_Fields_into_MasterEnv \ 31.249 - TSCountLowHigh startMaster; \ 31.250 - uint64 totalMasterCycles; \ 31.251 - uint32 numMasterAnimations; \ 31.252 - TSCountLowHigh startReqHdlr; \ 31.253 - uint64 totalPluginCycles; \ 31.254 - uint32 numPluginAnimations; \ 31.255 - uint64 cyclesTillStartAnimationMaster; \ 31.256 - TSCountLowHigh endAnimationMaster; 31.257 - 31.258 - #define MEAS__startAnimationMaster_forSys \ 31.259 - TSCountLowHigh startStamp1, endStamp1; \ 31.260 - saveTSCLowHigh( endStamp1 ); \ 31.261 - _VMSMasterEnv->cyclesTillStartAnimationMaster = \ 31.262 - endStamp1.longVal - masterVP->startSusp.longVal; 31.263 - 31.264 - #define Meas_startReqHdlr_forSys \ 31.265 - saveTSCLowHigh( startStamp1 ); \ 31.266 - _VMSMasterEnv->startReqHdlr.longVal = startStamp1.longVal; 31.267 - 31.268 - #define MEAS__endAnimationMaster_forSys \ 31.269 - saveTSCLowHigh( startStamp1 ); \ 31.270 - _VMSMasterEnv->endAnimationMaster.longVal = startStamp1.longVal; 31.271 - 31.272 - /*A TSC is stored in VP first thing inside wrapper-lib 31.273 - * Now, measures cycles from there to here 31.274 - * Master and Plugin will add this value to other trace-seg measures 31.275 - */ 31.276 - #define MEAS__Capture_End_Susp_in_CoreCtlr_ForSys\ 31.277 - saveTSCLowHigh(endSusp); \ 31.278 - numCycles = endSusp.longVal - currVP->startSusp.longVal; \ 31.279 - /*sanity check (400K is about 20K iters)*/ \ 31.280 - if( numCycles < 400000 ) \ 31.281 - { currVP->totalSuspCycles += numCycles; \ 31.282 - currVP->numGoodSusp++; \ 31.283 - } \ 31.284 - /*recorded every time, but only read if currVP == MasterVP*/ \ 31.285 - _VMSMasterEnv->startMaster.longVal = endSusp.longVal; 31.286 - 31.287 -#else 31.288 - #define MEAS__Insert_System_Meas_Fields_into_Slave 31.289 - #define MEAS__Insert_System_Meas_Fields_into_MasterEnv 31.290 - #define MEAS__Make_Meas_Hists_for_System_Meas 31.291 - #define MEAS__startAnimationMaster_forSys 31.292 - #define MEAS__startReqHdlr_forSys 31.293 - #define MEAS__endAnimationMaster_forSys 31.294 - #define MEAS__Capture_End_Susp_in_CoreCtlr_ForSys 31.295 - #define MEAS__Print_Hists_for_System_Meas 31.296 -#endif 31.297 - 31.298 -#ifdef HOLISTIC__TURN_ON_PERF_COUNTERS 31.299 - 31.300 - #define MEAS__Insert_Counter_Handler \ 31.301 - typedef void (*CounterHandler) (int,int,int,SlaveVP*,uint64,uint64,uint64); 31.302 - 31.303 - enum eventType { 31.304 - DebugEvt = 0, 31.305 - AppResponderInvocation_start, 31.306 - AppResponder_start, 31.307 - AppResponder_end, 31.308 - AssignerInvocation_start, 31.309 - NextAssigner_start, 31.310 - Assigner_start, 31.311 - Assigner_end, 31.312 - Work_start, 31.313 - Work_end, 31.314 - HwResponderInvocation_start, 31.315 - Timestamp_start, 31.316 - Timestamp_end 31.317 - }; 31.318 - 31.319 - #define saveCyclesAndInstrs(core,cycles,instrs,cachem) do{ \ 31.320 - int cycles_fd = _VMSMasterEnv->cycles_counter_fd[core]; \ 31.321 - int instrs_fd = _VMSMasterEnv->instrs_counter_fd[core]; \ 31.322 - int cachem_fd = _VMSMasterEnv->cachem_counter_fd[core]; \ 31.323 - int nread; \ 31.324 - \ 31.325 - nread = read(cycles_fd,&(cycles),sizeof(cycles)); \ 31.326 - if(nread<0){ \ 31.327 - perror("Error reading cycles counter"); \ 31.328 - cycles = 0; \ 31.329 - } \ 31.330 - \ 31.331 - nread = read(instrs_fd,&(instrs),sizeof(instrs)); \ 31.332 - if(nread<0){ \ 31.333 - perror("Error reading cycles counter"); \ 31.334 - instrs = 0; \ 31.335 - } \ 31.336 - nread = read(cachem_fd,&(cachem),sizeof(cachem)); \ 31.337 - if(nread<0){ \ 31.338 - perror("Error reading last level cache miss counter"); \ 31.339 - cachem = 0; \ 31.340 - } \ 31.341 - } while (0) 31.342 - 31.343 - #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv \ 31.344 - int cycles_counter_fd[NUM_CORES]; \ 31.345 - int instrs_counter_fd[NUM_CORES]; \ 31.346 - int cachem_counter_fd[NUM_CORES]; \ 31.347 - uint64 start_master_lock[NUM_CORES][3]; \ 31.348 - CounterHandler counterHandler; 31.349 - 31.350 - #define HOLISTIC__Setup_Perf_Counters setup_perf_counters(); 31.351 - 31.352 - 31.353 - #define HOLISTIC__CoreCtrl_Setup \ 31.354 - CounterHandler counterHandler = _VMSMasterEnv->counterHandler; \ 31.355 - SlaveVP *lastVPBeforeMaster = NULL; \ 31.356 - /*if(thisCoresThdParams->coreNum == 0){ \ 31.357 - uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ 31.358 - while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ 31.359 - } \ 31.360 - if(0 < (thisCoresThdParams->coreNum) && (thisCoresThdParams->coreNum) < (NUM_CORES - 1)){ \ 31.361 - ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ 31.362 - int sndctr = tsc_offset_resp(sendCoresThdParams, 0); \ 31.363 - uint64 initval = tsc_offset_send(thisCoresThdParams,0); \ 31.364 - while(!coreCtlrThdParams[NUM_CORES - 2]->ret_tsc); \ 31.365 - } \ 31.366 - if(thisCoresThdParams->coreNum == (NUM_CORES - 1)){ \ 31.367 - ThdParams* sendCoresThdParams = coreCtlrThdParams[thisCoresThdParams->coreNum - 1]; \ 31.368 - int sndctr = tsc_offset_resp(sendCoresThdParams,0); \ 31.369 - }*/ 31.370 - 31.371 - 31.372 - #define HOLISTIC__Insert_Master_Global_Vars \ 31.373 - int vpid,task; \ 31.374 - CounterHandler counterHandler = masterEnv->counterHandler; 31.375 - 31.376 - #define HOLISTIC__Record_last_work lastVPBeforeMaster = currVP; 31.377 - 31.378 - #define HOLISTIC__Record_AppResponderInvocation_start \ 31.379 - uint64 cycles,instrs,cachem; \ 31.380 - saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 31.381 - if(lastVPBeforeMaster){ \ 31.382 - (*counterHandler)(AppResponderInvocation_start,lastVPBeforeMaster->slaveID,lastVPBeforeMaster->assignCount,lastVPBeforeMaster,cycles,instrs,cachem); \ 31.383 - lastVPBeforeMaster = NULL; \ 31.384 - } else { \ 31.385 - _VMSMasterEnv->start_master_lock[thisCoresIdx][0] = cycles; \ 31.386 - _VMSMasterEnv->start_master_lock[thisCoresIdx][1] = instrs; \ 31.387 - _VMSMasterEnv->start_master_lock[thisCoresIdx][2] = cachem; \ 31.388 - } 31.389 - 31.390 - /* Request Handler may call resume() on the VP, but we want to 31.391 - * account the whole interval to the same task. Therefore, need 31.392 - * to save task ID at the beginning. 31.393 - * 31.394 - * Using this value as "end of AppResponder Invocation Time" 31.395 - * is possible if there is only one SchedSlot per core - 31.396 - * invoking processor is last to be treated here! If more than 31.397 - * one slot, MasterLoop processing time for all but the last VP 31.398 - * would be erroneously counted as invocation time. 31.399 - */ 31.400 - #define HOLISTIC__Record_AppResponder_start \ 31.401 - vpid = currSlot->slaveAssignedToSlot->slaveID; \ 31.402 - task = currSlot->slaveAssignedToSlot->assignCount; \ 31.403 - uint64 cycles, instrs, cachem; \ 31.404 - saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 31.405 - (*counterHandler)(AppResponder_start,vpid,task,currSlot->slaveAssignedToSlot,cycles,instrs,cachem); 31.406 - 31.407 - #define HOLISTIC__Record_AppResponder_end \ 31.408 - uint64 cycles2,instrs2,cachem2; \ 31.409 - saveCyclesAndInstrs(thisCoresIdx,cycles2, instrs2,cachem2); \ 31.410 - (*counterHandler)(AppResponder_end,vpid,task,currSlot->slaveAssignedToSlot,cycles2,instrs2,cachem2); \ 31.411 - (*counterHandler)(Timestamp_end,vpid,task,currSlot->slaveAssignedToSlot,rdtsc(),0,0); 31.412 - 31.413 - 31.414 - /* Don't know who to account time to yet - goes to assigned VP 31.415 - * after the call. 31.416 - */ 31.417 - #define HOLISTIC__Record_Assigner_start \ 31.418 - int empty = FALSE; \ 31.419 - if(currSlot->slaveAssignedToSlot == NULL){ \ 31.420 - empty= TRUE; \ 31.421 - } \ 31.422 - uint64 tmp_cycles, tmp_instrs, tmp_cachem; \ 31.423 - saveCyclesAndInstrs(thisCoresIdx,tmp_cycles,tmp_instrs,tmp_cachem); \ 31.424 - uint64 tsc = rdtsc(); \ 31.425 - if(vpid > 0) { \ 31.426 - (*counterHandler)(NextAssigner_start,vpid,task,currSlot->slaveAssignedToSlot,tmp_cycles,tmp_instrs,tmp_cachem); \ 31.427 - vpid = 0; \ 31.428 - task = 0; \ 31.429 - } 31.430 - 31.431 - #define HOLISTIC__Record_Assigner_end \ 31.432 - uint64 cycles,instrs,cachem; \ 31.433 - saveCyclesAndInstrs(thisCoresIdx,cycles,instrs,cachem); \ 31.434 - if(empty){ \ 31.435 - (*counterHandler)(AssignerInvocation_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,masterEnv->start_master_lock[thisCoresIdx][0],masterEnv->start_master_lock[thisCoresIdx][1],masterEnv->start_master_lock[thisCoresIdx][2]); \ 31.436 - } \ 31.437 - (*counterHandler)(Timestamp_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tsc,0,0); \ 31.438 - (*counterHandler)(Assigner_start,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,tmp_cycles,tmp_instrs,tmp_cachem); \ 31.439 - (*counterHandler)(Assigner_end,assignedSlaveVP->slaveID,assignedSlaveVP->assignCount,assignedSlaveVP,cycles,instrs,tmp_cachem); 31.440 - 31.441 - #define HOLISTIC__Record_Work_start \ 31.442 - if(currVP){ \ 31.443 - uint64 cycles,instrs,cachem; \ 31.444 - saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 31.445 - (*counterHandler)(Work_start,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs,cachem); \ 31.446 - } 31.447 - 31.448 - #define HOLISTIC__Record_Work_end \ 31.449 - if(currVP){ \ 31.450 - uint64 cycles,instrs,cachem; \ 31.451 - saveCyclesAndInstrs(thisCoresIdx,cycles, instrs,cachem); \ 31.452 - (*counterHandler)(Work_end,currVP->slaveID,currVP->assignCount,currVP,cycles,instrs,cachem); \ 31.453 - } 31.454 - 31.455 - #define HOLISTIC__Record_HwResponderInvocation_start \ 31.456 - uint64 cycles,instrs,cachem; \ 31.457 - saveCyclesAndInstrs(animatingSlv->coreAnimatedBy,cycles, instrs,cachem); \ 31.458 - (*(_VMSMasterEnv->counterHandler))(HwResponderInvocation_start,animatingSlv->slaveID,animatingSlv->assignCount,animatingSlv,cycles,instrs,cachem); 31.459 - 31.460 - 31.461 - #define getReturnAddressBeforeLibraryCall(vp_ptr, res_ptr) do{ \ 31.462 -void* frame_ptr0 = vp_ptr->framePtr; \ 31.463 -void* frame_ptr1 = *((void**)frame_ptr0); \ 31.464 -void* frame_ptr2 = *((void**)frame_ptr1); \ 31.465 -void* frame_ptr3 = *((void**)frame_ptr2); \ 31.466 -void* ret_addr = *((void**)frame_ptr3 + 1); \ 31.467 -*res_ptr = ret_addr; \ 31.468 -} while (0) 31.469 - 31.470 -#else 31.471 - #define MEAS__Insert_Counter_Handler 31.472 - #define MEAS__Insert_Counter_Meas_Fields_into_MasterEnv 31.473 - #define HOLISTIC__Setup_Perf_Counters 31.474 - #define HOLISTIC__CoreCtrl_Setup 31.475 - #define HOLISTIC__Insert_Master_Global_Vars 31.476 - #define HOLISTIC__Record_last_work 31.477 - #define HOLISTIC__Record_AppResponderInvocation_start 31.478 - #define HOLISTIC__Record_AppResponder_start 31.479 - #define HOLISTIC__Record_AppResponder_end 31.480 - #define HOLISTIC__Record_Assigner_start 31.481 - #define HOLISTIC__Record_Assigner_end 31.482 - #define HOLISTIC__Record_Work_start 31.483 - #define HOLISTIC__Record_Work_end 31.484 - #define HOLISTIC__Record_HwResponderInvocation_start 31.485 - #define getReturnAddressBeforeLibraryCall(vp_ptr, res_ptr) 31.486 -#endif 31.487 - 31.488 -//Experiment in two-step macros -- if doesn't work, insert each separately 31.489 -#define MEAS__Insert_Meas_Fields_into_Slave \ 31.490 - MEAS__Insert_Susp_Meas_Fields_into_Slave \ 31.491 - MEAS__Insert_Master_Meas_Fields_into_Slave \ 31.492 - MEAS__Insert_System_Meas_Fields_into_Slave 31.493 - 31.494 - 31.495 -//====================== Histogram Macros -- Create ======================== 31.496 -// 31.497 -// 31.498 - 31.499 -//The language implementation should include a definition of this macro, 31.500 -// which creates all the histograms the language uses to collect measurements 31.501 -// of plugin operation -- so, if the language didn't define it, must 31.502 -// define it here (as empty), to avoid compile error 31.503 -#ifndef MEAS__Make_Meas_Hists_for_Language 31.504 -#define MEAS__Make_Meas_Hists_for_Language 31.505 -#endif 31.506 - 31.507 -#define makeAMeasHist( idx, name, numBins, startVal, binWidth ) \ 31.508 - makeHighestDynArrayIndexBeAtLeast( _VMSMasterEnv->measHistsInfo, idx ); \ 31.509 - _VMSMasterEnv->measHists[idx] = \ 31.510 - makeFixedBinHist( numBins, startVal, binWidth, name ); 31.511 - 31.512 -//============================== Probes =================================== 31.513 - 31.514 - 31.515 -//=========================================================================== 31.516 -#endif /* _VMS_DEFS_MEAS_H */ 31.517 -
32.1 --- a/Services_Offered_by_VMS/Measurement_and_Stats/probes.c Mon Sep 03 03:34:54 2012 -0700 32.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 32.3 @@ -1,304 +0,0 @@ 32.4 -/* 32.5 - * Copyright 2010 OpenSourceStewardshipFoundation 32.6 - * 32.7 - * Licensed under BSD 32.8 - */ 32.9 - 32.10 -#include <stdio.h> 32.11 -#include <malloc.h> 32.12 -#include <sys/time.h> 32.13 - 32.14 -#include "VMS_impl/VMS.h" 32.15 - 32.16 - 32.17 - 32.18 -//==================== Probes ================= 32.19 -/* 32.20 - * In practice, probe operations are called from the app, from inside slaves 32.21 - * -- so have to be sure each probe is single-Slv owned, and be sure that 32.22 - * any place common structures are modified it's done inside the master. 32.23 - * So -- the only place common structures are modified is during creation. 32.24 - * after that, all mods are to individual instances. 32.25 - * 32.26 - * Thniking perhaps should change the semantics to be that probes are 32.27 - * attached to the virtual processor -- and then everything is guaranteed 32.28 - * to be isolated -- except then can't take any intervals that span Slvs, 32.29 - * and would have to transfer the probes to Master env when Slv dissipates.. 32.30 - * gets messy.. 32.31 - * 32.32 - * For now, just making so that probe creation causes a suspend, so that 32.33 - * the dynamic array in the master env is only modified from the master 32.34 - * 32.35 - */ 32.36 - 32.37 -//============================ Helpers =========================== 32.38 -inline void 32.39 -doNothing() 32.40 - { 32.41 - } 32.42 - 32.43 -float64 inline 32.44 -giveInterval( struct timeval _start, struct timeval _end ) 32.45 - { float64 start, end; 32.46 - start = _start.tv_sec + _start.tv_usec / 1000000.0; 32.47 - end = _end.tv_sec + _end.tv_usec / 1000000.0; 32.48 - return end - start; 32.49 - } 32.50 - 32.51 -//================================================================= 32.52 -IntervalProbe * 32.53 -create_generic_probe( char *nameStr, SlaveVP *animSlv ) 32.54 - { 32.55 - VMSSemReq reqData; 32.56 - 32.57 - reqData.reqType = make_probe; 32.58 - reqData.nameStr = nameStr; 32.59 - 32.60 - VMS_WL__send_VMSSem_request( &reqData, animSlv ); 32.61 - 32.62 - return animSlv->dataRetFromReq; 32.63 - } 32.64 - 32.65 -/*Use this version from outside VMS -- it uses external malloc, and modifies 32.66 - * dynamic array, so can't be animated in a slave Slv 32.67 - */ 32.68 -IntervalProbe * 32.69 -ext__create_generic_probe( char *nameStr ) 32.70 - { IntervalProbe *newProbe; 32.71 - int32 nameLen; 32.72 - 32.73 - newProbe = malloc( sizeof(IntervalProbe) ); 32.74 - nameLen = strlen( nameStr ); 32.75 - newProbe->nameStr = malloc( nameLen ); 32.76 - memcpy( newProbe->nameStr, nameStr, nameLen ); 32.77 - newProbe->hist = NULL; 32.78 - newProbe->schedChoiceWasRecorded = FALSE; 32.79 - newProbe->probeID = 32.80 - addToDynArray( newProbe, _VMSMasterEnv->dynIntervalProbesInfo ); 32.81 - 32.82 - return newProbe; 32.83 - } 32.84 - 32.85 -//============================ Fns def in header ======================= 32.86 - 32.87 -int32 32.88 -VMS_impl__create_single_interval_probe( char *nameStr, SlaveVP *animSlv ) 32.89 - { IntervalProbe *newProbe; 32.90 - 32.91 - newProbe = create_generic_probe( nameStr, animSlv ); 32.92 - 32.93 - return newProbe->probeID; 32.94 - } 32.95 - 32.96 -int32 32.97 -VMS_impl__create_histogram_probe( int32 numBins, float64 startValue, 32.98 - float64 binWidth, char *nameStr, SlaveVP *animSlv ) 32.99 - { IntervalProbe *newProbe; 32.100 - 32.101 - newProbe = create_generic_probe( nameStr, animSlv ); 32.102 - 32.103 -#ifdef PROBES__USE_TIME_OF_DAY_PROBES 32.104 - DblHist *hist; 32.105 - hist = makeDblHistogram( numBins, startValue, binWidth ); 32.106 -#else 32.107 - Histogram *hist; 32.108 - hist = makeHistogram( numBins, startValue, binWidth ); 32.109 -#endif 32.110 - newProbe->hist = hist; 32.111 - return newProbe->probeID; 32.112 - } 32.113 - 32.114 - 32.115 -int32 32.116 -VMS_impl__record_time_point_into_new_probe( char *nameStr, SlaveVP *animSlv) 32.117 - { IntervalProbe *newProbe; 32.118 - struct timeval *startStamp; 32.119 - float64 startSecs; 32.120 - 32.121 - newProbe = create_generic_probe( nameStr, animSlv ); 32.122 - newProbe->endSecs = 0; 32.123 - 32.124 - 32.125 - gettimeofday( &(newProbe->startStamp), NULL); 32.126 - 32.127 - //turn into a double 32.128 - startStamp = &(newProbe->startStamp); 32.129 - startSecs = startStamp->tv_sec + ( startStamp->tv_usec / 1000000.0 ); 32.130 - newProbe->startSecs = startSecs; 32.131 - 32.132 - return newProbe->probeID; 32.133 - } 32.134 - 32.135 -int32 32.136 -VMS_ext_impl__record_time_point_into_new_probe( char *nameStr ) 32.137 - { IntervalProbe *newProbe; 32.138 - struct timeval *startStamp; 32.139 - float64 startSecs; 32.140 - 32.141 - newProbe = ext__create_generic_probe( nameStr ); 32.142 - newProbe->endSecs = 0; 32.143 - 32.144 - gettimeofday( &(newProbe->startStamp), NULL); 32.145 - 32.146 - //turn into a double 32.147 - startStamp = &(newProbe->startStamp); 32.148 - startSecs = startStamp->tv_sec + ( startStamp->tv_usec / 1000000.0 ); 32.149 - newProbe->startSecs = startSecs; 32.150 - 32.151 - return newProbe->probeID; 32.152 - } 32.153 - 32.154 - 32.155 -/*Only call from inside master or main startup/shutdown thread 32.156 - */ 32.157 -void 32.158 -VMS_impl__free_probe( IntervalProbe *probe ) 32.159 - { if( probe->hist != NULL ) freeDblHist( probe->hist ); 32.160 - if( probe->nameStr != NULL) VMS_int__free( probe->nameStr ); 32.161 - VMS_int__free( probe ); 32.162 - } 32.163 - 32.164 - 32.165 -void 32.166 -VMS_impl__index_probe_by_its_name( int32 probeID, SlaveVP *animSlv ) 32.167 - { IntervalProbe *probe; 32.168 - 32.169 - VMS_int__get_master_lock(); 32.170 - probe = _VMSMasterEnv->intervalProbes[ probeID ]; 32.171 - 32.172 - addValueIntoTable(probe->nameStr, probe, _VMSMasterEnv->probeNameHashTbl); 32.173 - VMS_int__release_master_lock(); 32.174 - } 32.175 - 32.176 - 32.177 -IntervalProbe * 32.178 -VMS_impl__get_probe_by_name( char *probeName, SlaveVP *animSlv ) 32.179 - { 32.180 - //TODO: fix this To be in Master -- race condition 32.181 - return getValueFromTable( probeName, _VMSMasterEnv->probeNameHashTbl ); 32.182 - } 32.183 - 32.184 - 32.185 -/*Everything is local to the animating slaveVP, so no need for request, do 32.186 - * work locally, in the anim Slv 32.187 - */ 32.188 -void 32.189 -VMS_impl__record_sched_choice_into_probe( int32 probeID, SlaveVP *animatingSlv ) 32.190 - { IntervalProbe *probe; 32.191 - 32.192 - probe = _VMSMasterEnv->intervalProbes[ probeID ]; 32.193 - probe->schedChoiceWasRecorded = TRUE; 32.194 - probe->coreNum = animatingSlv->coreAnimatedBy; 32.195 - probe->slaveID = animatingSlv->slaveID; 32.196 - probe->slaveCreateSecs = animatingSlv->createPtInSecs; 32.197 - } 32.198 - 32.199 -/*Everything is local to the animating slaveVP, so no need for request, do 32.200 - * work locally, in the anim Slv 32.201 - */ 32.202 -void 32.203 -VMS_impl__record_interval_start_in_probe( int32 probeID ) 32.204 - { IntervalProbe *probe; 32.205 - 32.206 - DEBUG__printf( dbgProbes, "record start of interval" ) 32.207 - probe = _VMSMasterEnv->intervalProbes[ probeID ]; 32.208 - 32.209 - //record *start* point as last thing, after lookup 32.210 -#ifdef PROBES__USE_TIME_OF_DAY_PROBES 32.211 - gettimeofday( &(probe->startStamp), NULL); 32.212 -#endif 32.213 -#ifdef PROBES__USE_TSC_PROBES 32.214 - probe->startStamp = getTSCount(); 32.215 -#endif 32.216 - } 32.217 - 32.218 - 32.219 -/*Everything is local to the animating slaveVP, except the histogram, so do 32.220 - * work locally, in the anim Slv -- may lose a few histogram counts 32.221 - * 32.222 - *This should be safe to run inside SlaveVP 32.223 - */ 32.224 -void 32.225 -VMS_impl__record_interval_end_in_probe( int32 probeID ) 32.226 - { IntervalProbe *probe; 32.227 - 32.228 - //Record first thing -- before looking up the probe to store it into 32.229 -#ifdef PROBES__USE_TIME_OF_DAY_PROBES 32.230 - struct timeval endStamp; 32.231 - gettimeofday( &(endStamp), NULL); 32.232 -#endif 32.233 -#ifdef PROBES__USE_TSC_PROBES 32.234 - TSCount endStamp, interval; 32.235 - endStamp = getTSCount(); 32.236 -#endif 32.237 -#ifdef PROBES__USE_PERF_CTR_PROBES 32.238 - 32.239 -#endif 32.240 - 32.241 - probe = _VMSMasterEnv->intervalProbes[ probeID ]; 32.242 - 32.243 -#ifdef PROBES__USE_TIME_OF_DAY_PROBES 32.244 - if( probe->hist != NULL ) 32.245 - { addToDblHist( giveInterval( probe->startStamp, endStamp), probe->hist ); 32.246 - } 32.247 -#endif 32.248 -#ifdef PROBES__USE_TSC_PROBES 32.249 - if( probe->hist != NULL ) 32.250 - { interval = probe->endStamp - probe->startStamp; 32.251 - //Sanity check for TSC counter overflow: if sane, add to histogram 32.252 - if( interval < probe->hist->endOfRange * 10 ) 32.253 - addToHist( interval, probe->hist ); 32.254 - } 32.255 -#endif 32.256 -#ifdef PROBES__USE_PERF_CTR_PROBES 32.257 - 32.258 -#endif 32.259 - 32.260 - DEBUG__printf( dbgProbes, "record end of interval" ) 32.261 - } 32.262 - 32.263 - 32.264 -void 32.265 -print_probe_helper( IntervalProbe *probe ) 32.266 - { 32.267 - printf( "\nprobe: %s, ", probe->nameStr ); 32.268 - 32.269 - 32.270 - if( probe->schedChoiceWasRecorded ) 32.271 - { printf( "coreNum: %d, slaveID: %d, slaveVPCreated: %0.6f | ", 32.272 - probe->coreNum, probe->slaveID, probe->slaveCreateSecs ); 32.273 - } 32.274 - 32.275 - if( probe->endSecs == 0 ) //just a single point in time 32.276 - { 32.277 - printf( " time point: %.6f\n", 32.278 - probe->startSecs - _VMSMasterEnv->createPtInSecs ); 32.279 - } 32.280 - else if( probe->hist == NULL ) //just an interval 32.281 - { 32.282 - printf( " startSecs: %.6f interval: %.6f\n", 32.283 - (probe->startSecs - _VMSMasterEnv->createPtInSecs), probe->interval); 32.284 - } 32.285 - else //a full histogram of intervals 32.286 - { 32.287 - printDblHist( probe->hist ); 32.288 - } 32.289 - } 32.290 - 32.291 -void 32.292 -VMS_impl__print_stats_of_probe( IntervalProbe *probe ) 32.293 - { 32.294 - 32.295 -// probe = _VMSMasterEnv->intervalProbes[ probeID ]; 32.296 - 32.297 - print_probe_helper( probe ); 32.298 - } 32.299 - 32.300 - 32.301 -void 32.302 -VMS_impl__print_stats_of_all_probes() 32.303 - { 32.304 - forAllInDynArrayDo( _VMSMasterEnv->dynIntervalProbesInfo, 32.305 - (DynArrayFnPtr) &VMS_impl__print_stats_of_probe ); 32.306 - fflush( stdout ); 32.307 - }
33.1 --- a/Services_Offered_by_VMS/Measurement_and_Stats/probes.h Mon Sep 03 03:34:54 2012 -0700 33.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 33.3 @@ -1,192 +0,0 @@ 33.4 -/* 33.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 33.6 - * Licensed under GNU General Public License version 2 33.7 - * 33.8 - * Author: seanhalle@yahoo.com 33.9 - * 33.10 - */ 33.11 - 33.12 -#ifndef _PROBES_H 33.13 -#define _PROBES_H 33.14 -#define _GNU_SOURCE 33.15 - 33.16 -#include "VMS_impl/VMS_primitive_data_types.h" 33.17 - 33.18 -#include <sys/time.h> 33.19 - 33.20 -/*Note on order of include files: 33.21 - * This file relies on #defines that appear in other files, which must come 33.22 - * first in the #include sequence.. 33.23 - */ 33.24 - 33.25 -/*Use these aliases in application code*/ 33.26 -#define VMS_App__record_time_point_into_new_probe VMS_WL__record_time_point_into_new_probe 33.27 -#define VMS_App__create_single_interval_probe VMS_WL__create_single_interval_probe 33.28 -#define VMS_App__create_histogram_probe VMS_WL__create_histogram_probe 33.29 -#define VMS_App__index_probe_by_its_name VMS_WL__index_probe_by_its_name 33.30 -#define VMS_App__get_probe_by_name VMS_WL__get_probe_by_name 33.31 -#define VMS_App__record_sched_choice_into_probe VMS_WL__record_sched_choice_into_probe 33.32 -#define VMS_App__record_interval_start_in_probe VMS_WL__record_interval_start_in_probe 33.33 -#define VMS_App__record_interval_end_in_probe VMS_WL__record_interval_end_in_probe 33.34 -#define VMS_App__print_stats_of_probe VMS_WL__print_stats_of_probe 33.35 -#define VMS_App__print_stats_of_all_probes VMS_WL__print_stats_of_all_probes 33.36 - 33.37 - 33.38 -//========================== 33.39 -#ifdef PROBES__USE_TSC_PROBES 33.40 - #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 33.41 - TSCount startStamp; \ 33.42 - TSCount endStamp; \ 33.43 - TSCount interval; \ 33.44 - Histogram *hist; /*if left NULL, then is single interval probe*/ 33.45 -#endif 33.46 -#ifdef PROBES__USE_TIME_OF_DAY_PROBES 33.47 - #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 33.48 - struct timeval startStamp; \ 33.49 - struct timeval endStamp; \ 33.50 - float64 startSecs; \ 33.51 - float64 endSecs; \ 33.52 - float64 interval; \ 33.53 - DblHist *hist; /*if NULL, then is single interval probe*/ 33.54 -#endif 33.55 -#ifdef PROBES__USE_PERF_CTR_PROBES 33.56 - #define PROBES__Insert_timestamps_and_intervals_into_probe_struct \ 33.57 - int64 startStamp; \ 33.58 - int64 endStamp; \ 33.59 - int64 interval; \ 33.60 - Histogram *hist; /*if left NULL, then is single interval probe*/ 33.61 -#endif 33.62 - 33.63 -//typedef struct _IntervalProbe IntervalProbe; -- is in VMS.h 33.64 -struct _IntervalProbe 33.65 - { 33.66 - char *nameStr; 33.67 - int32 probeID; 33.68 - 33.69 - int32 schedChoiceWasRecorded; 33.70 - int32 coreNum; 33.71 - int32 slaveID; 33.72 - float64 slaveCreateSecs; 33.73 - PROBES__Insert_timestamps_and_intervals_into_probe_struct; 33.74 - }; 33.75 - 33.76 -//=========================== NEVER USE THESE ========================== 33.77 -/*NEVER use these in any code!! These are here only for use in the macros 33.78 - * defined in this file!! 33.79 - */ 33.80 -int32 33.81 -VMS_impl__create_single_interval_probe( char *nameStr, SlaveVP *animSlv ); 33.82 - 33.83 -int32 33.84 -VMS_impl__create_histogram_probe( int32 numBins, float64 startValue, 33.85 - float64 binWidth, char *nameStr, SlaveVP *animSlv ); 33.86 - 33.87 -int32 33.88 -VMS_impl__record_time_point_into_new_probe( char *nameStr, SlaveVP *animSlv); 33.89 - 33.90 -int32 33.91 -VMS_ext_impl__record_time_point_into_new_probe( char *nameStr ); 33.92 - 33.93 -void 33.94 -VMS_impl__free_probe( IntervalProbe *probe ); 33.95 - 33.96 -void 33.97 -VMS_impl__index_probe_by_its_name( int32 probeID, SlaveVP *animSlv ); 33.98 - 33.99 -IntervalProbe * 33.100 -VMS_impl__get_probe_by_name( char *probeName, SlaveVP *animSlv ); 33.101 - 33.102 -void 33.103 -VMS_impl__record_sched_choice_into_probe( int32 probeID, SlaveVP *animSlv ); 33.104 - 33.105 -void 33.106 -VMS_impl__record_interval_start_in_probe( int32 probeID ); 33.107 - 33.108 -void 33.109 -VMS_impl__record_interval_end_in_probe( int32 probeID ); 33.110 - 33.111 -void 33.112 -VMS_impl__print_stats_of_probe( IntervalProbe *probe ); 33.113 - 33.114 -void 33.115 -VMS_impl__print_stats_of_all_probes(); 33.116 - 33.117 - 33.118 -//======================== Probes ============================= 33.119 -// 33.120 -// Use macros to allow turning probes off with a #define switch 33.121 -// This means probes have zero impact on performance when off 33.122 -//============================================================= 33.123 - 33.124 -#ifdef PROBES__TURN_ON_STATS_PROBES 33.125 - 33.126 - #define PROBES__Create_Probe_Bookkeeping_Vars \ 33.127 - _VMSMasterEnv->dynIntervalProbesInfo = \ 33.128 - makePrivDynArrayOfSize( (void***)&(_VMSMasterEnv->intervalProbes), 200); \ 33.129 - \ 33.130 - _VMSMasterEnv->probeNameHashTbl = makeHashTable( 1000, &VMS_int__free ); \ 33.131 - \ 33.132 - /*put creation time directly into master env, for fast retrieval*/ \ 33.133 - struct timeval timeStamp; \ 33.134 - gettimeofday( &(timeStamp), NULL); \ 33.135 - _VMSMasterEnv->createPtInSecs = \ 33.136 - timeStamp.tv_sec +(timeStamp.tv_usec/1000000.0); 33.137 - 33.138 - #define VMS_WL__record_time_point_into_new_probe( nameStr, animSlv ) \ 33.139 - VMS_impl__record_time_point_in_new_probe( nameStr, animSlv ) 33.140 - 33.141 - #define VMS_ext__record_time_point_into_new_probe( nameStr ) \ 33.142 - VMS_ext_impl__record_time_point_into_new_probe( nameStr ) 33.143 - 33.144 - #define VMS_WL__create_single_interval_probe( nameStr, animSlv ) \ 33.145 - VMS_impl__create_single_interval_probe( nameStr, animSlv ) 33.146 - 33.147 - #define VMS_WL__create_histogram_probe( numBins, startValue, \ 33.148 - binWidth, nameStr, animSlv ) \ 33.149 - VMS_impl__create_histogram_probe( numBins, startValue, \ 33.150 - binWidth, nameStr, animSlv ) 33.151 - #define VMS_int__free_probe( probe ) \ 33.152 - VMS_impl__free_probe( probe ) 33.153 - 33.154 - #define VMS_WL__index_probe_by_its_name( probeID, animSlv ) \ 33.155 - VMS_impl__index_probe_by_its_name( probeID, animSlv ) 33.156 - 33.157 - #define VMS_WL__get_probe_by_name( probeID, animSlv ) \ 33.158 - VMS_impl__get_probe_by_name( probeName, animSlv ) 33.159 - 33.160 - #define VMS_WL__record_sched_choice_into_probe( probeID, animSlv ) \ 33.161 - VMS_impl__record_sched_choice_into_probe( probeID, animSlv ) 33.162 - 33.163 - #define VMS_WL__record_interval_start_in_probe( probeID ) \ 33.164 - VMS_impl__record_interval_start_in_probe( probeID ) 33.165 - 33.166 - #define VMS_WL__record_interval_end_in_probe( probeID ) \ 33.167 - VMS_impl__record_interval_end_in_probe( probeID ) 33.168 - 33.169 - #define VMS_WL__print_stats_of_probe( probeID ) \ 33.170 - VMS_impl__print_stats_of_probe( probeID ) 33.171 - 33.172 - #define VMS_WL__print_stats_of_all_probes() \ 33.173 - VMS_impl__print_stats_of_all_probes() 33.174 - 33.175 - 33.176 -#else 33.177 - #define PROBES__Create_Probe_Bookkeeping_Vars 33.178 - #define VMS_WL__record_time_point_into_new_probe( nameStr, animSlv ) 0 /* do nothing */ 33.179 - #define VMS_ext__record_time_point_into_new_probe( nameStr ) 0 /* do nothing */ 33.180 - #define VMS_WL__create_single_interval_probe( nameStr, animSlv ) 0 /* do nothing */ 33.181 - #define VMS_WL__create_histogram_probe( numBins, startValue, \ 33.182 - binWidth, nameStr, animSlv ) \ 33.183 - 0 /* do nothing */ 33.184 - #define VMS_WL__index_probe_by_its_name( probeID, animSlv ) /* do nothing */ 33.185 - #define VMS_WL__get_probe_by_name( probeID, animSlv ) NULL /* do nothing */ 33.186 - #define VMS_WL__record_sched_choice_into_probe( probeID, animSlv ) /* do nothing */ 33.187 - #define VMS_WL__record_interval_start_in_probe( probeID ) /* do nothing */ 33.188 - #define VMS_WL__record_interval_end_in_probe( probeID ) /* do nothing */ 33.189 - #define VMS_WL__print_stats_of_probe( probeID ) ; /* do nothing */ 33.190 - #define VMS_WL__print_stats_of_all_probes() ;/* do nothing */ 33.191 - 33.192 -#endif /* defined PROBES__TURN_ON_STATS_PROBES */ 33.193 - 33.194 -#endif /* _PROBES_H */ 33.195 -
34.1 --- a/Services_Offered_by_VMS/Memory_Handling/vmalloc.c Mon Sep 03 03:34:54 2012 -0700 34.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 34.3 @@ -1,438 +0,0 @@ 34.4 -/* 34.5 - * Copyright 2009 OpenSourceCodeStewardshipFoundation.org 34.6 - * Licensed under GNU General Public License version 2 34.7 - * 34.8 - * Author: seanhalle@yahoo.com 34.9 - * 34.10 - * Created on November 14, 2009, 9:07 PM 34.11 - */ 34.12 - 34.13 -#include <malloc.h> 34.14 -#include <inttypes.h> 34.15 -#include <stdlib.h> 34.16 -#include <stdio.h> 34.17 -#include <string.h> 34.18 -#include <math.h> 34.19 - 34.20 -#include "VMS_impl/VMS.h" 34.21 -#include "Histogram/Histogram.h" 34.22 - 34.23 -#define MAX_UINT64 0xFFFFFFFFFFFFFFFF 34.24 - 34.25 -//A MallocProlog is a head element if the HigherInMem variable is NULL 34.26 -//A Chunk is free if the prevChunkInFreeList variable is NULL 34.27 - 34.28 -/* 34.29 - * This calculates the container which fits the given size. 34.30 - */ 34.31 -inline 34.32 -uint32 getContainer(size_t size) 34.33 -{ 34.34 - return (log2(size)-LOG128)/LOG54; 34.35 -} 34.36 - 34.37 -/* 34.38 - * Removes the first chunk of a freeList 34.39 - * The chunk is removed but not set as free. There is no check if 34.40 - * the free list is empty, so make sure this is not the case. 34.41 - */ 34.42 -inline 34.43 -MallocProlog *removeChunk(MallocArrays* freeLists, uint32 containerIdx) 34.44 -{ 34.45 - MallocProlog** container = &freeLists->bigChunks[containerIdx]; 34.46 - MallocProlog* removedChunk = *container; 34.47 - *container = removedChunk->nextChunkInFreeList; 34.48 - 34.49 - if(removedChunk->nextChunkInFreeList) 34.50 - removedChunk->nextChunkInFreeList->prevChunkInFreeList = 34.51 - (MallocProlog*)container; 34.52 - 34.53 - if(*container == NULL) 34.54 - { 34.55 - if(containerIdx < 64) 34.56 - freeLists->bigChunksSearchVector[0] &= ~((uint64)1 << containerIdx); 34.57 - else 34.58 - freeLists->bigChunksSearchVector[1] &= ~((uint64)1 << (containerIdx-64)); 34.59 - } 34.60 - 34.61 - return removedChunk; 34.62 -} 34.63 - 34.64 -/* 34.65 - * Removes the first chunk of a freeList 34.66 - * The chunk is removed but not set as free. There is no check if 34.67 - * the free list is empty, so make sure this is not the case. 34.68 - */ 34.69 -inline 34.70 -MallocProlog *removeSmallChunk(MallocArrays* freeLists, uint32 containerIdx) 34.71 -{ 34.72 - MallocProlog** container = &freeLists->smallChunks[containerIdx]; 34.73 - MallocProlog* removedChunk = *container; 34.74 - *container = removedChunk->nextChunkInFreeList; 34.75 - 34.76 - if(removedChunk->nextChunkInFreeList) 34.77 - removedChunk->nextChunkInFreeList->prevChunkInFreeList = 34.78 - (MallocProlog*)container; 34.79 - 34.80 - return removedChunk; 34.81 -} 34.82 - 34.83 -inline 34.84 -size_t getChunkSize(MallocProlog* chunk) 34.85 -{ 34.86 - return (uintptr_t)chunk->nextHigherInMem - 34.87 - (uintptr_t)chunk - sizeof(MallocProlog); 34.88 -} 34.89 - 34.90 -/* 34.91 - * Removes a chunk from a free list. 34.92 - */ 34.93 -inline 34.94 -void extractChunk(MallocProlog* chunk, MallocArrays *freeLists) 34.95 -{ 34.96 - chunk->prevChunkInFreeList->nextChunkInFreeList = chunk->nextChunkInFreeList; 34.97 - if(chunk->nextChunkInFreeList) 34.98 - chunk->nextChunkInFreeList->prevChunkInFreeList = chunk->prevChunkInFreeList; 34.99 - 34.100 - //The last element in the list points to the container. If the container points 34.101 - //to NULL the container is empty 34.102 - if(*((void**)(chunk->prevChunkInFreeList)) == NULL && getChunkSize(chunk) >= BIG_LOWER_BOUND) 34.103 - { 34.104 - //Find the approppiate container because we do not know it 34.105 - uint64 containerIdx = ((uintptr_t)chunk->prevChunkInFreeList - (uintptr_t)freeLists->bigChunks) >> 3; 34.106 - if(containerIdx < (uint32)64) 34.107 - freeLists->bigChunksSearchVector[0] &= ~((uint64)1 << containerIdx); 34.108 - if(containerIdx < 128 && containerIdx >=64) 34.109 - freeLists->bigChunksSearchVector[1] &= ~((uint64)1 << (containerIdx-64)); 34.110 - 34.111 - } 34.112 -} 34.113 - 34.114 -/* 34.115 - * Merges two chunks. 34.116 - * Chunk A has to be before chunk B in memory. Both have to be removed from 34.117 - * a free list 34.118 - */ 34.119 -inline 34.120 -MallocProlog *mergeChunks(MallocProlog* chunkA, MallocProlog* chunkB) 34.121 -{ 34.122 - chunkA->nextHigherInMem = chunkB->nextHigherInMem; 34.123 - chunkB->nextHigherInMem->nextLowerInMem = chunkA; 34.124 - return chunkA; 34.125 -} 34.126 -/* 34.127 - * Inserts a chunk into a free list. 34.128 - */ 34.129 -inline 34.130 -void insertChunk(MallocProlog* chunk, MallocProlog** container) 34.131 -{ 34.132 - chunk->nextChunkInFreeList = *container; 34.133 - chunk->prevChunkInFreeList = (MallocProlog*)container; 34.134 - if(*container) 34.135 - (*container)->prevChunkInFreeList = chunk; 34.136 - *container = chunk; 34.137 -} 34.138 - 34.139 -/* 34.140 - * Divides the chunk that a new chunk of newSize is created. 34.141 - * There is no size check, so make sure the size value is valid. 34.142 - */ 34.143 -inline 34.144 -MallocProlog *divideChunk(MallocProlog* chunk, size_t newSize) 34.145 -{ 34.146 - MallocProlog* newChunk = (MallocProlog*)((uintptr_t)chunk->nextHigherInMem - 34.147 - newSize - sizeof(MallocProlog)); 34.148 - 34.149 - newChunk->nextLowerInMem = chunk; 34.150 - newChunk->nextHigherInMem = chunk->nextHigherInMem; 34.151 - 34.152 - chunk->nextHigherInMem->nextLowerInMem = newChunk; 34.153 - chunk->nextHigherInMem = newChunk; 34.154 - 34.155 - return newChunk; 34.156 -} 34.157 - 34.158 -/* 34.159 - * Search for chunk in the list of big chunks. Split the block if it's too big 34.160 - */ 34.161 -inline 34.162 -MallocProlog *searchChunk(MallocArrays *freeLists, size_t sizeRequested, uint32 containerIdx) 34.163 -{ 34.164 - MallocProlog* foundChunk; 34.165 - 34.166 - uint64 searchVector = freeLists->bigChunksSearchVector[0]; 34.167 - //set small chunk bits to zero 34.168 - searchVector &= MAX_UINT64 << containerIdx; 34.169 - containerIdx = __builtin_ffsl(searchVector); //least significant 1 bit 34.170 - 34.171 - if(containerIdx == 0) 34.172 - { 34.173 - searchVector = freeLists->bigChunksSearchVector[1]; 34.174 - containerIdx = __builtin_ffsl(searchVector); 34.175 - if(containerIdx == 0) 34.176 - { 34.177 - //TODO: get additional mem and insert into free list 34.178 - //malloc( MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE ); 34.179 - printf("VMS malloc failed: low memory"); 34.180 - exit(1); 34.181 - } 34.182 - containerIdx += 64; 34.183 - } 34.184 - containerIdx--; 34.185 - 34.186 - 34.187 - foundChunk = removeChunk(freeLists, containerIdx); 34.188 - size_t chunkSize = getChunkSize(foundChunk); 34.189 - 34.190 - //If the new chunk is larger than the requested size: split 34.191 - if(chunkSize > sizeRequested + 2 * sizeof(MallocProlog) + BIG_LOWER_BOUND) 34.192 - { 34.193 - MallocProlog *newChunk = divideChunk(foundChunk,sizeRequested); 34.194 - containerIdx = getContainer(getChunkSize(foundChunk)) - 1; 34.195 - insertChunk(foundChunk,&freeLists->bigChunks[containerIdx]); 34.196 - if(containerIdx < 64) 34.197 - freeLists->bigChunksSearchVector[0] |= ((uint64)1 << containerIdx); 34.198 - else 34.199 - freeLists->bigChunksSearchVector[1] |= ((uint64)1 << (containerIdx-64)); 34.200 - foundChunk = newChunk; 34.201 - } 34.202 - 34.203 - return foundChunk; 34.204 -} 34.205 - 34.206 - 34.207 -/* 34.208 - * This is sequential code, meant to only be called from the Master, not from 34.209 - * any slave Slvs. 34.210 - * 34.211 - *May 2012 34.212 - *ToDo: Improve speed, by using built-in leading 1 detector to calc free-list 34.213 - * index. 34.214 - *Change to two separate arrays, one for free-lists of small fixed-size chunks 34.215 - * other for free lists of exponentially growing chunk sizes 34.216 - *Do simple compare to decide which array of lists to use 34.217 - *For small chunks, size the lists in increments of 16, up to, say, 128 (1024 34.218 - * is max if want less than 64 lists, which allows searching for first 34.219 - * occupied free-list using leading-1 detector on a bit-vector) 34.220 - *To find index, right-shift by 4 bits, and that's the index! (works because 34.221 - * compare says no 1's above 128 position ((bit 7)), and sizes are every 16, 34.222 - * so dividing by 16 equals exactly the position) 34.223 - *For large chunks, have 63 free lists, but split into even and odd indexes. 34.224 - *For even indexes, each list starts with chunks twice the size of previous 34.225 - * even index. 34.226 - *For odd indexes, each list starts with chunks of size half-way between those 34.227 - * of the even indexes on either side. 34.228 - * 34.229 - *To calc the free-list position of a requested size, get pos of leading 1 34.230 - * of the size, call this msbsP (most-significant-bit-set-position). Then 34.231 - * check bit to right of it (one-less-significant) 34.232 - *If it's 0 then use the even index: msbsP * 2, which is msbsP << 1. 34.233 - *If it's 1, then use the odd-index, which is msbsP << 1 + 1 34.234 - * 34.235 - *To find msbsP, use GCC builtin: "int __builtin_clzll (unsigned long long)" 34.236 - * which returns the number of zeros above (left of) msb set. Note, dies if 34.237 - * give it zero, but the compare used to choose between arrays makes sure 34.238 - * requested size given to it is not zero. 34.239 - * 34.240 - *This scheme keeps wastage small, while finding free element is O(1), and a 34.241 - * fast constant. 34.242 - *For large chunk sizes, if don't shave excess, then it ensures worst-case 34.243 - * wastage due to mis-match in size of chunk vs requested size is 33% 34.244 - * (invariant: take any even list.. it starts at a power of 2, and next list 34.245 - * up starts at 50% larger, so biggest chunk is 1.5 x smallest request, that's 34.246 - * 33% of total memory wasted. Then, for the odd index above, smallest chunk 34.247 - * is 2x for smallest request of 1.5x, for 25% total wasted memory) 34.248 - *For smallest size chunks, the pre-amble wastes quite a bit, but above that, 34.249 - * sizing in increments of 16 keeps wastage small. And, if always shave, then 34.250 - * wastage due to size mis-match is maximum 16 bytes for the large chunks. 34.251 - * 34.252 - */ 34.253 -void * 34.254 -VMS_int__malloc( size_t sizeRequested ) 34.255 - { 34.256 - MEAS__Capture_Pre_Malloc_Point 34.257 - 34.258 - MallocArrays* freeLists = _VMSMasterEnv->freeLists; 34.259 - MallocProlog* foundChunk; 34.260 - 34.261 - //Return a small chunk if the requested size is smaller than 128B 34.262 - if(sizeRequested <= LOWER_BOUND) 34.263 - { 34.264 - uint32 freeListIdx = (sizeRequested-1)/SMALL_CHUNK_SIZE; 34.265 - if(freeLists->smallChunks[freeListIdx] == NULL) 34.266 - foundChunk = searchChunk(freeLists, SMALL_CHUNK_SIZE*(freeListIdx+1), 0); 34.267 - else 34.268 - foundChunk = removeSmallChunk(freeLists, freeListIdx); 34.269 - 34.270 - //Mark as allocated 34.271 - foundChunk->prevChunkInFreeList = NULL; 34.272 - return foundChunk + 1; 34.273 - } 34.274 - 34.275 - //Calculate the expected container. Start one higher to have a Chunk that's 34.276 - //always big enough. 34.277 - uint32 containerIdx = getContainer(sizeRequested); 34.278 - 34.279 - if(freeLists->bigChunks[containerIdx] == NULL) 34.280 - foundChunk = searchChunk(freeLists, sizeRequested, containerIdx); 34.281 - else 34.282 - foundChunk = removeChunk(freeLists, containerIdx); 34.283 - 34.284 - //Mark as allocated 34.285 - foundChunk->prevChunkInFreeList = NULL; 34.286 - 34.287 - MEAS__Capture_Post_Malloc_Point 34.288 - 34.289 - //skip over the prolog by adding its size to the pointer return 34.290 - return foundChunk + 1; 34.291 - } 34.292 - 34.293 -void * 34.294 -VMS_WL__malloc( int32 sizeRequested ) 34.295 - { void *ret; 34.296 - 34.297 - VMS_int__get_master_lock(); 34.298 - ret = VMS_int__malloc( sizeRequested ); 34.299 - VMS_int__release_master_lock(); 34.300 - return ret; 34.301 - } 34.302 - 34.303 - 34.304 -/* 34.305 - * This is sequential code, meant to only be called from the Master, not from 34.306 - * any slave Slvs. 34.307 - */ 34.308 -void 34.309 -VMS_int__free( void *ptrToFree ) 34.310 - { 34.311 - 34.312 - MEAS__Capture_Pre_Free_Point; 34.313 - 34.314 - MallocArrays* freeLists = _VMSMasterEnv->freeLists; 34.315 - MallocProlog *chunkToFree = (MallocProlog*)ptrToFree - 1; 34.316 - uint32 containerIdx; 34.317 - 34.318 - //Check for free neighbors 34.319 - if(chunkToFree->nextLowerInMem) 34.320 - { 34.321 - if(chunkToFree->nextLowerInMem->prevChunkInFreeList != NULL) 34.322 - {//Chunk is not allocated 34.323 - extractChunk(chunkToFree->nextLowerInMem, freeLists); 34.324 - chunkToFree = mergeChunks(chunkToFree->nextLowerInMem, chunkToFree); 34.325 - } 34.326 - } 34.327 - if(chunkToFree->nextHigherInMem) 34.328 - { 34.329 - if(chunkToFree->nextHigherInMem->prevChunkInFreeList != NULL) 34.330 - {//Chunk is not allocated 34.331 - extractChunk(chunkToFree->nextHigherInMem, freeLists); 34.332 - chunkToFree = mergeChunks(chunkToFree, chunkToFree->nextHigherInMem); 34.333 - } 34.334 - } 34.335 - 34.336 - size_t chunkSize = getChunkSize(chunkToFree); 34.337 - if(chunkSize < BIG_LOWER_BOUND) 34.338 - { 34.339 - containerIdx = (chunkSize/SMALL_CHUNK_SIZE)-1; 34.340 - if(containerIdx > SMALL_CHUNK_COUNT-1) 34.341 - containerIdx = SMALL_CHUNK_COUNT-1; 34.342 - insertChunk(chunkToFree, &freeLists->smallChunks[containerIdx]); 34.343 - } 34.344 - else 34.345 - { 34.346 - containerIdx = getContainer(getChunkSize(chunkToFree)) - 1; 34.347 - insertChunk(chunkToFree, &freeLists->bigChunks[containerIdx]); 34.348 - if(containerIdx < 64) 34.349 - freeLists->bigChunksSearchVector[0] |= (uint64)1 << containerIdx; 34.350 - else 34.351 - freeLists->bigChunksSearchVector[1] |= (uint64)1 << (containerIdx-64); 34.352 - } 34.353 - 34.354 - MEAS__Capture_Post_Free_Point; 34.355 - } 34.356 - 34.357 -void 34.358 -VMS_WL__free( void *ptrToFree ) 34.359 - { 34.360 - VMS_int__get_master_lock(); 34.361 - VMS_int__free( ptrToFree ); 34.362 - VMS_int__release_master_lock(); 34.363 - } 34.364 - 34.365 -/* 34.366 - * Designed to be called from the main thread outside of VMS, during init 34.367 - */ 34.368 -MallocArrays * 34.369 -VMS_ext__create_free_list() 34.370 -{ 34.371 - //Initialize containers for small chunks and fill with zeros 34.372 - _VMSMasterEnv->freeLists = (MallocArrays*)malloc( sizeof(MallocArrays) ); 34.373 - MallocArrays *freeLists = _VMSMasterEnv->freeLists; 34.374 - 34.375 - freeLists->smallChunks = 34.376 - (MallocProlog**)malloc(SMALL_CHUNK_COUNT*sizeof(MallocProlog*)); 34.377 - memset((void*)freeLists->smallChunks, 34.378 - 0,SMALL_CHUNK_COUNT*sizeof(MallocProlog*)); 34.379 - 34.380 - //Calculate number of containers for big chunks 34.381 - uint32 container = getContainer(MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE)+1; 34.382 - freeLists->bigChunks = (MallocProlog**)malloc(container*sizeof(MallocProlog*)); 34.383 - memset((void*)freeLists->bigChunks,0,container*sizeof(MallocProlog*)); 34.384 - freeLists->containerCount = container; 34.385 - 34.386 - //Create first element in lastContainer 34.387 - MallocProlog *firstChunk = malloc( MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE ); 34.388 - if( firstChunk == NULL ) {printf("Can't allocate initial memory\n"); exit(1);} 34.389 - freeLists->memSpace = firstChunk; 34.390 - 34.391 - //Touch memory to avoid page faults 34.392 - void *ptr,*endPtr; 34.393 - endPtr = (void*)firstChunk+MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE; 34.394 - for(ptr = firstChunk; ptr < endPtr; ptr+=PAGE_SIZE) 34.395 - { 34.396 - *(char*)ptr = 0; 34.397 - } 34.398 - 34.399 - firstChunk->nextLowerInMem = NULL; 34.400 - firstChunk->nextHigherInMem = (MallocProlog*)((uintptr_t)firstChunk + 34.401 - MALLOC_ADDITIONAL_MEM_FROM_OS_SIZE - sizeof(MallocProlog)); 34.402 - firstChunk->nextChunkInFreeList = NULL; 34.403 - //previous element in the queue is the container 34.404 - firstChunk->prevChunkInFreeList = &freeLists->bigChunks[container-2]; 34.405 - 34.406 - freeLists->bigChunks[container-2] = firstChunk; 34.407 - //Insert into bit search list 34.408 - if(container <= 65) 34.409 - { 34.410 - freeLists->bigChunksSearchVector[0] = ((uint64)1 << (container-2)); 34.411 - freeLists->bigChunksSearchVector[1] = 0; 34.412 - } 34.413 - else 34.414 - { 34.415 - freeLists->bigChunksSearchVector[0] = 0; 34.416 - freeLists->bigChunksSearchVector[1] = ((uint64)1 << (container-66)); 34.417 - } 34.418 - 34.419 - //Create dummy chunk to mark the top of stack this is of course 34.420 - //never freed 34.421 - MallocProlog *dummyChunk = firstChunk->nextHigherInMem; 34.422 - dummyChunk->nextHigherInMem = dummyChunk+1; 34.423 - dummyChunk->nextLowerInMem = NULL; 34.424 - dummyChunk->nextChunkInFreeList = NULL; 34.425 - dummyChunk->prevChunkInFreeList = NULL; 34.426 - 34.427 - return freeLists; 34.428 - } 34.429 - 34.430 - 34.431 -/*Designed to be called from the main thread outside of VMS, during cleanup 34.432 - */ 34.433 -void 34.434 -VMS_ext__free_free_list( MallocArrays *freeLists ) 34.435 - { 34.436 - free(freeLists->memSpace); 34.437 - free(freeLists->bigChunks); 34.438 - free(freeLists->smallChunks); 34.439 - 34.440 - } 34.441 -
35.1 --- a/Services_Offered_by_VMS/Memory_Handling/vmalloc.h Mon Sep 03 03:34:54 2012 -0700 35.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 35.3 @@ -1,94 +0,0 @@ 35.4 -/* 35.5 - * Copyright 2009 OpenSourceCodeStewardshipFoundation.org 35.6 - * Licensed under GNU General Public License version 2 35.7 - * 35.8 - * Author: seanhalle@yahoo.com 35.9 - * 35.10 - * Created on November 14, 2009, 9:07 PM 35.11 - */ 35.12 - 35.13 -#ifndef _VMALLOC_H 35.14 -#define _VMALLOC_H 35.15 - 35.16 -#include <malloc.h> 35.17 -#include <inttypes.h> 35.18 -#include "VMS_impl/VMS_primitive_data_types.h" 35.19 - 35.20 -#define SMALL_CHUNK_SIZE 32 35.21 -#define SMALL_CHUNK_COUNT 4 35.22 -#define LOWER_BOUND 128 //Biggest chunk size that is created for the small chunks 35.23 -#define BIG_LOWER_BOUND 160 //Smallest chunk size that is created for the big chunks 35.24 - 35.25 -#define LOG54 0.3219280948873623 35.26 -#define LOG128 7 35.27 - 35.28 -typedef struct _MallocProlog MallocProlog; 35.29 - 35.30 -struct _MallocProlog 35.31 - { 35.32 - MallocProlog *nextChunkInFreeList; 35.33 - MallocProlog *prevChunkInFreeList; 35.34 - MallocProlog *nextHigherInMem; 35.35 - MallocProlog *nextLowerInMem; 35.36 - }; 35.37 -//MallocProlog 35.38 - 35.39 - typedef struct MallocArrays MallocArrays; 35.40 - 35.41 - struct MallocArrays 35.42 - { 35.43 - MallocProlog **smallChunks; 35.44 - MallocProlog **bigChunks; 35.45 - uint64 bigChunksSearchVector[2]; 35.46 - void *memSpace; 35.47 - uint32 containerCount; 35.48 - }; 35.49 - //MallocArrays 35.50 - 35.51 -typedef struct 35.52 - { 35.53 - MallocProlog *firstChunkInFreeList; 35.54 - int32 numInList; //TODO not used 35.55 - } 35.56 -FreeListHead; 35.57 - 35.58 -void * 35.59 -VMS_int__malloc( size_t sizeRequested ); 35.60 -#define VMS_PI__malloc VMS_int__malloc 35.61 - 35.62 -void * 35.63 -VMS_WL__malloc( int32 sizeRequested ); /*BUG: -- get master lock */ 35.64 -#define VMS_App__malloc VMS_WL__malloc 35.65 - 35.66 -void * 35.67 -VMS_int__malloc_aligned( size_t sizeRequested ); 35.68 -#define VMS_PI__malloc_aligned VMS_int__malloc_aligned 35.69 - 35.70 -void 35.71 -VMS_int__free( void *ptrToFree ); 35.72 -#define VMS_PI__free VMS_int__free 35.73 - 35.74 -void 35.75 -VMS_WL__free( void *ptrToFree ); 35.76 -#define VMS_App__free VMS_WL__free 35.77 - 35.78 - 35.79 - 35.80 -/*Allocates memory from the external system -- higher overhead 35.81 - */ 35.82 -void * 35.83 -VMS_ext__malloc_in_ext( size_t sizeRequested ); 35.84 - 35.85 -/*Frees memory that was allocated in the external system -- higher overhead 35.86 - */ 35.87 -void 35.88 -VMS_ext__free_in_ext( void *ptrToFree ); 35.89 - 35.90 - 35.91 -MallocArrays * 35.92 -VMS_ext__create_free_list(); 35.93 - 35.94 -void 35.95 -VMS_ext__free_free_list(MallocArrays *freeLists ); 35.96 - 35.97 -#endif 35.98 \ No newline at end of file
36.1 --- a/VMS.h Mon Sep 03 03:34:54 2012 -0700 36.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 36.3 @@ -1,390 +0,0 @@ 36.4 -/* 36.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 36.6 - * Licensed under GNU General Public License version 2 36.7 - * 36.8 - * Author: seanhalle@yahoo.com 36.9 - * 36.10 - */ 36.11 - 36.12 -#ifndef _VMS_H 36.13 -#define _VMS_H 36.14 -#define _GNU_SOURCE 36.15 - 36.16 -#include "DynArray/DynArray.h" 36.17 -#include "Hash_impl/PrivateHash.h" 36.18 -#include "Histogram/Histogram.h" 36.19 -#include "Queue_impl/PrivateQueue.h" 36.20 - 36.21 -#include "VMS_primitive_data_types.h" 36.22 -#include "Services_Offered_by_VMS/Memory_Handling/vmalloc.h" 36.23 - 36.24 -#include <pthread.h> 36.25 -#include <sys/time.h> 36.26 - 36.27 -//================= Defines: included from separate files ================= 36.28 -// 36.29 -// Note: ALL defines are in other files, none are in here 36.30 -// 36.31 -#include "Defines/VMS_defs.h" 36.32 - 36.33 - 36.34 -//================================ Typedefs ================================= 36.35 -// 36.36 -typedef unsigned long long TSCount; 36.37 - 36.38 -typedef struct _AnimSlot AnimSlot; 36.39 -typedef struct _VMSReqst VMSReqst; 36.40 -typedef struct _SlaveVP SlaveVP; 36.41 -typedef struct _MasterVP MasterVP; 36.42 -typedef struct _IntervalProbe IntervalProbe; 36.43 - 36.44 - 36.45 -typedef SlaveVP *(*SlaveAssigner) ( void *, AnimSlot*); //semEnv, slot for HW info 36.46 -typedef void (*RequestHandler) ( SlaveVP *, void * ); //prWReqst, semEnv 36.47 -typedef void (*TopLevelFnPtr) ( void *, SlaveVP * ); //initData, animSlv 36.48 -typedef void TopLevelFn ( void *, SlaveVP * ); //initData, animSlv 36.49 -typedef void (*ResumeSlvFnPtr) ( SlaveVP *, void * ); 36.50 - //=========== MEASUREMENT STUFF ========== 36.51 - MEAS__Insert_Counter_Handler 36.52 - //======================================== 36.53 - 36.54 -//============================ HW Dependent Fns ================================ 36.55 - 36.56 -#include "HW_Dependent_Primitives/VMS__HW_measurement.h" 36.57 -#include "HW_Dependent_Primitives/VMS__primitives.h" 36.58 - 36.59 - 36.60 -//============= Request Related =========== 36.61 -// 36.62 - 36.63 -enum VMSReqstType //avoid starting enums at 0, for debug reasons 36.64 - { 36.65 - semantic = 1, 36.66 - createReq, 36.67 - dissipate, 36.68 - VMSSemantic //goes with VMSSemReqst below 36.69 - }; 36.70 - 36.71 -struct _VMSReqst 36.72 - { 36.73 - enum VMSReqstType reqType;//used for dissipate and in future for IO requests 36.74 - void *semReqData; 36.75 - 36.76 - VMSReqst *nextReqst; 36.77 - }; 36.78 -//VMSReqst 36.79 - 36.80 -enum VMSSemReqstType //These are equivalent to semantic requests, but for 36.81 - { // VMS's services available directly to app, like OS 36.82 - make_probe = 1, // and probe services -- like a VMS-wide built-in lang 36.83 - throw_excp, 36.84 - openFile, 36.85 - otherIO 36.86 - }; 36.87 - 36.88 -typedef struct 36.89 - { enum VMSSemReqstType reqType; 36.90 - SlaveVP *requestingSlv; 36.91 - char *nameStr; //for create probe 36.92 - char *msgStr; //for exception 36.93 - void *exceptionData; 36.94 - } 36.95 - VMSSemReq; 36.96 - 36.97 - 36.98 -//==================== Core data structures =================== 36.99 - 36.100 -typedef struct 36.101 - { 36.102 - //for future expansion 36.103 - } 36.104 -SlotPerfInfo; 36.105 - 36.106 -struct _AnimSlot 36.107 - { 36.108 - int workIsDone; 36.109 - int needsSlaveAssigned; 36.110 - SlaveVP *slaveAssignedToSlot; 36.111 - 36.112 - int slotIdx; //needed by Holistic Model's data gathering 36.113 - int coreSlotIsOn; 36.114 - SlotPerfInfo *perfInfo; //used by assigner to pick best slave for core 36.115 - }; 36.116 -//AnimSlot 36.117 - 36.118 - enum VPtype { 36.119 - Slave = 1, //default 36.120 - Master, 36.121 - Shutdown, 36.122 - Idle 36.123 - }; 36.124 - 36.125 -/*This structure embodies the state of a slaveVP. It is reused for masterVP 36.126 - * and shutdownVPs. 36.127 - */ 36.128 -struct _SlaveVP 36.129 - { //The offsets of these fields are hard-coded into assembly 36.130 - void *stackPtr; //save the core's stack ptr when suspend 36.131 - void *framePtr; //save core's frame ptr when suspend 36.132 - void *resumeInstrPtr; //save core's program-counter when suspend 36.133 - void *coreCtlrFramePtr; //restore before jmp back to core controller 36.134 - void *coreCtlrStackPtr; //restore before jmp back to core controller 36.135 - 36.136 - //============ below this, no fields are used in asm ============= 36.137 - 36.138 - int slaveID; //each slave given a globally unique ID 36.139 - int coreAnimatedBy; 36.140 - void *startOfStack; //used to free, and to point slave to Fn 36.141 - enum VPtype typeOfVP; //Slave vs Master vs Shutdown.. 36.142 - int assignCount; //Each assign is for one work-unit, so IDs it 36.143 - //note, a scheduling decision is uniquely identified by the triple: 36.144 - // <slaveID, coreAnimatedBy, assignCount> -- used in record & replay 36.145 - 36.146 - //for comm -- between master and coreCtlr & btwn wrapper lib and plugin 36.147 - AnimSlot *animSlotAssignedTo; 36.148 - VMSReqst *requests; //wrapper lib puts in requests, plugin takes out 36.149 - void *dataRetFromReq;//Return vals from plugin to Wrapper Lib 36.150 - 36.151 - //For using Slave as carrier for data 36.152 - void *semanticData; //Lang saves lang-specific things in slave here 36.153 - 36.154 - //=========== MEASUREMENT STUFF ========== 36.155 - MEAS__Insert_Meas_Fields_into_Slave; 36.156 - float64 createPtInSecs; //time VP created, in seconds 36.157 - //======================================== 36.158 - }; 36.159 -//SlaveVP 36.160 - 36.161 - 36.162 -/* The one and only global variable, holds many odds and ends 36.163 - */ 36.164 -typedef struct 36.165 - { //The offsets of these fields are hard-coded into assembly 36.166 - void *coreCtlrReturnPt; //offset to this field used in asm 36.167 - int8 falseSharePad1[256 - sizeof(void*)]; 36.168 - int32 masterLock; //offset to this field used in asm 36.169 - int8 falseSharePad2[256 - sizeof(int32)]; 36.170 - //============ below this, no fields are used in asm ============= 36.171 - 36.172 - //Basic VMS infrastructure 36.173 - SlaveVP **masterVPs; 36.174 - AnimSlot ***allAnimSlots; 36.175 - 36.176 - //plugin related 36.177 - SlaveAssigner slaveAssigner; 36.178 - RequestHandler requestHandler; 36.179 - void *semanticEnv; 36.180 - 36.181 - //Slave creation 36.182 - int32 numSlavesCreated; //gives ordering to processor creation 36.183 - int32 numSlavesAlive; //used to detect fail-safe shutdown 36.184 - 36.185 - //Initialization related 36.186 - int32 setupComplete; //use while starting up coreCtlr 36.187 - 36.188 - //Memory management related 36.189 - MallocArrays *freeLists; 36.190 - int32 amtOfOutstandingMem;//total currently allocated 36.191 - 36.192 - //Random number seeds -- random nums used in various places 36.193 - uint32_t seed1; 36.194 - uint32_t seed2; 36.195 - 36.196 - //=========== MEASUREMENT STUFF ============= 36.197 - IntervalProbe **intervalProbes; 36.198 - PrivDynArrayInfo *dynIntervalProbesInfo; 36.199 - HashTable *probeNameHashTbl; 36.200 - int32 masterCreateProbeID; 36.201 - float64 createPtInSecs; //real-clock time VMS initialized 36.202 - Histogram **measHists; 36.203 - PrivDynArrayInfo *measHistsInfo; 36.204 - MEAS__Insert_Susp_Meas_Fields_into_MasterEnv; 36.205 - MEAS__Insert_Master_Meas_Fields_into_MasterEnv; 36.206 - MEAS__Insert_Master_Lock_Meas_Fields_into_MasterEnv; 36.207 - MEAS__Insert_Malloc_Meas_Fields_into_MasterEnv; 36.208 - MEAS__Insert_Plugin_Meas_Fields_into_MasterEnv; 36.209 - MEAS__Insert_System_Meas_Fields_into_MasterEnv; 36.210 - MEAS__Insert_Counter_Meas_Fields_into_MasterEnv; 36.211 - //========================================== 36.212 - } 36.213 -MasterEnv; 36.214 - 36.215 -//========================= Extra Stuff Data Strucs ======================= 36.216 -typedef struct 36.217 - { 36.218 - 36.219 - } 36.220 -VMSExcp; 36.221 - 36.222 -//======================= OS Thread related =============================== 36.223 - 36.224 -void * coreController( void *paramsIn ); //standard PThreads fn prototype 36.225 -void * coreCtlr_Seq( void *paramsIn ); //standard PThreads fn prototype 36.226 -void animationMaster( void *initData, SlaveVP *masterVP ); 36.227 - 36.228 - 36.229 -typedef struct 36.230 - { 36.231 - void *endThdPt; 36.232 - unsigned int coreNum; 36.233 - } 36.234 -ThdParams; 36.235 - 36.236 -//============================= Global Vars ================================ 36.237 - 36.238 -volatile MasterEnv *_VMSMasterEnv __align_to_cacheline__; 36.239 - 36.240 - //these are global, but only used for startup and shutdown 36.241 -pthread_t coreCtlrThdHandles[ NUM_CORES ]; //pthread's virt-procr state 36.242 -ThdParams *coreCtlrThdParams [ NUM_CORES ]; 36.243 - 36.244 -pthread_mutex_t suspendLock; 36.245 -pthread_cond_t suspendCond; 36.246 - 36.247 -//========================= Function Prototypes =========================== 36.248 -/* MEANING OF WL PI SS int VMSOS 36.249 - * These indicate which places the function is safe to use. They stand for: 36.250 - * 36.251 - * WL Wrapper Library -- wrapper lib code should only use these 36.252 - * PI Plugin -- plugin code should only use these 36.253 - * SS Startup and Shutdown -- designates these relate to startup & shutdown 36.254 - * int internal to VMS -- should not be used in wrapper lib or plugin 36.255 - * VMSOS means "OS functions for applications to use" 36.256 - * 36.257 - * VMS_int__ functions touch internal VMS data structs and are only safe 36.258 - * to be used inside the master lock. However, occasionally, they appear 36.259 - * in wrapper-lib or plugin code. In those cases, very careful analysis 36.260 - * has been done to be sure no concurrency issues could arise. 36.261 - * 36.262 - * VMS_WL__ functions are all safe for use outside the master lock. 36.263 - * 36.264 - * VMSOS are only safe for applications to use -- they're like a second 36.265 - * language mixed in -- but they can't be used inside plugin code, and 36.266 - * aren't meant for use in wrapper libraries, because they are themselves 36.267 - * wrapper-library calls! 36.268 - */ 36.269 -//========== Startup and shutdown ========== 36.270 -void 36.271 -VMS_SS__init(); 36.272 - 36.273 -void 36.274 -VMS_SS__start_the_work_then_wait_until_done(); 36.275 - 36.276 -SlaveVP* 36.277 -VMS_SS__create_shutdown_slave(); 36.278 - 36.279 -void 36.280 -VMS_SS__shutdown(); 36.281 - 36.282 -void 36.283 -VMS_SS__cleanup_at_end_of_shutdown(); 36.284 - 36.285 - 36.286 -//============== =============== 36.287 - 36.288 -inline SlaveVP * 36.289 -VMS_int__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ); 36.290 -#define VMS_PI__create_slaveVP VMS_int__create_slaveVP 36.291 -#define VMS_WL__create_slaveVP VMS_int__create_slaveVP 36.292 - 36.293 - //Use this to create processor inside entry point & other places outside 36.294 - // the VMS system boundary (IE, don't animate with a SlaveVP or MasterVP) 36.295 -SlaveVP * 36.296 -VMS_ext__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ); 36.297 - 36.298 -inline SlaveVP * 36.299 -VMS_int__create_slaveVP_helper( SlaveVP *newSlv, TopLevelFnPtr fnPtr, 36.300 - void *dataParam, void *stackLocs ); 36.301 - 36.302 -inline void 36.303 -VMS_int__reset_slaveVP_to_TopLvlFn( SlaveVP *slaveVP, TopLevelFnPtr fnPtr, 36.304 - void *dataParam); 36.305 - 36.306 -inline void 36.307 -VMS_int__point_slaveVP_to_OneParamFn( SlaveVP *slaveVP, void *fnPtr, 36.308 - void *param); 36.309 - 36.310 -inline void 36.311 -VMS_int__point_slaveVP_to_TwoParamFn( SlaveVP *slaveVP, void *fnPtr, 36.312 - void *param1, void *param2); 36.313 - 36.314 -void 36.315 -VMS_int__dissipate_slaveVP( SlaveVP *slaveToDissipate ); 36.316 -#define VMS_PI__dissipate_slaveVP VMS_int__dissipate_slaveVP 36.317 -//WL: dissipate a SlaveVP by sending a request 36.318 - 36.319 -void 36.320 -VMS_ext__dissipate_slaveVP( SlaveVP *slaveToDissipate ); 36.321 - 36.322 -void 36.323 -VMS_int__throw_exception( char *msgStr, SlaveVP *reqstSlv, VMSExcp *excpData ); 36.324 -#define VMS_PI__throw_exception VMS_int__throw_exception 36.325 -void 36.326 -VMS_WL__throw_exception( char *msgStr, SlaveVP *reqstSlv, VMSExcp *excpData ); 36.327 -#define VMS_App__throw_exception VMS_WL__throw_exception 36.328 - 36.329 -void * 36.330 -VMS_int__give_sem_env_for( SlaveVP *animSlv ); 36.331 -#define VMS_PI__give_sem_env_for VMS_int__give_sem_env_for 36.332 -#define VMS_SS__give_sem_env_for VMS_int__give_sem_env_for 36.333 -//No WL version -- not safe! if use in WL, be sure data rd & wr is stable 36.334 - 36.335 - 36.336 -inline void 36.337 -VMS_int__get_master_lock(); 36.338 - 36.339 -#define VMS_int__release_master_lock() _VMSMasterEnv->masterLock = UNLOCKED 36.340 - 36.341 -inline uint32_t 36.342 -VMS_int__randomNumber(); 36.343 - 36.344 -//============== Request Related =============== 36.345 - 36.346 -void 36.347 -VMS_int__suspend_slaveVP_and_send_req( SlaveVP *callingSlv ); 36.348 - 36.349 -inline void 36.350 -VMS_WL__add_sem_request_in_mallocd_VMSReqst( void *semReqData, SlaveVP *callingSlv ); 36.351 - 36.352 -inline void 36.353 -VMS_WL__send_sem_request( void *semReqData, SlaveVP *callingSlv ); 36.354 - 36.355 -void 36.356 -VMS_WL__send_create_slaveVP_req( void *semReqData, SlaveVP *reqstingSlv ); 36.357 - 36.358 -void inline 36.359 -VMS_WL__send_dissipate_req( SlaveVP *prToDissipate ); 36.360 - 36.361 -inline void 36.362 -VMS_WL__send_VMSSem_request( void *semReqData, SlaveVP *callingSlv ); 36.363 - 36.364 -VMSReqst * 36.365 -VMS_PI__take_next_request_out_of( SlaveVP *slaveWithReq ); 36.366 -//#define VMS_PI__take_next_request_out_of( slave ) slave->requests 36.367 - 36.368 -//inline void * 36.369 -//VMS_PI__take_sem_reqst_from( VMSReqst *req ); 36.370 -#define VMS_PI__take_sem_reqst_from( req ) req->semReqData 36.371 - 36.372 -void inline 36.373 -VMS_PI__handle_VMSSemReq( VMSReqst *req, SlaveVP *requestingSlv, void *semEnv, 36.374 - ResumeSlvFnPtr resumeSlvFnPtr ); 36.375 - 36.376 -//======================== MEASUREMENT ====================== 36.377 -uint64 36.378 -VMS_WL__give_num_plugin_cycles(); 36.379 -uint32 36.380 -VMS_WL__give_num_plugin_animations(); 36.381 - 36.382 - 36.383 -//========================= Utilities ======================= 36.384 -inline char * 36.385 -VMS_int__strDup( char *str ); 36.386 - 36.387 - 36.388 -//========================= Probes ======================= 36.389 -#include "Services_Offered_by_VMS/Measurement_and_Stats/probes.h" 36.390 - 36.391 -//================================================ 36.392 -#endif /* _VMS_H */ 36.393 -
37.1 --- a/VMS__PI.c Mon Sep 03 03:34:54 2012 -0700 37.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 37.3 @@ -1,121 +0,0 @@ 37.4 -/* 37.5 - * Copyright 2010 OpenSourceStewardshipFoundation 37.6 - * 37.7 - * Licensed under BSD 37.8 - */ 37.9 - 37.10 -#include <stdio.h> 37.11 -#include <stdlib.h> 37.12 -#include <string.h> 37.13 -#include <malloc.h> 37.14 -#include <inttypes.h> 37.15 -#include <sys/time.h> 37.16 - 37.17 -#include "VMS.h" 37.18 - 37.19 - 37.20 -/* MEANING OF WL PI SS int 37.21 - * These indicate which places the function is safe to use. They stand for: 37.22 - * WL: Wrapper Library 37.23 - * PI: Plugin 37.24 - * SS: Startup and Shutdown 37.25 - * int: internal to the VMS implementation 37.26 - */ 37.27 - 37.28 -//========================= Local Declarations ======================== 37.29 -void inline 37.30 -handleMakeProbe( VMSSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ); 37.31 - 37.32 -void inline 37.33 -handleThrowException( VMSSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ); 37.34 -//======================================================================= 37.35 - 37.36 - 37.37 -VMSReqst * 37.38 -VMS_PI__take_next_request_out_of( SlaveVP *slaveWithReq ) 37.39 - { VMSReqst *req; 37.40 - 37.41 - req = slaveWithReq->requests; 37.42 - if( req == NULL ) return NULL; 37.43 - 37.44 - slaveWithReq->requests = slaveWithReq->requests->nextReqst; 37.45 - return req; 37.46 - } 37.47 - 37.48 - 37.49 - 37.50 -/*May 2012 37.51 - *CHANGED IMPL -- now a macro in header file 37.52 - * 37.53 - *Turn function into macro that just accesses the request field 37.54 - * 37.55 -inline void * 37.56 -VMS_PI__take_sem_reqst_from( VMSReqst *req ) 37.57 - { 37.58 - return req->semReqData; 37.59 - } 37.60 -*/ 37.61 - 37.62 - 37.63 -/* This is for OS requests and VMS infrastructure requests, such as to create 37.64 - * a probe -- a probe is inside the heart of VMS-core, it's not part of any 37.65 - * language -- but it's also a semantic thing that's triggered from and used 37.66 - * in the application.. so it crosses abstractions.. so, need some special 37.67 - * pattern here for handling such requests. 37.68 - * Doing this just like it were a second language sharing VMS-core. 37.69 - * 37.70 - * This is called from the language's request handler when it sees a request 37.71 - * of type VMSSemReq 37.72 - * 37.73 - * TODO: Later change this, to give probes their own separate plugin & have 37.74 - * VMS-core steer the request to appropriate plugin 37.75 - * Do the same for OS calls -- look later at it.. 37.76 - */ 37.77 -void inline 37.78 -VMS_PI__handle_VMSSemReq( VMSReqst *req, SlaveVP *requestingSlv, void *semEnv, 37.79 - ResumeSlvFnPtr resumeFn ) 37.80 - { VMSSemReq *semReq; 37.81 - 37.82 - semReq = VMS_PI__take_sem_reqst_from(req); 37.83 - if( semReq == NULL ) return; 37.84 - switch( semReq->reqType ) //sem handlers are all in other file 37.85 - { 37.86 - case make_probe: handleMakeProbe( semReq, semEnv, resumeFn); 37.87 - break; 37.88 - case throw_excp: handleThrowException( semReq, semEnv, resumeFn); 37.89 - break; 37.90 - } 37.91 - } 37.92 - 37.93 -/* 37.94 - */ 37.95 -void inline 37.96 -handleMakeProbe( VMSSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ) 37.97 - { IntervalProbe *newProbe; 37.98 - 37.99 - newProbe = VMS_int__malloc( sizeof(IntervalProbe) ); 37.100 - newProbe->nameStr = VMS_int__strDup( semReq->nameStr ); 37.101 - newProbe->hist = NULL; 37.102 - newProbe->schedChoiceWasRecorded = FALSE; 37.103 - 37.104 - //This runs in masterVP, so no race-condition worries 37.105 - newProbe->probeID = 37.106 - addToDynArray( newProbe, _VMSMasterEnv->dynIntervalProbesInfo ); 37.107 - 37.108 - semReq->requestingSlv->dataRetFromReq = newProbe; 37.109 - 37.110 - //This in inside VMS, while resume_slaveVP fn is inside language, so pass 37.111 - // pointer from lang to here, then call it. 37.112 - (*resumeFn)( semReq->requestingSlv, semEnv ); 37.113 - } 37.114 - 37.115 -void inline 37.116 -handleThrowException( VMSSemReq *semReq, void *semEnv, ResumeSlvFnPtr resumeFn ) 37.117 - { 37.118 - VMS_int__throw_exception( semReq->msgStr, semReq->requestingSlv, semReq->exceptionData ); 37.119 - 37.120 - (*resumeFn)( semReq->requestingSlv, semEnv ); 37.121 - } 37.122 - 37.123 - 37.124 -
38.1 --- a/VMS__WL.c Mon Sep 03 03:34:54 2012 -0700 38.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 38.3 @@ -1,160 +0,0 @@ 38.4 -/* 38.5 - * Copyright 2010 OpenSourceStewardshipFoundation 38.6 - * 38.7 - * Licensed under BSD 38.8 - */ 38.9 - 38.10 -#include <stdio.h> 38.11 -#include <stdlib.h> 38.12 -#include <string.h> 38.13 -#include <malloc.h> 38.14 -#include <inttypes.h> 38.15 -#include <sys/time.h> 38.16 - 38.17 -#include "VMS.h" 38.18 - 38.19 - 38.20 -/* MEANING OF WL PI SS int 38.21 - * These indicate which places the function is safe to use. They stand for: 38.22 - * WL: Wrapper Library 38.23 - * PI: Plugin 38.24 - * SS: Startup and Shutdown 38.25 - * int: internal to the VMS implementation 38.26 - */ 38.27 - 38.28 - 38.29 - 38.30 -/*For this implementation of VMS, it may not make much sense to have the 38.31 - * system of requests for creating a new processor done this way.. but over 38.32 - * the scope of single-master, multi-master, mult-tasking, OS-implementing, 38.33 - * distributed-memory, and so on, this gives VMS implementation a chance to 38.34 - * do stuff before suspend, in the SlaveVP, and in the Master before the plugin 38.35 - * is called, as well as in the lang-lib before this is called, and in the 38.36 - * plugin. So, this gives both VMS and language implementations a chance to 38.37 - * intercept at various points and do order-dependent stuff. 38.38 - *Having a standard VMSNewPrReqData struc allows the language to create and 38.39 - * free the struc, while VMS knows how to get the newSlv if it wants it, and 38.40 - * it lets the lang have lang-specific data related to creation transported 38.41 - * to the plugin. 38.42 - */ 38.43 -void 38.44 -VMS_WL__send_create_slaveVP_req( void *semReqData, SlaveVP *reqstingSlv ) 38.45 - { VMSReqst req; 38.46 - 38.47 - req.reqType = createReq; 38.48 - req.semReqData = semReqData; 38.49 - req.nextReqst = reqstingSlv->requests; 38.50 - reqstingSlv->requests = &req; 38.51 - 38.52 - VMS_int__suspend_slaveVP_and_send_req( reqstingSlv ); 38.53 - } 38.54 - 38.55 - 38.56 -/* 38.57 - *This adds a request to dissipate, then suspends the processor so that the 38.58 - * request handler will receive the request. The request handler is what 38.59 - * does the work of freeing memory and removing the processor from the 38.60 - * semantic environment's data structures. 38.61 - *The request handler also is what figures out when to shutdown the VMS 38.62 - * system -- which causes all the core controller threads to die, and returns from 38.63 - * the call that started up VMS to perform the work. 38.64 - * 38.65 - *This form is a bit misleading to understand if one is trying to figure out 38.66 - * how VMS works -- it looks like a normal function call, but inside it 38.67 - * sends a request to the request handler and suspends the processor, which 38.68 - * jumps out of the VMS_WL__dissipate_slaveVP function, and out of all nestings 38.69 - * above it, transferring the work of dissipating to the request handler, 38.70 - * which then does the actual work -- causing the processor that animated 38.71 - * the call of this function to disappear and the "hanging" state of this 38.72 - * function to just poof into thin air -- the virtual processor's trace 38.73 - * never returns from this call, but instead the virtual processor's trace 38.74 - * gets suspended in this call and all the virt processor's state disap- 38.75 - * pears -- making that suspend the last thing in the Slv's trace. 38.76 - */ 38.77 -void 38.78 -VMS_WL__send_dissipate_req( SlaveVP *slaveToDissipate ) 38.79 - { VMSReqst req; 38.80 - 38.81 - req.reqType = dissipate; 38.82 - req.nextReqst = slaveToDissipate->requests; 38.83 - slaveToDissipate->requests = &req; 38.84 - 38.85 - VMS_int__suspend_slaveVP_and_send_req( slaveToDissipate ); 38.86 - } 38.87 - 38.88 - 38.89 - 38.90 -/*This call's name indicates that request is malloc'd -- so req handler 38.91 - * has to free any extra requests tacked on before a send, using this. 38.92 - * 38.93 - * This inserts the semantic-layer's request data into standard VMS carrier 38.94 - * request data-struct that is mallocd. The sem request doesn't need to 38.95 - * be malloc'd if this is called inside the same call chain before the 38.96 - * send of the last request is called. 38.97 - * 38.98 - *The request handler has to call VMS_int__free_VMSReq for any of these 38.99 - */ 38.100 -inline void 38.101 -VMS_WL__add_sem_request_in_mallocd_VMSReqst( void *semReqData, 38.102 - SlaveVP *callingSlv ) 38.103 - { VMSReqst *req; 38.104 - 38.105 - req = VMS_int__malloc( sizeof(VMSReqst) ); 38.106 - req->reqType = semantic; 38.107 - req->semReqData = semReqData; 38.108 - req->nextReqst = callingSlv->requests; 38.109 - callingSlv->requests = req; 38.110 - } 38.111 - 38.112 -/*This inserts the semantic-layer's request data into standard VMS carrier 38.113 - * request data-struct is allocated on stack of this call & ptr to it sent 38.114 - * to plugin 38.115 - *Then it does suspend, to cause request to be sent. 38.116 - */ 38.117 -inline void 38.118 -VMS_WL__send_sem_request( void *semReqData, SlaveVP *callingSlv ) 38.119 - { VMSReqst req; 38.120 - 38.121 - req.reqType = semantic; 38.122 - req.semReqData = semReqData; 38.123 - req.nextReqst = callingSlv->requests; 38.124 - callingSlv->requests = &req; 38.125 - 38.126 - VMS_int__suspend_slaveVP_and_send_req( callingSlv ); 38.127 - } 38.128 - 38.129 - 38.130 -/*May 2012 Not sure what this is.. looks like old idea for VMS semantic 38.131 - * request 38.132 - */ 38.133 -inline void 38.134 -VMS_WL__send_VMSSem_request( void *semReqData, SlaveVP *callingSlv ) 38.135 - { VMSReqst req; 38.136 - 38.137 - req.reqType = VMSSemantic; 38.138 - req.semReqData = semReqData; 38.139 - req.nextReqst = callingSlv->requests; //gab any other preceeding 38.140 - callingSlv->requests = &req; 38.141 - 38.142 - VMS_int__suspend_slaveVP_and_send_req( callingSlv ); 38.143 - } 38.144 - 38.145 -/*May 2012 38.146 - *To throw exception from wrapper lib or application, first turn 38.147 - * it into a request, then send the request 38.148 - */ 38.149 -void 38.150 -VMS_WL__throw_exception( char *msgStr, SlaveVP *reqstSlv, VMSExcp *excpData ) 38.151 - { VMSReqst req; 38.152 - VMSSemReq semReq; 38.153 - 38.154 - req.reqType = VMSSemantic; 38.155 - req.semReqData = &semReq; 38.156 - req.nextReqst = reqstSlv->requests; //gab any other preceeding 38.157 - reqstSlv->requests = &req; 38.158 - 38.159 - semReq.msgStr = msgStr; 38.160 - semReq.exceptionData = excpData; 38.161 - 38.162 - VMS_int__suspend_slaveVP_and_send_req( reqstSlv ); 38.163 - }
39.1 --- a/VMS__int.c Mon Sep 03 03:34:54 2012 -0700 39.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 39.3 @@ -1,289 +0,0 @@ 39.4 -/* 39.5 - * Copyright 2010 OpenSourceStewardshipFoundation 39.6 - * 39.7 - * Licensed under BSD 39.8 - */ 39.9 - 39.10 -#include <stdio.h> 39.11 -#include <stdlib.h> 39.12 -#include <string.h> 39.13 -#include <malloc.h> 39.14 -#include <inttypes.h> 39.15 -#include <sys/time.h> 39.16 - 39.17 -#include "VMS.h" 39.18 - 39.19 - 39.20 -/* MEANING OF WL PI SS int 39.21 - * These indicate which places the function is safe to use. They stand for: 39.22 - * WL: Wrapper Library 39.23 - * PI: Plugin 39.24 - * SS: Startup and Shutdown 39.25 - * int: internal to the VMS implementation 39.26 - */ 39.27 - 39.28 - 39.29 -inline SlaveVP * 39.30 -VMS_int__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ) 39.31 - { SlaveVP *newSlv; 39.32 - void *stackLocs; 39.33 - 39.34 - newSlv = VMS_int__malloc( sizeof(SlaveVP) ); 39.35 - stackLocs = VMS_int__malloc( VIRT_PROCR_STACK_SIZE ); 39.36 - if( stackLocs == 0 ) 39.37 - { perror("VMS_int__malloc stack"); exit(1); } 39.38 - 39.39 - _VMSMasterEnv->numSlavesAlive += 1; 39.40 - 39.41 - return VMS_int__create_slaveVP_helper( newSlv, fnPtr, dataParam, stackLocs ); 39.42 - } 39.43 - 39.44 -/* "ext" designates that it's for use outside the VMS system -- should only 39.45 - * be called from main thread or other thread -- never from code animated by 39.46 - * a VMS virtual processor. 39.47 - */ 39.48 -inline SlaveVP * 39.49 -VMS_ext__create_slaveVP( TopLevelFnPtr fnPtr, void *dataParam ) 39.50 - { SlaveVP *newSlv; 39.51 - char *stackLocs; 39.52 - 39.53 - newSlv = malloc( sizeof(SlaveVP) ); 39.54 - stackLocs = malloc( VIRT_PROCR_STACK_SIZE ); 39.55 - if( stackLocs == 0 ) 39.56 - { perror("malloc stack"); exit(1); } 39.57 - 39.58 - _VMSMasterEnv->numSlavesAlive += 1; 39.59 - 39.60 - return VMS_int__create_slaveVP_helper(newSlv, fnPtr, dataParam, stackLocs); 39.61 - } 39.62 - 39.63 - 39.64 -//=========================================================================== 39.65 -/*there is a label inside this function -- save the addr of this label in 39.66 - * the callingSlv struc, as the pick-up point from which to start the next 39.67 - * work-unit for that slave. If turns out have to save registers, then 39.68 - * save them in the slave struc too. Then do assembly jump to the CoreCtlr's 39.69 - * "done with work-unit" label. The slave struc is in the request in the 39.70 - * slave that animated the just-ended work-unit, so all the state is saved 39.71 - * there, and will get passed along, inside the request handler, to the 39.72 - * next work-unit for that slave. 39.73 - */ 39.74 -void 39.75 -VMS_int__suspend_slaveVP_and_send_req( SlaveVP *animatingSlv ) 39.76 - { 39.77 - 39.78 - //This suspended Slv will get assigned by Master again at some 39.79 - // future point 39.80 - 39.81 - //return ownership of the Slv and anim slot to Master virt pr 39.82 - animatingSlv->animSlotAssignedTo->workIsDone = TRUE; 39.83 - 39.84 - HOLISTIC__Record_HwResponderInvocation_start; 39.85 - MEAS__Capture_Pre_Susp_Point; 39.86 - //This assembly function is a VMS primitive that first saves the 39.87 - // stack and frame pointer, plus an addr inside this assembly code. 39.88 - //When core ctlr later gets this slave out of a sched slot, it 39.89 - // restores the stack and frame and then jumps to the addr.. that 39.90 - // jmp causes return from this function. 39.91 - //So, in effect, this function takes a variable amount of wall-clock 39.92 - // time to complete -- the amount of time is determined by the 39.93 - // Master, which makes sure the memory is in a consistent state first. 39.94 - switchToCoreCtlr(animatingSlv); 39.95 - flushRegisters(); 39.96 - MEAS__Capture_Post_Susp_Point; 39.97 - 39.98 - return; 39.99 - } 39.100 - 39.101 - 39.102 -/* "ext" designates that it's for use outside the VMS system -- should only 39.103 - * be called from main thread or other thread -- never from code animated by 39.104 - * a SlaveVP, nor from a masterVP. 39.105 - * 39.106 - *Use this version to dissipate Slvs created outside the VMS system. 39.107 - */ 39.108 -void 39.109 -VMS_ext__dissipate_slaveVP( SlaveVP *slaveToDissipate ) 39.110 - { 39.111 - _VMSMasterEnv->numSlavesAlive -= 1; 39.112 - if( _VMSMasterEnv->numSlavesAlive == 0 ) 39.113 - { //no more work, so shutdown 39.114 - VMS_SS__shutdown(); //note, creates shut-down slaves on each core 39.115 - } 39.116 - 39.117 - //NOTE: dataParam was given to the processor, so should either have 39.118 - // been alloc'd with VMS_int__malloc, or freed by the level above animSlv. 39.119 - //So, all that's left to free here is the stack and the SlaveVP struc 39.120 - // itself 39.121 - //Note, should not stack-allocate the data param -- no guarantee, in 39.122 - // general that creating processor will outlive ones it creates. 39.123 - free( slaveToDissipate->startOfStack ); 39.124 - free( slaveToDissipate ); 39.125 - } 39.126 - 39.127 - 39.128 - 39.129 -/*This must be called by the request handler plugin -- it cannot be called 39.130 - * from the semantic library "dissipate processor" function -- instead, the 39.131 - * semantic layer has to generate a request, and the plug-in calls this 39.132 - * function. 39.133 - *The reason is that this frees the virtual processor's stack -- which is 39.134 - * still in use inside semantic library calls! 39.135 - * 39.136 - *This frees or recycles all the state owned by and comprising the VMS 39.137 - * portion of the animating virtual procr. The request handler must first 39.138 - * free any semantic data created for the processor that didn't use the 39.139 - * VMS_malloc mechanism. Then it calls this, which first asks the malloc 39.140 - * system to disown any state that did use VMS_malloc, and then frees the 39.141 - * statck and the processor-struct itself. 39.142 - *If the dissipated processor is the sole (remaining) owner of VMS_int__malloc'd 39.143 - * state, then that state gets freed (or sent to recycling) as a side-effect 39.144 - * of dis-owning it. 39.145 - */ 39.146 -void 39.147 -VMS_int__dissipate_slaveVP( SlaveVP *animatingSlv ) 39.148 - { 39.149 - DEBUG__printf2(dbgRqstHdlr, "VMS int dissipate slaveID: %d, alive: %d",animatingSlv->slaveID, _VMSMasterEnv->numSlavesAlive-1); 39.150 - //dis-own all locations owned by this processor, causing to be freed 39.151 - // any locations that it is (was) sole owner of 39.152 - _VMSMasterEnv->numSlavesAlive -= 1; 39.153 - if( _VMSMasterEnv->numSlavesAlive == 0 ) 39.154 - { //no more work, so shutdown 39.155 - VMS_SS__shutdown(); //note, creates shut-down processor on each core 39.156 - } 39.157 - 39.158 - //NOTE: dataParam was given to the processor, so should either have 39.159 - // been alloc'd with VMS_int__malloc, or freed by the level above animSlv. 39.160 - //So, all that's left to free here is the stack and the SlaveVP struc 39.161 - // itself 39.162 - //Note, should not stack-allocate initial data -- no guarantee, in 39.163 - // general that creating processor will outlive ones it creates. 39.164 - VMS_int__free( animatingSlv->startOfStack ); 39.165 - VMS_int__free( animatingSlv ); 39.166 - } 39.167 - 39.168 -/*Anticipating multi-tasking 39.169 - */ 39.170 -void * 39.171 -VMS_int__give_sem_env_for( SlaveVP *animSlv ) 39.172 - { 39.173 - return _VMSMasterEnv->semanticEnv; 39.174 - } 39.175 - 39.176 -/* 39.177 - * 39.178 - */ 39.179 -inline SlaveVP * 39.180 -VMS_int__create_slaveVP_helper( SlaveVP *newSlv, TopLevelFnPtr fnPtr, 39.181 - void *dataParam, void *stackLocs ) 39.182 - { 39.183 - newSlv->startOfStack = stackLocs; 39.184 - newSlv->slaveID = _VMSMasterEnv->numSlavesCreated++; 39.185 - newSlv->requests = NULL; 39.186 - newSlv->animSlotAssignedTo = NULL; 39.187 - newSlv->typeOfVP = Slave; 39.188 - newSlv->assignCount = 0; 39.189 - 39.190 - VMS_int__reset_slaveVP_to_TopLvlFn( newSlv, fnPtr, dataParam ); 39.191 - 39.192 - //============================= MEASUREMENT STUFF ======================== 39.193 - #ifdef PROBES__TURN_ON_STATS_PROBES 39.194 - //TODO: make this TSCHiLow or generic equivalent 39.195 - //struct timeval timeStamp; 39.196 - //gettimeofday( &(timeStamp), NULL); 39.197 - //newSlv->createPtInSecs = timeStamp.tv_sec +(timeStamp.tv_usec/1000000.0) - 39.198 - // _VMSMasterEnv->createPtInSecs; 39.199 - #endif 39.200 - //======================================================================== 39.201 - 39.202 - return newSlv; 39.203 - } 39.204 - 39.205 - 39.206 -/*Later, improve this -- for now, just exits the application after printing 39.207 - * the error message. 39.208 - */ 39.209 -void 39.210 -VMS_int__throw_exception( char *msgStr, SlaveVP *reqstSlv, VMSExcp *excpData ) 39.211 - { 39.212 - printf("%s",msgStr); 39.213 - fflush(stdin); 39.214 - exit(1); 39.215 - } 39.216 - 39.217 - 39.218 -inline char * 39.219 -VMS_int__strDup( char *str ) 39.220 - { char *retStr; 39.221 - 39.222 - if( str == NULL ) return (char *)NULL; 39.223 - retStr = (char *)VMS_int__malloc( strlen(str) + 1 ); 39.224 - strcpy( retStr, str ); 39.225 - 39.226 - return (char *)retStr; 39.227 - } 39.228 - 39.229 - 39.230 -inline void 39.231 -VMS_int__backoff_for_TooLongToGetLock( int32 numTriesToGetLock ); 39.232 - 39.233 -inline void 39.234 -VMS_int__get_master_lock() 39.235 - { int32 *addrOfMasterLock; 39.236 - 39.237 - addrOfMasterLock = &(_VMSMasterEnv->masterLock); 39.238 - 39.239 - int numTriesToGetLock = 0; 39.240 - int gotLock = 0; 39.241 - 39.242 - MEAS__Capture_Pre_Master_Lock_Point; 39.243 - 39.244 - while( !gotLock ) //keep going until get master lock 39.245 - { 39.246 - numTriesToGetLock++; //if too many, means too much contention 39.247 - if( numTriesToGetLock > NUM_TRIES_BEFORE_DO_BACKOFF ) 39.248 - { VMS_int__backoff_for_TooLongToGetLock( numTriesToGetLock ); 39.249 - } 39.250 - if( numTriesToGetLock > MASTERLOCK_RETRIES_BEFORE_YIELD ) 39.251 - { numTriesToGetLock = 0; 39.252 - pthread_yield(); 39.253 - } 39.254 - 39.255 - //try to get the lock 39.256 - gotLock = __sync_bool_compare_and_swap( addrOfMasterLock, 39.257 - UNLOCKED, LOCKED ); 39.258 - } 39.259 - MEAS__Capture_Post_Master_Lock_Point; 39.260 - } 39.261 - 39.262 -/*Used by the backoff to pick a random amount of busy-wait. Can't use the 39.263 - * system rand because it takes much too long. 39.264 - *Note, are passing pointers to the seeds, which are then modified 39.265 - */ 39.266 -inline uint32_t 39.267 -VMS_int__randomNumber() 39.268 - { 39.269 - _VMSMasterEnv->seed1 = 36969 * (_VMSMasterEnv->seed1 & 65535) + 39.270 - (_VMSMasterEnv->seed1 >> 16); 39.271 - _VMSMasterEnv->seed2 = 18000 * (_VMSMasterEnv->seed2 & 65535) + 39.272 - (_VMSMasterEnv->seed2 >> 16); 39.273 - return (_VMSMasterEnv->seed1 << 16) + _VMSMasterEnv->seed2; 39.274 - } 39.275 - 39.276 - 39.277 -/*Busy-waits for a random number of cycles -- chooses number of cycles 39.278 - * differently than for the no-work backoff 39.279 - */ 39.280 -inline void 39.281 -VMS_int__backoff_for_TooLongToGetLock( int32 numTriesToGetLock ) 39.282 - { int32 i, waitIterations; 39.283 - volatile double fakeWorkVar; //busy-wait fake work 39.284 - 39.285 - waitIterations = 39.286 - VMS_int__randomNumber()% (numTriesToGetLock * GET_LOCK_BACKOFF_WEIGHT); 39.287 - //addToHist( wait_iterations, coreLoopThdParams->wait_iterations_hist ); 39.288 - for( i = 0; i < waitIterations; i++ ) 39.289 - { fakeWorkVar += (fakeWorkVar + 32.0) / 2.0; //busy-wait 39.290 - } 39.291 - } 39.292 -
40.1 --- a/VMS__startup_and_shutdown.c Mon Sep 03 03:34:54 2012 -0700 40.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 40.3 @@ -1,598 +0,0 @@ 40.4 -/* 40.5 - * Copyright 2010 OpenSourceStewardshipFoundation 40.6 - * 40.7 - * Licensed under BSD 40.8 - */ 40.9 - 40.10 -#include <stdio.h> 40.11 -#include <stdlib.h> 40.12 -#include <string.h> 40.13 -#include <malloc.h> 40.14 -#include <inttypes.h> 40.15 -#include <sys/time.h> 40.16 -#include <pthread.h> 40.17 - 40.18 -#include "VMS.h" 40.19 - 40.20 - 40.21 -#define thdAttrs NULL 40.22 - 40.23 - 40.24 -/* MEANING OF WL PI SS int 40.25 - * These indicate which places the function is safe to use. They stand for: 40.26 - * WL: Wrapper Library 40.27 - * PI: Plugin 40.28 - * SS: Startup and Shutdown 40.29 - * int: internal to the VMS implementation 40.30 - */ 40.31 - 40.32 - 40.33 -//=========================================================================== 40.34 -AnimSlot ** 40.35 -create_anim_slots( int32 coreSlotsAreOn ); 40.36 - 40.37 -void 40.38 -create_masterEnv(); 40.39 - 40.40 -void 40.41 -create_the_coreCtlr_OS_threads(); 40.42 - 40.43 -MallocProlog * 40.44 -create_free_list(); 40.45 - 40.46 -void 40.47 -endOSThreadFn( void *initData, SlaveVP *animatingSlv ); 40.48 - 40.49 - 40.50 -//=========================================================================== 40.51 - 40.52 -/*Setup has two phases: 40.53 - * 1) Semantic layer first calls init_VMS, which creates masterEnv, and puts 40.54 - * the master Slv into the work-queue, ready for first "call" 40.55 - * 2) Semantic layer then does its own init, which creates the seed virt 40.56 - * slave inside the semantic layer, ready to assign it when 40.57 - * asked by the first run of the animationMaster. 40.58 - * 40.59 - *This part is bit weird because VMS really wants to be "always there", and 40.60 - * have applications attach and detach.. for now, this VMS is part of 40.61 - * the app, so the VMS system starts up as part of running the app. 40.62 - * 40.63 - *The semantic layer is isolated from the VMS internals by making the 40.64 - * semantic layer do setup to a state that it's ready with its 40.65 - * initial Slvs, ready to assign them to slots when the animationMaster 40.66 - * asks. Without this pattern, the semantic layer's setup would 40.67 - * have to modify slots directly to assign the initial virt-procrs, and put 40.68 - * them into the readyToAnimateQ itself, breaking the isolation completely. 40.69 - * 40.70 - * 40.71 - *The semantic layer creates the initial Slv(s), and adds its 40.72 - * own environment to masterEnv, and fills in the pointers to 40.73 - * the requestHandler and slaveAssigner plug-in functions 40.74 - */ 40.75 - 40.76 -/*This allocates VMS data structures, populates the master VMSProc, 40.77 - * and master environment, and returns the master environment to the semantic 40.78 - * layer. 40.79 - */ 40.80 -void 40.81 -VMS_SS__init() 40.82 - { 40.83 - #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 40.84 - create_masterEnv(); 40.85 - printf( "\n\n Running in SEQUENTIAL mode \n\n" ); 40.86 - #else 40.87 - create_masterEnv(); 40.88 - DEBUG__printf1(TRUE,"Offset of lock in masterEnv: %d ", (int32)offsetof(MasterEnv,masterLock) ); 40.89 - create_the_coreCtlr_OS_threads(); 40.90 - #endif 40.91 - } 40.92 - 40.93 - 40.94 -/*TODO: finish implementing 40.95 - *This function returns information about the version of VMS, the language 40.96 - * the program is being run in, its version, and information on the 40.97 - * hardware. 40.98 - */ 40.99 -/* 40.100 -char * 40.101 -VMS_App__give_environment_string() 40.102 - { 40.103 - //-------------------------- 40.104 - fprintf(output, "#\n# >> Build information <<\n"); 40.105 - fprintf(output, "# GCC VERSION: %d.%d.%d\n",__GNUC__,__GNUC_MINOR__,__GNUC_PATCHLEVEL__); 40.106 - fprintf(output, "# Build Date: %s %s\n", __DATE__, __TIME__); 40.107 - 40.108 - fprintf(output, "#\n# >> Hardware information <<\n"); 40.109 - fprintf(output, "# Hardware Architecture: "); 40.110 - #ifdef __x86_64 40.111 - fprintf(output, "x86_64"); 40.112 - #endif //__x86_64 40.113 - #ifdef __i386 40.114 - fprintf(output, "x86"); 40.115 - #endif //__i386 40.116 - fprintf(output, "\n"); 40.117 - fprintf(output, "# Number of Cores: %d\n", NUM_CORES); 40.118 - //-------------------------- 40.119 - 40.120 - //VMS Plugins 40.121 - fprintf(output, "#\n# >> VMS Plugins <<\n"); 40.122 - fprintf(output, "# Language : "); 40.123 - fprintf(output, _LANG_NAME_); 40.124 - fprintf(output, "\n"); 40.125 - //Meta info gets set by calls from the language during its init, 40.126 - // and info registered by calls from inside the application 40.127 - fprintf(output, "# Assigner: %s\n", _VMSMasterEnv->metaInfo->assignerInfo); 40.128 - 40.129 - //-------------------------- 40.130 - //Application 40.131 - fprintf(output, "#\n# >> Application <<\n"); 40.132 - fprintf(output, "# Name: %s\n", _VMSMasterEnv->metaInfo->appInfo); 40.133 - fprintf(output, "# Data Set:\n%s\n",_VMSMasterEnv->metaInfo->inputSet); 40.134 - 40.135 - //-------------------------- 40.136 - } 40.137 - */ 40.138 - 40.139 -/*This structure holds all the information VMS needs to manage a program. VMS 40.140 - * stores information about what percent of CPU time the program is getting, what 40.141 - * language it uses, the request handlers to call for its slaves, and so on. 40.142 - */ 40.143 -/* 40.144 -typedef struct 40.145 - { void *semEnv; 40.146 - RequestHdlrFnPtr requestHandler; 40.147 - SlaveAssignerFnPtr slaveAssigner; 40.148 - int32 numSlavesLive; 40.149 - void *resultToReturn; 40.150 - 40.151 - TopLevelFnPtr seedFnPtr; 40.152 - void *dataForSeed; 40.153 - bool32 executionIsComplete; 40.154 - pthread_mutex_t doneLock; 40.155 - pthread_cond_t doneCond; 40.156 - } 40.157 -VMSProcess; 40.158 -*/ 40.159 - 40.160 - 40.161 -/* 40.162 -void 40.163 -VMS_App__start_VMS_running() 40.164 - { 40.165 - create_masterEnv(); 40.166 - 40.167 - #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 40.168 - //Nothing else to create for sequential mode 40.169 - #else 40.170 - create_the_coreCtlr_OS_threads(); 40.171 - #endif 40.172 - } 40.173 -*/ 40.174 - 40.175 -/*A pointer to the startup-function for the language is given as the last 40.176 - * argument to the call. Use this to initialize a program in the language. 40.177 - * This creates a data structure that encapsulates the bookkeeping info 40.178 - * VMS uses to track and schedule a program run. 40.179 - */ 40.180 -/* 40.181 -VMSProcess * 40.182 -VMS_App__spawn_program_on_data_in_Lang( TopLevelFnPtr prog_seed_fn, void *data, 40.183 - LangInitFnPtr langInitFnPtr ) 40.184 - { VMSProcess *newProcess; 40.185 - newProcess = malloc( sizeof(VMSProcess) ); 40.186 - newProcess->doneLock = PTHREAD_MUTEX_INITIALIZER; 40.187 - newProcess->doneCond = PTHREAD_COND_INITIALIZER; 40.188 - newProcess->executionIsComplete = FALSE; 40.189 - newProcess->numSlavesLive = 0; 40.190 - 40.191 - newProcess->dataForSeed = data; 40.192 - newProcess->seedFnPtr = prog_seed_fn; 40.193 - 40.194 - //The language's spawn-process function fills in the plugin function-ptrs in 40.195 - // the VMSProcess struct, gives the struct to VMS, which then makes and 40.196 - // queues the seed SlaveVP, which starts processors made from the code being 40.197 - // animated. 40.198 - 40.199 - (*langInitFnPtr)( newProcess ); 40.200 - 40.201 - return newProcess; 40.202 - } 40.203 -*/ 40.204 - 40.205 -/*When all SlaveVPs owned by the program-run associated to the process have 40.206 - * dissipated, then return from this call. There is no language to cleanup, 40.207 - * and VMS does not shutdown.. but the process bookkeeping structure, 40.208 - * which is used by VMS to track and schedule the program, is freed. 40.209 - *The VMSProcess structure is kept until this call collects the results from it, 40.210 - * then freed. If the process is not done yet when VMS gets this 40.211 - * call, then this call waits.. the challenge here is that this call comes from 40.212 - * a live OS thread that's outside VMS.. so, inside here, it waits on a 40.213 - * condition.. then it's a VMS thread that signals this to wake up.. 40.214 - *First checks whether the process is done, if yes, calls the clean-up fn then 40.215 - * returns the result extracted from the VMSProcess struct. 40.216 - *If process not done yet, then performs a wait (in a loop to be sure the 40.217 - * wakeup is not spurious, which can happen). VMS registers the wait, and upon 40.218 - * the process ending (last SlaveVP owned by it dissipates), then VMS signals 40.219 - * this to wakeup. This then calls the cleanup fn and returns the result. 40.220 - */ 40.221 -/* 40.222 -void * 40.223 -VMS_App__give_results_when_done_for( VMSProcess *process ) 40.224 - { void *result; 40.225 - 40.226 - pthread_mutex_lock( process->doneLock ); 40.227 - while( !(process->executionIsComplete) ) 40.228 - { 40.229 - pthread_cond_wait( process->doneCond, 40.230 - process->doneLock ); 40.231 - } 40.232 - pthread_mutex_unlock( process->doneLock ); 40.233 - 40.234 - result = process->resultToReturn; 40.235 - 40.236 - VMS_int__cleanup_process_after_done( process ); 40.237 - free( process ); //was malloc'd above, so free it here 40.238 - 40.239 - return result; 40.240 - } 40.241 -*/ 40.242 - 40.243 -/*Turns off the VMS system, and frees all data associated with it. Does this 40.244 - * by creating shutdown SlaveVPs and inserting them into animation slots. 40.245 - * Will probably have to wake up sleeping cores as part of this -- the fn that 40.246 - * inserts the new SlaveVPs should handle the wakeup.. 40.247 - */ 40.248 -/* 40.249 -void 40.250 -VMS_SS__shutdown(); //already defined -- look at it 40.251 - 40.252 -void 40.253 -VMS_App__shutdown() 40.254 - { 40.255 - for( cores ) 40.256 - { slave = VMS_int__create_new_SlaveVP( endOSThreadFn, NULL ); 40.257 - VMS_int__insert_slave_onto_core( SlaveVP *slave, coreNum ); 40.258 - } 40.259 - } 40.260 -*/ 40.261 - 40.262 -/* VMS_App__start_VMS_running(); 40.263 - 40.264 - VMSProcess matrixMultProcess; 40.265 - 40.266 - matrixMultProcess = 40.267 - VMS_App__spawn_program_on_data_in_Lang( &prog_seed_fn, data, Vthread_lang ); 40.268 - 40.269 - resMatrix = VMS_App__give_results_when_done_for( matrixMultProcess ); 40.270 - 40.271 - VMS_App__shutdown(); 40.272 - */ 40.273 - 40.274 -void 40.275 -create_masterEnv() 40.276 - { MasterEnv *masterEnv; 40.277 - VMSQueueStruc **readyToAnimateQs; 40.278 - int coreIdx; 40.279 - SlaveVP **masterVPs; 40.280 - AnimSlot ***allAnimSlots; //ptr to array of ptrs 40.281 - 40.282 - 40.283 - //Make the master env, which holds everything else 40.284 - _VMSMasterEnv = malloc( sizeof(MasterEnv) ); 40.285 - 40.286 - //Very first thing put into the master env is the free-list, seeded 40.287 - // with a massive initial chunk of memory. 40.288 - //After this, all other mallocs are VMS__malloc. 40.289 - _VMSMasterEnv->freeLists = VMS_ext__create_free_list(); 40.290 - 40.291 - 40.292 - //===================== Only VMS__malloc after this ==================== 40.293 - masterEnv = (MasterEnv*)_VMSMasterEnv; 40.294 - 40.295 - //Make a readyToAnimateQ for each core controller 40.296 - readyToAnimateQs = VMS_int__malloc( NUM_CORES * sizeof(VMSQueueStruc *) ); 40.297 - masterVPs = VMS_int__malloc( NUM_CORES * sizeof(SlaveVP *) ); 40.298 - 40.299 - //One array for each core, several in array, core's masterVP scheds all 40.300 - allAnimSlots = VMS_int__malloc( NUM_CORES * sizeof(AnimSlot *) ); 40.301 - 40.302 - _VMSMasterEnv->numSlavesAlive = 0; //used to detect shut-down condition 40.303 - 40.304 - _VMSMasterEnv->numSlavesCreated = 0; //used by create slave to set ID 40.305 - for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 40.306 - { 40.307 - readyToAnimateQs[ coreIdx ] = makeVMSQ(); 40.308 - 40.309 - //Q: should give masterVP core-specific info as its init data? 40.310 - masterVPs[ coreIdx ] = VMS_int__create_slaveVP( (TopLevelFnPtr)&animationMaster, (void*)masterEnv ); 40.311 - masterVPs[ coreIdx ]->coreAnimatedBy = coreIdx; 40.312 - masterVPs[ coreIdx ]->typeOfVP = Master; 40.313 - allAnimSlots[ coreIdx ] = create_anim_slots( coreIdx ); //makes for one core 40.314 - } 40.315 - _VMSMasterEnv->masterVPs = masterVPs; 40.316 - _VMSMasterEnv->masterLock = UNLOCKED; 40.317 - _VMSMasterEnv->seed1 = rand()%1000; // init random number generator 40.318 - _VMSMasterEnv->seed2 = rand()%1000; // init random number generator 40.319 - _VMSMasterEnv->allAnimSlots = allAnimSlots; 40.320 - _VMSMasterEnv->measHistsInfo = NULL; 40.321 - 40.322 - //============================= MEASUREMENT STUFF ======================== 40.323 - 40.324 - MEAS__Make_Meas_Hists_for_Susp_Meas; 40.325 - MEAS__Make_Meas_Hists_for_Master_Meas; 40.326 - MEAS__Make_Meas_Hists_for_Master_Lock_Meas; 40.327 - MEAS__Make_Meas_Hists_for_Malloc_Meas; 40.328 - MEAS__Make_Meas_Hists_for_Plugin_Meas; 40.329 - MEAS__Make_Meas_Hists_for_Language; 40.330 - 40.331 - PROBES__Create_Probe_Bookkeeping_Vars; 40.332 - 40.333 - HOLISTIC__Setup_Perf_Counters; 40.334 - 40.335 - //======================================================================== 40.336 - } 40.337 - 40.338 -AnimSlot ** 40.339 -create_anim_slots( int32 coreSlotsAreOn ) 40.340 - { AnimSlot **animSlots; 40.341 - int i; 40.342 - 40.343 - animSlots = VMS_int__malloc( NUM_ANIM_SLOTS * sizeof(AnimSlot *) ); 40.344 - 40.345 - for( i = 0; i < NUM_ANIM_SLOTS; i++ ) 40.346 - { 40.347 - animSlots[i] = VMS_int__malloc( sizeof(AnimSlot) ); 40.348 - 40.349 - //Set state to mean "handling requests done, slot needs filling" 40.350 - animSlots[i]->workIsDone = FALSE; 40.351 - animSlots[i]->needsSlaveAssigned = TRUE; 40.352 - animSlots[i]->slotIdx = i; //quick retrieval of slot pos 40.353 - animSlots[i]->coreSlotIsOn = coreSlotsAreOn; 40.354 - } 40.355 - return animSlots; 40.356 - } 40.357 - 40.358 - 40.359 -void 40.360 -freeAnimSlots( AnimSlot **animSlots ) 40.361 - { int i; 40.362 - for( i = 0; i < NUM_ANIM_SLOTS; i++ ) 40.363 - { 40.364 - VMS_int__free( animSlots[i] ); 40.365 - } 40.366 - VMS_int__free( animSlots ); 40.367 - } 40.368 - 40.369 - 40.370 -void 40.371 -create_the_coreCtlr_OS_threads() 40.372 - { 40.373 - //======================================================================== 40.374 - // Create the Threads 40.375 - int coreIdx, retCode; 40.376 - 40.377 - //Need the threads to be created suspended, and wait for a signal 40.378 - // before proceeding -- gives time after creating to initialize other 40.379 - // stuff before the coreCtlrs set off. 40.380 - _VMSMasterEnv->setupComplete = 0; 40.381 - 40.382 - //initialize the cond used to make the new threads wait and sync up 40.383 - //must do this before *creating* the threads.. 40.384 - pthread_mutex_init( &suspendLock, NULL ); 40.385 - pthread_cond_init( &suspendCond, NULL ); 40.386 - 40.387 - //Make the threads that animate the core controllers 40.388 - for( coreIdx=0; coreIdx < NUM_CORES; coreIdx++ ) 40.389 - { coreCtlrThdParams[coreIdx] = VMS_int__malloc( sizeof(ThdParams) ); 40.390 - coreCtlrThdParams[coreIdx]->coreNum = coreIdx; 40.391 - 40.392 - retCode = 40.393 - pthread_create( &(coreCtlrThdHandles[coreIdx]), 40.394 - thdAttrs, 40.395 - &coreController, 40.396 - (void *)(coreCtlrThdParams[coreIdx]) ); 40.397 - if(retCode){printf("ERROR creating thread: %d\n", retCode); exit(1);} 40.398 - } 40.399 - } 40.400 - 40.401 - 40.402 - 40.403 -void 40.404 -VMS_SS__register_request_handler( RequestHandler requestHandler ) 40.405 - { _VMSMasterEnv->requestHandler = requestHandler; 40.406 - } 40.407 - 40.408 - 40.409 -void 40.410 -VMS_SS__register_anim_assigner( SlaveAssigner animAssigner ) 40.411 - { _VMSMasterEnv->slaveAssigner = animAssigner; 40.412 - } 40.413 - 40.414 -VMS_SS__register_semantic_env( void *semanticEnv ) 40.415 - { _VMSMasterEnv->semanticEnv = semanticEnv; 40.416 - } 40.417 - 40.418 - 40.419 -/*This is what causes the VMS system to initialize.. then waits for it to 40.420 - * exit. 40.421 - * 40.422 - *Wrapper lib layer calls this when it wants the system to start running.. 40.423 - */ 40.424 -void 40.425 -VMS_SS__start_the_work_then_wait_until_done() 40.426 - { 40.427 -#ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 40.428 - /*Only difference between version with an OS thread pinned to each core and 40.429 - * the sequential version of VMS is VMS__init_Seq, this, and coreCtlr_Seq. 40.430 - */ 40.431 - //Instead of un-suspending threads, just call the one and only 40.432 - // core ctlr (sequential version), in the main thread. 40.433 - coreCtlr_Seq( NULL ); 40.434 - flushRegisters(); 40.435 -#else 40.436 - int coreIdx; 40.437 - //Start the core controllers running 40.438 - 40.439 - //tell the core controller threads that setup is complete 40.440 - //get lock, to lock out any threads still starting up -- they'll see 40.441 - // that setupComplete is true before entering while loop, and so never 40.442 - // wait on the condition 40.443 - pthread_mutex_lock( &suspendLock ); 40.444 - _VMSMasterEnv->setupComplete = 1; 40.445 - pthread_mutex_unlock( &suspendLock ); 40.446 - pthread_cond_broadcast( &suspendCond ); 40.447 - 40.448 - 40.449 - //wait for all to complete 40.450 - for( coreIdx=0; coreIdx < NUM_CORES; coreIdx++ ) 40.451 - { 40.452 - pthread_join( coreCtlrThdHandles[coreIdx], NULL ); 40.453 - } 40.454 - 40.455 - //NOTE: do not clean up VMS env here -- semantic layer has to have 40.456 - // a chance to clean up its environment first, then do a call to free 40.457 - // the Master env and rest of VMS locations 40.458 -#endif 40.459 - } 40.460 - 40.461 - 40.462 -SlaveVP* VMS_SS__create_shutdown_slave(){ 40.463 - SlaveVP* shutdownVP; 40.464 - 40.465 - shutdownVP = VMS_int__create_slaveVP( &endOSThreadFn, NULL ); 40.466 - shutdownVP->typeOfVP = Shutdown; 40.467 - 40.468 - return shutdownVP; 40.469 -} 40.470 - 40.471 -//TODO: look at architecting cleanest separation between request handler 40.472 -// and animation master, for dissipate, create, shutdown, and other non-semantic 40.473 -// requests. Issue is chain: one removes requests from AppSlv, one dispatches 40.474 -// on type of request, and one handles each type.. but some types require 40.475 -// action from both request handler and animation master -- maybe just give the 40.476 -// request handler calls like: VMS__handle_X_request_type 40.477 - 40.478 - 40.479 -/*This is called by the semantic layer's request handler when it decides its 40.480 - * time to shut down the VMS system. Calling this causes the core controller OS 40.481 - * threads to exit, which unblocks the entry-point function that started up 40.482 - * VMS, and allows it to grab the result and return to the original single- 40.483 - * threaded application. 40.484 - * 40.485 - *The _VMSMasterEnv is needed by this shut down function, so the create-seed- 40.486 - * and-wait function has to free a bunch of stuff after it detects the 40.487 - * threads have all died: the masterEnv, the thread-related locations, 40.488 - * masterVP any AppSlvs that might still be allocated and sitting in the 40.489 - * semantic environment, or have been orphaned in the _VMSWorkQ. 40.490 - * 40.491 - *NOTE: the semantic plug-in is expected to use VMS__malloc to get all the 40.492 - * locations it needs, and give ownership to masterVP. Then, they will be 40.493 - * automatically freed. 40.494 - * 40.495 - *In here,create one core-loop shut-down processor for each core controller and put 40.496 - * them all directly into the readyToAnimateQ. 40.497 - *Note, this function can ONLY be called after the semantic environment no 40.498 - * longer cares if AppSlvs get animated after the point this is called. In 40.499 - * other words, this can be used as an abort, or else it should only be 40.500 - * called when all AppSlvs have finished dissipate requests -- only at that 40.501 - * point is it sure that all results have completed. 40.502 - */ 40.503 -void 40.504 -VMS_SS__shutdown() 40.505 - { int32 coreIdx; 40.506 - SlaveVP *shutDownSlv; 40.507 - AnimSlot **animSlots; 40.508 - //create the shutdown processors, one for each core controller -- put them 40.509 - // directly into the Q -- each core will die when gets one 40.510 - for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 40.511 - { //Note, this is running in the master 40.512 - shutDownSlv = VMS_SS__create_shutdown_slave(); 40.513 - //last slave has dissipated, so no more in slots, so write 40.514 - // shut down slave into first animulng slot. 40.515 - animSlots = _VMSMasterEnv->allAnimSlots[ coreIdx ]; 40.516 - animSlots[0]->slaveAssignedToSlot = shutDownSlv; 40.517 - animSlots[0]->needsSlaveAssigned = FALSE; 40.518 - shutDownSlv->coreAnimatedBy = coreIdx; 40.519 - shutDownSlv->animSlotAssignedTo = animSlots[ 0 ]; 40.520 - } 40.521 - } 40.522 - 40.523 - 40.524 -/*Am trying to be cute, avoiding IF statement in coreCtlr that checks for 40.525 - * a special shutdown slaveVP. Ended up with extra-complex shutdown sequence. 40.526 - *This function has the sole purpose of setting the stack and framePtr 40.527 - * to the coreCtlr's stack and framePtr.. it does that then jumps to the 40.528 - * core ctlr's shutdown point -- might be able to just call Pthread_exit 40.529 - * from here, but am going back to the pthread's stack and setting everything 40.530 - * up just as if it never jumped out, before calling pthread_exit. 40.531 - *The end-point of core ctlr will free the stack and so forth of the 40.532 - * processor that animates this function, (this fn is transfering the 40.533 - * animator of the AppSlv that is in turn animating this function over 40.534 - * to core controller function -- note that this slices out a level of virtual 40.535 - * processors). 40.536 - */ 40.537 -void 40.538 -endOSThreadFn( void *initData, SlaveVP *animatingSlv ) 40.539 - { 40.540 - #ifdef DEBUG__TURN_ON_SEQUENTIAL_MODE 40.541 - asmTerminateCoreCtlrSeq(animatingSlv); 40.542 - #else 40.543 - asmTerminateCoreCtlr(animatingSlv); 40.544 - #endif 40.545 - } 40.546 - 40.547 - 40.548 -/*This is called from the startup & shutdown 40.549 - */ 40.550 -void 40.551 -VMS_SS__cleanup_at_end_of_shutdown() 40.552 - { 40.553 - //Before getting rid of everything, print out any measurements made 40.554 - if( _VMSMasterEnv->measHistsInfo != NULL ) 40.555 - { forAllInDynArrayDo( _VMSMasterEnv->measHistsInfo, (DynArrayFnPtr)&printHist ); 40.556 - forAllInDynArrayDo( _VMSMasterEnv->measHistsInfo, (DynArrayFnPtr)&saveHistToFile); 40.557 - forAllInDynArrayDo( _VMSMasterEnv->measHistsInfo, (DynArrayFnPtr)&freeHist ); 40.558 - } 40.559 - 40.560 - MEAS__Print_Hists_for_Susp_Meas; 40.561 - MEAS__Print_Hists_for_Master_Meas; 40.562 - MEAS__Print_Hists_for_Master_Lock_Meas; 40.563 - MEAS__Print_Hists_for_Malloc_Meas; 40.564 - MEAS__Print_Hists_for_Plugin_Meas; 40.565 - 40.566 - 40.567 - //All the environment data has been allocated with VMS__malloc, so just 40.568 - // free its internal big-chunk and all inside it disappear. 40.569 -/* 40.570 - readyToAnimateQs = _VMSMasterEnv->readyToAnimateQs; 40.571 - masterVPs = _VMSMasterEnv->masterVPs; 40.572 - allAnimSlots = _VMSMasterEnv->allAnimSlots; 40.573 - 40.574 - for( coreIdx = 0; coreIdx < NUM_CORES; coreIdx++ ) 40.575 - { 40.576 - freeVMSQ( readyToAnimateQs[ coreIdx ] ); 40.577 - //master Slvs were created external to VMS, so use external free 40.578 - VMS_int__dissipate_slaveVP( masterVPs[ coreIdx ] ); 40.579 - 40.580 - freeAnimSlots( allAnimSlots[ coreIdx ] ); 40.581 - } 40.582 - 40.583 - VMS_int__free( _VMSMasterEnv->readyToAnimateQs ); 40.584 - VMS_int__free( _VMSMasterEnv->masterVPs ); 40.585 - VMS_int__free( _VMSMasterEnv->allAnimSlots ); 40.586 - 40.587 - //============================= MEASUREMENT STUFF ======================== 40.588 - #ifdef PROBES__TURN_ON_STATS_PROBES 40.589 - freeDynArrayDeep( _VMSMasterEnv->dynIntervalProbesInfo, &VMS_WL__free_probe); 40.590 - #endif 40.591 - //======================================================================== 40.592 -*/ 40.593 - //These are the only two that use system free 40.594 - VMS_ext__free_free_list( _VMSMasterEnv->freeLists ); 40.595 - free( (void *)_VMSMasterEnv ); 40.596 - } 40.597 - 40.598 - 40.599 -//================================ 40.600 - 40.601 -
41.1 --- a/VMS_primitive_data_types.h Mon Sep 03 03:34:54 2012 -0700 41.2 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 41.3 @@ -1,42 +0,0 @@ 41.4 -/* 41.5 - * Copyright 2009 OpenSourceStewardshipFoundation.org 41.6 - * Licensed under GNU General Public License version 2 41.7 - * 41.8 - * Author: seanhalle@yahoo.com 41.9 - * 41.10 - 41.11 - */ 41.12 - 41.13 -#ifndef _PRIMITIVE_DATA_TYPES_H 41.14 -#define _PRIMITIVE_DATA_TYPES_H 41.15 - 41.16 - 41.17 -/*For portability, need primitive data types that have a well defined 41.18 - * size, and well-defined layout into bytes 41.19 - *To do this, provide standard aliases for all primitive data types 41.20 - *These aliases must be used in all functions instead of the ANSI types 41.21 - * 41.22 - *When VMS is used together with BLIS, these definitions will be replaced 41.23 - * inside each specialization module according to the compiler used in 41.24 - * that module and the hardware being specialized to. 41.25 - */ 41.26 -typedef char bool8; 41.27 -typedef char int8; 41.28 -typedef char uint8; 41.29 -typedef short int16; 41.30 -typedef unsigned short uint16; 41.31 -typedef int int32; 41.32 -typedef unsigned int uint32; 41.33 -typedef unsigned int bool32; 41.34 -typedef long long int64; 41.35 -typedef unsigned long long uint64; 41.36 -typedef float float32; 41.37 -typedef double float64; 41.38 -//typedef double double float128; //GCC doesn't like this 41.39 -#define float128 double double 41.40 - 41.41 -#define TRUE 1 41.42 -#define FALSE 0 41.43 - 41.44 -#endif /* _PRIMITIVE_DATA_TYPES_H */ 41.45 -
42.1 --- a/__README__Code_Overview.txt Mon Sep 03 03:34:54 2012 -0700 42.2 +++ b/__README__Code_Overview.txt Wed Sep 19 23:12:44 2012 -0700 42.3 @@ -1,21 +1,21 @@ 42.4 42.5 -This file is intended to help those new to VMS to find their way around the code. 42.6 +This file is intended to help those new to PR to find their way around the code. 42.7 42.8 Some observations: 42.9 --] VMS.h is the top header file, and is the root of a tree of #includes that pulls in all the other headers 42.10 +-] PR.h is the top header file, and is the root of a tree of #includes that pulls in all the other headers 42.11 42.12 -] Defines directory contains all the header files that hold #define statements 42.13 42.14 --] VMS has several kinds of function, grouped according to what kind of code should call them: VMS_App_.. for applications to call, VMS_WL_.. for wrapper-library code to call, VMS_PI_.. for plugin code to call, and VMS_int_.. for VMS to use internally. Sometimes VMS_int_ functions are called from the wrapper library or plugin, but this should only be done by programmers who have gained an in-depth knowledge of VMS's implementation and understand that VMS_int_ functions are not protected for concurrent use.. 42.15 +-] PR has several kinds of function, grouped according to what kind of code should call them: PR_App_.. for applications to call, PR_WL_.. for wrapper-library code to call, PR_PI_.. for plugin code to call, and PR_int_.. for PR to use internally. Sometimes PR_int_ functions are called from the wrapper library or plugin, but this should only be done by programmers who have gained an in-depth knowledge of PR's implementation and understand that PR_int_ functions are not protected for concurrent use.. 42.16 42.17 --] VMS has its own version of malloc, unfortunately, which is due to the system malloc breaking when the stack-pointer register is manipulated, which VMS must do. The VMS form of malloc must be used in code that runs inside the VMS system, especially all application code that uses a VMS-based language. However, a complication is that the malloc implementation is not protected with a lock. However, mallocs performed in the main thread, outside the VMS-language program, cannot use VMS malloc.. this presents some issues crossing the boundary.. 42.18 +-] PR has its own version of malloc, unfortunately, which is due to the system malloc breaking when the stack-pointer register is manipulated, which PR must do. The PR form of malloc must be used in code that runs inside the PR system, especially all application code that uses a PR-based language. However, a complication is that the malloc implementation is not protected with a lock. However, mallocs performed in the main thread, outside the PR-language program, cannot use PR malloc.. this presents some issues crossing the boundary.. 42.19 42.20 --] Things in the code are turned on and off by using #define in combination with #ifdef. All defines for doing this are found in Defines/VMS_defs__turn_on_and_off.h. The rest of the files in Defines directory contain macro definitions, hardware constants, and any other #define statements. 42.21 +-] Things in the code are turned on and off by using #define in combination with #ifdef. All defines for doing this are found in Defines/PR_defs__turn_on_and_off.h. The rest of the files in Defines directory contain macro definitions, hardware constants, and any other #define statements. 42.22 42.23 --] VMS has many macros used in the code.. such as for measurements and debug.. all measurement, debug, and statistics gathering statements can be turned on or off by commenting-out or uncommenting the appropriate #define. 42.24 +-] PR has many macros used in the code.. such as for measurements and debug.. all measurement, debug, and statistics gathering statements can be turned on or off by commenting-out or uncommenting the appropriate #define. 42.25 42.26 --] The best way to learn VMS is to uncomment DEBUG__TURN_ON_SEQUENTIAL_MODE, which allows using a normal debugger while sequentially executing through both application code and VMS internals. Setting breakpoints at various spots in the code is a good way to see the VMS system in operation. 42.27 +-] The best way to learn PR is to uncomment DEBUG__TURN_ON_SEQUENTIAL_MODE, which allows using a normal debugger while sequentially executing through both application code and PR internals. Setting breakpoints at various spots in the code is a good way to see the PR system in operation. 42.28 42.29 --] VMS has several "VMS primitives" implemented with assembly code. The net effect of these assembly functions is to perform the switching between application code and the VMS system. 42.30 +-] PR has several "PR primitives" implemented with assembly code. The net effect of these assembly functions is to perform the switching between application code and the PR system. 42.31 42.32 --] The heart of this multi-core version of VMS is the AnimationMaster and CoreController. Those files have large comments explaining the nature of VMS and this implementation. Those comments are the best place to start reading, to get an understanding of the code before tracing through it. 42.33 \ No newline at end of file 42.34 +-] The heart of this multi-core version of PR is the AnimationMaster and CoreController. Those files have large comments explaining the nature of PR and this implementation. Those comments are the best place to start reading, to get an understanding of the code before tracing through it. 42.35 \ No newline at end of file
