view AnimationMaster.c @ 267:608833ae2c5d

Checkpoint -- about to clean up AnimationMaster, deleting a bunch of stuff
author Sean Halle <seanhalle@yahoo.com>
date Sun, 04 Nov 2012 18:39:28 -0800
parents a5fa1e087c7e
children e5bd470b562b
line source
1 /*
2 * Copyright 2010 OpenSourceStewardshipFoundation
3 *
4 * Licensed under BSD
5 */
9 #include <stdio.h>
10 #include <stddef.h>
12 #include "PR.h"
13 #include "VSs_impl/VSs.h"
15 inline void
16 replaceWithNewSlotSlv( SlaveVP *slave );
19 /*The animationMaster embodies most of the animator of the language. The
20 * animator is what emodies the behavior of language constructs.
21 * As such, it is the animationMaster, in combination with the plugin
22 * functions, that make the language constructs do their behavior.
23 *
24 *Within the code, this is the top-level-function of the masterVPs, and
25 * runs when the coreController has no more slave VPs. It's job is to
26 * refill the animation slots with slaves that have work.
27 *
28 *There are multiple versions of the master, each tuned to a specific
29 * combination of modes. This keeps the master simple, with reduced overhead,
30 * when the application is not using the extra complexity.
31 *
32 *As of Sept 2012, the versions available will be:
33 * 1) Single langauge, which only exposes slaves (such as SSR or Vthread)
34 * 2) Single language, which only exposes tasks (such as pure dataflow)
35 * 3) Single language, which exposes both (like Cilk, StarSs, and OpenMP)
36 * 4) Multi-language, which always assumes both tasks and slaves
37 * 5) Multi-language and multi-process, which also assumes both tasks and slaves
38 *
39 *
40 *
41 */
44 //===================== The versions of the Animation Master =================
45 //
46 //==============================================================================
48 /* 1) This version is for a single language, that has only slaves, no tasks,
49 * such as Vthread or SSR.
50 *This version is for when an application has only a single language, and
51 * that language exposes slaves explicitly (as opposed to a task based
52 * language like pure dataflow).
53 *
54 *
55 *It scans the animation slots for just-completed slaves.
56 * Each completed slave has a request in it. So, the master hands each to
57 * the plugin's request handler (there is only one plugin, because only one
58 * lang).
59 *Each request represents a language construct that has been encountered
60 * by the application code in the slave. Passing the request to the
61 * request handler is how that language construct's behavior gets invoked.
62 * The request handler then performs the actions of the construct's
63 * behavior. So, the request handler encodes the behavior of the
64 * language's parallelism constructs, and performs that when the master
65 * hands it a slave containing a request to perform that construct.
66 *
67 *On a shared-memory machine, the behavior of parallelism constructs
68 * equals control, over order of execution of code. Hence, the behavior
69 * of the language constructs performed by the request handler is to
70 * choose the order that slaves get animated, and thereby control the
71 * order that application code in the slaves executes.
72 *
73 *To control order of animation of slaves, the request handler has a
74 * semantic environment that holds data structures used to hold slaves
75 * and choose when they're ready to be animated.
76 *
77 *Once a slave is marked as ready to be animated by the request handler,
78 * it is the second plugin function, the Assigner, which chooses the core
79 * the slave gets assigned to for animation. Hence, the Assigner doesn't
80 * perform any of the semantic behavior of language constructs, rather
81 * it gives the language a chance to improve performance. The performance
82 * of application code is strongly related to communication between
83 * cores. On shared-memory machines, communication is caused during
84 * execution of code, by memory accesses, and how much depends on contents
85 * of caches connected to the core executing the code. So, the placement
86 * of slaves determines the communication caused during execution of the
87 * slave's code.
88 *The point of the Assigner, then, is to use application information during
89 * execution of the program, to make choices about slave placement onto
90 * cores, with the aim to put slaves close to caches containing the data
91 * used by the slave's code.
92 *
93 *==========================================================================
94 *In summary, the animationMaster scans the slots, finds slaves
95 * just-finished, which hold requests, pass those to the request handler,
96 * along with the semantic environment, and the request handler then manages
97 * the structures in the semantic env, which controls the order of
98 * animation of slaves, and so embodies the behavior of the language
99 * constructs.
100 *The animationMaster then rescans the slots, offering each empty one to
101 * the Assigner, along with the semantic environment. The Assigner chooses
102 * among the ready slaves in the semantic Env, finding the one best suited
103 * to be animated by that slot's associated core.
104 *
105 *==========================================================================
106 *Implementation Details:
107 *
108 *There is a separate masterVP for each core, but a single semantic
109 * environment shared by all cores. Each core also has its own scheduling
110 * slots, which are used to communicate slaves between animationMaster and
111 * coreController. There is only one global variable, _PRTopEnv, which
112 * holds the semantic env and other things shared by the different
113 * masterVPs. The request handler and Assigner are registered with
114 * the animationMaster by the language's init function, and a pointer to
115 * each is in the _PRTopEnv. (There are also some pthread related global
116 * vars, but they're only used during init of PR).
117 *PR gains control over the cores by essentially "turning off" the OS's
118 * scheduler, using pthread pin-to-core commands.
119 *
120 *The masterVPs are created during init, with this animationMaster as their
121 * top level function. The masterVPs use the same SlaveVP data structure,
122 * even though they're not slave VPs.
123 *A "seed slave" is also created during init -- this is equivalent to the
124 * "main" function in C, and acts as the entry-point to the PR-language-
125 * based application.
126 *The masterVPs share a single system-wide master-lock, so only one
127 * masterVP may be animated at a time.
128 *The core controllers access _PRTopEnv to get the masterVP, and when
129 * they start, the slots are all empty, so they run their associated core's
130 * masterVP. The first of those to get the master lock sees the seed slave
131 * in the shared semantic environment, so when it runs the Assigner, that
132 * returns the seed slave, which the animationMaster puts into a scheduling
133 * slot then switches to the core controller. That then switches the core
134 * over to the seed slave, which then proceeds to execute language
135 * constructs to create more slaves, and so on. Each of those constructs
136 * causes the seed slave to suspend, switching over to the core controller,
137 * which eventually switches to the masterVP, which executes the
138 * request handler, which uses PR primitives to carry out the creation of
139 * new slave VPs, which are marked as ready for the Assigner, and so on..
140 *
141 *On animation slots, and system behavior:
142 * A request may linger in an animation slot for a long time while
143 * the slaves in the other slots are animated. This only becomes a problem
144 * when such a request is a choke-point in the constraints, and is needed
145 * to free work for *other* cores. To reduce this occurrence, the number
146 * of animation slots should be kept low. In balance, having multiple
147 * animation slots amortizes the overhead of switching to the masterVP and
148 * executing the animationMaster code, which drives for more than one. In
149 * practice, the best balance should be discovered by profiling.
150 */
151 void animationMaster( void *initData, SlaveVP *masterVP )
152 {
153 //Used while scanning and filling animation slots
154 int32 slotIdx, numSlotsFilled;
155 AnimSlot *currSlot, **animSlots;
156 SlaveVP *assignedSlaveVP; //the slave chosen by the assigner
158 //Local copies, for performance
159 MasterEnv *masterEnv;
160 SlaveAssigner slaveAssigner;
161 RequestHandler requestHandler;
162 void *semanticEnv;
163 int32 thisCoresIdx;
165 //======================== Initializations ========================
166 masterEnv = (MasterEnv*)_PRTopEnv;
168 thisCoresIdx = masterVP->coreAnimatedBy;
169 animSlots = masterEnv->allAnimSlots[thisCoresIdx];
171 requestHandler = masterEnv->requestHandler;
172 slaveAssigner = masterEnv->slaveAssigner;
173 semanticEnv = masterEnv->semanticEnv;
175 HOLISTIC__Insert_Master_Global_Vars;
177 //======================== animationMaster ========================
178 while(1){
180 MEAS__Capture_Pre_Master_Point
182 //Scan the animation slots
183 numSlotsFilled = 0;
184 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
185 {
186 currSlot = animSlots[ slotIdx ];
188 //Check if newly-done slave in slot, which will need request handled
189 if( currSlot->workIsDone )
190 {
191 currSlot->workIsDone = FALSE;
192 currSlot->needsWorkAssigned = TRUE;
194 HOLISTIC__Record_AppResponder_start;
195 MEAS__startReqHdlr;
197 currSlot->workIsDone = FALSE;
198 currSlot->needsWorkAssigned = TRUE;
199 SlaveVP *currSlave = currSlot->slaveAssignedToSlot;
201 justAddedReqHdlrChg();
202 //handle the request, either by PR or by the language
203 if( currSlave->requests->reqType != LangReq )
204 { //The request is a standard PR one, not one defined by the
205 // language, so PR handles it, then queues slave to be assigned
206 handleReqInPR( currSlave );
207 writePrivQ( currSlave, PRReadyQ ); //Q slave to be assigned below
208 }
209 else
210 { MEAS__startReqHdlr;
212 //Language handles request, which is held inside slave struc
213 (*requestHandler)( currSlave, semanticEnv );
215 MEAS__endReqHdlr;
216 }
217 }
219 //process the requests made by the slave (held inside slave struc)
220 (*requestHandler)( currSlot->slaveAssignedToSlot, semanticEnv );
222 HOLISTIC__Record_AppResponder_end;
223 MEAS__endReqHdlr;
224 }
225 //If slot empty, hand to Assigner to fill with a slave
226 if( currSlot->needsWorkAssigned )
227 { //Call plugin's Assigner to give slot a new slave
228 HOLISTIC__Record_Assigner_start;
229 assignedSlaveVP =
230 (*slaveAssigner)( semanticEnv, currSlot );
232 //put the chosen slave into slot, and adjust flags and state
233 if( assignedSlaveVP != NULL )
234 { currSlot->slaveAssignedToSlot = assignedSlaveVP;
235 assignedSlaveVP->animSlotAssignedTo = currSlot;
236 currSlot->needsWorkAssigned = FALSE;
237 numSlotsFilled += 1;
239 HOLISTIC__Record_Assigner_end;
240 }
241 }
242 }
244 MEAS__Capture_Post_Master_Point;
246 masterSwitchToCoreCtlr( masterVP );
247 flushRegisters();
248 DEBUG__printf(FALSE,"came back after switch to core -- so lock released!");
249 }//while(1)
250 }
253 /* 2) This version is for a single language that has only tasks, which
254 * cannot be suspended.
255 */
256 void animationMaster( void *initData, SlaveVP *masterVP )
257 {
258 //Used while scanning and filling animation slots
259 int32 slotIdx, numSlotsFilled;
260 AnimSlot *currSlot, **animSlots;
261 SlaveVP *assignedSlaveVP; //the slave chosen by the assigner
263 //Local copies, for performance
264 MasterEnv *masterEnv;
265 SlaveAssigner slaveAssigner;
266 RequestHandler requestHandler;
267 PRSemEnv *semanticEnv;
268 int32 thisCoresIdx;
270 //#ifdef MODE__MULTI_LANG
271 SlaveVP *slave;
272 PRProcess *process;
273 int32 langMagicNumber;
274 //#endif
276 //======================== Initializations ========================
277 masterEnv = (MasterEnv*)_PRTopEnv;
279 thisCoresIdx = masterVP->coreAnimatedBy;
280 animSlots = masterEnv->allAnimSlots[thisCoresIdx];
282 requestHandler = masterEnv->requestHandler;
283 slaveAssigner = masterEnv->slaveAssigner;
284 semanticEnv = masterEnv->semanticEnv;
286 //initialize, for non-multi-lang, non multi-proc case
287 // default handler gets put into master env by a registration call by lang
288 endTaskHandler = masterEnv->defaultTaskHandler;
290 HOLISTIC__Insert_Master_Global_Vars;
292 //======================== animationMaster ========================
293 //Do loop gets requests handled and work assigned to slots..
294 // work can either be a task or a resumed slave
295 //Having two cases makes this logic complex.. can be finishing either, and
296 // then the next available work may be either.. so really have two distinct
297 // loops that are inter-twined..
298 while(1){
300 MEAS__Capture_Pre_Master_Point
302 //Scan the animation slots
303 numSlotsFilled = 0;
304 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
305 {
306 currSlot = animSlots[ slotIdx ];
308 //Check if newly-done slave in slot, which will need request handled
309 if( currSlot->workIsDone )
310 { currSlot->workIsDone = FALSE;
312 HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot
313 MEAS__startReqHdlr;
316 //process the request made by the slave (held inside slave struc)
317 slave = currSlot->slaveAssignedToSlot;
319 //check if the completed work was a task..
320 if( slave->metaTask->isATask )
321 {
322 if( slave->request->type == TaskEnd )
323 { //do task end handler, which is registered separately
324 //note, end hdlr may use semantic data from reqst..
325 //#ifdef MODE__MULTI_LANG
326 //get end-task handler
327 //taskEndHandler = lookup( slave->reqst->langMagicNumber, processEnv );
328 taskEndHandler = slave->metaTask->endTaskHandler;
329 //#endif
330 (*taskEndHandler)( slave, semanticEnv );
332 goto AssignWork;
333 }
334 else //is a task, and just suspended
335 { //turn slot slave into free task slave & make replacement
336 if( slave->typeOfVP == SlotTaskSlv ) changeSlvType();
338 //goto normal slave request handling
339 goto SlaveReqHandling;
340 }
341 }
342 else //is a slave that suspended
343 {
344 SlaveReqHandling:
345 (*requestHandler)( slave, semanticEnv ); //(note: indirect Fn call more efficient when use fewer params, instead re-fetch from slave)
347 HOLISTIC__Record_AppResponder_end;
348 MEAS__endReqHdlr;
350 goto AssignWork;
351 }
352 } //if has suspended slave that needs handling
354 //if slot empty, hand to Assigner to fill with a slave
355 if( currSlot->needsWorkAssigned )
356 { //Call plugin's Assigner to give slot a new slave
357 HOLISTIC__Record_Assigner_start;
359 AssignWork:
361 assignedSlaveVP = assignWork( semanticEnv, currSlot );
363 //put the chosen slave into slot, and adjust flags and state
364 if( assignedSlaveVP != NULL )
365 { currSlot->slaveAssignedToSlot = assignedSlaveVP;
366 assignedSlaveVP->animSlotAssignedTo = currSlot;
367 currSlot->needsWorkAssigned = FALSE;
368 numSlotsFilled += 1;
369 }
370 else
371 {
372 currSlot->needsWorkAssigned = TRUE; //local write
373 }
374 HOLISTIC__Record_Assigner_end;
375 }//if slot needs slave assigned
376 }//for( slotIdx..
378 MEAS__Capture_Post_Master_Point;
380 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
381 flushRegisters();
382 }//while(1)
383 }
386 /*This is the master when just multi-lang, but not multi-process mode is on.
387 * This version has to handle both tasks and slaves, and do extra work of
388 * looking up the semantic env and handlers to use, for each completed bit of
389 * work.
390 *It also has to search through the semantic envs to find one with work,
391 * then ask that env's assigner to return a unit of that work.
392 *
393 *The language is written to startup in the same way as if it were the only
394 * language in the app, and it operates in the same way,
395 * the only difference between single language and multi-lang is here, in the
396 * master.
397 *This invisibility to mode is why the language has to use registration calls
398 * for everything during startup -- those calls do different things depending
399 * on whether it's single-language or multi-language mode.
400 *
401 *In this version of the master, work can either be a task or a resumed slave
402 *Having two cases makes this logic complex.. can be finishing either, and
403 * then the next available work may be either.. so really have two distinct
404 * loops that are inter-twined..
405 *
406 *Some special cases:
407 * A task-end is a special case for a few reasons (below).
408 * A task-end can't block a slave (can't cause it to "logically suspend")
409 * A task available for work can only be assigned to a special slave, which
410 * has been set aside for doing tasks, one such task-slave is always
411 * assigned to each slot. So, when a task ends, a new task is assigned to
412 * that slot's task-slave right away.
413 * But if no tasks are available, then have to switch over to looking at
414 * slaves to find one ready to resume, to find work for the slot.
415 * If a task just suspends, not ends, then its task-slave is no longer
416 * available to take new tasks, so a new task-slave has to be assigned to
417 * that slot. Then the slave of the suspended task is turned into a free
418 * task-slave and request handling is done on it as if it were a slave
419 * that suspended.
420 * After request handling, do the same sequence of looking for a task to be
421 * work, and if none, look for a slave ready to resume, as work for the slot.
422 * If a slave suspends, handle its request, then look for work.. first for a
423 * task to assign, and if none, slaves ready to resume.
424 * Another special case is when task-end is done on a free task-slave.. in
425 * that case, the slave has no more work and no way to get more.. so place
426 * it into a recycle queue.
427 * If no work is found of either type, then do a special thing to prune down
428 * the extra slaves in the recycle queue, just so don't get too many..
429 *
430 *The multi-lang thing complicates matters..
431 *
432 *For request handling, it means have to first fetch the semantic environment
433 * of the language, and then do the request handler pointed to by that
434 * semantic env.
435 *For assigning, things get more complex because of competing goals.. One
436 * goal is for language specific stuff to be used during assignment, so
437 * assigner can make higher quality decisions.. but with multiple languages,
438 * which only get mixed in the application, the assigners can't be written
439 * with knowledge of each other. So, they can only make localized decisions,
440 * and so different language's assigners may interfere with each other..
441 *
442 *So, have some possibilities available:
443 *1) can have a fixed scheduler in the proto-runtime, that all the
444 * languages give their work to.. (but then lose language-specific info,
445 * there is a standard PR format for assignment info, and the langauge
446 * attaches this to the work-unit when it gives it to PR.. also have issue
447 * with HWSim, which uses a priority Q instead of FIFO, and requests can
448 * "undo" previous work put in, so request handlers need way to manipulate
449 * the work-holding Q..) (this might be fudgeable with
450 * HWSim, if the master did a lang-supplied callback each time it assigns a
451 * unit to a slot.. then HWSim can keep exactly one unit of work in PR's
452 * queue at a time.. but this is quite hack-like.. or perhaps HWSim supplies
453 * a task-end handler that kicks the next unit of work from HWSim internal
454 * priority queue, over to PR readyQ)
455 *2) can have each language have its own semantic env, that holds its own
456 * work, which is assigned by its own assigner.. then the master searches
457 * through all the semantic envs to find one with work and asks it give work..
458 * (this has downside of blinding assigners to each other.. but does work
459 * for HWSim case)
460 *3) could make PR have a different readyQ for each core, and ask the lang
461 * to put work to the core it prefers.. but the work may be moved by PR if
462 * needed, say if one core idles for too long. This is a hybrid approach,
463 * letting the language decide which core, but PR keeps the work and does it
464 * FIFO style.. (this might als be fudgeable with HWSim, in similar fashion,
465 * but it would be complicated by having to track cores separately)
466 *
467 *Choosing 2, to keep compatibility with single-lang mode.. it allows the same
468 * assigner to be used for single-lang as for multi-lang.. the overhead of
469 * the extra master search for work is part of the price of the flexibility,
470 * but should be fairly small.. takes the first env that has work available,
471 * and whatever it returns is assigned to the slot..
472 *
473 *As a hybrid, giving an option for a unified override assigner to be registered
474 * and used.. This allows something like a static analysis to detect
475 * which languages are grouped together, and then analyze the pattern of
476 * construct calls, and generate a custom assigner that uses info from all
477 * the languages in a unified way.. Don't really expect this to happen,
478 * but making it possible.
479 */
480 #ifdef MODE__MULTI_LANG
481 void animationMaster( void *initData, SlaveVP *masterVP )
482 {
483 //Used while scanning and filling animation slots
484 int32 slotIdx, numSlotsFilled;
485 AnimSlot *currSlot, **animSlots;
486 SlaveVP *assignedSlaveVP; //the slave chosen by the assigner
488 //Local copies, for performance
489 MasterEnv *masterEnv;
490 SlaveAssigner slaveAssigner;
491 RequestHandler requestHandler;
492 PRSemEnv *semanticEnv;
493 int32 thisCoresIdx;
495 //#ifdef MODE__MULTI_LANG
496 SlaveVP *slave;
497 PRProcess *process;
498 int32 langMagicNumber;
499 //#endif
501 //======================== Initializations ========================
502 masterEnv = (MasterEnv*)_PRTopEnv;
504 thisCoresIdx = masterVP->coreAnimatedBy;
505 animSlots = masterEnv->allAnimSlots[thisCoresIdx];
507 requestHandler = masterEnv->requestHandler;
508 slaveAssigner = masterEnv->slaveAssigner;
509 semanticEnv = masterEnv->semanticEnv;
511 //initialize, for non-multi-lang, non multi-proc case
512 // default handler gets put into master env by a registration call by lang
513 endTaskHandler = masterEnv->defaultTaskHandler;
515 HOLISTIC__Insert_Master_Global_Vars;
517 //======================== animationMaster ========================
518 //Do loop gets requests handled and work assigned to slots..
519 // work can either be a task or a resumed slave
520 //Having two cases makes this logic complex.. can be finishing either, and
521 // then the next available work may be either.. so really have two distinct
522 // loops that are inter-twined..
523 while(1){
525 MEAS__Capture_Pre_Master_Point
527 //Scan the animation slots
528 numSlotsFilled = 0;
529 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
530 {
531 currSlot = animSlots[ slotIdx ];
533 //Check if newly-done slave in slot, which will need request handled
534 if( currSlot->workIsDone )
535 { currSlot->workIsDone = FALSE;
537 HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot
538 MEAS__startReqHdlr;
541 //process the request made by the slave (held inside slave struc)
542 slave = currSlot->slaveAssignedToSlot;
544 //check if the completed work was a task..
545 if( slave->taskMetaInfo->isATask )
546 {
547 if( slave->reqst->type == TaskEnd )
548 { //do task end handler, which is registered separately
549 //note, end hdlr may use semantic data from reqst..
550 //#ifdef MODE__MULTI_LANG
551 //get end-task handler
552 //taskEndHandler = lookup( slave->reqst->langMagicNumber, processEnv );
553 taskEndHandler = slave->taskMetaInfo->endTaskHandler;
554 //#endif
555 (*taskEndHandler)( slave, semanticEnv );
557 goto AssignWork;
558 }
559 else //is a task, and just suspended
560 { //turn slot slave into free task slave & make replacement
561 if( slave->typeOfVP == SlotTaskSlv ) changeSlvType();
563 //goto normal slave request handling
564 goto SlaveReqHandling;
565 }
566 }
567 else //is a slave that suspended
568 {
569 SlaveReqHandling:
570 (*requestHandler)( slave, semanticEnv ); //(note: indirect Fn call more efficient when use fewer params, instead re-fetch from slave)
572 HOLISTIC__Record_AppResponder_end;
573 MEAS__endReqHdlr;
575 goto AssignWork;
576 }
577 } //if has suspended slave that needs handling
579 //if slot empty, hand to Assigner to fill with a slave
580 if( currSlot->needsWorkAssigned )
581 { //Call plugin's Assigner to give slot a new slave
582 HOLISTIC__Record_Assigner_start;
584 AssignWork:
586 assignedSlaveVP = assignWork( semanticEnv, currSlot );
588 //put the chosen slave into slot, and adjust flags and state
589 if( assignedSlaveVP != NULL )
590 { currSlot->slaveAssignedToSlot = assignedSlaveVP;
591 assignedSlaveVP->animSlotAssignedTo = currSlot;
592 currSlot->needsWorkAssigned = FALSE;
593 numSlotsFilled += 1;
594 }
595 else
596 {
597 currSlot->needsWorkAssigned = TRUE; //local write
598 }
599 HOLISTIC__Record_Assigner_end;
600 }//if slot needs slave assigned
601 }//for( slotIdx..
603 MEAS__Capture_Post_Master_Point;
605 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
606 flushRegisters();
607 }//while(1)
608 }
609 #endif //MODE__MULTI_LANG
613 //This is the master when both multi-lang and multi-process modes are turned on
614 //#ifdef MODE__MULTI_LANG
615 //#ifdef MODE__MULTI_PROCESS
616 void animationMaster( void *initData, SlaveVP *masterVP )
617 {
618 int32 slotIdx;
619 AnimSlot *currSlot;
620 //Used while scanning and filling animation slots
621 AnimSlot **animSlots;
623 //Local copies, for performance
624 MasterEnv *masterEnv;
625 int32 thisCoresIdx;
627 //======================== Initializations ========================
628 masterEnv = (MasterEnv*)_PRTopEnv;
630 thisCoresIdx = masterVP->coreAnimatedBy;
631 animSlots = masterEnv->allAnimSlots[thisCoresIdx];
633 HOLISTIC__Insert_Master_Global_Vars;
635 //======================== animationMaster ========================
636 //Do loop gets requests handled and work assigned to slots..
637 // work can either be a task or a resumed slave
638 //Having two cases makes this logic complex.. can be finishing either, and
639 // then the next available work may be either.. so really have two distinct
640 // loops that are inter-twined..
641 while(1)
642 {
643 MEAS__Capture_Pre_Master_Point
645 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
646 {
647 currSlot = animSlots[ slotIdx ];
649 masterFunction_multiLang( currSlot );
650 }
652 MEAS__Capture_Post_Master_Point;
654 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
655 flushRegisters();
656 }
657 }
658 #endif //MODE__MULTI_LANG
659 #endif //MODE__MULTI_PROCESS
662 //This version of the master selects one of three loops, depending upon
663 // whether stand-alone single language (just slaves), or standalone with
664 // tasks, or multi-lang (implies multi-process)
665 void animationMaster( void *initData, SlaveVP *masterVP )
666 {
667 int32 slotIdx;
668 AnimSlot *currSlot;
669 //Used while scanning and filling animation slots
670 AnimSlot **animSlots;
672 //Local copies, for performance
673 MasterEnv *masterEnv;
674 int32 thisCoresIdx;
676 //======================== Initializations ========================
677 masterEnv = (MasterEnv*)_PRTopEnv;
679 thisCoresIdx = masterVP->coreAnimatedBy;
680 animSlots = masterEnv->allAnimSlots[thisCoresIdx];
682 HOLISTIC__Insert_Master_Global_Vars;
684 //======================== animationMaster ========================
685 //Have three different modes, and the master behavior is different for
686 // each, so jump to the loop that corresponds to the mode.
687 //
688 switch(mode)
689 { case StandaloneSlavesOnly:
690 while(1)
691 { MEAS__Capture_Pre_Master_Point
692 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
693 {
694 currSlot = animSlots[ slotIdx ];
696 masterFunction_StandaloneSlavesOnly( currSlot );
697 }
698 MEAS__Capture_Post_Master_Point;
699 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
700 flushRegisters();
701 }
702 case StandaloneWTasks:
703 while(1)
704 { MEAS__Capture_Pre_Master_Point
705 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
706 {
707 currSlot = animSlots[ slotIdx ];
709 masterFunction_StandaloneWTasks( currSlot );
710 }
711 MEAS__Capture_Post_Master_Point;
712 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
713 flushRegisters();
714 }
715 case MultiLang:
716 while(1)
717 { MEAS__Capture_Pre_Master_Point
718 for( slotIdx = 0; slotIdx < NUM_ANIM_SLOTS; slotIdx++)
719 {
720 currSlot = animSlots[ slotIdx ];
722 masterFunction_multiLang( currSlot );
723 }
724 MEAS__Capture_Post_Master_Point;
725 masterSwitchToCoreCtlr( masterVP ); //returns when ctlr switches back to master
726 flushRegisters();
727 }
728 }
729 }
732 inline
733 void
734 masterFunction_multiLang( AnimSlot *currSlot )
735 { //Scan the animation slots
736 int32 magicNumber;
737 SlaveVP *slave;
738 SlaveVP *assignedSlaveVP;
739 PRSemEnv *semanticEnv;
740 PRReqst *req;
741 RequestHandler requestHandler;
743 //Check if newly-done slave in slot, which will need request handled
744 if( currSlot->workIsDone )
745 { currSlot->workIsDone = FALSE;
746 currSlot->needsWorkAssigned = TRUE;
748 HOLISTIC__Record_AppResponder_start; //TODO: update to check which process for each slot
749 MEAS__startReqHdlr;
752 //process the request made by the slave (held inside slave struc)
753 slave = currSlot->slaveAssignedToSlot;
754 req = slave->request;
756 //If the requesting slave is a slot slave, and request is not
757 // task-end, then turn it into a free task slave.
758 if( slave->typeOfVP == SlotTaskSlv && req->reqType != TaskEnd )
759 replaceWithNewSlotSlv( slave );
761 //Handle task create and end first -- they're special cases..
762 switch( req->reqType )
763 { case TaskEnd:
764 { //do PR handler, which calls lang's hdlr and does recycle of
765 // free task slave if needed -- PR handler checks for free task Slv
766 PRHandle_EndTask( slave ); break;
767 }
768 case TaskCreate:
769 { //Do PR's create-task handler, which calls the lang's hdlr
770 // PR handler checks for free task Slv
771 PRHandle_CreateTask( slave ); break;
772 }
773 case SlvCreate: PRHandle_CreateSlave( slave ); break;
774 case SlvDissipate: PRHandle_Dissipate( slave ); break;
775 case Service: PR_int__handle_PRServiceReq( slave ); break; //resume into PR's own semantic env
776 case Hardware: //for future expansion
777 case IO: //for future expansion
778 case OSCall: //for future expansion
779 PR_int__throw_exception("Not implemented"); break;
780 case Language: //normal sem request
781 magicNumber = req->langMagicNumber;
782 semanticEnv = PR_PI__give_sem_env_for( slave, magicNumber );
783 requestHandler = semanticEnv->requestHdlr;
784 (*requestHandler)( req->semReq, slave, semanticEnv );
785 }
787 HOLISTIC__Record_AppResponder_end;
788 MEAS__endReqHdlr;
789 } //if have request to be handled
791 if( currSlot->needsWorkAssigned )
792 {
793 HOLISTIC__Record_Assigner_start;
795 //Scan sem environs, looking for semEnv with ready work.
796 // call the Assigner for that sem Env, to get a slave for the slot
797 assignedSlaveVP = assignWork( semanticEnv, currSlot );
799 //if work found, put into slot, and adjust flags and state
800 if( assignedSlaveVP != NULL )
801 { currSlot->slaveAssignedToSlot = assignedSlaveVP;
802 assignedSlaveVP->animSlotAssignedTo = currSlot;
803 currSlot->needsWorkAssigned = FALSE;
804 }
805 HOLISTIC__Record_Assigner_end;
806 }//if slot needs slave assigned
807 }
809 //==========================================================================
810 /*When a task in a slot slave suspends, the slot slave has to be changed to
811 * a free task slave, then the slot slave replaced. The replacement can be
812 * either a recycled free task slave that finished it's task and has been
813 * idle in the recycle queue, or else create a new slave to be the slot slave.
814 *The master only calls this with a slot slave that needs to be replaced.
815 */
816 inline void
817 replaceWithNewSlotSlv( SlaveVP *requestingSlv, PRProcess *process )
818 { SlaveVP *newSlotSlv;
820 When slot slave converted to a free task slave, insert the process pointer -- slot slaves are not assigned to any process;
821 when convert from slot slave to free task slave, check what should do about num (live slaves + live tasks) inside VSs's task stub, and properly update process's count of liveFreeTaskSlaves
823 //get a new slave to be the slot slave
824 newSlotSlv = readPrivQ( process->freeTaskSlvRecycleQ );
825 if( newSlotSlv == NULL )
826 { newSlotSlv = PR_int__create_slaveVP( &idle_fn, NULL, process, 0);
827 //just made a new free task slave, so count it
828 process->numLiveFreeTaskSlvs += 1;
829 }
831 //set slave values to make it the slot slave
832 newSlotSlv->metaTask = NULL;
833 newSlotSlv->typeOfVP = SlotTaskSlv;
834 // newSlotSlv->needsTaskAssigned = TRUE;
836 //a slot slave is pinned to a particular slot on a particular core
837 //Note, this happens before the request is seen by handler, so nothing
838 // has had a chance to change the coreAnimatedBy or anything else..
839 newSlotSlv->animSlotAssignedTo = requestingSlv->animSlotAssignedTo;
840 newSlotSlv->coreAnimatedBy = requestingSlv->coreAnimatedBy;
842 //put it into the slot slave matrix
843 int32 slotNum = requestingSlv->animSlotAssignedTo->slotIdx;
844 int32 coreNum = requestingSlv->coreAnimatedBy;
845 process->slotTaskSlvs[coreNum][slotNum] = newSlotSlv;
847 //Fix up requester, to be an extra slave now (but not an ended one)
848 // because it's active, doesn't go into freeTaskSlvRecycleQ
849 requestingSlv->typeOfVP = FreeTaskSlv;
850 requestingSlv->metaTask->taskType = FreeTask;
851 }
855 /*This does:
856 * 1) searches the semantic environments for one with work ready
857 * if finds one, asks its assigner to return work
858 * 2) checks what kind of work: new task, resuming task, resuming slave
859 * if new task, gets the slot slave and assigns task to it and returns slave
860 * else, gets the slave attached to the metaTask and returns that.
861 * 3) if no work found, then prune former task slaves waiting to be recycled.
862 * If no work and no slaves to prune, check for shutdown conditions.
863 *
864 * Semantic env keeps its own work in its own structures, and has its own
865 * assigner. It chooses
866 * However, include a switch that switches-in an override assigner, which
867 * sees all the work in all the semantic env's. This is most likely
868 * generated by static tools and included in the executable. That means it
869 * has to be called via a registered pointer from here. The idea is that
870 * the static tools know which languages are grouped together.. and the
871 * override enables them to generate a custom assigner that uses info from
872 * all the languages in a unified way.. Don't really expect this to happen,
873 * but am making it possible.
874 */
875 inline SlaveVP *
876 assignWork( PRProcess *process, AnimSlot *slot )
877 { SlaveVP *returnSlv;
878 int32 coreNum, slotNum;
879 PRMetaTask *assignedMetaTask;
881 coreNum = slot->coreSlotIsOn;
883 if( process->overrideAssigner != NULL )
884 { assignedMetaTask = (*process->overrideAssigner)( process, slot );
885 if( assignedMetaTask != NULL )
886 {
887 //have work, so reset Done flag (caused by work generated on other core)
888 // if( process->coreIsDone[coreNum] == TRUE ) //reads are higher perf
889 // process->coreIsDone[coreNum] = FALSE; //don't just write always
891 // switch( assignedMetaTask->taskType )
892 // { case GenericSlave: goto AssignSlave;
893 // case FreeTask: goto AssignSlave;
894 // case SlotTask: goto AssignNewTask;
895 // default: PR_int__throw_exception( "unknown task type ret by assigner" );
896 // }
897 //If meta task has a slave attached, then goto assign slave,
898 // else it's a new task, so goto where assign it to a slot slave
899 if( assignedMetaTask->slaveAssignedTo != NULL )
900 goto AssignSlave;
901 else
902 goto AssignNewTask;
903 }
904 else //metaTask is NULL, so no work..
905 goto NoWork;
906 }
908 //If here, then no override assigner, so search semantic envs for work
909 int32 envIdx, numEnvs; PRSemEnv **semEnvs, *semEnv; SlaveAssigner assigner;
910 semEnvs = process->semEnvs;
911 numEnvs = process->numSemEnvs;
912 for( envIdx = 0; envIdx < numEnvs; envIdx++ ) //keep semEnvs in hash & array
913 { semEnv = semEnvs[envIdx];
914 if( semEnv->hasWork )
915 { assigner = semEnv->slaveAssigner;
916 assignedMetaTask = (*assigner)( semEnv, slot );
918 //have work, so reset Done flag (caused by work generated on other core)
919 // if( process->coreIsDone[coreNum] == TRUE ) //reads are higher perf
920 // process->coreIsDone[coreNum] = FALSE; //don't just write always
922 // switch( assignedMetaTask->taskType )
923 // { case GenericSlave: goto AssignSlave;
924 // case FreeTask: goto AssignSlave;
925 // case SlotTask: goto AssignNewTask;
926 // default: PR_int__throw_exception( "unknown task type ret by assigner" );
927 // }
928 //If meta task has a slave attached, then goto assign slave,
929 // else it's a new task, so goto where assign it to a slot slave
930 if( assignedMetaTask->slaveAssignedTo != NULL )
931 goto AssignSlave;
932 else
933 goto AssignNewTask;
934 }
935 }
936 //If reach here, then have searched all semEnv's & none have work..
938 NoWork:
939 //No work, if reach here..
940 { goto ReturnTheSlv;
941 }
943 AssignSlave: //Have a metaTask attached to a slave, so get the slave & ret it
944 { returnSlv = assignedMetaTask->slaveAssignedTo;
945 returnSlv->coreAnimatedBy = coreNum;
947 goto ReturnTheSlv;
948 }
950 AssignNewTask: //Have a new metaTask that has no slave yet.. assign to slot slv
951 {
952 //get the slot slave to assign the task to..
953 slotNum = slot->slotIdx;
954 returnSlv = process->slotTaskSlvs[coreNum][slotNum];
956 //point slave to task's function
957 PR_int__reset_slaveVP_to_TopLvlFn( returnSlv,
958 assignedMetaTask->topLevelFn, assignedMetaTask->initData );
959 returnSlv->metaTask = assignedMetaTask;
960 assignedMetaTask->slaveAssignedTo = returnSlv;
961 // returnSlv->needsTaskAssigned = FALSE; //slot slave is a "Task" slave type
963 //have work, so reset Done flag, if was set
964 // if( process->coreIsDone[coreNum] == TRUE ) //reads are higher perf
965 // process->coreIsDone[coreNum] = FALSE; //don't just write always
967 goto ReturnTheSlv;
968 }
971 ReturnTheSlv: //All paths goto here.. to provide single point for holistic..
973 #ifdef HOLISTIC__TURN_ON_OBSERVE_UCC
974 if( returnSlv == NULL )
975 { returnSlv = process->idleSlv[coreNum][slotNum];
977 //things that would normally happen in resume(), but idle VPs
978 // never go there
979 returnSlv->numTimesAssignedToASlot++; //gives each idle unit a unique ID
980 Unit newU;
981 newU.vp = returnSlv->slaveID;
982 newU.task = returnSlv->numTimesAssignedToASlot;
983 addToListOfArrays(Unit,newU,process->unitList);
985 if (returnSlv->numTimesAssignedToASlot > 1) //make a dependency from prev idle unit
986 { Dependency newD; // to this one
987 newD.from_vp = returnSlv->slaveID;
988 newD.from_task = returnSlv->numTimesAssignedToASlot - 1;
989 newD.to_vp = returnSlv->slaveID;
990 newD.to_task = returnSlv->numTimesAssignedToASlot;
991 addToListOfArrays(Dependency, newD ,process->ctlDependenciesList);
992 }
993 }
994 else //have a slave will be assigned to the slot
995 { //assignSlv->numTimesAssigned++;
996 //get previous occupant of the slot
997 Unit prev_in_slot =
998 process->last_in_slot[coreNum * NUM_ANIM_SLOTS + slotNum];
999 if(prev_in_slot.vp != 0) //if not first slave in slot, make dependency
1000 { Dependency newD; // is a hardware dependency
1001 newD.from_vp = prev_in_slot.vp;
1002 newD.from_task = prev_in_slot.task;
1003 newD.to_vp = returnSlv->slaveID;
1004 newD.to_task = returnSlv->numTimesAssignedToASlot;
1005 addToListOfArrays(Dependency,newD,process->hwArcs);
1007 prev_in_slot.vp = returnSlv->slaveID; //make new slave the new previous
1008 prev_in_slot.task = returnSlv->numTimesAssignedToASlot;
1009 process->last_in_slot[coreNum * NUM_ANIM_SLOTS + slotNum] =
1010 prev_in_slot;
1012 #endif
1014 return( returnSlv );
1018 /*In creator, only PR related things happen, and things in the langlet whose
1019 * creator construct was used.
1020 *Other langlet still gets a chance to create semData -- but by registering a
1021 * "createSemData" handler in the semEnv. When a construct of the langlet
1022 * calls "PR__give_sem_data()", if there is no semData for that langlet,
1023 * the PR will call the creator in the langlet's semEnv, place whatever it
1024 * makes as the semData in that slave for that langlet, and return that semData
1026 *So, as far as counting things, a langlet is only allowed to count creation
1027 * of slaves it creates itself.. may have to change this later.. add a way for
1028 * langlet to register a trigger Fn called each time a slave gets created..
1029 * need more experience with what langlets will do at create time.. think Cilk
1030 * has interesting create behavior.. not sure how that will differ in light
1031 * of true tasks and langlet approach. Look at it after all done and start
1032 * modifying the langs to be langlets..
1034 *PR itself needs to create the slave, then update numLiveSlaves in process,
1035 * copy processID from requestor to newly created
1036 */
1037 PRHandle_CreateSlave( PRReqst *req, SlaveVP *requestingSlv )
1038 { SlaveVP *newSlv;
1039 PRMetaTask metaTask;
1040 PRProcess *process;
1042 process = requestingSlv->processSlaveIsIn;
1043 newSlv = PR_int__create_slaveVP();
1044 newSlv->typeOfVP = GenericSlv;
1045 newSlv->processSlaveIsIn = process;
1046 process->numLiveGenericSlvs += 1;
1047 metaTask = PR_int__create_slave_meta_task();
1048 metaTask->taskID = req->ID;
1049 // metaTask->taskType = GenericSlave;
1051 (*req->handler)( req->semReq, newSlv, requestingSlv, semEnv );
1054 /*The dissipate handler has to, sdate the number of slaves of the type, within
1055 * the process, and call the langlet handler linked into the request,
1056 * and after that returns, then call the PR function that frees the slave state
1057 * (or recycles the slave).
1059 *The PR function that frees the slave state has to also free all of the
1060 * semData in the slave.. or else reset all of the semDatas.. by, say, marking
1061 * them, then in PR__give_semData( magicNum ) call the langlet registered
1062 * "resetSemData" Fn.
1063 */
1064 PRHandle_Dissipate( SlaveVP *slave )
1065 { PRProcess *process;
1066 void *semEnv;
1068 process = slave->processSlaveIsIn;
1070 //do the language's dissipate handler
1071 semEnv = PR_int__give_sem_env_for_slave( slave, slave->request->langMagicNumber );
1072 (*slave->request->handler)( slave->request->semReq, slave, semEnv );
1074 process->numLiveGenericSlvs -= 1;
1075 PR_int__recycle_slave_multilang( requestingSlv );
1077 //check End Of Process Condition
1078 if( process->numLiveTasks == 0 &&
1079 process->numLiveGenericSlvs == 0 )
1080 PR_SS__shutdown_process( process );
1083 /*Create task is a special form, that has PR behavior in addition to plugin
1084 * behavior. Master calls this first, and then calls the plugin's
1085 * create task handler.
1087 *Note: the requesting slave must be either generic slave or free task slave
1088 */
1089 inline PRMetaTask *
1090 PRHandle_CreateTask( PRReqst *req, SlaveVP *requestingSlv )
1091 { PRMetaTask *metaTask;
1092 PRProcess *process;
1093 PRLangMetaTask *langMetaTask;
1094 PRSemEnv *semanticEnv;
1096 process = requestingSlv->processSlaveIsIn;
1098 metaTask = PR_int__create_meta_task( req );
1099 metaTask->taskID = req->ID; //may be NULL
1100 metaTask->topLevelFn = req->topLevelFn;
1101 metaTask->initData = req->initData;
1103 process->numLiveTasks += 1;
1105 semanticEnv = PR_int__give_sem_env_for_slave( slave,
1106 req->langMagicNumber );
1108 //Do the langlet's create-task handler, which keeps the task
1109 // inside the langlet's sem env, but returns the langMetaTask
1110 // so PR can hook it to the PRMetaTask.
1111 //(Could also do PRMetaTask as a prolog -- make a Fn that takes the size
1112 // of the lang's metaTask, and alloc's that plus the prolog and returns
1113 // ptr to position just above the prolog)
1114 langMetaTask = (*req->handler)(req->semReq, slave, semanticEnv);
1115 metaTask->langMetaTask = langMetaTask;
1116 langMetaTask->protoMetaTask = metaTask;
1118 return;
1121 /*When a task ends, are two scenarios: 1) task ran to completion, or 2) task
1122 * suspended at some point in its code.
1123 *For 1, just decr count of live tasks (and check for end condition) -- the
1124 * master loop will decide what goes into the slot freed up by this task end,
1125 * so, here, don't worry about assigning a new task to the slot slave.
1126 *For 2, the task's slot slave has been converted to a free task slave, which
1127 * now has nothing more to do, so send it to the recycle Q (which includes
1128 * freeing all the semData and meta task structs alloc'd for it). Then
1129 * decrement the live task count and check end condition.
1131 *PR has to update count of live tasks, and check end of process condition.
1132 * The "main" can invoke constructs that wait for a process to end, so when
1133 * end detected, have to resume what's waiting..
1134 *Thing is, that wait involves the main OS thread. That means
1135 * PR internals have to do OS thread signaling. Want to do that in the
1136 * core controller, which has the original stack of an OS thread. So the
1137 * end process handling happens in the core controller.
1139 *So here, when detect process end, signal to the core controller, which will
1140 * then do the condition variable notify to the OS thread that's waiting.
1142 *Note: slave may be either a slot slave or a free task slave.
1143 */
1144 inline void
1145 PRHandle_EndTask( SlaveVP *requestingSlv )
1146 { void *semEnv;
1147 PRReqst *req;
1148 PRLangMetaTask *langMetaTask;
1149 PRProcess *process;
1151 req = requestingSlv->request;
1152 semEnv = PR_int__give_sem_env_of_req( req, requestingSlv ); //magic num in req
1153 langMetaTask = requestingSlv->metaTask->langMetaTask;
1155 //Do the langlet's request handler
1156 //Want to keep PR structs hidden from plugin, so extract semReq..
1157 (*req->handler)( langMetaTask, req->semReq, semEnv );
1159 //Now that the langlet's done with it, recycle the slave if it's a freeTaskSlv
1160 if( requestingSlv->typeOfVP == FreeTaskSlv )
1161 PR_int__recycle_slave_multilang( requestingSlv );
1163 process->numLiveTasks -= 1;
1165 //check End Of Process Condition
1166 if( process->numLiveTasks == 0 &&
1167 process->numLiveGenericSlvs == 0 )
1168 //Tell the core controller to do wakeup of any waiting OS thread
1169 PR_SS__shutdown_process( process );