Mercurial > cgi-bin > hgwebdir.cgi > VMS > VMS_Implementations > SSR_impls > SSR__MC_shared_impl
changeset 4:b7a974ccc6f4
Added old design notes -- probably very out of sync with code
| author | Me |
|---|---|
| date | Wed, 28 Jul 2010 13:16:31 -0700 |
| parents | 9f2e23d38ff2 |
| children | 833c981134dd |
| files | DESIGN_NOTES.txt |
| diffstat | 1 files changed, 212 insertions(+), 0 deletions(-) [+] |
line diff
1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 1.2 +++ b/DESIGN_NOTES.txt Wed Jul 28 13:16:31 2010 -0700 1.3 @@ -0,0 +1,212 @@ 1.4 + 1.5 +From e-mail to Albert, on design of app-virt-procr to core-loop animation 1.6 +switch and back. 1.7 + 1.8 +==================== 1.9 +General warnings about this code: 1.10 +It only compiles in GCC 4.x (label addr and computed goto) 1.11 +Has assembly for x86 32bit 1.12 + 1.13 + 1.14 +==================== 1.15 +AVProcr data-struc has: stack-ptr, jump-ptr, data-ptr, slotNum, coreloop-ptr 1.16 + and semantic-custom-ptr 1.17 + 1.18 +The VMS Creator: takes ptr to function and ptr to initial data 1.19 +-- creates a new AVProcr struc 1.20 +-- sets the jmp-ptr field to the ptr-to-function passed in 1.21 +-- sets the data-ptr to ptr to initial data passed in 1.22 +-- if this is for a suspendable virt processor, then create a stack and set 1.23 + the stack-ptr 1.24 + 1.25 +VMS__create_procr( AVProcrFnPtr fnPtr, void *initialData ) 1.26 +{ 1.27 +AVProcr newPr = malloc( sizeof(AVProcr) ); 1.28 +newPr->jmpPtr = fnPtr; 1.29 +newPr->coreLoopDonePt = &CoreLoopDonePt; //label is in coreLoop 1.30 +newPr->data = initialData; 1.31 +newPr->stackPtr = createNewStack(); 1.32 +return newPr; 1.33 +} 1.34 + 1.35 +The semantic layer can then add its own state in the cusom-ptr field 1.36 + 1.37 +The Scheduler plug-in: 1.38 +-- Sets slave-ptr in AVProcr, and points the slave to AVProcr 1.39 +-- if non-suspendable, sets the AVProcr's stack-ptr to the slave's stack-ptr 1.40 + 1.41 +MasterLoop: 1.42 +-- puts AVProcr structures onto the workQ 1.43 + 1.44 +CoreLoop: 1.45 +-- gets stack-ptr out of AVProcr and sets the core's stack-ptr to that 1.46 +-- gets data-ptr out of AVProcr and puts it into reg GCC uses for that param 1.47 +-- puts AVProcr's addr into reg GCC uses for the AVProcr-pointer param 1.48 +-- jumps to the addr in AVProcr's jmp-ptr field 1.49 +CoreLoop() 1.50 +{ while( FOREVER ) 1.51 + { nextPr = readQ( workQ ); //workQ is static (global) var declared volatile 1.52 + <dataPtr-param-register> = nextPr->data; 1.53 + <AVProcrPtr-param-register> = nextPr; 1.54 + <stack-pointer register> = nextPr->stackPtr; 1.55 + jmp nextPr->jmpPtr; 1.56 +CoreLoopDonePt: //label's addr put into AVProcr when create new one 1.57 + } 1.58 +} 1.59 +(Note, for suspendable processors coming back from suspension, there is no 1.60 + need to fill the parameter registers -- they will be discarded) 1.61 + 1.62 +Suspend an application-level virtual processor: 1.63 +VMS__AVPSuspend( AVProcr *pr ) 1.64 +{ 1.65 +pr->jmpPtr = &ResumePt; //label defined a few lines below 1.66 +pr->slave->doneFlag = TRUE; 1.67 +pr->stackPtr = <current SP reg value>; 1.68 +jmp pr->coreLoopDonePt; 1.69 +ResumePt: return; 1.70 +} 1.71 + 1.72 +This works because the core loop will have switched back to this stack 1.73 + before jumping to ResumePt.. also, the core loop never modifies the 1.74 + stack pointer, it simply switches to whatever stack pointer is in the 1.75 + next AVProcr it gets off the workQ. 1.76 + 1.77 + 1.78 + 1.79 +============================================================================= 1.80 +As it is now, there's only one major unknown about GCC (first thing below 1.81 + the line), and there are a few restrictions, the most intrusive being 1.82 + that the functions the application gives to the semantic layer have a 1.83 + pre-defined prototype -- return nothing, take a pointer to initial data 1.84 + and a pointer to an AVProcr struc, which they're not allowed to modify 1.85 + -- only pass it to semantic-lib calls. 1.86 + 1.87 +So, here are the assumptions, restrictions, and so forth: 1.88 +=========================== 1.89 +Major assumption: that GCC will do the following the same way every time: 1.90 + say the application defines a function that fits this typedef: 1.91 +typedef void (*AVProcrFnPtr) ( void *, AVProcr * ); 1.92 + 1.93 +and let's say somewhere in the code they do this: 1.94 +AVProcrFnPtr fnPtr = &someFunc; 1.95 + 1.96 +then they do this: 1.97 +(*fnPtr)( dataPtr, animatingVirtProcrPtr ); 1.98 + 1.99 +Can the registers that GCC uses to pass the two pointers be predicted? 1.100 + Will they always be the same registers, in every program that has the 1.101 + same typedef? 1.102 +If that typedef fixes, guaranteed, the registers (on x86) that GCC will use 1.103 + to send the two pointers, then the rest of this solution works. 1.104 + 1.105 +Change in model: Instead of a virtual processor whose execution trace is 1.106 + divided into work-units, replacing that with the pattern that a virtual 1.107 + processor is suspended. Which means, no more "work unit" data structure 1.108 + -- instead, it's now an "Application Virtual Processor" structure 1.109 + -- AVProcr -- which is given directly to the application function! 1.110 + 1.111 + -- You were right, don't need slaves to be virtual processors, only need 1.112 + "scheduling buckets" -- just a way to keep track of things.. 1.113 + 1.114 +Restrictions: 1.115 +-- the "virtual entities" created by the semantic layer must be virtual 1.116 + processors, created with a function-to-execute and initial data -- the 1.117 + function is restricted to return nothing and only take a pointer to the 1.118 + initial data plus a pointer to an AVProcr structure, which represents 1.119 + "self", the virtual processor created. (This is the interface I showed 1.120 + you for "Hello World" semantic layer). 1.121 +What this means for synchronous dataflow, is that the nodes in the graph 1.122 + are virtual processors that in turn spawn a new virtual processor for 1.123 + every "firing" of the node. This should be fine because the function 1.124 + that the node itself is created with is a "canned" function that is part 1.125 + of the semantic layer -- the function that is spawned is the user-provided 1.126 + function. The restriction only means that the values from the inputs to 1.127 + the node are packaged as the "initial data" given to the spawned virtual 1.128 + processor -- so the user-function has to cast a void * to the 1.129 + semantic-layer-defined structure by which it gets the inputs to the node. 1.130 + 1.131 +-- Second restriction is that the semantic layer has to use VMS supplied 1.132 + stuff -- for example, the data structure that represents the 1.133 + application-level virtual processor is defined in VMS, and the semantic 1.134 + layer has to call a VMS function in order to suspend a virtual processor. 1.135 + 1.136 +-- Third restriction is that the application code never do anything with 1.137 + the AVProcr structure except pass it to semantic-layer lib calls. 1.138 + 1.139 +-- Fourth restriction is that every virtual processor must call a 1.140 + "dissipate" function as its last act -- the user-supplied 1.141 + virtual-processor function can't just end -- it has to call 1.142 + SemLib__dissipate( AVProcr ) before the closing brace.. and after the 1.143 + semantic layer is done cleaning up its own data, it has to in turn call 1.144 + VMS__disspate( AVProcr ). 1.145 + 1.146 +-- For performance reasons, I think I want to have two different kinds of 1.147 + app-virtual processor -- suspendable ones and non-suspendable -- where 1.148 + non-suspendable are not allowed to perform any communication with other 1.149 + virtual processors, except at birth and death. Suspendable ones, of 1.150 + course can perform communications, create other processors, and so forth 1.151 + -- all of which cause it to suspend. 1.152 +The performance difference is that I need a separate stack for each 1.153 + suspendable, but non-suspendable can re-use a fixed number of stacks 1.154 + (one for each slave). 1.155 + 1.156 + 1.157 +==================== May 29 1.158 + 1.159 +Qs: 1.160 +--1 how to safely jump between virt processor's trace and coreloop 1.161 +--2 how to set up __cdecl style stack + frame for just-born virtual processor 1.162 +--3 how to switch stack-pointers + frame-pointers 1.163 + 1.164 + 1.165 +--1: 1.166 +Not sure if GCC's computed goto is safe, because modify the stack pointer 1.167 +without GCC's knowledge -- although, don't use the stack in the coreloop 1.168 +segment, so, actually, that should be safe! 1.169 + 1.170 +So, GCC has its own special C extensions, one of which gets address of label: 1.171 + 1.172 +void *labelAddr; 1.173 +labelAddr = &&label; 1.174 +goto *labelAddr; 1.175 + 1.176 +--2 1.177 +In CoreLoop, will check whether VirtProc just born, or was suspended. 1.178 +If just born, do bit of code that sets up the virtual processor's stack 1.179 +and frame according to the __cdecl convention for the standard virt proc 1.180 +fn typedef -- save the pointer to data and pointer to virt proc struc into 1.181 +correct places in the frame 1.182 + __cdecl says, according to: 1.183 +http://unixwiz.net/techtips/win32-callconv-asm.html 1.184 +To do this: 1.185 +push the parameters onto the stack, right most first, working backwards to 1.186 + the left. 1.187 +Then perform call instr, which pushes return addr onto stack. 1.188 +Then callee first pushes the frame pointer, %EBP followed by placing the 1.189 +then-current value of stack pointer into %EBP 1.190 +push ebp 1.191 +mov ebp, esp // ebp « esp 1.192 + 1.193 +Once %ebp has been changed, it can now refer directly to the function's 1.194 + arguments as 8(%ebp), 12(%ebp). Note that 0(%ebp) is the old base pointer 1.195 + and 4(%ebp) is the old instruction pointer. 1.196 + 1.197 +Then callee pushes regs it will use then adds to stack pointer the size of 1.198 + its local vars. 1.199 + 1.200 +Stack in callee looks like this: 1.201 +16(%ebp) - third function parameter 1.202 +12(%ebp) - second function parameter 1.203 +8(%ebp) - first function parameter 1.204 +4(%ebp) - old %EIP (the function's "return address") 1.205 +----------^^ State seen at first instr of callee ^^----------- 1.206 +0(%ebp) - old %EBP (previous function's base pointer) 1.207 +-4(%ebp) - save of EAX, the only reg used in function 1.208 +-8(%ebp) - first local variable 1.209 +-12(%ebp) - second local variable 1.210 +-16(%ebp) - third local variable 1.211 + 1.212 + 1.213 +--3 1.214 +It might be just as simple as two mov instrs, one for %ESP, one for %EBP.. 1.215 + the stack and frame pointer regs
