# HG changeset patch # User Me@portablequad # Date 1328651564 28800 # Node ID 6dd906e3c9a48fcc1cb31f2d49ce58afd5474461 # Parent 53825c49db83b992dcfd7fbb12182d4b4bd00366 added .hgeol to handle line-ending issues diff -r 53825c49db83 -r 6dd906e3c9a4 .hgeol --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/.hgeol Tue Feb 07 13:52:44 2012 -0800 @@ -0,0 +1,14 @@ + +[patterns] +**.py = native +**.txt = native +**.c = native +**.h = native +**.cpp = native +**.java = native +**.class = bin +**.jar = bin +**.sh = native +**.pl = native +**.jpg = bin +**.gif = bin diff -r 53825c49db83 -r 6dd906e3c9a4 DESIGN_NOTES.txt --- a/DESIGN_NOTES.txt Tue Jan 31 18:34:07 2012 +0100 +++ b/DESIGN_NOTES.txt Tue Feb 07 13:52:44 2012 -0800 @@ -1,212 +1,212 @@ - -From e-mail to Albert, on design of app-virt-procr to core-loop animation -switch and back. - -==================== -General warnings about this code: -It only compiles in GCC 4.x (label addr and computed goto) -Has assembly for x86 32bit - - -==================== -AVProcr data-struc has: stack-ptr, jump-ptr, data-ptr, slotNum, coreloop-ptr - and semantic-custom-ptr - -The VMS Creator: takes ptr to function and ptr to initial data --- creates a new AVProcr struc --- sets the jmp-ptr field to the ptr-to-function passed in --- sets the data-ptr to ptr to initial data passed in --- if this is for a suspendable virt processor, then create a stack and set - the stack-ptr - -VMS__create_procr( AVProcrFnPtr fnPtr, void *initialData ) -{ -AVProcr newPr = malloc( sizeof(AVProcr) ); -newPr->jmpPtr = fnPtr; -newPr->coreLoopDonePt = &CoreLoopDonePt; //label is in coreLoop -newPr->data = initialData; -newPr->stackPtr = createNewStack(); -return newPr; -} - -The semantic layer can then add its own state in the cusom-ptr field - -The Scheduler plug-in: --- Sets slave-ptr in AVProcr, and points the slave to AVProcr --- if non-suspendable, sets the AVProcr's stack-ptr to the slave's stack-ptr - -MasterLoop: --- puts AVProcr structures onto the workQ - -CoreLoop: --- gets stack-ptr out of AVProcr and sets the core's stack-ptr to that --- gets data-ptr out of AVProcr and puts it into reg GCC uses for that param --- puts AVProcr's addr into reg GCC uses for the AVProcr-pointer param --- jumps to the addr in AVProcr's jmp-ptr field -CoreLoop() -{ while( FOREVER ) - { nextPr = readQ( workQ ); //workQ is static (global) var declared volatile - = nextPr->data; - = nextPr; - = nextPr->stackPtr; - jmp nextPr->jmpPtr; -CoreLoopDonePt: //label's addr put into AVProcr when create new one - } -} -(Note, for suspendable processors coming back from suspension, there is no - need to fill the parameter registers -- they will be discarded) - -Suspend an application-level virtual processor: -VMS__AVPSuspend( AVProcr *pr ) -{ -pr->jmpPtr = &ResumePt; //label defined a few lines below -pr->slave->doneFlag = TRUE; -pr->stackPtr = ; -jmp pr->coreLoopDonePt; -ResumePt: return; -} - -This works because the core loop will have switched back to this stack - before jumping to ResumePt.. also, the core loop never modifies the - stack pointer, it simply switches to whatever stack pointer is in the - next AVProcr it gets off the workQ. - - - -============================================================================= -As it is now, there's only one major unknown about GCC (first thing below - the line), and there are a few restrictions, the most intrusive being - that the functions the application gives to the semantic layer have a - pre-defined prototype -- return nothing, take a pointer to initial data - and a pointer to an AVProcr struc, which they're not allowed to modify - -- only pass it to semantic-lib calls. - -So, here are the assumptions, restrictions, and so forth: -=========================== -Major assumption: that GCC will do the following the same way every time: - say the application defines a function that fits this typedef: -typedef void (*AVProcrFnPtr) ( void *, AVProcr * ); - -and let's say somewhere in the code they do this: -AVProcrFnPtr fnPtr = &someFunc; - -then they do this: -(*fnPtr)( dataPtr, animatingVirtProcrPtr ); - -Can the registers that GCC uses to pass the two pointers be predicted? - Will they always be the same registers, in every program that has the - same typedef? -If that typedef fixes, guaranteed, the registers (on x86) that GCC will use - to send the two pointers, then the rest of this solution works. - -Change in model: Instead of a virtual processor whose execution trace is - divided into work-units, replacing that with the pattern that a virtual - processor is suspended. Which means, no more "work unit" data structure - -- instead, it's now an "Application Virtual Processor" structure - -- AVProcr -- which is given directly to the application function! - - -- You were right, don't need slaves to be virtual processors, only need - "scheduling buckets" -- just a way to keep track of things.. - -Restrictions: --- the "virtual entities" created by the semantic layer must be virtual - processors, created with a function-to-execute and initial data -- the - function is restricted to return nothing and only take a pointer to the - initial data plus a pointer to an AVProcr structure, which represents - "self", the virtual processor created. (This is the interface I showed - you for "Hello World" semantic layer). -What this means for synchronous dataflow, is that the nodes in the graph - are virtual processors that in turn spawn a new virtual processor for - every "firing" of the node. This should be fine because the function - that the node itself is created with is a "canned" function that is part - of the semantic layer -- the function that is spawned is the user-provided - function. The restriction only means that the values from the inputs to - the node are packaged as the "initial data" given to the spawned virtual - processor -- so the user-function has to cast a void * to the - semantic-layer-defined structure by which it gets the inputs to the node. - --- Second restriction is that the semantic layer has to use VMS supplied - stuff -- for example, the data structure that represents the - application-level virtual processor is defined in VMS, and the semantic - layer has to call a VMS function in order to suspend a virtual processor. - --- Third restriction is that the application code never do anything with - the AVProcr structure except pass it to semantic-layer lib calls. - --- Fourth restriction is that every virtual processor must call a - "dissipate" function as its last act -- the user-supplied - virtual-processor function can't just end -- it has to call - SemLib__dissipate( AVProcr ) before the closing brace.. and after the - semantic layer is done cleaning up its own data, it has to in turn call - VMS__disspate( AVProcr ). - --- For performance reasons, I think I want to have two different kinds of - app-virtual processor -- suspendable ones and non-suspendable -- where - non-suspendable are not allowed to perform any communication with other - virtual processors, except at birth and death. Suspendable ones, of - course can perform communications, create other processors, and so forth - -- all of which cause it to suspend. -The performance difference is that I need a separate stack for each - suspendable, but non-suspendable can re-use a fixed number of stacks - (one for each slave). - - -==================== May 29 - -Qs: ---1 how to safely jump between virt processor's trace and coreloop ---2 how to set up __cdecl style stack + frame for just-born virtual processor ---3 how to switch stack-pointers + frame-pointers - - ---1: -Not sure if GCC's computed goto is safe, because modify the stack pointer -without GCC's knowledge -- although, don't use the stack in the coreloop -segment, so, actually, that should be safe! - -So, GCC has its own special C extensions, one of which gets address of label: - -void *labelAddr; -labelAddr = &&label; -goto *labelAddr; - ---2 -In CoreLoop, will check whether VirtProc just born, or was suspended. -If just born, do bit of code that sets up the virtual processor's stack -and frame according to the __cdecl convention for the standard virt proc -fn typedef -- save the pointer to data and pointer to virt proc struc into -correct places in the frame - __cdecl says, according to: -http://unixwiz.net/techtips/win32-callconv-asm.html -To do this: -push the parameters onto the stack, right most first, working backwards to - the left. -Then perform call instr, which pushes return addr onto stack. -Then callee first pushes the frame pointer, %EBP followed by placing the -then-current value of stack pointer into %EBP -push ebp -mov ebp, esp // ebp « esp - -Once %ebp has been changed, it can now refer directly to the function's - arguments as 8(%ebp), 12(%ebp). Note that 0(%ebp) is the old base pointer - and 4(%ebp) is the old instruction pointer. - -Then callee pushes regs it will use then adds to stack pointer the size of - its local vars. - -Stack in callee looks like this: -16(%ebp) - third function parameter -12(%ebp) - second function parameter -8(%ebp) - first function parameter -4(%ebp) - old %EIP (the function's "return address") -----------^^ State seen at first instr of callee ^^----------- -0(%ebp) - old %EBP (previous function's base pointer) --4(%ebp) - save of EAX, the only reg used in function --8(%ebp) - first local variable --12(%ebp) - second local variable --16(%ebp) - third local variable - - ---3 -It might be just as simple as two mov instrs, one for %ESP, one for %EBP.. - the stack and frame pointer regs + +From e-mail to Albert, on design of app-virt-procr to core-loop animation +switch and back. + +==================== +General warnings about this code: +It only compiles in GCC 4.x (label addr and computed goto) +Has assembly for x86 32bit + + +==================== +AVProcr data-struc has: stack-ptr, jump-ptr, data-ptr, slotNum, coreloop-ptr + and semantic-custom-ptr + +The VMS Creator: takes ptr to function and ptr to initial data +-- creates a new AVProcr struc +-- sets the jmp-ptr field to the ptr-to-function passed in +-- sets the data-ptr to ptr to initial data passed in +-- if this is for a suspendable virt processor, then create a stack and set + the stack-ptr + +VMS__create_procr( AVProcrFnPtr fnPtr, void *initialData ) +{ +AVProcr newPr = malloc( sizeof(AVProcr) ); +newPr->jmpPtr = fnPtr; +newPr->coreLoopDonePt = &CoreLoopDonePt; //label is in coreLoop +newPr->data = initialData; +newPr->stackPtr = createNewStack(); +return newPr; +} + +The semantic layer can then add its own state in the cusom-ptr field + +The Scheduler plug-in: +-- Sets slave-ptr in AVProcr, and points the slave to AVProcr +-- if non-suspendable, sets the AVProcr's stack-ptr to the slave's stack-ptr + +MasterLoop: +-- puts AVProcr structures onto the workQ + +CoreLoop: +-- gets stack-ptr out of AVProcr and sets the core's stack-ptr to that +-- gets data-ptr out of AVProcr and puts it into reg GCC uses for that param +-- puts AVProcr's addr into reg GCC uses for the AVProcr-pointer param +-- jumps to the addr in AVProcr's jmp-ptr field +CoreLoop() +{ while( FOREVER ) + { nextPr = readQ( workQ ); //workQ is static (global) var declared volatile + = nextPr->data; + = nextPr; + = nextPr->stackPtr; + jmp nextPr->jmpPtr; +CoreLoopDonePt: //label's addr put into AVProcr when create new one + } +} +(Note, for suspendable processors coming back from suspension, there is no + need to fill the parameter registers -- they will be discarded) + +Suspend an application-level virtual processor: +VMS__AVPSuspend( AVProcr *pr ) +{ +pr->jmpPtr = &ResumePt; //label defined a few lines below +pr->slave->doneFlag = TRUE; +pr->stackPtr = ; +jmp pr->coreLoopDonePt; +ResumePt: return; +} + +This works because the core loop will have switched back to this stack + before jumping to ResumePt.. also, the core loop never modifies the + stack pointer, it simply switches to whatever stack pointer is in the + next AVProcr it gets off the workQ. + + + +============================================================================= +As it is now, there's only one major unknown about GCC (first thing below + the line), and there are a few restrictions, the most intrusive being + that the functions the application gives to the semantic layer have a + pre-defined prototype -- return nothing, take a pointer to initial data + and a pointer to an AVProcr struc, which they're not allowed to modify + -- only pass it to semantic-lib calls. + +So, here are the assumptions, restrictions, and so forth: +=========================== +Major assumption: that GCC will do the following the same way every time: + say the application defines a function that fits this typedef: +typedef void (*AVProcrFnPtr) ( void *, AVProcr * ); + +and let's say somewhere in the code they do this: +AVProcrFnPtr fnPtr = &someFunc; + +then they do this: +(*fnPtr)( dataPtr, animatingVirtProcrPtr ); + +Can the registers that GCC uses to pass the two pointers be predicted? + Will they always be the same registers, in every program that has the + same typedef? +If that typedef fixes, guaranteed, the registers (on x86) that GCC will use + to send the two pointers, then the rest of this solution works. + +Change in model: Instead of a virtual processor whose execution trace is + divided into work-units, replacing that with the pattern that a virtual + processor is suspended. Which means, no more "work unit" data structure + -- instead, it's now an "Application Virtual Processor" structure + -- AVProcr -- which is given directly to the application function! + + -- You were right, don't need slaves to be virtual processors, only need + "scheduling buckets" -- just a way to keep track of things.. + +Restrictions: +-- the "virtual entities" created by the semantic layer must be virtual + processors, created with a function-to-execute and initial data -- the + function is restricted to return nothing and only take a pointer to the + initial data plus a pointer to an AVProcr structure, which represents + "self", the virtual processor created. (This is the interface I showed + you for "Hello World" semantic layer). +What this means for synchronous dataflow, is that the nodes in the graph + are virtual processors that in turn spawn a new virtual processor for + every "firing" of the node. This should be fine because the function + that the node itself is created with is a "canned" function that is part + of the semantic layer -- the function that is spawned is the user-provided + function. The restriction only means that the values from the inputs to + the node are packaged as the "initial data" given to the spawned virtual + processor -- so the user-function has to cast a void * to the + semantic-layer-defined structure by which it gets the inputs to the node. + +-- Second restriction is that the semantic layer has to use VMS supplied + stuff -- for example, the data structure that represents the + application-level virtual processor is defined in VMS, and the semantic + layer has to call a VMS function in order to suspend a virtual processor. + +-- Third restriction is that the application code never do anything with + the AVProcr structure except pass it to semantic-layer lib calls. + +-- Fourth restriction is that every virtual processor must call a + "dissipate" function as its last act -- the user-supplied + virtual-processor function can't just end -- it has to call + SemLib__dissipate( AVProcr ) before the closing brace.. and after the + semantic layer is done cleaning up its own data, it has to in turn call + VMS__disspate( AVProcr ). + +-- For performance reasons, I think I want to have two different kinds of + app-virtual processor -- suspendable ones and non-suspendable -- where + non-suspendable are not allowed to perform any communication with other + virtual processors, except at birth and death. Suspendable ones, of + course can perform communications, create other processors, and so forth + -- all of which cause it to suspend. +The performance difference is that I need a separate stack for each + suspendable, but non-suspendable can re-use a fixed number of stacks + (one for each slave). + + +==================== May 29 + +Qs: +--1 how to safely jump between virt processor's trace and coreloop +--2 how to set up __cdecl style stack + frame for just-born virtual processor +--3 how to switch stack-pointers + frame-pointers + + +--1: +Not sure if GCC's computed goto is safe, because modify the stack pointer +without GCC's knowledge -- although, don't use the stack in the coreloop +segment, so, actually, that should be safe! + +So, GCC has its own special C extensions, one of which gets address of label: + +void *labelAddr; +labelAddr = &&label; +goto *labelAddr; + +--2 +In CoreLoop, will check whether VirtProc just born, or was suspended. +If just born, do bit of code that sets up the virtual processor's stack +and frame according to the __cdecl convention for the standard virt proc +fn typedef -- save the pointer to data and pointer to virt proc struc into +correct places in the frame + __cdecl says, according to: +http://unixwiz.net/techtips/win32-callconv-asm.html +To do this: +push the parameters onto the stack, right most first, working backwards to + the left. +Then perform call instr, which pushes return addr onto stack. +Then callee first pushes the frame pointer, %EBP followed by placing the +then-current value of stack pointer into %EBP +push ebp +mov ebp, esp // ebp « esp + +Once %ebp has been changed, it can now refer directly to the function's + arguments as 8(%ebp), 12(%ebp). Note that 0(%ebp) is the old base pointer + and 4(%ebp) is the old instruction pointer. + +Then callee pushes regs it will use then adds to stack pointer the size of + its local vars. + +Stack in callee looks like this: +16(%ebp) - third function parameter +12(%ebp) - second function parameter +8(%ebp) - first function parameter +4(%ebp) - old %EIP (the function's "return address") +----------^^ State seen at first instr of callee ^^----------- +0(%ebp) - old %EBP (previous function's base pointer) +-4(%ebp) - save of EAX, the only reg used in function +-8(%ebp) - first local variable +-12(%ebp) - second local variable +-16(%ebp) - third local variable + + +--3 +It might be just as simple as two mov instrs, one for %ESP, one for %EBP.. + the stack and frame pointer regs