Table of Contents
Plan transformations: task trees, code replacement, and serialization
Disclaimer: the most up to date overview of plan transformations you can find in README.md of cram_plan_transformation
package (CRAM_REPO/cram_3d_world/cram_plan_transformation/README.md
). The tutorial below is out of date and has not be maintained, unfortunately, but you might still find some useful explanations there. Otherwise, stick to the README.
Description: this tutorial is aimed at CRAM developers, and its purpose is to present an in-development aspect of CRAM. It is less a proper tutorial and more documentation for work in progress. Expect content to change. It contains an intro to task trees, a data structure used by CRAM functions (note: this has existed in CRAM for a while and will not change). It also describes code replacement (also long existing), failure handling through code replacement (new), and serialization of task trees and functions (new).
CRAM task trees
Even before semrec, CRAM offered a logging mechanism to store what functions were called, when, and with what parameters, a mechanism referred to as a task tree. Let's first have a look at a very simple task tree. Run a REPL, and run the following code to give us some cram functions to play with later:
(swank:operate-on-system-for-emacs "cram-language" (quote load-op)) (in-package :cpl-impl) (def-cram-function foo () (format t "Just a cram function.~%")) (def-top-level-cram-function bar () (format t "Just a toplevel cram function.~%") (foo))
Before calling the newly-defined top-level cram function however, try the following:
*top-level-task-trees*
The REPL will return a hash table which you can inspect; since you've just started the REPL and didn't run any CRAM functions, it should be empty.
Next, run the top-level function defined above, then try to get the value for the hash table again:
(bar) *top-level-task-trees*
You'll notice there's an element in the hash table now, corresponding to the top-level function we just ran. There's also a more elegant way to get the task tree:
(get-top-level-task-tree 'bar)
Inspect the result the REPL returns and pay attention to the slots in the objects. Of particular importance will be the slots CODE, CODE-REPLACEMENTS, PATH, and CHILDREN. CODE-REPLACEMENTS is a list of objects of the same type as CODE, and will become important later when we try to replace code in the task tree. CHILDREN is a list of objects of type TASK-TREE-NODE. CODE contains an SEXP, a FUNCTION object, a TASK, and a list of PARAMETERS. PATH is a list of symbols.
Let's look at the structure generated for the very simple example above. It contains:
- A TASK-TREE-NODE with NIL path and NIL code, but one child …
- which is a node with path ( ('TOP-LEVEL 'BAR) ), non-nil code where the function object is the code defined for the bar function, and with one child …
- which is a node with path ( ('FOO) ('TOP-LEVEL 'BAR) ), non-nil code where the function object is the code defined for the foo function, and no children.
Notice that, in order to appear in the task tree, a function must be defined as a def-cram-function or def-top-level-cram-function. format doesn't appear in the task tree, and neither would any lisp function defined without def-[top-level-]cram-function.
Let's now also try the following, to observe another aspect of task tree generation:
(def-top-level-cram-function bar () (format t "Just a toplevel cram function.~%") (foo) (foo)) (bar) (get-top-level-task-tree 'bar)
and inspect the result. Notice that the tree structure has changed: the ( ('TOP-LEVEL 'BAR) ) node now contains two children, one with path ( ('FOO) ('TOP-LEVEL 'BAR) ) and the other with ( ('FOO :CALL 2) ('TOP-LEVEL 'BAR) ).
For some more intuition about task tree construction, here's another example you should try:
(def-cram-function eek () (format t "A very low level function.~%")) (def-cram-function foo () (format t "Just a cram function.~%") (eek)) (bar) (get-top-level-task-tree 'bar)
and if you inspect the result you should notice yet another change in the task tree structure. Each of the top-level node's children now also has a child, and they are, respectively, ( ('EEK) ('FOO) ('TOP-LEVEL 'BAR) ) and ( ('EEK) ('FOO :CALL 2) ('TOP-LEVEL 'BAR) ).
Here are the “laws” of task trees we encountered so far:
- only cram-functions (including the top-level cram function) appear in the task tree. (The format function didn't appear in the task tree)
- paths are lists of symbols which are the function names, the 'TOP-LEVEL symbol, and key-value :CALL if a function was called several times from the same location.
- paths refer to locations during execution history, not source code. (The nodes for EEK are children of the different FOO nodes, rather than EEK appearing as a child in a single FOO node)
Code replacement and plan running
The task tree isn't just a mechanism for logging however. Though the examples above don't show it, the task tree helps define what is actually run when you invoke a cram function, and we will look at this now in the next example:
(defun box-off () (format t "Box is in the off state.~%")) (def-cram-function box-state () (format t "Box is in the on-state, turning off ... ~%") (replace-task-code '(box-off) #'box-off '((BOX-STATE) (TOP-LEVEL SELF-OFF-TURNING-BOX)))) (def-top-level-cram-function self-off-turning-box () (box-state))
Load the above code into REPL and run the top-level function. The response should look something like this:
Box is in the on-state, turning off ... (#S(CODE :SEXP (BOX-OFF) :FUNCTION #<FUNCTION BOX-OFF> :TASK NIL :PARAMETERS NIL))
As you can see, the message in the box-state function is printed out, which isn't surprising, and notice that a CODE object is returned by replace-task-code. Let's first inspect the task tree for self-off-turning-box however. As before, use
(get-top-level-task-tree 'self-off-turning-box)
and inspect the result. Navigate to the ( (BOX-STATE) (TOP-LEVEL SELF-OFF-TURNING-BOX) ) node; you will see there is an element in the CODE-REPLACEMENTS list now, containing a function object that points to box-off. So let's try running the top-level function again and see what happens …
Box is in the off state. NIL
Observe that the call to box-state has now been replaced with a call to box-off. In general, and this is another “law” of task trees, if a task tree node exists and has code replacements, then the car of those code replacements is run when the node is reached during execution history.
The last part of the law is very important, as the next example shows:
(def-cram-function box-state () (format t "Box is in the on-state, turning off ... ~%") (replace-task-code '(box-off) #'box-off '((BOX-OFF) (BOX-STATE) (TOP-LEVEL SELF-OFF-TURNING-BOX))))
First, inspect the variable *top-level-task-trees* and clear the entry associated with self-off-turning-box. Alternatively, run
(setf (gethash 'self-off-turning-box *top-level-task-trees*) nil)
Next, run self-off-turning-box again and inspect its corresponding task tree that you get with get-top-level-task-tree. Notice that a ( (BOX-OFF) (BOX-STATE) (TOP-LEVEL SELF-OFF-TURNING-BOX) ) node has been created, and it has the code replacement specified. However, if you run self-off-turning-box again … nothing different seems to have happened. The CODE-REPLACEMENTS list grows longer (and you will see it as a return value) but the box-off function is never actually called. This is because there is no call to a cram function called box-off inside the cram function box-state, therefore there is no opportunity for the system to ask whether there is code defined for such an eventuality.
This isn't to say that code replacements cannot change the structure of a task tree in ways that will also impact execution. You can add meaningful (executable) nodes to the tree, you just have to make sure that there is a call to the highest node in the hierarchy you want glued to the task tree. Here's an example of that:
(def-cram-function part-a () (format t "You should never see this.~%")) (def-cram-function part-b () (format t "You should never see this either.~%")) (defun part-a-true () (format t "The part A you will see.~%")) (defun part-b-true () (format t "The part B you will see.~%")) (defun double-call () (part-a) (part-b)) (def-cram-function simple-node () (format t "Just a place-holder that you won't see.")) (def-top-level-cram-function anticipator () (replace-task-code '(double-call) #'double-call '((SIMPLE-NODE) (TOP-LEVEL ANTICIPATOR))) (replace-task-code '(part-a-true) #'part-a-true '((PART-A) (SIMPLE-NODE) (TOP-LEVEL ANTICIPATOR))) (replace-task-code '(part-b-true) #'part-b-true '((PART-B) (SIMPLE-NODE) (TOP-LEVEL ANTICIPATOR))) (simple-node))
If you run the anticipator top-level function, you'll notice that the part-a and part-b functions (as well as simple-node) never actually ran. Instead, the code replacements caused certain nodes to be created and were invoked when execution reached certain points.
The above examples were for parameter-less functions. Parameters, however, are not changed via code-replacement. Functions are called with the parameters that actually exist at execution time; the PARAMETERS slot in the task tree node's CODE or CODE-REPLACEMENTS is for logging purposes only.
We can therefore state some additional “laws” of task trees:
- when a top-level cram function is called, it checks whether a top-level task tree associated with it exists. If not, it creates one and sets the current execution path to ( (TOP-LEVEL <function name>) )
- when a cram function is called, it sets the current execution path to (append (list <function name>) <current execution path>) then checks whether a task tree node exists at the current execution path.
- If no node exists, the function creates it, then runs the code defined in the function's source code.
- If a node exists, but it has a valid task object (that might point to an active or finished task), then the cram function modifies the current path by adding a key-value :CALL (+ 1 <max key-value so far at this location, default 1>), creating a node there, and running its source code.
- If a node exists, but it has a nil task object, then the cram function will run either the car of CODE-REPLACEMENTS if it is not nil, or CODE otherwise.
- When a task tree node is run, the value of PARAMETERS in either CODE or (car CODE-REPLACEMENTS) is set to the value of the parameters at the moment of the function call.
In the examples so far we've used replace-task-code, a function defined in cram-language that takes an s-expression, a function, a path (and, optionally, a task tree; default value is the current task tree). All of the above features have existed in CRAM for a long time and are not the subject of recent pull requests.
Failure handling through code replacement: a simple example
Now that we've looked at the basics of task trees and replace-task-code, it's time to see a simple possible application of this capability– We will try to replace code as part of a failure handling routine:
(in-package :cpl-impl) (defvar *stop-infinity* 0) (setf *stop-infinity* 0) (remove-top-level-task-tree 'ptr-fail-handle) (def-cram-function failure-causing () (if (> 1 *stop-infinity*) (progn (setf *stop-infinity* (+ 1 *stop-infinity*)) (format T "FAILURE-CAUSING: Will attempt to send a fail signal from ~A.~%" *current-path*) (fail 'plan-failure))) (format T "FAILURE-CAUSING: Why are we back here?~%")) (def-cram-function some-function () (failure-causing)) (defun plan-repaired () (format T "PLAN-REPAIRED: Hi there. This was inserted via code replacement.~%")) (def-top-level-cram-function ptr-fail-handle () (with-failure-handling ((plan-failure (f) (let ((code-path (plan-failure/get-code-path f))) (format T "Code path of failure: ~A.~% Current path: ~A.~%" code-path *current-path*) (replace-task-code '(plan-repaired) #'plan-repaired code-path) (retry)))) (progn (some-function))))
In other words, we're defining a top-level function that, if a failure happens, will replace one of the cram-functions it calls; we'll see soon enough why for the sake of example we added the check to *stop-infinity*. Let's run the top-level function. The response you see should be something like this:
FAILURE-CAUSING: Will attempt to send a fail signal from ((FAILURE-CAUSING) (SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). Code path of failure: ((FAILURE-CAUSING) (SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). Current path: ((SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). FAILURE-CAUSING: Why are we back here? NIL
As you can see, it's only the FAILURE-CAUSING function that gets called, not PLAN-REPAIRED, which we inserted via code-replacement. What happened? If you inspect the task tree associated with PTR-FAIL-HANDLE, you'll notice that there are 2 children for the (TOP-LEVEL PTR-FAIL-HANDLE) node, ( (SOME-FUNCTION) …) and ( (SOME-FUNCTION : CALL 2) …). This is because the retry does a non-local exit and returns to the beginning of the &BODY in the WITH-FAILURE-HANDLING clause, where it encounters a call to SOME-FUNCTION, and it sees that there is still a task object associated with ( (SOME-FUNCTION) …). Therefore (see the “laws” above) it creates a new sibling node in which the code replacement has no effect, and so it eventually calls FAILURE-CAUSING again (which this time simply terminates, to avoid an infinite loop in the example). However if you run ptr-fail-handle again the answer you see looks like this:
PLAN-REPAIRED: Hi there. This was inserted via code replacement. NIL
(You'll get the same answer if you (setf *stop-infinity* 0) and then run ptr-fail-handle.)
This is because after a top-level function completes, all task objects in the task tree are set to nil. Therefore, when we rerun the plan and reach the call to some-function again, it's ( (SOME-FUNCTION) …) that is the current path and we have the code replacement set up here.
WITH-FAILURE-HANDLING has been intended to preserve and log all attempts in the task tree. However, in case you want your plan to actually change during runtime, rather than having to wait until it ends and runs again, you will need a new CRAM macro, with-transformative-failure-handling. The code looks exactly as before, with just a change to the top-level:
(def-top-level-cram-function ptr-fail-handle () (with-transformative-failure-handling ((plan-failure (f) (let ((code-path (plan-failure/get-code-path f))) (format T "Code path of failure: ~A.~% Current path: ~A.~%" code-path *current-path*) (replace-task-code '(plan-repaired) #'plan-repaired code-path) (retry)))) (progn (some-function))))
Let's run the new code (as well as reset some global variables to get a clean environment):
(remove-top-level-task-tree 'ptr-fail-handle) (setf *stop-infinity* 0) (ptr-fail-handle)
The answer should show that the top-level calls the failure causing function once, then changes itself, then will call the replacement function immediately thereafter:
FAILURE-CAUSING: Will attempt to send a fail signal from ((FAILURE-CAUSING) (SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). Code path of failure: ((FAILURE-CAUSING) (SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). Current path: ((SOME-FUNCTION) (TOP-LEVEL PTR-FAIL-HANDLE)). PLAN-REPAIRED: Hi there. This was inserted via code replacement. NIL
If you inspect the task tree for ptr-fail-handle now, you will see there is no :CALL 2 node. Unlike with-failure-handling, with-transformative-failure-handling will only store the last attempt.
Using projection runs to guide plan transformation
Besides reacting to runtime conditions, another possible use of plan transformation is to try out several plan structures “in simulation” (what CRAM refers to as inside a projection environment), select the best plan structure, then rerun it “for real”, outside projection. To assist in writing such a plan, the with-transformative-tryouts macro is defined in and exported by the cram-projection package. Below is an example code snippet showing a very simple use of the macro, which you can run in the roslisp repl.
(swank:operate-on-system-for-emacs "cram-projection" (quote load-op)) (in-package :cpl-impl) (cram-projection:define-projection-environment example-env) (def-top-level-cram-function tryprj () (cram-projection:with-transformative-tryouts example-env (progn (format T "Got result ~A~%Now retrying for real.~%" (cram-projection:projection-ended/get-projection-outcome *projection-signal-data*)) (setf *in-projection-environment* nil) (retry)) (format T "Working in environment ~A~%" (if *in-projection-environment* 'example-env nil)) (format T "Doing stuff.~%"))) (tryprj)
The result from tryprj should look like this:
Working in environment EXAMPLE-ENV Doing stuff. Got result #S(CRAM-PROJECTION::PROJECTION-ENVIRONMENT-RESULT :NAME EXAMPLE-ENV :RESULT (NIL) :ENVIRONMENT ((*EPISODE-KNOWLEDGE* . #<LIVE-EPISODE-KNOWLEDGE {1002B609F3}>))) Now retrying for real. Working in environment NIL Doing stuff. NIL
We'll now look at what happens and explain the code inside tryprj.
The parameters for cram-projection:with-transformative-tryouts are
- a PROJECTION-ENVIRONMENT-NAME (same semantics of the parameter as with the with-projection-environment macro: the parameter will be converted to a symbol so you must not write 'example-env; instead, use example-env).
- a TRANSFORMATION-CLAUSE in which the plan should analyze the results from projection, decide whether/how to transform BODY, decide whether to run in projection or not. Some notes on TRANSFORMATION-CLAUSE:
- will typically be a block of code inside a PROGN.
- will typically end in RETRY. Ending in RETURN is also possible. In that case, TRANSFORMATION-CLAUSE will not call BODY again, but instead the macro will return NIL (or whatever value is in the RETURN S-exp) to its caller.
- TRANSFORMATION-CLAUSE has access to the results of projection by (cram-projection:projection-ended/get-projection-outcome cpl-impl:*projection-signal-data*). In the example above we only print the results.
- the path of BODY inside the task tree is accessible to TRANSFORMATION-CLAUSE via cpl-impl:*retry-path*.
- cpl-impl:*in-projection-environment* is a flag that TRANSFORMATION-CLAUSE can set to control whether the next run of BODY will be inside the projection environment or not. At the start of the macro, cpl-impl:*in-projection-environment* is set to T (BODY will run in projection). In the example, we set it to NIL to also have a run outside the projection environment.
- any special variables acting as parameters for BODY can be changed in TRANSFORMATION-CLAUSE. In the example, we use cpl-impl:*in-projection-environment* as such a parameter.
- if plan transformations should be enabled for BODY, it must contain at least one CRAM-FUNCTION.
- BODY, a list of S-expressions that should be run. The macro return value is the return value of BODY.
Just like with-transformative-failure-handling, with-transformative-tryouts will store only the latest run of BODY in the task tree.
Changing parameters with plan transformation
Remember one of the previous laws of the task tree:
When a task tree node is run, the value of PARAMETERS in either CODE or (car CODE-REPLACEMENTS) is set to the value of the parameters at the moment of the function call.
This means that, typically, you can't change parameters through plan transformation as previously implemented in CRAM. However, it's useful to distinguish two types of parameters for a cram-function:
- “run-time” parameters. These take their values from the actual runtime context that a cram-function is called in. This is the only behavior available in the main CRAM code branch.
- “compile-time” parameters. These are fixed when the code is generated, either at compile time or after plan transformation. You can think of them as parameters to tweak how a function runs, but they can also be adjusted to reflect the runtime context. Their main use is to allow plan transformation to affect them.
To support “compile-time” parameters, a new construct was just added to cpl, def-ptr-cram-function, and its workings are explained below. Let's first load a sample code into a new repl window:
(swank:operate-on-system-for-emacs "cram-language" (quote load-op)) (defparameter par-1 1) ;; just a couple variables to give some 'runtime' context for our cram functions to work in (defparameter par-2 1) (cpl-impl:def-ptr-cram-function example-ptr (P &rest args) (format T "Inside example-ptr.~%") (format T "Received arguments ~a~%" args) (format T "PTR parameter is (~a)~%Will now fail, so as to trigger plan transformation.~%" P) (cpl:fail 'cpl-impl:plan-failure)) (defun compatible-ptr (P &rest args) (format T "Inside compatible-ptr.~%") (format T "Received arguments ~a~%" args) (format T "PTR parameter is (~a)~%Done!~%" P)) (cpl-impl:def-top-level-cram-function try-ptr () (cpl-impl:with-transformative-failure-handling ((cpl-impl:plan-failure (f) (let ((code-path (cpl-impl::plan-failure/get-code-path f))) (format t "Will replace code now (and switch the ptr-parameter) ...~%") (cpl-impl:replace-task-code '(compatible-ptr args) #'compatible-ptr code-path :ptr-parameter "Oranges") (cpl-impl:retry)))) (setf par-1 (+ par-1 par-2)) (setf par-2 (- par-1 par-2)) (example-ptr "Apples" par-1 par-2 0)))
Of these, example-ptr is the “star” of the show. Notice that it's been defined with def-ptr-cram-function, and what that tells CRAM is that the first argument in the supplied lambda list (in this case, P) is supposed to be a “compile-time”, or plan-transformation accessible, parameter.
ptr-cram-functions are a bit peculiar. When you call them the first time they behave like regular cram functions, and take their parameters from the ones you supplied. When you call them again (meaning, when there's a node in the task tree corresponding to the plan location you're calling them from), they ignore the first parameter in the lambda list you supply them with and replace it with the ptr-parameter stored in the task tree node. ptr-cram-functions do all this automatically; you don't need to write anything different when using them. In the example above, the ptr-cram-function uses the P parameter as if it were a usual parameter.
To actually change the compile time parameter in a node, notice the extra &key parameter in replace-task-code:
(cpl-impl:replace-task-code '(compatible-ptr args) #'compatible-ptr code-path :ptr-parameter "Oranges")
By default the :ptr-parameter to cpl-impl:replace-task-code is set to the previously effective value of ptr-parameter that is stored in the task tree node. This either the ptr-parameter in the code slot of the node (if there are no code replacements), or the ptr-parameter in the car object in the code-replacements list of the node. When a task tree node is created and a ptr-parameter is not specified, the default is nil.
In our example here, the compile-time parameter is a string. It can however be anything, including a struct or a list, for when you need to pass on several objects through plan transformation.
Let's run the top-level function and see what happens (trace below shows a copy-paste from the repl window) …
CL-USER> (try-ptr) Inside example-ptr. Received arguments (2 1 0) PTR parameter is (Apples) Will now fail, so as to trigger plan transformation. Will replace code now (and switch the ptr-parameter) ... Inside compatible-ptr. Received arguments (3 2 0) PTR parameter is (Oranges) Done! NIL CL-USER> (try-ptr) Inside compatible-ptr. Received arguments (5 3 0) PTR parameter is (Oranges) Done! NIL CL-USER> (try-ptr) Inside compatible-ptr. Received arguments (8 5 0) PTR parameter is (Oranges) Done! NIL
Notice how the transformed plan persists between plan runs, including the compile-time parameter, even though the run-time parameters change as expected.
Task tree serialization
Task trees are, as we've seen, not just a log of values but effectively a program themselves, because they track whether code replacements exists, and which functions should be used as replacements. It may be useful at times to be able to store task trees persistently, including code replacements, and we will look at de/serialization next. The assumption is you have just run the example above in your REPL, and you therefore have a task tree for the ptr-fail-handle function.
To store the task tree, use the following:
(swank:operate-on-system-for-emacs "cram-execution-trace" (quote load-op)) (store-tree (get-top-level-task-tree 'ptr-fail-handle) "~/example.dat")
It's more interesting if we restore from an empty REPL, so close your REPL and rerun it. You will need to load the cram basic packages cram_language and cram_execution_trace:
(swank:operate-on-system-for-emacs "cram-language" (quote load-op)) (swank:operate-on-system-for-emacs "cram-execution-trace" (quote load-op)) (in-package :cpl-impl)
We'll restore the task tree in a moment, but before we do we need to reload the previous code:
(defvar *stop-infinity* 0) (setf *stop-infinity* 0) (remove-top-level-task-tree 'ptr-fail-handle) (def-cram-function failure-causing () (if (> 1 *stop-infinity*) (progn (setf *stop-infinity* (+ 1 *stop-infinity*)) (format T "FAILURE-CAUSING: Will attempt to send a fail signal from ~A.~%" *current-path*) (fail 'plan-failure))) (format T "FAILURE-CAUSING: Why are we back here?~%")) (def-cram-function some-function () (failure-causing)) (defun plan-repaired () (format T "PLAN-REPAIRED: Hi there. This was inserted via code replacement.~%")) (def-top-level-cram-function ptr-fail-handle () (with-transformative-failure-handling ((plan-failure (f) (let ((code-path (plan-failure/get-code-path f))) (format T "Code path of failure: ~A.~% Current path: ~A.~%" code-path *current-path*) (replace-task-code '(plan-repaired) #'plan-repaired code-path) (retry)))) (progn (some-function))))
This is because we cannot store function executable code with cet:store-tree (for now, at least). Rather, what we can re/store are function names. Typically, you will have your plans in source packages that you load anyway; restoring the task tree is simply there to reload any code replacements that were found useful during execution. Notice that in the code we loaded the top-level function calls some-function, which in turn calls failure-causing.
Let's then restore the task tree and rerun the top-level function.
(setf (gethash 'ptr-fail-handle *top-level-task-trees*) (cet:restore-tree "~/example.dat")) (ptr-fail-handle)
The response shows that the plan-repaired function gets called instead of failure-causing.
The following point is very important. In order for a replacement to be restorable, it must be that the function is named. This is ok:
(replace-code-task '(some sexp) #'some-named-function some-code-path)
This is ok as far as code replacement goes, but is not de/serializable (yet), which means it cannot be restored:
(replace-code-task '(some sexp) (lambda (some-args) (some-named-function some-args)) some-code-path)
If you use the second version, then the restored tree will just use the default function code, rather than any replacement (because the restored replacement is nil).