Monday, August 24, 2009

Abstracting Ambiguity

The Quantum::Superpositions module provides a facility for ambiguous computation in Perl. This idea has been added to Perl 6 under the name "Junctions". This data type lets you treat a list of alternatives for a value as thought that list were a single scalar.

Implementing a backtracking search is more work than a simple result validator. Backtracking complicates control flow and requires you to manage those lists of alternatives[1]. The added effort pays off in efficiency compared to an exhaustive search built on top of a simple validator, which only needs to worry about a single value for each parameter at any given point.

Junctions work by overloading a single datum to behave as if it were multiple values simultaneously. They lets pretend you're checking a single result, when you're actually checking many possible results at the same time.

If you're played a bit with Perl 6 you probably know how much fun it is to use junctions to simplify managing those lists of alternative values.

My friend awwaiid recently introduced me to the amb operator invented by John McCarthy. It reads a little like the any junction constructor:

my $x = amb(1, 2, 3);
my $y = amb(1, 2, 3);

amb_assert($x >= 2);
amb_assert($x + $y == 5);
amb_assert($y <= 2);

say "$x + $y = 5"; # prints 3 + 2 == 5

But this is where the resemblance ends[2]. Instead of overloading the data stored in $x and $y, amb overloads the control flow of the expressions that assign to $x and $y.

When amb is invoked it captures a continuation, and then invokes it with the first value it was given (this causes that value to be returned from amb). Execution then resumes normally; in the first expression $x is assigned the value 1 because that's what the continuation was invoked with.

Eventually the first assertion is evaluated and fails because $x >= 2 is false. At this point we backtrack to the last amb (kind of like using die). Since the continuation that assigns the value to $x has been reified we can use it again with the next value, 2.

What amb is doing is injecting the control flow required for backtracking into the stack, adding backtracking retrying logic. Every amb expression is an implicit backtracking marker.

The above example is taken from the Continuation::Delimited test suite. I think this is a very nice example of how continuations (delimited or otherwise) can be used to create powerful but simple constructs. By treating program flow as first class data we can create abstractions for the patterns we find in it.

Delimited continuations allow you to wrap, discard and replicate pieces of the program's flow; these are all simple manipulations you can do to first class data. While traditional first class functions let you do these things to operations that haven't happened yet, with continuations you can manipulate operations that are happening right now, enabling the expressiveness of higher order functions to be applied very interesting ways.

The real beauty of continuations is that the program's flow is still written in exactly the same way as before. Continuations are captured and reified from normal, unaltered code, which is why they are such a general purpose tool.

[1] It gets even harder if mutable state is involved, since backtracking would require rolling that state back before retrying. Yet another reason to prefer immutability.

[2] This example can't be translated directly to junctions. The predicate checks don't (and can't) modify the junctions, so they do not compose with each other implicitly.

Although junctions only provide a limited aspect of nondeterministic programming that doesn't mean they are less useful than amb, they're just different. When collecting permuted results is what you really want then overloading the data (as opposed to control flow) is a more direct approach.

The expression 1|2|3 + 1|2|3 == 5 evalutes to true because at least one of the permuted sums is equal to 5. Although it can't tell you which of the summands amount to 5, the ability to get a junction for the sums (1|2|3 + 1|2|3) is useful in its own right.

7 comments:

kappa said...

In case you didn't see this: http://search.cpan.org/dist/Amb/

Luke Palmer said...

Or you can overload control flow and data at the same time...

do {
x <- [1,2,3];
y <- [1,2,3];
guard (x > 2);
guard (x + y == 5);
guard (y <= 2);
return (show x ++ " + " ++ show y ++ " = 5")
}

A major contributor to my stifled inspiration for Perl 6 was realizing how much Junctions wanted to be Monad []. But it was no fault of the Junctions: it was their society keeping them from realizing their true potential.

nothingmuch said...

You can do that in Haskell because every expression in a do block is an implicit lambda and you can control the way they are sequenced based on the typeclass of the data involved.

Since mbind is effectively the increment-the-instruction-pointer operation for any data that is relevant or interesting, they are a more generic abstraction than continuations.

Unsurprisingly, continuations in Haskell are implemented in terms of a monad (which also makes the implementation very clean and simple, since mbind is already effectively CPS and there's no mutable data).

In Perl 6 this could theoretically still be done (e.g. by rewriting the AST) but in Perl 5 I am not sure there is a way to do this without suffering brain damage.

B::Generate is not stable enough and there are no APIs for compiling new lexical pad structures and assigning indices, it would be a massive effort (but if done the results would be awesome).

I guess by walking the optree and annotating some sort of ANF representation (op_next is pretty much ANF if you squint right) you could generate a minimal syntactic unit for new symbol introduction and convert all expressions into implicit subroutines, and then reparse that, but I think that'll end up being pretty brittle.

Anyway, if anyone has read this far, real monads are the next step ;-)

SamV said...

You could use this for solving problems like this:

send + more = money

for my $l (split "", "sendmorny") {
$l{$l} = amb(0..9);
$$l = $l{$l};
}
# must be unique
my %vals = map { $_ => 1 } values %l;
amb_assert(keys(%vals) == keys(%l));

# run off the expression
my $cry;
amb_assert(($cry=$d + $e)%10 == $y);
$cry = ($cry > 9 ? 1 : 0);
amb_assert(($cry=$n + $r + $cry)%10 == $e);
$cry = ($cry > 9 ? 1 : 0);
amb_assert(($cry=$e + $o + $cry)%10 == $n);
$cry = ($cry > 9 ? 1 : 0);
amb_assert(($cry=$s + $m + $cry)%10 == $o);
$cry = ($cry > 9 ? 1 : 0);
amb_assert($cry == $m);

$string = "$s$e$n$d + $m$o$r$e == $m$o$n$e$y";
amb_assert(eval $string);
print "$string\n";

see also my experiments with Junctions from 2005

nothingmuch said...

I was thinking of a related acme module, basically an object that overloads all operators to do predicate unification and when it's resolved into an actual value (e.g. '""' or '+0' overloading) that invokes amb as a thunk, and returns some value that satisfies the constraints.

Comparators in void context become assertions:

$x == $y; # this is true

and we get instant prolog.

The problem is that I have no idea how to capture continuations with actual C stack frames (i don't think that's even possible in a portable way), so that kills overloading.

Unknown said...

I can't wait to play with this!

I found a similar backtracking example in scheme a while back and ported it to Ruby (1.8):

http://pastie.org/596143

I wrote a perl version as well using Coro::State, but it's untested because I never got round to configuring support for the continuation "game" (as Marc Lehmann calls it) in Coro (it uses the callcc implementation from the POD):

http://pastie.org/596157

Unknown said...

You might find this old post of mine about doing some continuation based declarative coding in Perl of mild interest

http://www.perlmonks.org/?node_id=193649

it's a nice model for those times I miss Prolog :-)