Dataflow programming in modern programming paradigms: Specification, lack of sleep and crushing my mind with lazy concurrent streams

Completed and handed in the specification document for my project last week. The final document can be found here.

The last week has been extremely busy, and so has the weekend, resulting in tangible work being low, although I have managed to press on with reading the Lucid book as well as spent several hours trying to fathom how to produce an efficient system of lazy concurrent streams.

I've made a few attempts at implementing lazy streams with my Ruby Dataflow Extension, but keep getting stumped at the last minute. In my head it seems easy, but when I sit down at the computer to write code I find myself knee deep in partially applied functions, deadlocks and traps. Extending Ruby to allow currying of functions was a definite help.
Here is the lovely code that lets me hack currying and partial application into Ruby:-

# Code based on "Whys" blog post on functional extensions in Ruby.
class PartialFun < Proc
  def call(*args)by
    if args.size < arity.abs and args.size != -arity - 1
       Fun.new{|*a| call(*(args+a)) }
    else
      super
    end
  end

  alias_method :[], :call
end

# CurryFun constructor wrapper

def CurryFun &block
  PartialFun.new &block
end

# Example function
def curry_foo
 CurryFun{|x, y, z| x+(y*z)}
end

One concept in my head is explainable via the examples now laid out.
Lets take the example:-

x = x fby 1
y = next x

This is all fine and dandy in Lucid, where y will be outputting 1 continually on its stream. Lucid is lazy. In my RDE (Ruby Dataflow Extension), dataflow programming occurs via concurrent nodes on top of threads, not on an eduction model (which I presume Lucid works on), and this concurrent system is eager.
Hence the previous example fails as x is nothing, so 'x fby 1' is left trying to read from x, and 'next x' also sees nothing.

My thought is that if the node 'fby' tries to read from x and does not receive anything, then it outputs on its buffer a closed function that attempts to read from buffer x.
The node then continues as normal and reads 1, and outputs that continually to its output buffer.
The next node reads the "next" item on the output buffer of x, hence skipping over the function on the buffer and reading 1.

However, lets say that we don't have this following program:-

a = some_slow_running_node
x = a fby 1
y = next x
z = x

Taking the model of operation as before, fby may try to read from the output buffer of a, but as a is a slow running node, it doesn't read anything immediately, so flags there is no result yet, or writes out a function. This doesn't matter for y as y only requires next x. However we now have z, which replicates all that is coming from x. If x outputs a function, z could try and apply the function. Lets say that fby didn't read anything from a immediately, so wrote a function onto its output buffer. A function whose operation is to read from a. z would read this function, and could apply it to see if we have a value for a yet. Could this work in a more complex situation? I'm not sure yet.
For example the "howfar" program presented in Section 3 of Wadge's Lucid book:-

howfar
where
howfar = if x eq 0 then 0 else 1 + next howfar fi;
end;

This program calculates how many items there are on the input stream before a 0 is encountered.
E.g. If the input stream to howfar is:-
1, 6, 4, 0, 9, 2, 1, 4, 3, 0, 6...
The output would be:-
3, 2, 1, 0, 5, 4, 3, 2, 1, 0, 1...

With the previous model of passing functions onto streams the expression 1 + next howfar could be rolled into a partially applied function, whose opeartion is 1 + reading the next value of the howfar buffer. More formal reasoning or tests are needed to ascertain if this will be a viable solution to producing concurrent lazy streams.

This does leave me with a nasty taste in my mouth though, it feels like I'm destroying the type system for the streams, as now not only can a stream be of type a, it must also be able to hold functions. Perhaps two streams masquerading as one would be better, one stream presenting the strongly typed item stream, and another subversive counterpart, holding any functions partially applied or not.

I have thought about a request system where results are 'pulled' through the system, with nodes making requests for computation to other nodes, but this destroys the whole joy of having a concurrent data flow network in which some nodes could be operating much faster and could do a lot of useful work while other nodes are taking longer.

The thinking continues, along side getting familiar with Lucid. My objectives for this week are to be researching into GLU and Gypsy, I'm hoping that I can do this at the weekend after much reading of the Lucid book this week.

Dataflow programming in modern programming paradigms

Tuesday, October 24, 2006

Specification, lack of sleep and crushing my mind with lazy concurrent streams

1 Comments:

About Me

Previous Posts