Achim's profilePattern RecognitionBlog Tools Help

Blog


    February 14

    The operating system knows best - coarse grain concurrency using pipes

    In my previous post I wrote about how the current languages (and runtimes) really have a hard time exploiting the benefits that the new multicore processors bring. This previous post mainly dealt with exploiting concurrency on a fine grain level (e.g. loops).
    Now I don't know what triggered my realization that there is an oldschool mechanism in both Windows and Un*x that enables many scenarios of coarse grain concurrency: anonymous pipes. It was either the announcement of Yahoo! Pipes or my laziness not to rewrite a bunch of scripts that interact heavily via command line IO.
    How do anonymous pipes enable coarse grain concurrency? Say we have a problem that requires some multi-step sequential manipulation/filtering of some kind of data. One way to attack this would be to code this sequence into the main routine of the program in your favorite programming language and pass the data around in programming language structures or references to them. Works fine, except that we'd have a hard time getting any of these steps to run concurrently on a multicore processor with today's languages/runtimes.
    The better way in regards to concurrency would be to split up the steps into their own little programs which receive their input via STDIN and output the result for their processing via STDOUT. To execute all steps in sequence you just string them together with anonymous pipes:
    s1|s2|s3|...|sn
    There are many benefits:
    • All steps create their own processes and get scheduled according to their resource needs by the operating system.
    • The steps can be written in different programming languages.
    • Steps can be easily exchanged without recompiling.
    • Additional filtering steps can be inserted into the chain (provided they obey the data formats).
    Of course there are a lot of things to be cautious about:
    • Do the steps warrant creating individual processes? (nowadays process creation seems to be relatively cheap)
    • Can the data be serialized efficiently into a byte stream? (Note that I don't say character stream here - character encoding in a command shell is a topic for another post - see this older post of mine on this topic).
    • Do the performance benefits of exploiting the multiple cores outweigh the additional overhead of process creation and data serialization?
    • Does the OS do a better job of using the resources than my code can do? (Considering the OS developers spent years on optimizing this the bet is against you)
    • Debugging could be harder.

    The Lambda the Ultimate blog had a good discussion on this and the relation to functional programming concepts a little over a year ago.

     

    October 17

    Multicore processors or how to choose a programming language for the next 5 years

    I love the open source scripting languages: Perl, Python, PHP and Ruby -  the P in LAMP (ok, they have to rename the last one). They are easy to pick up, support different programming paradigms, are available for many platforms, have extensive libraries, have great communities driving them forward, are free (as in beer and freedom) ... I could go on and on.
     
    All of them are in different phases of growing up. Perl 6, Python 3000, Ruby 2.0 all promise bigger and better things in terms of language design and functionality. But (judging from my web searches) not many people talk about how the next revolution in programming - multicore computing - affects the language runtimes (I know, big words, but bear with me).
     
    The battle between PC hardware companies is heating up again, on TV ads for multicore processors are shown during primetime. What happens though, when you run a Perl script containing code like this on one of these shiny new multicore machines?
    for($i = 0 ; $i < 1000000 ; $i++)
    {
        $array[$i] = some_expensive_function($array[$i]);
    }
    One of the cores is awfully busy while the others are idly sitting around! Assuming the function some_expensive_function has no side effects the task could easily split up among the different processor cores.
     
    I hear you saying: "But yes, of course the script has to be multi-thread enabled to make use of all the cores.". However, this requires additional, non-trivial work - as Herb Sutter says in his excellent 2005 article: "The free lunch is over". Herb urges everybody to brush up their skills in writing multi-threaded applications. He says: "Implicitly parallelizing compilers can help a little, but don’t expect much; they can’t do nearly as good a job of parallelizing your sequential program as you could do by turning it into an explicitly parallel and threaded version."
     
    This is one way to approach the problem - what I would call "handcrafting" your parallelism. I'm sure you can get very well performing applications out of this; applications like video encoders used to measure multicore performance are enabled today.
     
    That is if you can get this handcrafting right - the web is full of tales of multi-threaded programming gone bad. Applications like this are notoriously hard to debug.
     
    Is there a better solution? If I have to (re-)learn concepts is there one that deals with this problem a little more elegantly?
     
    Turns out there is: functional programming. From the Wikipedia entry: "Disallowing side effects provides for referential transparency, which makes it easier to verify, optimize, and parallelize programs, and easier to write automated tools to perform those tasks".
     
    Excellent - so I just have to adopt the functional programming constructs available in Perl, Python and Ruby and the things that are parallelizable will be parallelized automatically for me? Wishful thinking for now, unfortunately. I couldn't find any info that any of the present runtimes are thread-aware (not just thread-safe), especially for functional programming constructs.
     
    What to do? Wait for the new Parrot, YARV, CPython runtimes? Possibly contribute there? Judging from the Perl 6 history this could take a while.
     
    Use one of the functional programming languages like Erlang or F#? Certainly attractive from a learning point of view, but I'd certainly always would have to trade off at least one of the advantages of the P languages mentioned at the start of the post.
     
    Fortunately there seems to be a way out: Python. For Python, unlike for Perl and Ruby, there are multiple runtimes, among them the Java VM and the .NET runtime. There is a strong motivation for Sun and Microsoft to make the bytecode of these runtimes work as fast as possible on multicore machines. All we need now is for IronPython and Jython to analyze the functional constructs and parallelize them automatically if possible.
     
    For the really tricky performance bottlenecks there will be C/C++ extensions using the tried and tested OpenMP.
     
    Natural language processing requires a lot of computing power. Multicore machines promise to make this available.
     
    Update: Just found out about RubyCLR. Along with JRuby this will allow to target .NET and Java with Ruby. Both seem to be less mature than IronPython and Jython, though.