October 29, 2009

Parallelism for the Masses?

I just returned from an interesting full-day seminar on parallelizing applications using Intel's tool suite with the quite ambitous title "How to write bug free and performance optimized parallel (threaded ) applications ( Turning a serial into a parallel application)". A demonstration of the capabilities of Intel Parallel Studio was quite impressive.

Clever Tools

Let Parallel Studio analyze your C/C++ code to detect performance hotspots and parallelizable code fragments. Then use OpenMP and add #pragma directives to actually "annotate" your code with parallelization hints for the compiler:

#pragma omp parallel for reduction(+: sum) 
for (i = 0; i < 1000000; i++) { 
 sum = CalcSum(i, sum); 
}

Pretty neat, isn't it? I found it quite surprising how far one can already get nowadays by just using clever tools for (semi-)automatic parallelization of applications. Still: Can you prove the above code does not cause semantic errors? Remove the "for reduction(+: sum)" from the above and you will get a random results.

Required: a radically new approach

Introducing technlogies like OpenMP (or the new .NET Parallel Programming libraries to mention a more recent one) doesn't help much to really improve the situation. It is like fixing  your bathroom with duct tape. It works - but it is not a sustainable solution. In my opinion, going parallel - and thus make your code scalable, be it on multi-core or a service in the cloud - requires more than tools. Everyone, who has already written one piece of multithreaded code, also has already had to debug a concurrency issue. It doesn't matter, how good you are, how much or little experience you have.

Actors enter the future stage

What does it take to write better scalable and parallelizable code? The problem with parallel code is "shared state". Message Passing avoids this problem. The Actor Programming Model takes this even further. Systems like Erlang have a long history in successfully applying this totally different programming paradigm (at least totally different from today's mainstream only of course). Multicore processors and cloud computing revived this paradigm by increasing the need for parallel programming support. Microsoft's Axum and attempts like the "XCoordination Application Space" try to bring this paradigm to the masses on the .NET platform, Scala will likely be the next success on the Java platform (imho ironically by replacing the Java language on the JMV). I dare to predict, that lot of our programming future will be Actor-based. And the platform that makes implementing actors most convenient has good chances to become the next mainstream.

What are your thoughts on the future of programming? How will we handle challenges like debugging or orchestration of actor communication in this new world? Keen to hear your opinions and/or experiences!

October 20, 2009

What Software Architects can learn from History - a conceptual look at SOA and EDA

This blog post was inspired by the article "SOA through the looking glass" by Udi Dahan, recently published in the MSDN Architecture Journal and a following discussion with collegues. One thing that quickly popped up was the different understandings of the terms SOA and EDA (we really should care about our own ubiquitous engineering language first ...). Here are my thoughts on Services, Events and why I think those concepts allow us to build better solutions, thus better meeting business needs, which is - after all - what we build all applications for.

How Lou Gerstner got IBM to dance

I do remember times where a big monolithic enterprise called IBM acted like a centralized mainframe in the market all over the world. Due to its organizational structure the company wasn't able to respond to new and changing market challenges within a reasonable timeframe. So why does IBM still exist? Because in the early 90s Lou Gerstner decided to restructure the whole company and split it into lots of small, autonomous profit centers ("How Lou Gerstner got IBM to dance"). Those profit centers act like small companies embedded within a large company, being responsible for their own revenue as well as finding the right partners to connect with both within as well as outside the enterprise, while being able to leverage synergy effects from being part of a large enterprise. As a matter of fact, companies like Amazon and Google are such effective market players because they are organized exactly this way - service oriented. Service Orientation is a quite natural concept describing a set of autonomous actors collaborating with each other, exchanging information by sending messages and reacting to external stimuli, also known as events. In his article, Udi Dahan correctly points out that service orientation and events are just 2 sides of the same coin.

The world doesn't block because you have a coffee

In contrast to an all-to-common misconception, I do not see SOA and EDA as technologies. In fact both describe organizational and structural concepts that apply to business organizations in the first place. Thus I see SOA and EDA as a way to translate those natural concepts onto the way we build software. Instead of relying on leaky abstractions like synchronous remote method calls (crossing process boundaries may always fail for hundreds of possible reasons) and distributed transactions (what if something goes wrong during the commit-phase??), software engineers need to take a good look at how the world really works that they are trying to project into bits and bytes. If you look at the real world, it doesn't use 2-phase-commit. Instead it uses compensation actions (see Greg Hohpes awarded article "Starbucks does not use 2-phase-commit". Also the real world seldom uses synchronous commands. Instead we send messages and react on events when they occur. You don't get blocked when you receive an email - instead it is up to you to decide when to come back and read it later. Also the sender is not blocked from her work, waiting for response from you. The real worlds acts sometimes sequentially (you receive the email before you can answer it), but seldom synchronous. You might have forgotten to pay a bill? You will receive a friendly reminder letter and if that overlaps with the fact that you already paid the bill in the meantime, it will state to "safely ignore this letter" in this case. How does the real world stay agile and responsive? Use small groups of people concentrating on a particular responsibility instead of large monolithic blocks responsible for everything, effectively achieving nothing.

Business Partners do not share database schemata

What is important about the word "autonomous"? I see the importance in the evolvability and flexibility of a system. Companies do not simply change their interfaces to partners they collaborate with. And 2 collaborating companies do seldom use a shared database, they will even use different fields for their "customer" datastructure. Here is where I see the "Bounded Contexts" first described by Eric Evans coming into play. Instead of falling into the trap of the anti-patterns "shared database" and "shared datastructure", where one single dataschema/-structure tries to please several different needs, split e.g. editing information like your personal profile on facebook from searching the whole user database for a particular name. Those are different taks, requiring datastructures and processes optimized for that particular task. Sharing a database does not work in this case not only because of the size of facebooks userdatabase, but also because changes to to either of the tasks would become impossible. Instead of achieving simplicity in the database schema, one introduces complexity in the evolvability of a system because of strong dependencies between components due to shared structures ("The 'One Truth Above All Else' Anti-Pattern").

Cloud Computing requires us to take new look

Traditional software development strategies hit the wall even faster when it comes to develop solutions to run in the cloud. As Gianpaolo Carraro describes in his blog post "Head in the Cloud, Feet on the Ground" (and in MSDN Architecture Journal #17), business demands will sooner or later likely force us to move at least parts of the systems we build into a cloud environment. How do you build a system, where service instances may come and go as needed? Where services simply move from one box to another without prior notice, controlled by the cloud operating system? Aside from other important implications, simple conventional synchronous method calls do not work anymore in such an environment. Different strategies are needed and service-orientation, asynchronously sent messages and reacting to events are part of the solution, leading to a system of collaborating services, talking to each other in conversations.

It's always about balance

Of course building a software following those principles also means, that you pull all those previously hidden complexities to the surface and into your application domain instead of leaving them up to the infrastructure layer, where they stay buried until something goes wrong and they bites you from the back. Thus one has to make a careful decision, where it is ok to use easy to write synchronous remote calls and accept the (hidden) risk of lack of scalability and blocking callers due to a broken infrastructure, eating up webserver resources, while blocking calls pile up instead of making this fact explicit in the organizational structure of an application system, partitioning functionality into several autonomous services, collaborating with each other by sending messages and reacting to events. As usual, it is a matter of compromise and finding the right balance. You just need to be aware of the risks you accept when synchronously calling a webservice method using SOAP over HTTP in favour of having to write a simple method call vs. the explicit and therefore more complex communication patterns happening between application components, but providing better flexibility and scalability due to higher decoupling.

We've already been there

Those ideas are not new. IBM's restructuring is just one of the examples taken from reality. Even within the Software Engineering industrie those ideas have been around for quite a while and most of us even already used them. Probably not everyone is writing programs in Erlang, which is quite successful in applying those ideas since the 60s. But almost everyone has written GUI applications, haven't we? Well, there is this thing called "Window Message Queue" and a technique called "reacting to user events". It is time to undust this knowledge. Although IT history is rather short compared to other industries, Architects should learn from it.