Multi-core programming may actually require more than one paradigm. Some current contenders are:
- MapReduce. This works well where a problem can be easily decomposed into parallel chunks.
- Nested Data Parallelism. This is similar to MapReduce, but actually supports recursive decomposition of a problem, even when the recursive chunks are of irregular size. Look for NDP to be a big win in purely functional languages running on massively parallel but limited hardware (like GPUs).
- Software Transactional Memory. If you need traditional threads, STM makes them bearable. You pay a 50% performance hit in critical sections, but you can scale complex locking schemes to 100s of processors without pain. This will not, however, work for distributed systems.
- Parallel object threads with messaging. This really clever model is used by Erlang. Each “object” becomes a lightweight thread, and objects communicate by asynchronous messages and pattern matching. It’s basically true parallel OO. This has succeeded nicely in several real-world applications, and it works great for unreliable distributed systems.
Some of these paradigms give you maximum performance, but only work if the problem decomposes cleanly. Others sacrifice some performance, but allow a wider variety of algorithms. I suspect that some combination of the above will ultimately become a standard toolkit.