Friday, October 26, 2007

One of my favorite sites on the net is KernelTrap. Though KernelTrap describes itself as, "... a web community devoted to sharing the latest in kernel development news.", all of the heavy lifting is done by one person, Jeremy Andrews. So I would like to take this opportunity to say thank you to Jeremy for his tireless efforts at making KernelTrap a great site and one of my favorite destinations on the net.

The feature I use the most is on the home page and it's basically Jeremy summarizing and distilling the [essence of the] conversations that happen on many of the kernel development mailing lists. Anyone who is or has ever been a member of an extremely voluminous mailing list, know how noisy it can be. Where the worst case scenario is an abysmally low signal-to-noise ratio. Plus it's no fun exploring the list after the fact because it becomes very tedious very fast, pointing and clicking your way through messages, trying to find something interesting. KernelTrap eliminates the noise and makes pointing and clicking fun again [or at least more productive]. It does this by organizing the different conversations from the different kernel development mailing lists into atoms.

An atom is simply a title and a summary of what the original thread/conversation was about, which includes quotes from the source. If the subject matter peaks your interest and you are not satisfied by the summary, you can click on the title or the "read more" link to, (wait for it ...) read more! Reading more takes you to a single page that contains the individual messages that make up the original conversation, no pointing or clicking required, all you have to do is scroll and enjoy. There is even a comments section at the bottom of each entry. The comments don't actually link back to the original mailing list so you can't really use it as a mechanism for joining the conversation. The purpose it does serve [to me] is comic relief. Probably 99% of the comments posted are from people who have never written a lick of kernel code in their life and probably wouldn't know a pointer if jumped up and poked them in the eye. Yet it doesn't stop them from complaining and passing judgment on the people who are actually involved in the conversation. I can't help but laugh.

Jokes aside, the reason I love KernelTrap is because it focuses on kernel development. And though I'm not a kernel developer, nor aspire to be one, the information provided is useful none-the-less. You see, the kernel is the most important piece of software that runs on your computer, because it is responsible for managing the resources that is the computer (CPU, memory, disk, etc). So whether your computer is running 1 or 1,000 processes, or your network application is handling 1 or 1,000 thousands connections, it's the kernel that is responsible for keeping things running smoothly or at least running. The consequence of being responsible for the computer is the kernel ends up being the most scalable piece of software on the computer. It is this feature of kernels that interest me. Because the lessons of scalable design and implementation, inherent in [good] kernels, aren't limited to kernel software. A lot of the lessons can be applied to user land software (my domain). So though the conversations may not tell you how things are implemented (the exception is the Linux Kernel Mailing List because patches [code] are included directly in the messages themselves) it can tell you why and who is doing it.

The newest KernelTrap feature is quotes. A quote is another type of atom that is simply a quote lifted from a larger conversation that is either insightful, funny, or both. My favorite for this week comes from Theo de Raadt of OpenBSD fame:

"You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can't write operating systems or applications without security holes, can then turn around and suddenly write virtualization layers without security holes."
— Theo de Raadt in an October 24th, 2007 message on the OpenBSD -misc mailing list.

So if you have never visited KernelTrap I highly recommend you take a look and if you are looking for a more Linux centric world view, LWN can't be beat.

Tuesday, October 23, 2007

7 REPSLM-C, Expanded

This post is a follow up to 7 Reasons Every Programmer Should Love Multi-Core and a direct response to this comment.

Maybe I should have put 6 before 4 because 6 makes the point that most of today's programs aren't written to take advantage of multi-core. So what exactly do I mean by take advantage? It seems you think I'm saying, it means simply running today's GUI, client/server, and P2P apps as is, on multi-core machines and expecting magic to happen. But that is not what I'm talking about.


With some existing apps like Postfix, WebSphere, SJSDS, IntelliJIDEA 7.0, PVF, and most Java bit torrent trackers/clients [just to name a few] magic can happen. While others require tuning (i.e. Apache, PostgreSQL, and many others). Most applications, especially desktop GUI apps, will require a major rewrite to take full advantage of multi-core machines.

What I'm talking about is programmers finding opportunities to exploit parallelism at every turn, which is what items 1-4 are about. Let's take something as mundane as sorting (i.e. merge sort, quick sort) as an example. Merge sort and quicksort are excellent use cases for applying a divide and conquer strategy. They consist of a partitioning step, a sorting step, and a combining step. Once partitioned, the partitions can be distributed across multiple threads [and thus multiple processors/cores/hardware-thread] and sorted in parallel. Some of you may say, "that's only 1 out of 3 steps, big deal." Others may take it even further and say, "1 out of 3. That means 2/3'rds of the algorithm is sequential. Amdahl's Law at work buddy!". But what you would be over looking is, [in the serial version] for a large enough dataset, the sorting step would dominate the runtime. So even though we have managed to only parallelize a single step we can still realize substantial runtime performance gains. This behavior is expressed quite eloquently by John L. Gustafson in his [short] essay, Reevaluating Amdahl's Law.

So what does all of this have to do w/ your comment? Let's start with your admonition of GUI applications taking advantage of multi-core and I'll use boring old sorting to make my point.

It is sometime in the future and there is a guy name Bob. Bob's current computer just died (CPU burned out) and he goes out and buys a new one. Bob doesn't know or care about multi-core [or whatever the future marketing term is for [S]MP]. He just wants something affordable that will run all his applications. Nevertheless, his new machine is a 128 way box (it is the future after-all), with tons of RAM. Bob takes his new machine home and fires it up. Bob keeps all his digital photographs and video on a 4 terabyte external storage array. He bought the original unit years ago before 32 terabyte hard drives came standard with PCs. You see, Bob's daughter is pregnant and is in her final trimester and her birthday is just around the corner. Bob wants to make her a Blue-HDD-DVDDD-X2 disk containing stills and video footage of her life, starting before she was even born, and up to her current pregnancy. It begins with the ultrasound video of her in her mother's womb and ends with the ultrasound of his grandchild in his daughter's womb. So Bob fires up his [hypothetical] image manager and tells it to create a workspace containing all the images and videos on the storage array, sorted by date. It's almost 30 years worth of data. And though the image manager software is old, some programmer, long ago, wrote a sorting algorithm that would scale with the number of processors available to it. So Bob clicks a button and in less than 5 minutes 3.5 terabytes of data has been sorted and ready to be manipulated. So what's the point? The point is it doesn't matter than "99%" of the CPU time was spent "waiting for some event", because when it mattered, (when Bob clicked the button) all the available resources were employed to solve the user's problem efficiently, resulting in a great user experience. Now I know the example is contrived but the premise upon which it is based is real. If you look at most GUI applications of today, very few of them can handle multiple simultaneous events or even rapid fire sequential events. In large part because most of the work (the action to be performed) happens on the same thread that is supposed to be listening for new events. Which is why the user interface freezes when the action to be performed requires disk or network access or is CPU bound. The classic example is loading a huge file into RAM from disk. Most GUI apps provide a progress meter and a cancel button but once the I/O starts, clicking cancel doesn't actually do anything because the thread that's supposed to be processing mouse events is busy reading the file in from disk. So yes, GUI application programmers should Love Multi-Core!

Client/Server and P2P are in the same boat in that they are both network applications. But they, like GUI and every other problem domain, can benefit from data decomposition driven parallelism (divide and conquer). I'm not going into great detail about how network applications benefit from multi-core because that subject has been beaten to death. I'll just say a couple things. The consensus is more processors equal more concurrent connections and/or reduced latency (user's aren't waiting around as long for a free thread to become available to process their requests). Finally multi-core affects vertical and horizontal scaling. Let's say you work at a web company and the majority of your web traffic is to static content on your web server (minimal contention between requests). Let us also assume that have unlimited bandwidth. The web server machine is a 2 socket box and quad-core capable but you only bought one 1P processor. A month passes and you got dugg and the blogosphere is abuzz about what you are selling. Customers are browsing and signing up in droves. Latency is climbing and connections are timing out in droves. You overnight 2 quad-core CPUs and additional RAM. Latency drops to a respectable level and you just avoided buying, powering, and cooling a brand new machine that would have cost you 3x as much as just spent for the CPUs and RAM. That is scaling vertically. If you were building a cluster (horizontal scaling), multi-core means you need less physical machines for the same amount of processing power. In other words, multi-core reduces the cost of horizontal scaling both in terms of dollars and latency. Access to RAM will always be faster than the network. So there is a lot less latency with performing the work locally --pushing it across the FSB, HyperTransport, etc, to multiple cores-- than pushing it out over the network and [eventually] pulling the results back. So yes, if you are coding or deploying network applications, P2P, client/server, or otherwise, you should Love Multi-Core!

Saturday, October 20, 2007

7 Reasons Every Programmer Should Love Multi-Core

  1. The technology is not new it's old. It's just really cheap SMP and the SMP domain (shared memory model) is a well understood domain. So there are tons of resources (books, white papers, essays, blogs, etc) available to get you up to speed.

  2. Shared memory concurrency is challenging. It's guaranteed to grow your brain.

  3. Most programming languages [already] have language and/or API level support (threads) for multi-core, so you can get started right now.

  4. There are a plethora of computing domains that benefit from increased parallelism. The following are just a few off the top of my head: GUI applications, client/server, p2p, games, search, simulations, AI. In other words, there won't be a shortage of interesting work to do in this space.

  5. Most programmers [and their managers] don't have a clue about concurrency so you can easily impress them with your skills/CNM (Concurrency Ninja Moves).

  6. The majority of today's programs aren't written with multi-core in mind so mastering concurrency programming means you won't be out of a job any time soon. Somebody has to write the multi-core aware version of all those apps.

  7. Since most programmers are clueless about concurrency, mastering it means you'll be smarter that millions of people. Being smarter than your [so called] peers is really satisfying.

Friday, October 19, 2007

Saved By Chappelle

The last three weeks have been a hell. If it could go wrong it went wrong. It started with the death of my workstation followed quickly by a fender bender. But I'm not going to dwell too much on the fender bender because I got off easy. The short version is, the moron that ran into me didn't get beaten to a pulp, thus I'm not writing this from a prison cell (thank you Dave Chappelle). My car [miraculously] only sustained minor scratches to the rear bumper (you should have seen the other guy), which I'm not going to fix it because it means repainting the bumper, which would make my car visually lopsided. That happened with my very first car [89 Honda CRX Civic]. The insurance company wouldn't pay for the whole thing to be painted and I didn't have the money to do it out of pocket. So the passenger side door and front panel was a different shade of yellow than the rest of the car. It drove me crazy. My current car is a '06 Ford Mustang GT, black. I [still] plan on tricking it out so the rear bumper was going to go anyway. So it's pointless to paint the bumper now and suffer unnecessarily.

What really pissed me off about the accident was, (a) it was completely avoidable (I have no idea why people think a yellow light means "speed up") and (b) the car just celebrated it's first birthday. The thing that kept me from losing it was the, "When Keeping it Real Goes Wrong", skits from the Chappelle Show. First there was contact; then there was blinding rage; then out of nowhere, Dave Chappelle. Weird! Network TV should run them as public service announcements. I think it would benefit Type A personalities.