Tuesday, January 09, 2007

What Really Happened on Mars

This is a story about troubleshooting a system-level computer problem.

This is a fascinating article by Mike Jones about the Pathfinder system resets that occurred. The problem was a race condition around a mutex. The article is a quick read and interesting story. It also describes a very typical system problem and what it's like to figure it out.

The Lessons Learned section is the best part!