I was reflecting on how much processor speeds, memory, and data transmission rates have increased over the last few decades. And yet the same old tools and techniques are often used to bring up new designs. When do you think we fall off the cliff?
Continue reading "The Coming Crisis in Board Bring-Up" »
The challenge of system debug on Intel (and other) systems can be huge. What new tricks are available for debugging system hangs, crashes, or application errors?
Continue reading "Debugging CATERR and other failures" »
Maybe not yet, but…
I spent last weekend in gadget hell. First it was the garage door opener. Then my broadband home router started acting up. Finally I had to help my son with his laptop ‘blue screening’. I’m fed up and I’m not going to take it anymore…
Continue reading "Is your laptop highly available?" »
ScanWorks Embedded Diagnostics is embedded firmware which uses a CPU’s debug port to access a system’s architecturally visible registers, memory and I/O. Acting as an “embedded JTAG-based debugger”, it can be operated remotely from anywhere and at anytime and troubleshoots the most difficult-to-debug hardware and software failures.
Continue reading "Debugging on Steroids" »
The ScanWorks Embedded Diagnostics solution for Intelâ x86 provides local and remote debugging capability via the CPU debug port instrument. Such capabilities include dumping of forensic data during a system crash or hang (i.e. register, memory and I/O), setting of breakpoints, and single-stepping through code. ScanWorks Embedded Diagnostics thus acts as an embedded JTAG debugger, in-situ within an on-board service processor. It is thus hardware- and distance-independent and can be used anytime, anywhere in the world on numerous systems in parallel. This provides an extremely effective means of debugging BIOS, device driver, OS kernel, and intermittent or catastrophic hardware and software failures in the lab and in the field.
Continue reading "Intel x86 Design for Debug Guidelines" »
The use of effective embedded diagnostics to identify failures in today’s high-availability systems becomes more critical as semiconductor and board technologies increase in speed and integration. But many OEMs seem to be overlooking this vital aspect of their products’ post-sales operation, possibly due to their drive for shorter lead times and reduced costs. This myopia has lead to many news-worthy stories in recent times. But it wasn’t always like this…
Continue reading "Rooting out impossible bugs" »
In my earlier blog on Debugging Watchdog Timeouts I mentioned the dreaded No Trouble Found (NTF) problem. Some have asked why NTF is important. Well, the answer is because NTF is a huge cost to companies, compounded by the fact that NTF is extremely difficult to quantify and to address. Something as simple as an errant wedding ring can cost companies millions of dollars. Let me explain…
Continue reading "No Trouble Found?" »
Watchdog timeouts occur in crashed or hung systems when the main processor on a printed circuit board no longer sends a heartbeat to an ancillary service processor. The service processor watchdog then, after the timeout period, re-initializes the main system to try to restore it to an operational state. Watchdog timeouts can occur in high- and low-end systems, anywhere from routers to servers to cell phones. Debugging these can be difficult…
Continue reading "Debugging watchdog timeouts" »