Sunday, January 13, 2013

BSOD

Sorry for the lack of posting the dreaded BSOD problems came back with a vengeance. So this last week has whizzed by as I ran various tests and tried different permutations. Most of the time I like to keep my system up to date with the various software updates.

Now this is more an act of catharsis (and documentation) for me than anything – anyone looking for stuff about cycling will, I’m afraid, be disappointed.

I find most of them tend to be updates to security issues – which are well worth doing. In fact I read today that the US have warned computer users to disable JAVA or risk being hacked. (Apparently it is being fixed – but here is how to disable it.)

Essentially after some update, which I think was either McAfee or Acronis the machine would crash within a few minutes of start up and me logging on. It got so annoying that I even dreamed about it the other night and in my dream I had fixed the problem. The dream was wrong it still crashed. The good news was that I was able to bring up the machine in Safe Mode (with networking) – where it was reliable. However at this point I did wonder whether I had a hardware fault and so ran some diagnostics.

The first run was some Microsoft diagnostics found on their Windows & repair disk. I am pretty good at following the advice on creating repair disks, the trouble is that some much time goes by before I need them that I tend to forget where they are. I laid my hands on several – the original Windows 7 installation DVD, one made when I installed Windows 7 and one using the Acronis back up software.

The Microsoft memory tests ran fine, so I decided to give MEMtest86 (thanks MikeC) a go as it is more thorough. After a bit of messing around I wrote a disk with the Tests on it and ran them. I had to burn a disc from another machine using different CD software as the machine i would normally use and am familiar with was sick of course.

While the programs were running I started reviewing my options and found that my machine had also got some diagnostic software on it. After Memtest86 passed twice I had a look at the Dell diagnostics. They offered options to do a full test of based upon the symptoms I was having. I went for the full monty. It started off playing 8 tones and flashing rather crude graphic tests on the screen and then had me tapping every key on the keyboard. I was beginning to wonder whether it was worth it.

It did move onto BIOS and RTC tests and then the CPU and memory – then onto the disks. At this point it had taken an hour or two the disk checking took way longer. I reckon that the software is probably used when a machine is built to establish that it is ok. My machine now has way more disk storage than when I got it. In fact I think it came with 1Tb configured as a Raid 0 disk – built for speed where the data is spread across both disks. I have since converted it to Raid 0 where data is mirrored across two disks as well as adding a bunch of USB disks. So it has gone from 1Tb to around 8Tb. Where 4Tb is tied up as a Raid 1 2Tb disk and the rest is back up disk.

It is said that if you don’t back up your data at least three times then you don’t care about it. Well I do, but as an ex-Unix sys admin person I also came to realise that you also need to check that you can restore the data you back up just in case. So I store my data in different forms and use different programs. (SyncToy, Acronis back up and Ghosting).

After the machine has been running the diagnostics for around 48hours I decided I didn’t really want to run exhaustive tests on my USB back up drives and used the Diagnostics interactively. Long-story short, despite all the diagnostic hammering it still didn’t crash at all.

So whilst all that was happening my plan was to try a system restore back to the previous known working state and if that failed I was considering installing Windows 7 again. Apparently there is a trick where you use the upgrade install to repair the System. The upgrade install is used where you have say Vista running and want to convert to Windows 7 but not have to scrub the disk and resort to a clean install and then re-installing your programs and data back. (I used this when converting this Desktop to Windows 7.)

At this point I reckoned that the problem was either a software problem or some really obscure hardware fault that was only used by some strange bit of software. (In fact I hoped that it was a software problem). I also noticed that between logging on and the machine crashing the Trusted Installer would appear.

So I set off a system restore from Safe Mode, which takes an hour or so, the machine re-booted and then I logged on and it crashed.  Although it did report that the System Restore had failed with an unspecified error – 0x80070005. A review on the web suggested that this might be related to Anti Virus software (AV) being over-protective. (Note there is loads of “advice” on the web and it takes a fair bit of time to work out the good from the bad I find.) There was nothing definitive but several places had reported issue with different AV programs.

So using Safe Mode plus networking I first tried to get a few incremental back ups done – safe mode was not the place. I then uninstalled McAfee – it failed and suggested I report the problem to McAfee. Note one of the problems with a lot of advice it tends to be along the lines of re-boot or re-install  which all takes for ever and makes you go over old ground so I didn’t bother. I did try McAfee’s Virtual technician program (MVT) that also reported problems it couldn’t fix.

The next step was McAfee’s MCPR (McAfee Consumer Products Removal tool). It reported problems as well. Now at this point I was thinking this all points to McAfee as being the root cause, it had got its knickers in a twist and perhaps was the problem all along. when re-booting my compute I can never remember which if the function keys to use to be able to select boot up modes and options and I ended up giving Microsoft’s System repair tool another go. it took so long I decided to cancel – it wouldn’t cancel. So I went to bed deciding I would deice whether to pull the in the morning.

That night after dreams of fixing the computer I was greeted with a message that it had failed but wanted me to send additional stuff to Microsoft. So I did, at this point I was thinking I would have to to the in-place install. I gave the MCPR program another try in safe Mode – it ran to completion and offered a re-start. I re-booted into Safe Mode and tried some on-line virus scanners. (Trend, Bitdefender and McAfee. The first two ran ok, that latter didn’t.) So I went for a  System Restore – it completed after a few hours - it re-booted and has been up for the last couple of days.

The first thing was to run back ups, I have bought a 3TB NAS disk to make it easier to get stuff using my Laptop. I have also downloaded the Microsoft development tools for a Windows Debugger (Windbg) to check out the crash dumps.  The directory they were supposed to be in was locked.

I than ran a System Check point – hopefully I now have a place to go back to.

So what next – well my money is now pretty much on the McAfee AV software – with either an interaction with say the Nvidia drivers or that the McAfee stuff had tied itself in knots and has been the issue all long. So do I re-store the AV software? Should I still do an in-situ install to repair stuff?

For now I am going to update most of the programs stopping along the way to ensure I have a System check point. Then I am going to upload the Microsoft patches that have come in. If all goes to plan I will then check Nvidia for an update (I got an email from them, the hesitation is that whilst the drivers do load some of the other software doesn’t).

Then I will re-install the McAfee software – if it fails then I will try the in-situ Windows 7 fix after removing the McAfee – then assuming that works I will give McAfee one last try.

The good news is that I did write this |Post on my desktop.

No comments:

Post a Comment