Low level HD recovery?

User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Low level HD recovery?

Post by ErikJan »

I own several external (USB connected, conventional) hard disks which I use for backup of my documents etc.. Recently, two of them started to show disk errors (bad sectors), so I purchased new ones and transferred the data. Now I’m good again.
Still it triggered me this time as to why these error suddenly occur. Could be wear and tear, but as backup devices, I’ve been using them maybe one / month for an hour or so. As my backups were ‘covered’ with the new disks, I decided to spend some time to find out what was going on with one of the (now empty) ‘bad’ disks.

Using my Windows 10 and several tools that scan the surface (incl. CHKDSK of course) I could confirm the bad sectors with read errors (and SMART data indicating troubles) but I couldn’t really do anything more.
When I Googled a bit more, I found a HD test/recovery tool called Victoria (from Belarus I believe), it doesn’t require an install and it appeared to work more low-level than other tools I tried it. While there are options to clear, erase, refresh and re-map sectors, the tool indicated this wasn’t possible under newer versions of Windows.
So I started up a VM (with Windows XP without network functionality… See it as sort of a test sandbox 😉), connected my broken (empty) external USB HD and started Victoria there.
Indeed, this time I could re-map and erase sectors and interestingly enough, after I ‘erased’ bad-sectors for a small part of the disk, the next scan of that same area showed no more errors (and sector-access timing was normal and fast again). My conclusion was that the errors must have been ‘soft-errors’ (vs. e.g. hard errors like physical damage).
Maybe over time there’s been some wear and tear and the alignment of the heads isn’t exactly the same anymore? Or areas have demagnetized a little (we typ. don’t store HD’s @ 0 Kelvin… 😉).
The process is very (very) slow however (probably taking weeks or much more for a 2Tb disk) so I was wondering if there would be a (DOS?) tool that would low level reset SMART and all re-mapped sectors and then re-write / re-align every sector? And then after that would be done, why wouldn’t the re-written disk be 100% OK again?
I admit posting something similar in another (local) forum and people suggested I should throw away the disks (but all I’m doing is playing here, my data is safe on the new disk) and moreover, they told me that IF I would fix sectors, they would turn bad quickly and any new data would never be safe again… I don’t understand the latter (as I’m re-writing stuff low-level) and hope someone can explain maybe (or better: point me to that DOS util that re-writes / resets the disk!).
I recall many, many years ago that CPU’s weren’t fast enough to read and process the data from a spinning HD sequentially (the next sector had passed the head before the CPU was ready to read to read it and therefore the system had to wait until that sector passed again). To optimize that, vendors wrote sectors low level in an interleaved way (maybe 1:3 or 1:2). I recall using a tool (SPINRITE) that would measure the systems’ and CPU performance and if it determined that things were fast enough, it could change the interleave to e.g. resp. 1:2 or even 1:1 (like all spinning disks have nowadays). That worked without impacting data on the disk and took a few hours. The result was up to a 50% increase in HD performance. My point is: that was a tool that acted low-level on how sectors were laid out. I even think that same tool could actually do what I’m looking for now: re-format at the sector level.
Any suggestions would be welcome. Remember again: my data is safe, the ‘broken’ disk is empty, I’m doing this because I’m interested, for fun and to learn.

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

In my experience (as a hardware engineer for many many years), there are two scenarios that could match this.

1. The disk surface has started to fail. This may appear to be OK for a while after you remap the bad sectors, but you really don't want your data anywhere near this disk

2. There were a few marginal sectors that passed the factory test and were left in use, but these have now deteriorated to the point where they generate errors. Again it is likely that there might be more just waiting to fail, so don't put your data anywhere near this disk.

Bottom line. It belongs in the metal recycling bin.
StuartR


User avatar
BobH
UraniumLounger
Posts: 9211
Joined: 13 Feb 2010, 01:27
Location: Deep in the Heart of Texas

Re: Low level HD recovery?

Post by BobH »

I agree with Stuart, but before recycling I always remove the rare earth magnets for other uses.
Bob's yer Uncle
(1/2)(1+√5)
Intel Core i5, 3570K, 3.40 GHz, 16 GB RAM, ECS Z77 H2-A3 Mobo, Windows 10 >HPE 64-bit, MS Office 2016

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

I also use a tool to physically score the disk surface to make it almost impossible to recover any data - this is a bit over the top since all my data is encrypted, but it does no harm
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

With apologies but this is going in the same direction as the other forum: "throw away the disk". As I had indicated, I'd like to understand why. In other words: "Show me the data", explain the science.
As I had explained, the bad sectors could be repaired and now behave normally. I'd like to understand the science behind statements that suggest "this will come back" or "this is the end of your disk". Again: when there's hard errors that can't be repaired (and after remapping has no more good sectors left: yes, fully agree). But when the errors disappear after a rewrite I could hypothesize that e.g. due to age the head alignment is slightly off and some sectors that were aligning borderline will now fail. A rewrite will re-magnetize the low level sector information and there's no reason (in my mind) why this sector would fail again for many years.
Think in a different way: if nothing changes inside the disk, why would the magnetic particles suddenly decide to start flipping at random? The only physical things going on is wear and tear, Brownian diffusion, maybe some chemistry and influences from external magnetic fields. All of these can be fixed by re-writing sector low level info (maybe expect extreme wear and tear).
Again: I understand I should wipe or destroy to make sure remaining data is gone. I can even understand I should get a new disk and put my data there (and as said: I did that!). That's all besides the question however...

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

The most common reason for this is that the disk surface gradually comes away from the platter.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

Thanks Stuart. As you might have discovered by now I've been in science and technology for the larger part of my life. Statements like "the disk surface comes away from the platter" still don't 'do' anything with me. Any articles our there that describe this?
I'm not a 'complot' thinker (in fact I'm very far away from that!), but I'm driven by facts, data and science. If I wouldn't know better however, I'd almost start to think the whole story of "when the disk gets errors it's time to buy another one" was planted by the manufacturers to maintain their market... ;-) Again, where's the evidence of that?

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

I don't have articles, but I worked as a senior hardware engineer for Digital Equipment, Compaq, and HP for forty years and I saw lots of internal reports into failure modes. These disks spin at very high speeds, and the heads float on the air current that generates, only a tiny distance from the surface. This rips the surface away from the disk over time.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

Again, I believe you, but how do you know that's the cause? I'd argue that is the surface is damaged I'll get hard errors which I can't fix. My errors can be fixed...

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

They get fixed by remapping the bad sectors. The disk has a map that directs logical blocks to physical places on the surface. When a block goes bad it can be removed from use and replaced with one that is not currently in use
StuartR


User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15498
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Low level HD recovery?

Post by ChrisGreaves »

StuartR wrote:
03 Oct 2021, 16:43
They get fixed by remapping the bad sectors. The disk has a map that directs logical blocks to physical places on the surface. When a block goes bad it can be removed from use and replaced with one that is not currently in use
Hi Stuart; I sympathize with ErikJan in that while I too want "a solution", I also want to know the reasoning behind it (A futile task with MS software ...).

That said I remember "mapping" from my mainframe days.

If I understand you, when a 1MB disk is sold, it sounds as if it has 1.2MB of surface available, and that 0.2MB in excess is there to be drawn on in remapping what ErikJan calls "hard errors that I can't fix" - which I think of as flakes of metal oxide spinning off the platter and into the walls of the enclosure.
Although ErikJan can't fix them, the hard disk software/drivers can - by remapping.

Before too long there will remain only and exactly 1.0MB of usable surface, and a few days later 0.999MB of usable surface, and so on through 0.80MB and so on and so down.
Of course by then ErikJan would be restoring from backup twice a day ...
And too I hear ErikJan when he says that, to all intents and purposes, he has disposed of the disk.

Although it is still attached to his computer.

It seems to me that ErikJan could run a little study here, by filling his disk with data in (say) date-named folders until it is full, and then on a daily basis:-
(a) record capacity and
(b) replace the oldest date-named folder with a new folder.
With this ErikJan could actually track the degradation of Hard Errors across time.

Me? I would expect to produce an exponential line, that the hard disk, once the flakes start spinning off, they would, I think, spin off in ever greater numbers on a periodic basis.

I well remember the images of heads, smoke particles etc.
In https://duckduckgo.com/?q=photo+of+hard ... ext&ia=web the fourth image in is an example.

Cheers
Chris
You do not have the required permissions to view the files attached to this post.
An expensive day out: Wallet and Grimace

User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

But remember: mine are software errors. Nothing hardware or hardware related. I can repair/fix and that is NOT (I repeat: NOT) done by re-mapping but by rewriting the sector. I know that as I selected that option (and remapping was another option that I did not select).
SO my bad sectors are fixed and not remapped.

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

I assume that what you call software errors are due to the data on a sector giving a bad checksum, so that it is really a hardware error in that some data has been lost in the drive. This can also be caused by surface deterioration, but rewriting the data may be sufficient to get it to read back reliably for a while, until there is further deterioration.

Modern disk drives store data in EXTREMELY small magnetic domains on the disk, and it is very easy for these to flip to the wrong state if they lose just a tiny amount of the oxide.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

FWIW (and with apologies for the delay)

I apologize to be stubborn but I do have a technical background and I believe that older HDs can suffer from slight wear and tear of the drive and head mechanisms which could cause track alignment issues. A sector by sector / track 're-layout / re-write / re-format' would redefine the track location and (all theoretically) would revive things for years to come again.
The fact that a re-write (and again, NOT! a remap) works, gives me the impression that this could be the root cause.
For sure, if other hardware related issues are causing this, then the fix would be impossible or the errors would come back in other locations.
And to repeat my disclaimer: I did buy another HD and this 'broken' one is to 'play and learn'.

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

My experience of disk failures (I have seen many over the years) is that head alignment issues tend to affect ALL the data on the drive, not just one or two sectors.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

OK, that I can follow. So that leaves particles somehow; can that be? where would these come from? Or something else. Diffusion or demagnetization (the first can also cause the second) seams feasible but that could be 'undone' by re-writing (read: re-formatting), right?
Just trying to understand the causes and maybe the science behind this. ;-)

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

With the extremely small magnetic domains it can be a 'random' flip, possibly caused by a stray magnetic field or even a cosmic ray. It is more likely to be caused by a surface flaw.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

If it's a stray magnetic field or a cosmic ray it could happen any time. This seems age related.
And even then, re-magnetizing (re-formatting) would fix that, right?
Surface flaw? How, what would cause that?

User avatar
StuartR
Administrator
Posts: 12577
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Low level HD recovery?

Post by StuartR »

Ther surface of disks is a very thin layer of magnetic material. The heads fly an incredibly small distance above this. The most common failure mode is that a tiny piece of the surface comes away and gets swept around by the heads.
StuartR


User avatar
ErikJan
5StarLounger
Posts: 1185
Joined: 03 Feb 2010, 19:59
Location: Terneuzen, the Netherlands

Re: Low level HD recovery?

Post by ErikJan »

OK. That could be an explanation. But the centrifugal forces would ultimately move these particles away from the platters, right? And there's never physical contact of course. Is there a website or forum or blog where these topics are discussed.
I still believe it's too simple to throw away a HD after an error occurs, 'just like that'. Sure, HD manufacturers love this and they would probably like to see error occur not too soon but also not too late (esp. for consumer drives). Maybe I'm just too paranoic ;-)