Identifying and Troubleshooting Hardware-Related Problems

Printer-Friendly Version
Product: 
ccc4

Sometimes hardware components die slow and annoyingly inconsistent deaths. At one moment, it appears that you can copy data to the disk and use it ordinarily. In the next moment, you're getting seemingly random errors, hangs, crashes, the destination volume "disappearing" in the middle of a backup task, Finder lockups and other unruly behavior.

When hardware fails in this way, it's nearly impossible for the OS or CCC to pop up a dialog that says "Hey, it's time to replace XYZ!" Instead, you have to dig a little deeper, rule out components, try replacements, etc. to isolate the faulty component.

Many times that hardware problems occur, CCC will get meaningful errors from the macOS kernel that plainly indicate some sort of hardware problem, and CCC will report these at the end of the backup task. In some cases, however, macOS or CCC will detect a hung filesystem and you will see one of the following messages from CCC:

"The backup task was aborted because the [source or destination] volume's mountpoint changed."

If you see this message, macOS's kernel recognized that the affected filesystem was not responding and terminated it. While this is obviously an abrupt end to your backup task, it beats the alternative macOS behavior described next.

"The backup task was aborted because the [source or destination] filesystem is not responding."

CCC will present this message when the source or destination volume hasn't accepted read or write activity in at least ten minutes, and a deliberate followup test verifies that a simple read or write request fails. In these cases, macOS's kernel has failed to take action on the misbehaving filesystem and you can expect to see hangs in any application that attempts to read from or write to the affected volume. To break the hang, the affected disk must be forcibly detached from your Mac or you must reboot by holding down the power button if the disk is internal.

Troubleshooting steps

When CCC suggests that you might have a hardware problem, here are the steps that we recommend you take to isolate the problem. Repeat the backup task between each step, and stop if something has resolved the problem:

  1. If the affected volume resides on an external hard drive, detach that disk from your Mac, then reattach it. Otherwise, restart your Mac before proceeding. Note that this generally only resolves the acute problem of a filesystem hanging. While the disk may appear to function fine once it is reattached, it's not unlikely for problems to recur.
  2. Run Disk Utility's "Repair disk" tool on the source and destination volumes. Filesystem problems are commonplace, and easy to rule out. If you discover filesystem problems on your startup disk, boot from your CCC backup volume or Apple's Recovery volume to run Disk Utility so you can repair the problems.
  3. If you have any other hardware devices attached to your Mac (e.g. Firewire or USB webcams, printers, iPhones — anything other than a display, keyboard, mouse, and the source/destination disks), detach them. If your source or destination volume is plugged into a USB hub, keyboard, or display, reconnect it to one of your Mac's built-in ports.
  4. Replace the cable that you're using to connect the external hard drive enclosure to your Mac (if applicable).
  5. Try connecting the external hard drive enclosure to your Mac via a different interface (if applicable)
  6. Try the same hard drive in a different external hard drive enclosure.
  7. Reformat the hard drive in Disk Utility. If the affected disk is not an SSD, click the "Security Options" button in the Erase tab and drag the slider to the right to specify the option to write a single pass of zeroes. Writing zeroes to every sector will effectively detect and "spare out" any additional failing sectors that have yet to be discovered.
  8. If none of the previous steps has resolved the problem, then the hard drive is failing or defective. Replace the hard drive.

"Why does CCC eject the destination?" or "Why is CCC making my whole computer hang?"

We hear this one a lot, and we generally reply, "don't shoot the messenger." In most cases, CCC is either the only application copying files to the affected volume, or it is at least the application doing most of the access, so it only seems like the problem is specific to CCC. A typical backup task will make millions of filesystem requests, so it comes as no surprise to us when CCC uncovers hardware problems in a disk. CCC is merely copying files from one disk to another, and this is not the kind of task that should cause a system-wide hang. Whenever multiple applications are hanging while trying to access a volume, the fault lies entirely within the macOS kernel, which is mishandling hardware that is either failing or defective. If you're uncertain of this assessment, please send us a report from CCC's Help window. When CCC detects a hang or stalled filesystem, it collects diagnostic information to determine where the hang is occurring. We're happy to review the diagnostics and confirm or deny the presence of a hardware problem.

"But Disk Utility says that there is nothing wrong with the disk…"

Disk Utility is competent at detecting structural problems with the filesystem, but it can't necessarily detect hardware failures that can cause a filesystem to stop responding to read and write requests. Additionally, even if your disk is SMART capable and "Verified", the attributes that SMART status reports on are weighted, and may not yet indicate that the hardware is in a pre-fail condition. Don't take a "Verified" status to indicate that your disk has no hardware problems whatsoever.

"But Disk Warrior/Tech Tool/[other third-party utility] says the hardware is fine, I'm sure the hardware is fine!"

There are no hardware diagnostic utilities on the market that will inform you of a problem with a cable, port, or enclosure, or report a bug in the firmware of a hard drive or SSD. The tools currently available on the Mac platform will inform you of software-based filesystem problems, media failure, and the results of SMART diagnostics which are specific to the hard drive device inside of an enclosure. While these tools are great at identifying the problems within that scope, the inability to detect problems with a cable, port, or enclosure, or a firmware bug on a hard drive, leaves a gaping hole that can only be filled with old-fashioned troubleshooting — isolate components, rule out variables, run multiple tests.

Other factors that can lead to hangs

Hardware is often the culprit when a backup task hangs, but sometimes other software can interfere with a backup task and even cause the whole system to hang. If you are using an external hard drive enclosure that came with custom software, try disabling or uninstalling that software before trying your next backup task. If a firmware update is available for your enclosure, try applying that as well to see if a problem with the enclosure has been resolved recently through a software update.

Related discussions:
Uninstalling Seagate diagnostic utilities alleviates hangs

Additionally, some hard drive enclosures respond poorly to sleep/wake events. If the problems that you are encountering tend to occur only after your system has slept and woken, you should try a different hard drive enclosure or interface to rule out enclosure-specific sleep problems.

Troubleshooting "Media errors"

Read errors are typically a result of media damage — some of the "sectors" on the hard drive have failed and macOS can no longer read data from them. Read errors can occur on the source or destination volume, and they can affect old disks as well as brand-new disks. When read errors occur, the file or files that are using the bad sector must be deleted. Bad sectors are "spared out" — permanently marked as unusable — only when the files on those sectors are deleted. The steps below indicate how to resolve media errors.

  1. Click on the affected item in the Task History window, then click on the "Reveal in Finder" button.
  2. Move the affected files and/or folders to the Trash.
  3. Empty the Trash.
  4. If you had to delete items from your source volume, locate those items on your backup volume and copy them back to the source (if desired).
  5. If CCC reported problems with more than a few files or folders, we strongly recommend that you reformat the affected disk in Disk Utility. If the affected disk is not an SSD, click the "Security Options" button in the Erase tab and drag the slider to the right to specify the option to write a single pass of zeroes. Writing zeroes to every sector will effectively detect and "spare out" any additional failing sectors that have yet to be discovered. If the affected disk is your startup disk, boot from your CCC bootable backup volume to perform this procedure (after you have allowed CCC to complete a backup).

Once you have deleted the affected files, you should be able to re-run your backup task with success.

Note: If you do not have a backup of the affected files, please scroll to the top of this document and exhaust the hardware-based troubleshooting techniques first. As indicated above, read errors are typically a result of media damage. In some rare cases, though, media errors can be errantly reported when a hardware-based problem exists (e.g. a bad port, cable, or enclosure). If deleting your only copy of a file is the suggested resolution, then it's prudent to rule out everything else as the cause of an issue before deleting that file.

"But Disk Utility says that there is nothing wrong with the disk"

While it is generally a good practice to run Disk Utility's "Repair volume" utility when you run into problems with your disk, note that Disk Utility does not scan for bad sectors, it only checks the health of the filesystem. Additionally, the SMART status that is reported in Disk Utility will report "Verified" unless the drive is in a pre-fail condition — i.e. failure is imminent. Bad sectors will not be reported by Disk Utility.

Individual sector failures are not uncommon, and not necessarily indicative of imminent drive failure. A full-volume backup of your hard drive is a great method for detecting media problems with sectors that are in use because it requires reading data from each of those sectors. If you only see a handful of affected files, delete those files as described above and continue to use the disk. If you see dozens or hundreds of these errors, however, there could be a more significant problem at play, and it may be time to replace the disk.

Errors on read write that are caused by physical drive malfunction

If your source or destination hard drive is experiencing a significant physical malfunction (errors that go beyond "input/output" read errors described above), you may have a narrow window of opportunity to back up the data from that disk to another hard drive. Time is precious; components could fail at any moment rendering the drive completely unmountable. Read activity is stressful on a dying volume, especially a full-volume backup. We recommend that you immediately back up the files that are most important to you. When you have backed up the most important data, next try to do a full-volume backup. When you have recovered as much data as possible, we recommend that you replace the affected hard drive.

What if the dying drive's volume won't mount?

More often than not, you're completely out of luck. You may be able to revive a hard drive for small amounts of time by letting the drive cool down (somewhere cool and dry, not cold) and then powering it up attached to a service workstation (i.e. don't attempt to boot from it, you may not have enough time).