Managing At-Risk Disks

Managing At-Risk Disks

When disks are identified as at-risk, it is important to take appropriate action to prevent data loss.

Identifying At-Risk Disks

At-risk disks can be found:

  • On the Fleet Health dashboard
  • Via the Disks at Risk page (from Health)
  • Through disk list filtering
  • In alert notifications

Investigating an At-Risk Disk

  1. Navigate to the disk's detail page
  2. Review the health score and risk factors
  3. Check the S.M.A.R.T. data for specific issues
  4. Review recent metrics for trends
  5. Check active alerts for the disk

Common Risk Indicators

For HDDs:

  • Increasing reallocated sector counts
  • High temperature readings
  • Growing pending sector counts
  • Spin retry failures

For SSDs:

  • High wear level percentages
  • Decreasing available reserved space
  • Media wearout indicators
  • Elevated error rates

Recommended Actions

Based on the severity and type of issue:

Monitoring Level:

  • Continue observing the disk
  • Review trends regularly
  • Plan for potential replacement

At-Risk Level:

  • Ensure backups are current
  • Order replacement hardware
  • Schedule replacement during maintenance window

Critical Level:

  • Immediate backup verification
  • Prepare for urgent replacement
  • Consider taking disk offline if data is already backed up

Documentation

Keep records of:

  • When issues were first detected
  • Actions taken
  • Replacement dates and details

This information is valuable for warranty claims and future planning.