SMART error (OfflineUncorrectableSector)

Forum dédié à la distribution du même nom et que vous pourrez télécharger sur http://www.contribs.org. La nouvelle version de cette distribution se nomme SME Server

Modérateur: modos Ixus

SMART error (OfflineUncorrectableSector)

Messagepar shwing » 12 Fév 2010 13:06

Hello,

J'ai installé la contribs Monitor Disk Health pour voir l'état de santé de mes disques. Évidemment je reçois un mail me disant ceci -->

Objet: SMART error (OfflineUncorrectableSector) detected on host: sme
This email was generated by the smartd daemon running on:

host name: sme
DNS domain: tchoupi.no-ip.net
NIS domain: (none)

The following warning/error was logged by the smartd daemon:

Device: /dev/sda, 757 Offline uncorrectable sectors

For details see host's SYSLOG (default: /var/log/messages).

You can also use the smartctl utility for further investigation.
Another email message will be sent in 1 days if the problem persists


Cette commande:
tail -f /var/log/messages | tai64nloval | grep /dev/sd

me dit:
Code: Tout sélectionner
Feb 12 11:19:12 sme smartd[6638]: Device: /dev/sda, 757 Offline uncorrectable sectors
Feb 12 11:19:12 sme smartd[6638]: Device: /dev/sda, SMART Prefailure Attribute: 8 Seek_Time_Performance changed from 250 to 251
Feb 12 11:19:12 sme smartd[6638]: Device: /dev/sda, SMART Usage Attribute: 199 UDMA_CRC_Error_Count changed from 197 to 194
Feb 12 11:19:12 sme smartd[6638]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 55 to 54
Feb 12 11:19:12 sme smartd[6638]: Device: /dev/sdb, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 55 to 54


Ni une ni deux, je fais tourner la commande
smartctl -a /dev/sda -h

qui me renvoie ceci :


Code: Tout sélectionner
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     Maxtor 6Y160M0
Serial Number:    Y44MLZBE
Firmware Version: YAR51EW0
User Capacity:    163,928,604,672 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Fri Feb 12 11:06:27 2010 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)   Offline data collection activity
               was never started.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)   The previous self-test routine completed
               without error or no self-test has ever
               been run.
Total time to complete Offline
data collection:        ( 302) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               No General Purpose Logging support.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     (  72) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027   200   199   063    Pre-fail  Always       -       15237
  4 Start_Stop_Count        0x0032   253   253   000    Old_age   Always       -       1619
  5 Reallocated_Sector_Ct   0x0033   251   162   063    Pre-fail  Always       -       24
  6 Read_Channel_Margin     0x0001   253   253   100    Pre-fail  Offline      -       0
  7 Seek_Error_Rate         0x000a   253   252   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0027   251   244   187    Pre-fail  Always       -       37038
  9 Power_On_Minutes        0x0032   191   191   000    Old_age   Always       -       1000h+26m
10 Spin_Retry_Count        0x002b   253   252   157    Pre-fail  Always       -       0
11 Calibration_Retry_Count 0x002b   253   252   223    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   248   248   000    Old_age   Always       -       2284
192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   253   253   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0032   253   253   000    Old_age   Always       -       51
195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age   Always       -       7592
196 Reallocated_Event_Count 0x0008   001   001   000    Old_age   Offline      -       757
197 Current_Pending_Sector  0x0008   253   164   000    Old_age   Offline      -       0
198 Offline_Uncorrectable   0x0008   001   001   000    Old_age   Offline      -       757
199 UDMA_CRC_Error_Count    0x0008   198   191   000    Old_age   Offline      -       10
200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   253   252   000    Old_age   Always       -       45
202 TA_Increase_Count       0x000a   253   239   000    Old_age   Always       -       0
203 Run_Out_Cancel          0x000b   253   252   180    Pre-fail  Always       -       6
204 Shock_Count_Write_Opern 0x000a   253   251   000    Old_age   Always       -       0
205 Shock_Rate_Write_Opern  0x000a   253   252   000    Old_age   Always       -       0
207 Spin_High_Current       0x002a   253   252   000    Old_age   Always       -       0
208 Spin_Buzz               0x002a   253   252   000    Old_age   Always       -       0
209 Offline_Seek_Performnce 0x0024   192   192   000    Old_age   Offline      -       0
99 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0
100 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0
101 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
Warning: ATA error count 8249 inconsistent with error log pointer 5

ATA Error Count: 8249 (device log contains only the most recent five errors)
   CR = Command Register [HEX]
   FR = Features Register [HEX]
   SC = Sector Count Register [HEX]
   SN = Sector Number Register [HEX]
   CL = Cylinder Low Register [HEX]
   CH = Cylinder High Register [HEX]
   DH = Device/Head Register [HEX]
   DC = Device Command Register [HEX]
   ER = Error register [HEX]
   ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8249 occurred at disk power-on lifetime: 18044 hours (751 days + 20 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 cd cf d2 e0  Error: ICRC, ABRT at LBA = 0x00d2cfcd = 13815757

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 cd cf d2 e0 00  28d+06:06:44.752  READ DMA EXT
  25 00 00 cd cd d2 e0 00  28d+06:06:44.752  READ DMA EXT
  25 00 00 cd c9 d2 e0 00  28d+06:06:44.736  READ DMA EXT
  25 00 00 cd c5 d2 e0 00  28d+06:06:44.736  READ DMA EXT
  25 00 80 4d c4 d2 e0 00  28d+06:06:44.736  READ DMA EXT

Error 8248 occurred at disk power-on lifetime: 12640 hours (526 days + 16 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 4d ae 7a e0  Error: ICRC, ABRT at LBA = 0x007aae4d = 8040013

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 4d ae 7a e0 00      00:13:51.008  READ DMA EXT
  25 00 00 4d aa 7a e0 00      00:13:50.992  READ DMA EXT
  25 00 80 cd a6 7a e0 00      00:13:50.992  READ DMA EXT
  25 00 00 cd a2 7a e0 00      00:13:50.992  READ DMA EXT
  25 00 00 cd 9e 7a e0 00      00:13:50.976  READ DMA EXT

Error 8247 occurred at disk power-on lifetime: 13072 hours (544 days + 16 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 e5 b7 04 e0  Error: ICRC, ABRT at LBA = 0x0004b7e5 = 309221

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 e5 b7 04 e0 00  21d+06:48:36.816  READ DMA
  c8 00 08 2d b4 04 e0 00  21d+06:48:36.800  READ DMA
  c8 00 08 15 f0 fa e0 00  21d+06:48:36.800  READ DMA
  c8 00 08 d5 af 04 e0 00  21d+06:48:36.800  READ DMA
  c8 00 10 ad af 04 e0 00  21d+06:48:36.800  READ DMA

Error 8246 occurred at disk power-on lifetime: 12580 hours (524 days + 4 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 dd 2f e3 ea  Error: UNC 8 sectors at LBA = 0x0ae32fdd = 182661085

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 dd 2f e3 ea 00  37d+19:56:58.976  READ DMA
  ec 03 46 00 00 00 a0 00  37d+19:56:58.960  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  37d+19:56:58.960  SET FEATURES [Set transfer mode]
  ec 00 00 dd 2f e3 a0 00  37d+19:56:58.960  IDENTIFY DEVICE
  c8 00 08 dd 2f e3 ea 00  37d+19:56:57.952  READ DMA

Error 8245 occurred at disk power-on lifetime: 12580 hours (524 days + 4 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 dd 2f e3 ea  Error: UNC 8 sectors at LBA = 0x0ae32fdd = 182661085

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 dd 2f e3 ea 00  37d+19:56:57.952  READ DMA
  ec 03 46 00 00 00 a0 00  37d+19:56:57.952  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  37d+19:56:57.952  SET FEATURES [Set transfer mode]
  ec 00 00 dd 2f e3 a0 00  37d+19:56:57.952  IDENTIFY DEVICE
  c8 00 08 dd 2f e3 ea 00  37d+19:56:56.944  READ DMA

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



Évidemment j'en déduis qui serai bon d'envisager de changer de disque, mais est-ce vraiment critique comme situation, ou je peux laisser trainer cette situation ? Dans les logs on peux voir que ce soucis n'est apparemment pas d'aujourd'hui (751 days + 20 hours) Est-ce grave docteur ?
J'ai commencer à lire ceci, mais houlaaa, là, ça devient trop compliqué pour mes toutes petites connaissances linuxiennes. Toute aide ou conseil seront les bienvenus.
Avatar de l’utilisateur
shwing
Amiral
Amiral
 
Messages: 1246
Inscrit le: 14 Mars 2004 01:00
Localisation: GE/CH

Retour vers E-Smith / SME Server

Qui est en ligne ?

Utilisateur(s) parcourant actuellement ce forum : Aucun utilisateur inscrit et 1 invité

cron