The user interface to smartmontools is the smartctl application. This application initiates tests on the disk drive and reports the status of the SMART device.
Since the smartctl application accesses the raw disk device, you must have root access to run it.
The -a option reports the full status of a device:
$ smartctl -a /dev/sda
The output will be a header of basic information, a set of raw data values and the test results. The header includes details about the drive being tested and a datestamp for this report:
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-
642.11.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen,
http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EZEX-00BN5A0
Serial Number: WD-WCC3F1HHJ4T8
LU WWN Device Id: 5 0014ee 20c75fb3b
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P
showall]
ATA Version is: 8
ATA Standard is: ACS-2 (unknown minor revision code: 0x001f)
Local Time is: Mon Jan 23 11:26:57 2017 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
...
The raw data values include error counts, spin-up time, power-on hours, and more. The last two columns (WHEN_FAILED and RAW_VALUE) are of particular interest. In the following sample, the device has been powered on 9823 hours. It was powered on and off 11 times (servers don't get power-cycled a lot) and the current temperature is 30° C. When the value for power on gets close to the manufacturer's Mean Time Between Failures (MTBF), it's time to start considering replacing the drive or moving it to a less critical system. If the Power Cycle count increases between reboots, it could indicate a failing power supply or faulty cables. If the temperature gets high, you should consider checking the drive's enclosure. A fan may have failed or a filter might be clogged:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
9 Power_On_Hours 0x0032 087 087 000 Old_age Always
- 9823
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always
- 11
194 Temperature_Celsius 0x0022 113 109 000 Old_age Always
- 30
The last section of the output will be the results of the tests:
SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours)
LBA_of_first_error
# 1 Extended offline Completed without error 00% 9825
-
The -t flag forces the SMART device to run the self-tests. These are non-destructive and can be run on a drive while it is in service. SMART devices can run a long or short test. A short test will take a few minutes, while the long test will take an hour or more on a large device:
$ smartctl -t [long][short] DEVICE $ smartctl -t long /dev/sda smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-642.11.1.el6.x86_64] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 124 minutes for test to complete. Test will complete after Mon Jan 23 13:31:23 2017 Use smartctl -X to abort test.
In a bit over two hours, this test will be completed and the results will be viewable with the smartctl -a command.