How to calculate LBA offsets on a 3ware 9650SE-16PML

Post date: Dec 31, 2011 2:00:38 PM

Info

GOAL: To have a document that outlines how to find a bad hard drive that is responsible for controller resets. When controller resets occur, it may not tell you which disk is bad, so you have to find out manually.

NOTES: Many of these steps were taken from LSI's (formerly 3ware) excellent technical support in a case that I opened with them.

REQUIREMENTS: A shell, tw_cli.

Specifications

Distribution: Debian Testing

Architecture: 64-bit

Outline

1. How to find the root cause of a controller reset, in this specific case.

Setup

1. 3ware 9650SE-16PML RAID controller

2. 16x1TB disks (15x1TB in a RAID-6, the 16th disk is a hot spare)

The Problem

I saw this in the logs:

Mar 21 05:40:03 p34 kernel: [521953.433965] sd 0:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting card.

Investigation

After a preliminary investigation, I opened a case with 3ware for a root cause analysis. They stated that Drive02 was bad. However, tw_cli was showing all disks as OK, and the SMART data was all good too.

In Google's "Failure Trends in a Large Drive Population" [1], they state "Out of all failed drives, over 56% of them have no count in any of the four strong SMART signals, namely scan errors, reallocation count, offline reallocation, and probational count. In other words, models based only on those signals can never predict more than half of the failed drives."

Check the RAID health: Looks good.

# tw_cli info c0

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

------------------------------------------------------------------------------

u0 RAID-6 OK - - 64K 12107.1 RiW ON

u1 SPARE OK - - - 931.505 - ON

VPort Status Unit Size Type Phy Encl-Slot Model

------------------------------------------------------------------------------

p0 OK u0 931.51 GB SATA 0 - WDC WD1002FBYS-01A6

p1 OK u0 931.51 GB SATA 1 - WDC WD1002FBYS-01A6

p2 OK u0 931.51 GB SATA 2 - WDC WD1002FBYS-01A6

p3 OK u0 931.51 GB SATA 3 - WDC WD1002FBYS-01A6

p4 OK u0 931.51 GB SATA 4 - WDC WD1002FBYS-01A6

p5 OK u0 931.51 GB SATA 5 - WDC WD1002FBYS-01A6

p6 OK u0 931.51 GB SATA 6 - WDC WD1002FBYS-01A6

p7 OK u0 931.51 GB SATA 7 - WDC WD1002FBYS-01A6

p8 OK u0 931.51 GB SATA 8 - WDC WD1002FBYS-01A6

p9 OK u0 931.51 GB SATA 9 - WDC WD1002FBYS-01A6

p10 OK u0 931.51 GB SATA 10 - WDC WD1002FBYS-01A6

p11 OK u0 931.51 GB SATA 11 - WDC WD1002FBYS-01A6

p12 OK u0 931.51 GB SATA 12 - WDC WD1002FBYS-01A6

p13 OK u0 931.51 GB SATA 13 - WDC WD1002FBYS-01A6

p14 OK u0 931.51 GB SATA 14 - WDC WD1002FBYS-01A6

p15 OK u1 931.51 GB SATA 15 - WDC WD1002FBYS-01A6

Check the disk (continue reading to find out how to identify the bad disk)

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0

3 Spin_Up_Time 0x0027 253 253 021 Pre-fail Always - 1175

4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 74

5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0

7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0

9 Power_On_Hours 0x0032 100 095 000 Old_age Always - 601

10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0

11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 74

192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 73

193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 74

194 Temperature_Celsius 0x0022 115 112 000 Old_age Always - 35

196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0

200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Extended offline Completed without error 00% 598 -

# 2 Conveyance offline Completed without error 00% 595 -

# 3 Short offline Completed without error 00% 583 -

# 4 Extended offline Completed without error 00% 563 -

# 5 Short offline Completed without error 00% 537 -

# 6 Short offline Completed without error 00% 512 -

# 7 Short offline Completed without error 00% 488 -

# 8 Short offline Completed without error 00% 464 -

# 9 Short offline Completed without error 00% 440 -

#10 Short offline Completed without error 00% 416 -

1. Look for the following messages after the controller reset and find the bad LBAs

Update (03/24/2010)

The engineer had overlooked the error=, if the error=0x01, then it would have been a drive issue. Currently, we are looking at the BBU module on the controller itself. It has been removed and I am re-testing to see if I can re-create the problem. However, I am keeping the rest of this post online because it does show how to find a bad disk in a 3ware RAID array, just make sure it does not say error=0x0 however.

Update (03/25/2010)

In this specific instance, the BBU controller (which attaches to the 3ware controller failed). After disconnecting the controller and continuing to pound on the RAID for 24 hours, there have been no further issues. When I asked 3ware if it was common for the BBU to fail, they responded: "BBU is a consumable product that does wear out over time and eventual failure is inevitable."

The command you need to get these logs is shown below:

# tw_cli /c0 show diag

You can also use the lsigetlunix.sh script provided by LSI to gather all of the logs for you.

This is what we are looking for after the controller reset occurred:

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

# grep -A10000 'Soft Reset Handler Started ...' 3ware/Controller_C0.txt |grep map=

DcbMgr::WriteSegment(map=0x4BCA04, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA04, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA04, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA04, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4C4F4C, segID=0x1, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x8, events=30, error=0x0)

DcbMgr::WriteSegment(map=0x4BCA2C, segID=0x1, events=30, error=0x0)

# grep 0x 3ware/Controller_C0.txt |grep map= | awk '{print $1}'|cut -f2- -d'('| cut -f1 -d','|sort|uniq

map=0x4BCA04 (events=30)

map=0x4BCA2C (events=30)

map=0x4C4F4C (events=30)

map=0x4CC32C (events=2)

The LSI engineer stated the drive is having problems with write requests and that these are the bad offsets. They are hexadecimal, they must be converted to decimal first.

2. Convert the offsets to decimal

On Apple's website [2], they show a quick way to convert from hexadecimal to decimal using printf

printf "%d\\n\\n" 0x4BCA2C

4966956

So let's get them all:

for lba in 0x4BCA04 0x4BCA2C 0x4C4F4C 0x4CC32C; do printf "%d=$lba\\n\\n" $lba; done | grep .

4966916=0x4BCA04

4966956=0x4BCA2C

5001036=0x4C4F4C

5030700=0x4CC32C

3. Divide by the chunk size of the array

I use 64KiB as my stripe size, the LSI engineer stated that you need to divide by 64000 for each bad LBA.

This results in the following output:

77.60806250000000000000

77.60868750000000000000

78.14118750000000000000

78.60468750000000000000

4. Now you need to loop over the drives and find the affected drive

It took me a little while to understand what the engineer was saying until I wrote it down, then it made sense. What he explained to me was, "we count from 1-15 and subtract the offset(s) later."

Details:

There are 16 drives total, but only 15 in the array, one is a hot spare.

Subtract 1 for the offset, the array starts at 0 and we are counting from 1.

Subtract 1 due to having a hot spare.

5. Loop the drives.

Details:

There are 16 drives total, but only 15 in the array, one is a hot spare.

Subtract 1 for the offset, the array starts at 0 and we are counting from 1.

Subtract 1 due to having a hot spare.

Now we will "loop through the drives"

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15->1 LOOP 1

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30->2 LOOP 2

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45->3 LOOP 3

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60->4 LOOP 4

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75->5 LOOP 5

76

77 <- Here is the problem (offsets=77.6/77.60)

78 <- Here is the problem (offsets=78.14/78.60)

79

80

81

82

83

84

85

86

87

88

89

..

You would continue if the LBA offsets were higher.

6. Subtract the offsets from earlier

So from above we take "LOOP 5" subtract by the offset:

-1 since we count from 1 and not 0

-1 since we have a hot spare

So: 5-2 = 3

Counting from 0 we get Drive02:

Drive00 <- Drive1 in system

Drive01 <- Drive2 in system

Drive02 <- Drive3 in system (or p2) as shown below:

# tw_cli info c0

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

------------------------------------------------------------------------------

u0 RAID-6 OK - - 64K 12107.1 RiW ON

u1 SPARE OK - - - 931.505 - ON

VPort Status Unit Size Type Phy Encl-Slot Model

------------------------------------------------------------------------------

p0 OK u0 931.51 GB SATA 0 - WDC WD1002FBYS-01A6

p1 OK u0 931.51 GB SATA 1 - WDC WD1002FBYS-01A6

p2 OK u0 931.51 GB SATA 2 - WDC WD1002FBYS-01A6

p3 OK u0 931.51 GB SATA 3 - WDC WD1002FBYS-01A6

p4 OK u0 931.51 GB SATA 4 - WDC WD1002FBYS-01A6

p5 OK u0 931.51 GB SATA 5 - WDC WD1002FBYS-01A6

p6 OK u0 931.51 GB SATA 6 - WDC WD1002FBYS-01A6

p7 OK u0 931.51 GB SATA 7 - WDC WD1002FBYS-01A6

p8 OK u0 931.51 GB SATA 8 - WDC WD1002FBYS-01A6

p9 OK u0 931.51 GB SATA 9 - WDC WD1002FBYS-01A6

p10 OK u0 931.51 GB SATA 10 - WDC WD1002FBYS-01A6

p11 OK u0 931.51 GB SATA 11 - WDC WD1002FBYS-01A6

p12 OK u0 931.51 GB SATA 12 - WDC WD1002FBYS-01A6

p13 OK u0 931.51 GB SATA 13 - WDC WD1002FBYS-01A6

p14 OK u0 931.51 GB SATA 14 - WDC WD1002FBYS-01A6

p15 OK u1 931.51 GB SATA 15 - WDC WD1002FBYS-01A6

Drive02 is the disk that needs to be replaced.

7. Export the disk and allow the array to rebuild to the spare

# tw_cli /c0/p2 export

Removing /c0/p2 will take the disk offline.

Do you want to continue ? Y|N [N]: Y

Removing port /c0/p2 ... Done.

8. Allow the array to rebuild and then replace the bad disk

# tw_cli info c0

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

------------------------------------------------------------------------------

u0 RAID-6 REBUILDING 45%(A) - 64K 12107.1 RiW ON

VPort Status Unit Size Type Phy Encl-Slot Model

------------------------------------------------------------------------------

p0 OK u0 931.51 GB SATA 0 - WDC WD1002FBYS-01A6

p1 OK u0 931.51 GB SATA 1 - WDC WD1002FBYS-01A6

p3 OK u0 931.51 GB SATA 3 - WDC WD1002FBYS-01A6

p4 OK u0 931.51 GB SATA 4 - WDC WD1002FBYS-01A6

p5 OK u0 931.51 GB SATA 5 - WDC WD1002FBYS-01A6

p6 OK u0 931.51 GB SATA 6 - WDC WD1002FBYS-01A6

p7 OK u0 931.51 GB SATA 7 - WDC WD1002FBYS-01A6

p8 OK u0 931.51 GB SATA 8 - WDC WD1002FBYS-01A6

p9 OK u0 931.51 GB SATA 9 - WDC WD1002FBYS-01A6

p10 OK u0 931.51 GB SATA 10 - WDC WD1002FBYS-01A6

p11 OK u0 931.51 GB SATA 11 - WDC WD1002FBYS-01A6

p12 OK u0 931.51 GB SATA 12 - WDC WD1002FBYS-01A6

p13 OK u0 931.51 GB SATA 13 - WDC WD1002FBYS-01A6

p14 OK u0 931.51 GB SATA 14 - WDC WD1002FBYS-01A6

p15 DEGRADED u0 931.51 GB SATA 15 - WDC WD1002FBYS-01A6

Name OnlineState BBUReady Status Volt Temp Hours LastCapTest

---------------------------------------------------------------------------

bbu On Yes OK OK OK 255 12-Dec-2009

URLs

[1] http://labs.google.com/papers/disk_failures.pdf

[2] http://support.apple.com/kb/HT3392?viewlocale=en_US