Sun SPARC 小机故障硬盘更换方法

网友投稿 1790 2022-09-20

Sun SPARC 小机故障硬盘更换方法

Sun SPARC 小机故障硬盘更换方法

单位的SUN小机硬盘前阵子指示灯 告警,检查发现是硬盘故障,设备早已脱保,如今都是自己维护,网上查询了很多资料,可用的信息太少,还是自己动手丰衣足食的好,也给有类似设备故障的同僚一点参考 一、 系统介绍 操作系统:Solaris 10文件系统:ZFS存储池:

bash-3.2# zpool status pool: rpool state: ONLINE scan: resilvered 144G in 2h10m with 0 errors on Mon Dec 11 13:56:43 2017 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t0d0s0 ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0

二、故障描述

存储池rpool中一个磁盘故障,待更换:

bash-3.2# zpool status pool: rpool state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scan: resilvered 197G in 2h24m with 0 errors on Fri May 12 19:15:56 2017 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c0t0d0s0 FAULTED 9 618 0 too many errors c0t1d0s0 ONLINE 0 0 0 errors: No known data errors

三、操作过程

1.确定磁盘

在换盘过程中,如果无法在外观上确定坏盘,可通过以下命令生成一个10G的文件:

bash-3.2# mkfile 10G file1

目前两块磁盘中,一个已经是坏盘,所以当生成1个10G文件时,在两块磁盘外观上,无故障的磁盘灯会闪烁,不闪烁的则是故障盘。

2.确定引导盘

bash-3.2# prtconf -vp|grep -i bootpath bootpath: '/pci@0,600000/pci@0/scsi@1/disk@0,0:a'

确定引导盘为disk0,再换盘前需更换为另外一个dis1进行系统引导重启操作系统到OK界面:

{0} ok boot disk1 (以disk1磁盘进行引导,在用disk1将系统引导起来后再进行故障盘更换工作)

3.更换磁盘

bash-3.2# zpool offline rpool c0t0d0s0 bash-3.2# zpool status pool: rpool state: DEGRADED status: One or more devices has been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: scrub repaired 0 in 2h7m with 0 errors on Thu Jan 24 20:31:42 2019 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c0t0d0s0 OFFLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 errors: No known data errors bash-3.2# df -h Filesystem size used avail capacity Mounted on rpool/ROOT/s10s_u11wos_24a 274G 109G 145G 43% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 92G 448K 92G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object sharefs 0K 0K 0K 0% /etc/dfs/sharetab fd 0K 0K 0K 0% /dev/fd swap 92G 64K 92G 1% /tmp swap 92G 72K 92G 1% /var/run rpool/export 274G 34K 145G 1% /export rpool/export/home 274G 13M 145G 1% /export/home rpool 274G 106K 145G 1% /rpool

拔出旧磁盘,插入新磁盘

4.建立磁盘分区表

对于系统盘做镜像,需要将两个磁盘的分区表做成一模一样。

bash-3.2# devfsadm -C bash-3.2# format Searching for disks...done c0t0d0: configured with capacity of 279.38GB AVAILABLE DISK SELECTIONS: 0. c0t0d0 /pci@0,600000/pci@0/scsi@1/sd@0,0 1. c0t1d0 /pci@0,600000/pci@0/scsi@1/sd@1,0 2. c3t60060E8005638900000063890000000Cd0 /scsi_vhci/ssd@g60060e8005638900000063890000000c 3. c3t60060E80056389000000638900000000d0 /scsi_vhci/ssd@g60060e80056389000000638900000000 4. c3t60060E80056389000000638900000008d0 oracle /scsi_vhci/ssd@g60060e80056389000000638900000008 5. c3t600507640081002FC0000000000000FCd0 /scsi_vhci/ssd@g600507640081002fc0000000000000fc Specify disk (enter its number): 0 selecting c0t0d0 [disk formatted] Disk not labeled. Label it now? y #新盘需打标签,2T以上容量可以打EFI标签(打标签会删除磁盘数据) FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision volname - set 8-character volume name ! - execute , then return quit format> p PARTITION MENU: 0 - change `0' partition 1 - change `1' partition 2 - change `2' partition 3 - change `3' partition 4 - change `4' partition 5 - change `5' partition 6 - change `6' partition 7 - change `7' partition select - select a predefined table modify - modify a predefined partition table name - name the current table print - display the current table label - write partition map and label to the disk ! - execute , then return quit partition> p Current partition table (original): Total disk cylinders available: 46873 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 20 128.17MB (21/0/0) 262500 1 swap wu 21 - 41 128.17MB (21/0/0) 262500 2 backup wu 0 - 46872 279.38GB (46873/0/0) 585912500 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 usr wm 42 - 46872 279.13GB (46831/0/0) 585387500 7 unassigned wm 0 0 (0/0/0) 0

以上是新盘“c0t0d0”的磁盘的分区结构,查看之前无故障的磁盘“c0t1d0”的分区结构:

Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 46872 279.38GB (46873/0/0) 585912500 1 unassigned wu 0 0 (0/0/0) 0 2 backup wm 0 - 46872 279.38G (46873/0/0) 585912500 3 unassigned wu 0 0 (0/0/0) 0 4 unassigned wu 0 0 (0/0/0) 0 5 unassigned wu 0 0 (0/0/0) 0 6 unassigned wu 0 0 (0/0/0) 0 7 unassigned wu 0 0 (0/0/0) 0

通过对比可看出新磁盘“c0t0d0”与旧磁盘“c0t1d0”的分区结构是不一样的,由于两块磁盘彼此是镜像的结构,所以在进行数据同步前,先将新磁盘“c0t0d0”的分区按照旧磁盘“c0t1d0”进行分区设置,如下图所示,分区表已经拷贝一模一样,注意磁盘末尾是分片“s2”(分片 2,表示带有 VTOC 标签的整个磁盘。)

bash-3.2# prtvtoc /dev/rdsk/c0t1d0s2|fmthard -s - /dev/rdsk/c0t0d0s2(新盘) fmthard: New volume table of contents now in place.

#将“c0t0d0s2”的分区按照“c0t1d0s2”的分区表进行复制设置

bash-3.2# format # c0t0d0s2的分区表如下所示 Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 46872 279.38GB (46873/0/0) 585912500 1 unassigned wu 0 0 (0/0/0) 0 2 backup wm 0 - 46872 279.38GB(46873/0/0) 585912500 3 unassigned wu 0 0 (0/0/0) 0 4 unassigned wu 0 0 (0/0/0) 0 5 unassigned wu 0 0 (0/0/0) 0 6 unassigned wu 0 0 (0/0/0) 0 7 unassigned wu 0 0 (0/0/0) 0

5. 磁盘镜像(数据同步)

目前系统只有一块磁盘在正常工作:

bash-3.2# zpool status pool: rpool state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scan: resilvered 197G in 2h24m with 0 errors on Fri May 12 19:15:56 2017 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c0t0d0s0 FAULTED 9 618 0 too many errors c0t1d0s0 ONLINE 0 0 0 errors: No known data errors

将新换的磁盘加入存储池,制作磁盘镜像,开始数据拷贝

bash-3.2# zpool replace rpool c0t0d0s0 Make sure to wait until resilver is done before rebooting. bash-3.2# zpool status pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Dec 11 11:46:11 2017 12.4M scanned out of 144G at 906K/s, 46h25m to go 12.2M resilvered, 0.01% done config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 c0t0d0s0/old FAULTED 9 618 0 too many errors c0t0d0s0 ONLINE 0 0 0 (resilvering) c0t1d0s0 ONLINE 0 0 0 errors: No known data errors bash-3.2# iostat -xn 3 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 4.8 39.9 79.3 0.0 0.2 0.0 28.1 0 3 c0t0d0 1.5 6.7 35.0 114.8 0.0 0.2 0.0 23.5 0 3 c0t1d0 0.0 0.0 0.7 0.0 0.0 0.0 0.0 2.8 0 0 c3t60060E8005638900000063890000000Cd0 2.0 1.0 1.0 0.5 0.0 0.0 0.0 0.3 0 0 c3t60060E80056389000000638900000008d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0 c3t60060E80056389000000638900000000d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 358.8 0.0 10546.3 0.0 8.4 0.0 23.3 0 98 c0t0d0 475.3 37.0 10446.3 392.8 0.0 2.2 0.0 4.2 0 56 c0t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E8005638900000063890000000Cd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000008d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000000d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523) extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 405.2 0.0 11338.2 0.0 9.1 0.0 22.4 0 100 c0t0d0 647.6 0.0 11317.5 0.0 0.0 1.4 0.0 2.1 0 44 c0t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E8005638900000063890000000Cd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000008d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060E80056389000000638900000000d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t600507640081002FC0000000000000FCd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rdms02b:vold(pid523 bash-3.2# zpool status pool: backup state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 c3t60060E8005638900000063890000000Cd0 ONLINE 0 0 0 errors: No known data errors pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Dec 11 11:46:11 2017 107G scanned out of 144G at 17.5M/s, 0h36m to go 107G resilvered, 73.92% done config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 c0t0d0s0/old FAULTED 9 618 0 too many errors c0t0d0s0 ONLINE 0 0 0 (resilvering) c0t1d0s0 ONLINE 0 0 0 errors: No known data errors

数据同步完成后如下所示,磁盘更换完毕

bash-3.2# zpool status pool: backup state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 c3t60060E8005638900000063890000000Cd0 ONLINE 0 0 0 errors: No known data errors pool: rpool state: ONLINE scan: resilvered 144G in 2h10m with 0 errors on Mon Dec 11 13:56:43 2017 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t0d0s0 ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 errors: No known data errors

6.硬盘测试

重启系统,在xscf界面下,用新换的磁盘启动系统,若启动无问题,则磁盘更换成功:

{0} ok boot disk0 Boot device: /pci@0,600000/pci@0/scsi@1/disk@0 File and args:

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:python装饰器原理和用法总结(python装饰器详解)
下一篇:3.TCL脚本学习——expect
相关文章

 发表评论

暂时没有评论,来抢沙发吧~