After digging endlessly this night I finally found the solution. The short answer is that you can use the disks' GUIDs (which persist even after disconnecting a drive) with the zpool command.

Long answer: I got the disk's GUID using the zdb command which gave me the following output

root@zeus:/dev# zdb
hermes:
    version: 28
    name: 'hermes'
    state: 0
    txg: 162804
    pool_guid: 14829240649900366534
    hostname: 'zeus'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 14829240649900366534
        children[0]:
            type: 'raidz'
            id: 0
            guid: 5355850150368902284
            nparity: 1
            metaslab_array: 31
            metaslab_shift: 32
            ashift: 9
            asize: 791588896768
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 11426107064765252810
                path: '/dev/disk/by-id/ata-ST3300620A_5QF0MJFP-part2'
                phys_path: '/dev/gptid/73b31683-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 15935140517898495532
                path: '/dev/disk/by-id/ata-ST3300831A_5NF0552X-part2'
                phys_path: '/dev/gptid/746c949a-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[2]:
                type: 'disk'
                id: 2
                guid: 7183706725091321492
                path: '/dev/disk/by-id/ata-ST3200822A_5LJ1CHMS-part2'
                phys_path: '/dev/gptid/7541115a-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[3]:
                type: 'disk'
                id: 3
                guid: 17196042497722925662
                path: '/dev/disk/by-id/ata-ST3200822A_3LJ0189C-part2'
                phys_path: '/dev/gptid/760a94ee-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
    features_for_read:

The GUID I was looking for is 15935140517898495532 which enabled me to do

root@zeus:/dev# zpool offline hermes 15935140517898495532
root@zeus:/dev# zpool status
  pool: hermes
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun  9 00:28:24 2013
config:

        NAME                         STATE     READ WRITE CKSUM
        hermes                       DEGRADED     0     0     0
          raidz1-0                   DEGRADED     0     0     0
            ata-ST3300620A_5QF0MJFP  ONLINE       0     0     0
            ata-ST3300831A_5NF0552X  OFFLINE      0     0     0
            ata-ST3200822A_5LJ1CHMS  ONLINE       0     0     0
            ata-ST3200822A_3LJ0189C  ONLINE       0     0     0

errors: No known data errors

and then

root@zeus:/dev# zpool replace hermes 15935140517898495532 /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
root@zeus:/dev# zpool status
  pool: hermes
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jun  9 01:44:36 2013
    408M scanned out of 419G at 20,4M/s, 5h50m to go
    101M resilvered, 0,10% done
config:

        NAME                            STATE     READ WRITE CKSUM
        hermes                          DEGRADED     0     0     0
          raidz1-0                      DEGRADED     0     0     0
            ata-ST3300620A_5QF0MJFP     ONLINE       0     0     0
            replacing-1                 OFFLINE      0     0     0
              ata-ST3300831A_5NF0552X   OFFLINE      0     0     0
              ata-ST3500320AS_9QM03ATQ  ONLINE       0     0     0  (resilvering)
            ata-ST3200822A_5LJ1CHMS     ONLINE       0     0     0
            ata-ST3200822A_3LJ0189C     ONLINE       0     0     0

errors: No known data errors

After resilvering had been completed everything worked well again. It would have been nice to include this information, that you can use a disk's GUID obtained through zdb with the zpool command, with the manpage of zpool.

Edit

As pointed out by durval below the zdb command may not output anything. Then you may try to use

zdb -l /dev/<name-of-device>

to explicitly list information about the device (even if it already is missing from the system).

Answer from Marcus on askubuntu.com
🌐
Ask Ubuntu
askubuntu.com › questions › 305830 › replacing-a-dead-disk-in-a-zpool
server - Replacing a dead disk in a zpool - Ask Ubuntu

After digging endlessly this night I finally found the solution. The short answer is that you can use the disks' GUIDs (which persist even after disconnecting a drive) with the zpool command.

Long answer: I got the disk's GUID using the zdb command which gave me the following output

root@zeus:/dev# zdb
hermes:
    version: 28
    name: 'hermes'
    state: 0
    txg: 162804
    pool_guid: 14829240649900366534
    hostname: 'zeus'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 14829240649900366534
        children[0]:
            type: 'raidz'
            id: 0
            guid: 5355850150368902284
            nparity: 1
            metaslab_array: 31
            metaslab_shift: 32
            ashift: 9
            asize: 791588896768
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 11426107064765252810
                path: '/dev/disk/by-id/ata-ST3300620A_5QF0MJFP-part2'
                phys_path: '/dev/gptid/73b31683-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 15935140517898495532
                path: '/dev/disk/by-id/ata-ST3300831A_5NF0552X-part2'
                phys_path: '/dev/gptid/746c949a-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[2]:
                type: 'disk'
                id: 2
                guid: 7183706725091321492
                path: '/dev/disk/by-id/ata-ST3200822A_5LJ1CHMS-part2'
                phys_path: '/dev/gptid/7541115a-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
            children[3]:
                type: 'disk'
                id: 3
                guid: 17196042497722925662
                path: '/dev/disk/by-id/ata-ST3200822A_3LJ0189C-part2'
                phys_path: '/dev/gptid/760a94ee-537f-11e2-bad7-50465d4eb8b0'
                whole_disk: 1
                create_txg: 4
    features_for_read:

The GUID I was looking for is 15935140517898495532 which enabled me to do

root@zeus:/dev# zpool offline hermes 15935140517898495532
root@zeus:/dev# zpool status
  pool: hermes
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 2h4m with 0 errors on Sun Jun  9 00:28:24 2013
config:

        NAME                         STATE     READ WRITE CKSUM
        hermes                       DEGRADED     0     0     0
          raidz1-0                   DEGRADED     0     0     0
            ata-ST3300620A_5QF0MJFP  ONLINE       0     0     0
            ata-ST3300831A_5NF0552X  OFFLINE      0     0     0
            ata-ST3200822A_5LJ1CHMS  ONLINE       0     0     0
            ata-ST3200822A_3LJ0189C  ONLINE       0     0     0

errors: No known data errors

and then

root@zeus:/dev# zpool replace hermes 15935140517898495532 /dev/disk/by-id/ata-ST3500320AS_9QM03ATQ
root@zeus:/dev# zpool status
  pool: hermes
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jun  9 01:44:36 2013
    408M scanned out of 419G at 20,4M/s, 5h50m to go
    101M resilvered, 0,10% done
config:

        NAME                            STATE     READ WRITE CKSUM
        hermes                          DEGRADED     0     0     0
          raidz1-0                      DEGRADED     0     0     0
            ata-ST3300620A_5QF0MJFP     ONLINE       0     0     0
            replacing-1                 OFFLINE      0     0     0
              ata-ST3300831A_5NF0552X   OFFLINE      0     0     0
              ata-ST3500320AS_9QM03ATQ  ONLINE       0     0     0  (resilvering)
            ata-ST3200822A_5LJ1CHMS     ONLINE       0     0     0
            ata-ST3200822A_3LJ0189C     ONLINE       0     0     0

errors: No known data errors

After resilvering had been completed everything worked well again. It would have been nice to include this information, that you can use a disk's GUID obtained through zdb with the zpool command, with the manpage of zpool.

Edit

As pointed out by durval below the zdb command may not output anything. Then you may try to use

zdb -l /dev/<name-of-device>

to explicitly list information about the device (even if it already is missing from the system).

Answer from Marcus on askubuntu.com
Videos
October 22, 2024
December 23, 2010
15.5K
🌐
Server Fault
serverfault.com › questions › 1147733 › zfs-on-linux-disk-replacement-best-practices
zfsonlinux - ZFS on Linux disk replacement best practices - Server Fault

There's no difference between the options when all is said and done — drives are interchangeable. "Just add sdi as the new spare" requires fewer steps and minimizes the amount of time that you spend in a resyncing state, so it's the natural choice.

Answer from hobbs on serverfault.com
🌐
FreeBSD
forums.freebsd.org › base system › storage
How to replace zfs failed drive? | The FreeBSD Forums
When the pool was healthy, a scrub ... do something regularly to catch corrupt files and fix them? ... Click to expand... do a search on the forum for "zfs scrub"....
🌐
Stack Exchange
unix.stackexchange.com › questions › 346713 › zfs-ubuntu-16-04-replace-drive-with-itself
ZFS Ubuntu 16.04 Replace drive with itself - Unix & Linux Stack Exchange

In order to use again the same drive in ZFS, if you are sure the disk is not faulty, doing DD to the first 10 GB of the drive is a good start, but you need to do it also at the end of the drive. But there is no need to DD 10 GB, I believe the first and last MB is enough.

I solved it this way:

dd bs=512 if=/dev/zero of=/dev/sdk count=2048 seek=$(($(blockdev --getsz /dev/sdk) - 2048))
dd bs=512 if=/dev/zero of=/dev/sdk count=2048 

Then just add the "new" disk back to the ZFS pool. No need to labelclear, scrub, wipe the disk. DD is all you need. You also should not partition the disk after DD. ZFS will not accept any previously partitioned disk.

Then just zpool replace <pool-name> <old-device> <new-device>

normally is the disk-by-id and is the device listed in lsblk.

Answer from LincolnP on unix.stackexchange.com
🌐
Triton Data Center
docs.tritondatacenter.com › private-cloud › troubleshooting › disk-replacement
Understanding and Resolving ZFS Disk Failure
Because scrubbing and resilvering ... scrub" command returns an error. By enabling ZFS autoreplace on a pool (a property disabled by default) you will enable your system to automatically use a spare drive to replace FAULTED/UNAVAIL drives....
🌐
Oracle
docs.oracle.com › cd › E19253-01 › 819-5461 › gazgd › index.html
Oracle
On some systems, such as the Sun Fire x4500, you must unconfigure a disk before you can take it offline. If you are replacing a disk in the same slot position on this system, then you can just run the zpool replace command as described in the first example in this section.
🌐
Dlford
dlford.io › linux-zfs-raid-disk-replacement-procedure
5 Steps to Safely Replace a Drive in a Linux ZFS Array | dlford.io
Let’s lay out an example scenario - say we have a mirrored (RAID1) array, and I just got an Email alert from smartmontools telling me that a drive /dev/sdf is failing in my ZFS RAID 10 array. Before I configure an array, I like to make sure all drive bays are labelled with the corresponding ...
Find elsewhere
🌐
Oracle
docs.oracle.com › cd › E19253-01 › 819-5461 › gbcet › index.html
Replacing a Device in a ZFS Storage Pool
Physically replace the disk (c1t3d0). Ensure that the blue Ready to Remove LED is illuminated before you physically remove the faulted drive.
🌐
Proxmox
forum.proxmox.com › home › forums › proxmox virtual environment › proxmox ve: installation and configuration
[SOLVED] - Need help replacing disk in ZFS | Proxmox Support Forum
Hi guys. I run PBS with ZFS on a pool named RPOOL that contains 4 drives of 4Tb. /dev/sdb was phasing out and gave tons of errors. I did a "ls -a /dev/disk/by-id/", and here is the output concerning ths serial number K4KJ220L: ata-HGST_HUS726040ALA610_K4KJ220L...
🌐
Arch Linux Forums
bbs.archlinux.org › viewtopic.php
[SOLVED] How to replace a failed drive in a ZFS pool / Kernel & Hardware / Arch Linux Forums
Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: resilvered 1.36M in 0h19m with 0 errors on Wed Oct 1 03:59:16 2014 config: NAME STATE READ WRITE CKSUM zfsdatapool ...
🌐
Proxmox
forum.proxmox.com › home › forums › proxmox virtual environment › proxmox ve: installation and configuration
[SOLVED] - Replace Failed Drive in ZFS | Proxmox Support Forum
The documentation is clear on the command needed to replace the failed drive after the new one is installed: zpool replace -f However, I am unsure of what exact device name to use. The output of zpool list -v and zpool status both show device names that are formatted differently than /dev/...
🌐
Reddit
reddit.com › r/zfs › need help with first time replacing a drive.
r/zfs on Reddit: Need help with first time replacing a drive.

why don't you just use zpool replace as suggested on the screen? replace the faulty drive with the new one and let zfs create the partitions and resilver the drive? maybe I didn't get the question, sorry

🌐
GitHub
github.com › openzfs › zfs › issues › 2076
RAIDZ1: unable to replace a drive with itself · Issue #2076 · openzfs/zfs
January 23, 2014 - Trying to simulate failure scenarios with a 3+1 RAIDZ1 array in order to prepare for eventualities. # create spfstank raidz1 -o ashift=12 sda sdb sdc sdd # zfs create spfstank/part # dd if=/dev/ran...
Published: Jan 23, 2014
🌐
Reddit
reddit.com › r/zfs › drive become faulted and then unavailable after a successful replacement
r/zfs on Reddit: Drive become FAULTED and then UNAVAILABLE after a successful replacement

export the pool and import it with zpool import -d /dev/disk/by-id/ instead. If using an sdX name is all that was wrong, it should find the correct disk and use that.

🌐
Reddit
reddit.com › r/zfs › question about replacing failed zfs drives
r/zfs on Reddit: Question about replacing failed zfs drives

Depending on how quickly you replaced the drive after removing it, the kernel may have assigned a new drive letter to the drive. Look in your dmesg to see which letter it got, and use that for the last argument in 'zpool replace'. Going forward, I'd recommend using by-path or by-vdev for the naming of your drives, so you can have a fixed naming for the slot that the drive is in. Unlike the /dev/sdX entries, these are stable across reboots, and make it much easier to find the position of a failed drive.

🌐
Reddit
reddit.com › r/zfs › zfs pool with single disk:how to replace a failed disk?
r/zfs on Reddit: ZFS POOL with single disk:How to replace a failed disk?

Apologies if I'm not following quite correctly. You're saying you have a single-disk zpool and it's come up with errors? To replace it you should be able to plug in your new one and run zpool replace poolname badDisk newDisk just fine? Rather than trying to destroy the pool and starting again. Though if it's getting some serious IO errors it may be too late to replace. It's also worth noting that if you have plenty of RAM for ZFS's ARC feature... a cache SSD probably isn't gonna be too much more helpful than just using ARC. That said, if you're talking about an ssd LOG device (not CACHE) SLOG's are more for PROTECTION for incomplete disk transactions than 'speedups' for pool writes. Would you mind providing us with the output of zpool status and dmesg from this machine? It helps the community find problems very well.

🌐
Server Fault
serverfault.com › questions › 952040 › how-to-remove-faulted-disk-from-zfs-pool-once-spare-replacement-done
zfsonlinux - How to remove faulted disk from ZFS pool once spare replacement done - Server Fault

To replace a failed disk with a hot spare, you do not need to zpool replace at all (and in fact this might cause you all sorts of grief later; I've never done this). Instead you are supposed to simply zpool detach the failed disk and the hot spare automatically replaces it.

Answer from Michael Hampton on serverfault.com
🌐
Reddit
reddit.com › r/zfs › replacing possible failing drives
r/zfs on Reddit: replacing possible failing drives

If you have failing drives, I would not attempt to replace them one-by-one. Buy a new disk shelf, build a new pool with the new disks and copy your data over ONCE.

🌐
Proxmox
forum.proxmox.com › home › forums › proxmox virtual environment › proxmox ve: installation and configuration
Replace disk in ZFS Pool | Proxmox Support Forum
Please bear with me as someone else setup my machine and I am trying to do these changes myself :( : I had two 8TB drives mirrored as "pool1" NAME STATE READ WRITE CKSUM pool1 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ata-WDC_WD80EMAZ-00M9AA0_VAGDUWBL ONLINE 0 0 0 ata-WDC_WD80EMAZ-00M9AA0_VAGLGALL...