[pve-devel] [PATCH pve-docs] Add section for ZFS Special Device
Fabian Ebner
f.ebner at proxmox.com
Wed Nov 6 09:34:52 CET 2019
Thanks for the review and suggestions! I'll send a v2 later, two replies
inline.
On 11/5/19 10:14 AM, Aaron Lauterer wrote:
> Nicely written.
>
> I have some suggestions inline:
> * splitting long sentences
> * adding more info as to what is valid for the size in
> special_small_blocks (taken from the zfs man page)
> * rewrote the last paragraph a bit
>
> On 10/22/19 12:33 PM, Fabian Ebner wrote:
> > Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
> > ---
> > local-zfs.adoc | 44 ++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 44 insertions(+)
> >
> > diff --git a/local-zfs.adoc b/local-zfs.adoc
> > index b4fb7db..378cbee 100644
> > --- a/local-zfs.adoc
> > +++ b/local-zfs.adoc
> > @@ -431,3 +431,47 @@ See the `encryptionroot`, `encryption`,
> `keylocation`, `keyformat` and
> > `keystatus` properties, the `zfs load-key`, `zfs unload-key` and `zfs
> > change-key` commands and the `Encryption` section from `man zfs`
> for more
> > details and advanced usage.
> > +
> > +
> > +ZFS Special Device
> > +~~~~~~~~~~~~~~~~~~
> > +
> > +Since version 0.8.0 ZFS allows adding a `special` device to a pool,
> which is
> > +then used to store metadata, deduplication tables and optionally
> small file
> > +blocks.
>
> Since version 0.8. ZFS supports `special` devices. A `special` device in
> a pool is used to store metadata, deduplication tables, and optionally
> small file blocks.
>
> > +
> > +IMPORTANT: The redundancy of the `special` device should match the
> one of the
> > +pool, since the `special` device is a point of failure for the whole
> pool.
> > +
> > +WARNING: Adding a `special` device to a pool cannot be undone!
> > +
> > +.Create a pool with `special` device and RAID-1:
> > +
> > + zpool create -f -o ashift=12 <pool> mirror <device1> <device2>
> special mirror <device3> <device4>
> > +
> > +.Add a `special` device to an existing pool with RAID-1:
> > +
> > + zpool add <pool> special mirror <device1> <device2>
> > +
> > +For ZFS datasets where the `special_small_blocks` property is set to
> a non-zero
> > +value, the `special` device is used to store small file blocks up to
> that size.
> > +Setting the `special_small_blocks` property on the pool will change
> the default
> > +value of that property for all child ZFS datasets (for example all
> containers
> > +in the pool will opt in for small file blocks).
> > +
> > +.Opt in for small file blocks pool-wide:
> > +
> > + zfs set special_small_blocks=<size> <pool>
> > +
> > +.Opt in for small file blocks for a single dataset:
> > +
> > + zfs set special_small_blocks=<size> <pool>/<filesystem>
> > +
> > +.Opt out from small file blocks for a single dataset:
> > +
> > + zfs set special_small_blocks=0 <pool>/<filesystem>
>
> INFO: The value for <size> can be `0` to disable storing small file
> blocks on the special device or a power of two in the range between 512B
> to 128K.
>
Another thing I'll add here is about the (non-intuitive) relation with
the recordsize. Setting small_file_blocks higher or equal than the
recordsize of the ZFS file system will cause *all* data to be written to
the special device [0].
> > +
> > +Using a `special` device makes sense for pools with lots and lots of
> changing
> > +metadata respectively small files. If you also have other, larger
> I/O on the
> > +same pool then the benefit from using a `special` device might be
> even more
> > +noticeable. It is recommended to use SSDs or NVMes for the `special`
> device.
> >
>
> A `special` device can improve the speed of small I/O operations if the
> pool consists of slow spinning hard disks. Enabling
> `special_small_blocks` can further increase the performance if a lot of
> small files are used. Use fast (NVME) SSDs for the `special` device.
>
It's really about metadata and not small I/O operations in general. For
example having I/O operations with block-size 4K, but on large files
will not benefit from a special device (even with small_file_blocks
enabled).
And I think that the benefit does not depend so much on the speed of the
SSD. It should come from the fact that the I/O on the HDDs doesn't get
disturbed as much by the metadata/small file operations.
What about the following?
A `special` device can improve the speed of a pool consisting of slow
spinning hard disks with a lot of changing metadata. For example if the
pool has many short-lived files. Enabling `special_small_blocks` can
further increase the performance when those files are small. Use SSDs
for the `special` device.
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
[0]: https://github.com/zfsonlinux/zfs/issues/9131#issuecomment-523680936
More information about the pve-devel
mailing list