[PVE-User] Shared storage recommendations

Tue Feb 26 18:45:39 CET 2019

On 26.02.2019 10:41, Thomas Lamprecht wrote:
> Hi,
>
> On 2/25/19 6:22 PM, Frederic Van Espen wrote:
>> Hi,
>>
>> We're designing a new datacenter network where we will run proxmox nodes on
>> about 30 servers. Of course, shared storage is a part of the design.
>>
>> What kind of shared storage would anyone recommend based on their
>> experience and what kind of network equipment would be essential in that
>> design? Let's assume for a bit that budget is not constrained too much. We
>> should be able to afford a vendor specific iSCSI device, or be able to
>> implement an open source solution like Ceph.
>>
>> Concerning storage space and IOPS requirements, we're very modest in the
>> current setup (about 13TB storage space used, very modest IOPS, about 6500
>> write IOPS and 4200 read IOPS currently distributed in the whole network
>> according to the prometheus monitoring).
>>
>> Key in the whole setup is day to day maintainability and scalability.
> I'd use ceph then. Scalability is something ceph is just made for, and
> maintainability is also really not to bad, IMO. You can do CTs and VMs on
> normal blockdevices (rdb) and also have a file based shared FS (cephFS) both
> well integrated into PVE frontend/backend, which other shared storage systems
> aren't.
>
> cheers,
> Thomas

+1 for ceph

for scaleabillity, parallelism, and aggregate performance ceph is just 
awesome. once you learn the fundamentals it also becomes quite 
manageable and "logical"
for a single big single threaded workloads you get all the SDS 
overheads, so you may need cheats, like caching or fancy striping (not 
unlike all other storage solutions)  but if you want loads and loads of 
parallel io it is a joy.

If you servers have loads of unused capacity you can consider a HCI 
model where storage and compute are on the same hosts. But it is easier 
to have have it separate. Ofcourse no problem starting with one model, 
and migrating on the run.

tips: you want 5 monitors, so you can have dual failures or one fail 
while another is under maintenance/reboot;  Good networking is 
important, 25gbps is better then 40gbps due to lower latency; I 
recommend using ipv6 from the beginning for future proofing, it is still 
a struggle to migrate ceph from ipv4->ipv6, since you must stop the 
cluster. IPv6+slaac is _very_ scaleable. IPv4 only loads can work thru a 
gateway host, until EOL, i use dualstacked samba/nfs jumphost for cephfs 
ipv4 access.  rados-gateways are normally dual-stacked anyway, or you 
can use a revers proxy like nginx for ipv4;   Many new clusters are 
flash only, but if you want spinning disks you do need flash based DB 
partitions for the spinning osd's; Still a good idea to have some ssd 
for low latency loads like cephfs-metadata,  radosgw indexes, and io 
hungry vm's;

good luck
Ronny