I’ve never heard of SeaweedFS, but Ceph cluster storage system has an S3-compati...

ranger_danger · 2026-02-13T14:09:30 1770991770

Ceph is a non-starter for me because you cannot have an existing filesystem on the disk. Previously I used GlusterFS on top of ZFS and made heavy use of gluster's async geo-replication feature to keep two storage arrays in sync that were far away over a slow link. This was done after getting fed up with rsync being so slow and always thrashing the disks having to scan many TBs every day.

While there is a geo-replication feature for Ceph, I cannot keep using ZFS at the same time, and gluster is no longer developed, so I'm currently looking for an alternative that would work for my use case if anyone knows of a solution.

jodrellblank · 2026-02-13T16:44:27 1771001067

> "Ceph is a non-starter for me because you cannot have an existing filesystem on the disk. Previously I used GlusterFS on top of ZFS"

I became a Ceph admin by accident so I wasn't involved in choosing it and I'm not familiar with other things in that space. It's a much larger project than a clustered filesystem; you give it disks and it distributes storage over them, and on top of that you can layer things like the S3 storage layer, its own filesystem (CephFS) or block devices which can be mounted on a Linux server and formatted with a filesystem (including ZFS I guess, but that sounds like a lot of layers).

> "While there is a geo-replication feature for Ceph"

Several; the data cluster layer can do it in two ways (stretch clusters and stretch pools), the block device layer can do it in two ways (journal based and snapshot based), the CephFS filesystem layer can do it with snapshot mirroring, and the S3 object layer can do it with multi-site sync.

I've not used any of them, they all have their trade-offs, and this is the kind of thing I was thinking of when saying it requires more skills and effort. for simple storage requirements, put a traditional SAN, a server with a bunch of disks, or pay a cheap S3 service to deal with it. Only if you have a strong need for scalable clusters, a team with storage/Linux skills, a pressing need to do it yourself, or to use many of its features, would I go in that direction.

https://docs.ceph.com/en/latest/rados/operations/stretch-mod...

https://docs.ceph.com/en/latest/rbd/rbd-mirroring/

https://docs.ceph.com/en/latest/cephfs/cephfs-mirroring/

https://docs.ceph.com/en/latest/radosgw/multisite/

skrtskrt · 2026-02-13T18:26:05 1771007165

Ceph is a non-starter because you need a team of people managing it constantly

jodrellblank · 2026-02-14T16:06:43 1771085203

I'm not posting to convince people they should use it, just that it's a really cool piece of open source infrastructure that I think is less well known, and I resepect it. It is very configurable and tunable, has a lot of features, command lines, and things to learn, and that does need people with skills and time.

That said, it doesn't need constant management; it's excellent at staying up even while damaged. As long as the cluster has enough free space it will rebuild around any hardware failure without human intervention, it doesn't need hot spares; if you plan it carefully then it has no single point of failure. (The original creator introduces the design choice of 'placement groups' and tradeoffs in this video[1]).

Most of the management time I've spent has been ageing hardware flaking out without actually failing - old disks erroring on read, controllers failing and dropping all the disks temporarily causing tens of seconds of read latency which had knock-on effects, or when we filled it too full and it went read-only. Other management work has been learning my way around it, upgrades, changing the way we use it for different projects, onboarding and offboarding services that use it, all of which will vary with what you actually do with it.

I've spent less time with VMware VSAN, but VSAN does a lot less, it takes your disks and gives you a VMFS datastore and maybe an iSCSI target. There can't be many alternatives which do what Ceph does, and require less skill and effort, and don't involve paying a vendor to manage it for you and give you a web interface?

[1] https://www.youtube.com/watch?v=PmLPbrf-x9g

xnyan · 2026-02-14T15:01:27 1771081287

That's was not my experience. Deploying and configuring ceph was a nightmare due to the mountain of options and considerations, but once it was deployed, ceph is extremely hands-off and resilient.

BlackLotus89 · 2026-02-13T19:54:17 1771012457

Yeah sure. I manage a ceph cluster (4PB) and have a few other responsibilities at the same time.

I can tell you that ceph is something I don't need to touch every month. Other things I have to baby more regularly