I have recently decided to change the architecture of my web services, mostly to offload a single HP MicroServer which over time has accumulated so many roles that it became a true single point of failure. As it also has 16 GB of RAM and 4 TB of disk space in ZFS cluster, but only 2 CPU cores, it was always suited for a NAS...
I have recently decided to change the architecture of my web services, mostly to offload a single HP MicroServer which over time has accumulated so many roles that it became a true single point of failure. As it also has 16 GB of RAM and 4 TB of disk space in ZFS cluster, but only 2 CPU cores, it was always suited for a NAS rather than an application server. Because I also decided to migrate it not only to #FreeBSD but actually #HardenedBSD flavour[^1], it was a long journey, but very interesting too…
HardenedBSD is not really suitable to run most production services. It’s not the problem of the fork, but with the latter - their code base is not ready for all the mitigations. Most native (!) NFS or NIS daemons will crash due to CFI, some will crash due to random other PAX mitigations, and some will crash due to memory execution protection when they try to use features that rely on it - most notably JIT in PHP and PostgreSQL. It’s not the fault of HardenedBSD, but it tells a lot about the whole legacy code base we’re using right now in systems with these mitigations disabled. I’ve ended up doing buildworld with CFI disabled for some daemons, but quickly hit the wall with the others, and eventually just reverted to regular FreeBSD.
NFSv4 is a nightmare and NFSv4 with Kerberos is a nightmare squared. After a few days of trying to get it to work I followed the best advice found on some FreeBSD forum under a long thread on getting it to work: “just don’t”. NFSv4 architecture is complex and relies on many moving parts but the #NFS daemons have close to zero logging and you quickly end up in a “computer says NO” situation with tcpdump and truss as your only diagnostic tools.
ZFS is funny in that the pools I created under Ubuntu 20.04 had some modern features such as feature@edonr enabled, which are not supported by stable FreeBSD 13.1. So I tried under 14-CURRENT, and it worked, but then stupidly made zpool upgrade, which enabled other features, which then prevented me from importing the pool into FreeBSD 13.2 prerelease which would likely import the original pool, but not the one upgraded under 14-CURRENT. But hey, FreeBSD 14 is due in July so I can live on the edge until then.
OK, all that could have been trivially avoided by simply reverting to the simplest but working solutions like unauthenticated NFSv3 and regular FreeBSD with fewer mitigations, which I ultimately did. But you don’t end up in the #infosec industry by doing things the easy way 😉
One unexpected moment in #Pleroma migration is that there’s one particular index (activities_visibility_index) whose generation on full SQL dump and restore took literally 15 hours. This particular cursed index is not documented anywhere in Pleroma migration documentation[^2], but it is mentioned in documentation of its fork Akkoma[^3] with a workaround.
All in all, it seems to be mostly done and the NAS is happily grunting right now while Pleroma pulls a few days long of outstanding news from Fediverse…
@kravietz
> Most native (!) NFS or NIS daemons will crash due to CFI, some will crash due to random other PAX mitigations, and some will crash due to memory execution protection when they try to use features that rely on it - most notably JIT in PHP and PostgreSQL
To provide a more balanced picture, the problem with CFI is that all of the NFS code is written in C and some CFI checks seem to be extremely sensitive to things like e.g. passing control between functions compiled and not compiled with CFI (cfi-icall). The idea of HardeneBSD is that the whole system is compiled with these mitigations and this places quite a responsibility on the underlying code. I suppose it would be much easier with apps written in modern languages such as Go, Elixir or Python. Pleroma (Elixir) ran under HardenedBSD without any problems, at least until it found out my database indexes are screwed up and crashed but that’s another problem 😂
As for storage, I ran some tests and of course local disk is much faster than any network storage. For example, on a local ZFS raidz I was getting 190 MiB/sec write while NFS on best cables and Gigabit switch was getting 8 MiB/sec. I can imagine this can be a bottleneck on busy instances but mine is a single-user one and so far it happily runs with uploads mounted on NFS (database is on the same server so also runs over network).
I had a look at GlusterFS and SeaweedFS but decided I want a running instance before I will have another 5 days blackout due to experiments 😉
I have recently decided to change the architecture of my web services, mostly to offload a single HP MicroServer which over time has accumulated so many roles that it became a true single point of failure. As it also has 16 GB of RAM and 4 TB of disk space in ZFS cluster, but only 2 CPU cores, it was always suited for a NAS rather than an application server. Because I also decided to migrate it not only to #FreeBSD but actually #HardenedBSD flavour[^1], it was a long journey, but very interesting too…
buildworld
with CFI disabled for some daemons, but quickly hit the wall with the others, and eventually just reverted to regular FreeBSD.tcpdump
andtruss
as your only diagnostic tools.feature@edonr
enabled, which are not supported by stable FreeBSD 13.1. So I tried under 14-CURRENT, and it worked, but then stupidly madezpool upgrade
, which enabled other features, which then prevented me from importing the pool into FreeBSD 13.2 prerelease which would likely import the original pool, but not the one upgraded under 14-CURRENT. But hey, FreeBSD 14 is due in July so I can live on the edge until then.activities_visibility_index
) whose generation on full SQL dump and restore took literally 15 hours. This particular cursed index is not documented anywhere in Pleroma migration documentation[^2], but it is mentioned in documentation of its fork Akkoma[^3] with a workaround.All in all, it seems to be mostly done and the NAS is happily grunting right now while Pleroma pulls a few days long of outstanding news from Fediverse…
[^1]: https://hardenedbsd.org/ [^2]: https://docs-develop.pleroma.social/backend/administration/backup/ [^3]: https://docs.akkoma.dev/stable/administration/backup/