I have been lurking on this community for a while now and have really enjoyed the informational and instructional posts but a topic I don’t see come up very often is scaling and hoarding. Currently, I have a 20TB server which I am rapidly filling and most posts talking about expanding recommend simply buying larger drives and slotting them in to a single machine. This definitely is the easiest way to expand, but seems like it would get you to about 100TB before you cant reasonably do that anymore. So how do you set up 100TB+ networks with multiple servers?

My main concern is that currently all my services are dockerized on a single machine running Ubuntu, which works extremely well. It is space efficient with hardlinking and I can still seed back everything. From different posts I’ve read, it seems like as people scale they either give up on hardlinks and then eat up a lot of their storage with copying files or they eventually delete their seeds and just keep the content. Does the Arr suite and Qbit allow dynamically selecting servers based on available space? Or are there other ways to solve these issues with additional tools? How do you guys set up large systems and what recommendations would you make? Any advice is appreciated from hardware to software!

Also, huge shout out to Saik0 from this thread: https://lemmy.dbzer0.com/post/24219297 I learned a ton from his post, but it seemed like the tip of the iceberg!

  • MrSulu@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 hours ago

    This is next level hoarding and reading the response suggestions is still very useful.

  • tenchiken@lemmy.dbzer0.comM
    link
    fedilink
    English
    arrow-up
    28
    ·
    2 days ago

    I personally have dedicated machines per task.

    8x SSD machine: runs services for Arr stack, temporary download and work destination.

    4-5x misc 16x Bay boxes: raw storage boxes. NFS shared. ZFS underlying drive config. Changes on a whim for what’s on them, but usually it’s 1x for movies, 2x for TV, etc. Categories can be spread to multiple places.

    2-3x 8x bay boxes: critical storage. Different drive geometric config, higher resilience. Hypervisors. I run a mix of Xen and proxmox depending on need.

    All get 10gb interconnect, with critical stuff (nothing Arr for sure) like personal vids and photos pushed to small encrypted storage like BackBlaze.

    The NFS shared stores, once you get everything mapped, allow some smooth automation to migrate things pretty smoothly around to allow maintenance and such.

    Mostly it’s all 10 year old or older gear. Fiber 10gb cards can be had off eBay for a few bucks, just watch out for compatibility and the cost for the transceivers.

    8 port SAS controllers can be gotten same way new off eBay from a few vendors, just explicitly look for “IT mode” so you don’t get a raid controller by accident.

    SuperMicro makes quality gear for this… Used can be affordable and I’ve had excellent luck. Most have a great ipmi controller for simple diagnostic needs too. Some of the best SAS drive planes are made by them.

    Check BackBlaze disk stats from their blog for drive suggestions!

    Heat becomes a huge factor, and the drives are particularly sensitive to it… Running hot shortens lifespan. Plan accordingly.

    It’s going to be noisy.

    Filter your air in the room.

    The rsync command is a good friend in a pinch for data evacuation.

    Your servers are cattle, not pets… If one is ill, sometimes it’s best to put it down (wipe and reload). If you suspect hardware, get it out of the mix quick, test and or replace before risking your data again.

    You are always closer to dataloss than you realize. Be paranoid.

    Don’t trust SMART. Learn how to read the full report. Pending-Sectors above 0 is always failure… Remove that disk!

    Keep 2 thumb drives with your installer handy.

    Keep a repo somewhere with your basics of network configs… Ideally sorted by machine.

    Leave yourself a back door network… Most machines will have a 1gb port. Might be handy when you least expect. Setting up LAGG with those 1gb ports as fallback for the higher speed fiber can save headaches later too…

    • dipper_retreat@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      24 hours ago

      Thanks for this fantastic write up, and your other response! I definitely learned a lot just looking up all the terms. Just a couple of questions if you have time.

      For your 16x bay boxes, are you running like old Optiplex or PowerEdge hardware or something else? I ask because these seem to be available in large supplies from surplus sites and Im curious if one is strictly better or easier to work with. Also, I’ve read that you should loosely match TB of storage to GB of RAM. The PowerEdge hardware has tons of DIMMs but old PCs don’t so curious if you’ve had to deal with that since zfs seems so well optimized.

      For the split categories, ie. 2x for TV you mentioned, do you need to run two instances of Sonarr? Or do you just manually change the path when a single box gets full? Otherwise, how do you keep the two instances in sync?

      Lastly, I’ve done quite a bit of reading on OMV and Proxmox but I don’t actually use them yet. Do you recommend Proxmox with an OMV vm or just OMV baremetal?

      Thanks for taking the time!

      • tenchiken@lemmy.dbzer0.comM
        link
        fedilink
        English
        arrow-up
        2
        ·
        23 hours ago

        For my larger boxes, I only use SuperMicro. Most other vendors do weird shit to their back planes that make them incompatible or charge for licenses for their ipmi/drac/lightsout . Any reputable reseller of server gear will offer SuperMicro.

        The disk to ram ratio is niche, and I’ve almost never run into that outside of large data warehouse or database systems (not what we’re doing here). Most of my machines run nearly idle even serving files several active streams or 3gb/sec data moves on only 16gb RAM. I use CPU being maxed out as a good warning that one of my disks needs checking, since silvering or degraded in ZFS chews CPU.

        That said, hypervisors eat RAM. Whatever machine you might want to perform torrents, transcoding, etc, give that box RAM and either a good supported GPU or a recent Intel quicksync chip.

        For organizing over the arrays, I use raided SSD for the downloads, with the torrent client moving to the destination host for seeding on completion.

        Single instance of radarr and sonarr, instead I update the root folder for “new” content any time I need to point to a new machine. I just have to keep the current new media destination in sync between the Arr and the torrent client for that category.

        The Arr stacks have gotten really good lately with path management, you just need to ensure the mounts available to them are set correct.

        In the event I need to move content between 2x different boxes, I pause the seed and use rsync to duplicate the torrent files. Change path and recheck the torrent. Once that’s good I either nuke and reimport in the Arr, or lately I’ve been doing better naming convention on the hosts so I can use preserving hardlinks. Beware, this is pretty complex route unless you are very comfortable in Linux and rsync!

        I’m using OMV on bare metal personally. My proxmox doesn’t even have OMV, it’s on a mini PC for transcoding. I see no problem running OMV inside proxmox though. My baremetal boxes are dedicated for just NAS duties.

        For what it’s worth, keep tasks as minimal and simple as you can. Complexity where it’s not needed can be pain later. My nas machines are largely identical in base config, with only the machine name and storage pool name different.

        If you don’t need a full hypervisor, I’d skip it. Docker has gotten great in its abilities. The easiest docker box I have was just Ubuntu with DockGE. It keeps it’s configs in a reliable path so easy to backup your configs etc.

    • ReallyActuallyFrankenstein@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 day ago

      Your setup is closer to “statistic” than “anecdote,” so curious how many drive failures you’ve had?

      What is the primary OS you run to manage all of the components?

      • tenchiken@lemmy.dbzer0.comM
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 day ago

        Most of my drives are in the 3tb/4tb range… Something about that timeframe made some reliable disks. Newer disks have had more issues really. A few boxes run some 8tb or 12tb, and I keep some external 8tb for evacuation purposes, but I don’t think I trust most options lately.

        HGST / Toshiba seems to have done good by me overall, but that’s subjective certainly.

        I have 2 Seagate I need to pull from one of the older boxes right now, but they are 2tb and well past their due:

        root@Mizuho:~# smartctl -a /dev/sdc|grep “Vendor|Product|Capacity|minutes” Vendor: SEAGATE Product: ST2000NM0021 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Accumulated power on time, hours:minutes 41427:43

        root@Mizuho:~# smartctl -a /dev/sdh|grep “Vendor|Product|Capacity|minutes” Vendor: SEAGATE Product: ST2000NM0021 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Accumulated power on time, hours:minutes 23477:56

        Typically I’m a Debian/Ubuntu guy. Easiest multi tool for my needs.

        I usually use OpenMediaVault for my simple NAS needs.

        Proxmox and XCP-NG for hypervisor. I was involved in the initial development of OpenStack, and have much love for classic Xen itself (screw Citrix and their mistreatment of xenserver).

        My dockers are either via DockGE or the compose plugins under OMV, leaning more toward DockGE lately for simplicity and eye candy.

        Overall, I’ve had my share of disk failures. Usually from being sloppy. I only trust software RAID, as I have better shot at recovery if I’m stupid enough to store something critical on less that N+2.

        I usually buy drives only on previous generation, and at that only when price absolutely craters. The former due to being bitten by new models crapping out early, and latter due to being too poor to support my bad habits.

        Nearly all of my SATA disks came from externals, but that’s become tenuous lately… SMR disks are getting stuck into these more and more, and manufacturers sneakier about hiding shit design.

        Used SAS from a place with solid warranty seems to be most reliable. About half my fleet was bought used and I’ve only lost about 1/4 of those with less than 5+ years active run time.

    • mnemonicmonkeys@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 day ago

      On your point about SAS cards, some sellers will post “IT Mode” in their product listing even if the card can never be flashed to IT mode. I know because it happened to me

      • tenchiken@lemmy.dbzer0.comM
        link
        fedilink
        English
        arrow-up
        2
        ·
        23 hours ago

        Ugh that is extra shitty. Yeah eBay is absurd sometimes with the risks.

        For anyone skimming, my cards are all based around the ancient but great LSI 9211-8i chips.

        I flash my own, so I can disable BIOS and efi. I suppose if someone gets to the larger hoarding, they should be comfortable flashing their own cards too.

        • mnemonicmonkeys@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          17 hours ago

          Just for anyone randomly stumbling upon this thread: make sure to throroughly check the picture of a listing. Look for LSI cards specifically. They have been purchased by 2-3 other companies, but you absolutrly need to do your research to make sure the manufacturer badge on the card is one of those companies. Don’t just trust the title of the listing

  • SpikesOtherDog@ani.social
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    2 days ago

    Buy a used PC that can take a PCI-e card and has room for several drives.

    Buy a used sas controller recommended by the truenas community

    Stuff the PC with 10 drives: one SATA SSD for the OS and 9 20GB HDDs in ZFS configuration for 140 TB of storage with 2 failover drives.

    Install TrueNAS and create a network share.

    Repeat and/or upgrade as needed.

    Edit: I just went web window shopping. Does anyone have $3k I could invest?

  • Krill@feddit.uk
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 days ago

    Alternative option: Truenas Scale. Supermicro motherboard. AMD Epyc (used) for lots of PCIE lanes. LSI 9300 and AEC 82885 expander. 16TB+ drives. Rack mount, with SAS back planea. RaidZ2 minimum. Special vdev, NVMe drives and dedicated apps and VM storage, don’t be afraid of a converged solution.

    And fans. Lots of fans.

  • Xanza@lemm.ee
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 days ago

    Dockerization isn’t a big deal. You can create a docker server with a smaller SSD/M.2 drive within the same machine to simply run the docker (or portainer) frontend and create pools of like-sized HDDs for pool storage using docker volumes to store your data on your HDD array. The most important thing being that the drives are the same size. So if you start out with 20TB drives, you have to continue with them.

    Once you create your pool, you can install more drives as needed and simply increase the size of the pool by the number of drives installed. Add your new drives to your machine, update fstab, add the new drive capacity to your array, and then balance your drives.

  • ShepherdPie@midwest.social
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    This is a great question and quite funny as I’m at 100TB now (including parity drives and non-media storage) and needing to figure out a solution fairly soon. Tossing a bunch of working $100-$200 drives in ‘the trash’ in order to replace them with $300-$400 drives isn’t much of a solution in my eyes.

    I suppose the proper solution is to build a server rack and load it with drives but that seems a bit daunting at my current skill level. Anybody have a time machine I can borrow real quick?

    • pedroapero@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 day ago

      I primarily buy used drives. Depending on your area, you might find buyers easily for your old 4TB+ ones.