For this new year, I’d like to learn the skills necessary to self host. Specifically, I would like to eventually be able to self host Nextcloud, Jellyfin and possibly my email server too.
I’ve have a basic level understanding of Python and Kotlin. Now I’m in the process of learning Linux through a virtual machine because I know Linux is better suited for self hosting.
Should I stick with Python? Or is JavaScript (or maybe Ruby) better suited for that purpose? I’m more than happy to learn a new language, but I’m unsure on which is better suited.
And if you could start again in your self hosting journey, what would you do differently? :)
EDIT: I wasn’t expecting all these wonderful replies. You’re all very kind people to share so much with me :)
The consensus seems to be that hosting your own email server might be a lot, so I might leave that as future project. But for Nextcloud and Jellyfin I saw a lot of great tips! I forgot to mention that ideally I would like to have Nextcloud available for multiple users (ie. family memebers) so indeed learning some basic networking/firewalling seems the bare minimum.
I also promise that I will carefully read the manuals!
Lots of people have been talking about products and tools. It’s docker, tailscale, cloudflare proxmox etc. These are important, but will likely come and go on a long enough timescale.
In terms of actual skills, there’s two that will dramatically decrease your headaches. Documention and backup planning. The problem with developing those skills is, to my knowledge, they’ve only ever been obtained through suffering. Trying to remember how to rebuild something when you built it 6 months ago is futile. Trying to recover borked data is brutal. There’s no fail-safe that you haven’t created, and there’s no history that you haven’t written. Fortunately, these are also the most transferable skills.
My advice is, jump in. Don’t hesitate. The chops in docker/linux/networking will come with use and familiarity. If it looks cool, do it. Make mistakes. You will rapidly realise what the problems with your set up are. You will gain knowledge in leaps and bounds from breaking a thing vs learning by rote or lesson. Reframe the headaches as a feature, not a bug - they’re highlighting holes in your understanding. They signpost the way to being a better tech, and a more stable production environment.
The greatest bit about self hosting for me is planning the next great leap forward, making it better, cleaner, more robust. Growing the confidence in your abilities to create a system you can trust. Honing your skills and toolset is the entirety of the excercise, so jump in, and don’t focus on any one thing to master or practice before hand!
Networking is way more important than pretty much anything else. TCP/IP and http are going to stay for quite a while.
Patience, most of all.
Also, backups and notes. The solution you use to host might take care of the backups. For example, I use Unraid, so if any drive fails the system can simulate the data on that drive until I can get it shut down to replace it, and then recreate the data on the new drive.
As for notes, those are important so that you can always know what you’ve done, and what you need to do. That way, if you ever have to do it again, say if you’re setting up another server or replacing one that failed, you know the steps you took to get it set up exactly how you like. It’s also handy because you’ll be doing things like assigning services to ports, and you’ll probably at some point want to know what services are on what ports without going through and checking each one. Things like that are handy things to stick in notes.
Other than that, you don’t need a lot of skills to set something like a home server up. You just need to read the documentation for each service you’re planning to use, and get familiar with how it works.
Unraid is not a backup. It is good, but if your data goes wrong for different reasons or you lose the entire device, you can’t restore it. Dedicated backups are a must for anything serious!
Unraid absolutely is a backup. That’s the whole point of the OS. And furthermore , the backup can be backed up at any time and stored on another device, allowing you to restore the entire OS and its configuration. And by “lose the entire device”, I’m assuming you mean the OS is corrupted. At that point, you simply burn a new USB and reconnect the drives, or move them to any other system running Unraid.
Dude it sounds you’re over skilled for the job. You just need to read some guides and you probably know already how networking works, very basic linux commands, linux folder structures, and then the concept of docker - primarily how it maps networking & folders from your “host machine” to the “docker container”, and how it loads services using a docker compose file. Especially for nextcloud, domain dns management and dynamic dns etc would be very helpful knowledge.
Also, just a suggestion, chatgpt etc are super useful. You tell them what you want and it spits out custom instructions for your setup, and you’re able to counter question at any point. If it does make mistakes, which it will, it’s a learning opportunity for you to troubleshoot and figure out how everything works. All the best and if you have a question feel free to message me.
The ability to read, and maybe watch a video. And then persistence for some of the trial and error you will run in to. All skills you need can be picked up with the above.
You don’t need to be a programmer to selfhost.
The most important “skills” to have if you want to selfhost imo are:
-
Basic Networking knowledge
-
Basic Linux knowledge
-
Basic docker/docker compose knowledge
But I’d say to not get lost in the papers and just jump right in. Imo, the best way to learn how to selfhost is to just… Do it. Most everything is free and fairly well documented
Perseverance
Totally agree! I’m not a programmer and I have several services running in my home server. I’m just curious and have used Linux for a decade as a normal user. With just these 3 basic knowledge skills you’re good to go.
-
Docker really. If something goes bad, trash the container and start again without loosing your actual data.
Mostly Docker.
Portainer and plugging Docker Compose XML into Portainer stacks makes Docker stupid-simple. (personally speaking as a stupid person that does this)
Cloudflare tunnels for stuff people other than you might want to access.
Tailscale if it’s only you.
Reverse proxy & port forwarding for sharing media over Jellyfin without violating the Cloudflare Tunnel ToS.
Dokploy is a pretty easy web gui and is itself a docker container.
Makes it dead simple to manage multiple containers and domains. (Not for power users that need kubernetes level flexibility)
Determination, patience, a willingness to learn anything you need to.
If you have those, in time, you will be able to get your lab up and running. I started mine with a minimal knowledge of Linux (I could install it from a USB and poke around). Now it’s the center of my families digital life.
You’ll get there in time.
Honestly, you just need to pick a video on follow along these days. There’s a load of YouTube videos out there that take you step by step.
Lewis rosman recently put out a 14-hour mega video of doing everything, well he might have made some controversial choices, The outcome is quite comprehensive.
the patience to read lots of documentation.
And maybe patience to power through a lack of documentation.
These 1000% eventually your gonna run into a problem / situation that does not have much documentation. Powering through step by step logically can test the best of us. You can spend 56 hours in a day on one problem. Give up. The next morning figure it out in 10 minutes. It’s a marathon not a sprint.
Working hands, ability to type characters into keyboard.
Learn how to properly backup your data in case you nuke something you shouldn’t
And regularly check them. I just found out the hard way this last week that my backups haven’t been running for a few weeks …
Yep.
I have friends in the SMB space, one thing they do is a regular backup verification (quarterly). At that frequency, restoring even a few files (especially to a new VM), is very indicative, especially if it’s a large dataset (e.g. Quickbooks).
In Enterprise, we do all sorts of validation, depending on the system. Some is performed as part of Data Center operations, some is by IT (those are separate things), some by Business Unit management and their IT counterparts.
Unfortunately, that wouldn’t have done anything. Because I did that in December and they stopped running like 2 weeks after my verification. I would have caught it on my next scheduled validation, but that doesn’t help me now 😕
I mean, it still helps right? It limits your losses to X weeks instead of X months or, I hate to say it, X years.
Experimenting with VMs is the way forward.
Basic networking knowledge is vital. And being able to configure your own firewall(s) safely is an important skill. Check out something like Foomuuri, or Firewald. Shorewall is brilliant for documentation and description of issues (with diagrams!) but it does not use the newer Linux kernel nftables and is no longer actively developed.
Go for it with Nextcloud.
I would also recommend at least having a shot at setting up an email server, although I would recommend pushing through to a fully working system. It is possible, and is very satisfying to have in place. The process of setting one up touches so many different parts of internet function and culture that it is worth it even if you don’t end up with a production system. The Workaround.org ISPMail stuff is a good starting point, and includes some helpful background information at every stage, enough so you can begin to understand what’s going on in the background and why certain choices are being made - even if you disagree with the decisions.
Python is great for server admin, although most server config and startup shutdown snippets are written in BASH. You will no doubt have already begun picking that up as you interact with your VMs.
Take the time to properly understand Linux file ownership and permission. Permission will be the cause of many issues you will encounter in you self-hosting journey on Linux. Make sure you know the basics of
chmod
(change permission) andchown
(change ownership), Linux users and groups. This will save you some head-scratching, but don’t worry, you will learn by doing !Remember that, if you setup everything right, especially with docker, running as root / with
sudo
is not required for any of the services you may want to run.if you could start again in your self hosting journey, what would you do differently? :)
That’s an excellent question.
If I were to start over, the first thing that I would do is start by learning the basics of networking and set up a VPN! IMO exposing services to the public internet should be considered more of an advanced level task. When you don’t know what you don’t know, it’s risky and frankly unnecessary.
The lowest barrier to entry for a personal VPN, by far, is Tailscale. Automatic internal DNS and clients for nearly any device makes finding services on a dedicated machine really, really, easy. Look into putting a tailscale client right into the compose file so you automatically get an internal DNS records for a service rather than a whole machine.
From there, play around with more ownership (work) with regard to what can touch your network. Switch from Tailscale’s “trusted” login to hosting your own Headscale instance. Add a PiHole or AdGuard exit node and set up your own internal DNS records.
Maybe even scrap the magic (someone else’s logic that may or may not be doing things you need) and go for a plain-Jane Wireguard setup.
For sure use Tailscale for VPN. They have apps for iPhone, Android, macOS, and Linux, so setting up your own personal network will be easy. Hosting on the real internet is definitely advanced and not always necessary.
If you want to program something, the closest you’re gonna get to programming is Ansible and Bash scripts.
You might want to get self hosting hardware like Synology or the like if you’re not ready to dig.
Otherwise here’s some things you need to know:
- Docker
- Easy, consistent deployment of services in their own environments. Think a VM but with almost no overhead.
- Docker Compose
- Run docker containers with consistent configuration in files.
- Connect various containers to each other on the same or different networks.
- Get multiple containers to start together and talk to each other.
- Systemd
- Manage any service on Linux. If anything needs to start on boot, restart when crashed, start on timer, you want Systemd.
- You can manage your docker compose containers lifecycle via Systemd.
- NGINX/Apache/Caddy
- A web server for reverse proxy. You’d probably need one at some point, especially if you want HTTPS. Your services get hidden behind it.
- ZFS
- Reliable redundant storage. You’ll need storage. Use ZFS with 2-disk redundancy.
- Supports automatic snapshots for recovering from oopsies. E.g. deleted something or some software shat on your data.
- Can use recertified disks from serverpartsdeals.
- Can use USB disks or USB box with multiple disks. If you end up going the USB route, ask me for tested hardware.
- Backup system
- Something to do backup. There are many options.
- Ansible
- If you want to write code that describes your services and make them happen, you want Ansible. You write code (well YAML) and Ansible installs things, writes config files, sets up Systemd services, restarts things. It can be convenient especially if you have a lot of stuff and you want to be able to see all of your infrastructure in code in one place and be able to version it.
- Prometheus
- Monitoring your stuff. Is my backup service running? If not send me an email.
Oh and use Debian or Ubuntu LTS.
Ansible is nice but I’ll repeat (as I said in another thread) it’s kind of advanced and gives a much better return on investment if you manage several hosts, plan to switch hosts regularly, or plan to do regular rebuilds of the environment.
Great summary!
Why Debian or Ubuntu? (I have my own thoughts, but it would be useful to show even high-level reasons why they’re preferred).
Re: Backup - Backblaze has a great writeup on backup approach today. I’m a fan of cloud being part of the mix (I use a combo of local replication and cloud, to mitigate different risks). Getting people to include backup from the start will help them long-term, so great you included it!
Predictable cadence, stable operation, timely updates, huge community and therefore documentation. You can get up to 5 years from an LTS release of Debian or Ubuntu. With Ubuntu LTS and Ubuntu Pro (free) you could theoretically run a machine without upgrading for 10 years. If you run workloads in containers, it doesn’t matter how old the host OS is. As long as it’s security patches, you can keep on trucking.
Damn, 5 years from LTS? That’s impressive
If you end up going the USB route, ask me for tested hardware.
Send these my way chief
As briefly as possible:
- Host side
- If you use Intel, all is well.
- If you use AMD…
- Prior to AM5
- Use an ASMedia PCIe USB card (StatTech, Sonnet)
- X570 is especially bad, though I’ve had some success with B350, when using the chipset ports. The CPU ports are all bad. Small form factor PCs often only expose CPU USB ports. They work with single disk per port but if you peg a port with a multi-disk box, they crap out regularly.
- Post AM5
- Have only tested USB4 on X870 and it’s solid.
- Prior to AM5
- Client side
- WD Elements / MyBook
- If you get disconnects under load and you’re not on a shit AMD USB host, the USB-SATA controller is overheating. Open them and ahere a heatsink on it. Drill a hole in the case above it for better ventilation. Disconnections will stop. If you don’t want to deal with any of that buy the item below.
- OWC Mercury Elite Pro Quad
- Well built, solid controllers, no issues over a year of testing. I have 2, hosting an 8-disk RAIDz2 and 2 hosting a 5-disk RAIDz2.
- Terramaster
- A friend bought a 6-bay and tore it down for me. It has the same controllers as the OWC in a similar topology. If it’s cheaper it might be OK. I can vouch for the OWC though.
- Cables
- Get name brand cables, ideally higher spec than what you’d need! They aren’t important for a single USB disk but running a 4-disk box can max out the port bandwidth. If the cable can’t handle it… errors. Casually transmitting 10Gbps via easily detachable cables and ports isn’t trivial.
- WD Elements / MyBook
Much appreciated 🙏
Gnarly stuff with the WD’s huh? Unfortunately I think that’s what I’ll end up having to put up with since I can’t really find the other options for a decent price around here.
Funny enough I was half-considering just using a bunch of WD Elements. You think the MyBooks might fare any better?I used a mix of Elements and MyBook for years. Upon opening to heatsink, I didn’t see any significant differences between them. They use ASMedia or Jmicron, mostly ASMedia. The overheating issue depends on ambient temp and load. I’ve had one machine in a basement never experience them. Either way the solution is pretty straightforward and cheap. Once heatsinked, I haven’t had a problem.
The cables they come with are good.
- Host side
- Docker