Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Being able to run self contained postgres with a single command is an easy ergonomic win. You don't necessarily need to be an engineer or write code for containers to be useful.


view as:

How you inform your backup system where to get backups ? How you set pg_hba and other configs? Few other "how?" and you're doing what you'd be doing on VM anyway

> How you inform your backup system where to get backups ? How you set pg_hba and other configs?

Simple answer: with configuration. In Docker Compose or Kubernetes. Less often in Mesos. Maybe I want to run it on a fleet of VMs, maybe on bare metal.

> and you're doing what you'd be doing on VM anyway

Right. But with containers I can have different apps using different dependency versions. Some things use nginx, some use some other web server. Some things run with node 12, some with node 16, some things use MySQL, some other things need Postgres. This is so easy with containers.


So my issue is that now I have PostgreSQL running... inside a container? So I need to figure out where the data for it is stored and figure out how to ensure it is being backed up.

And like, that's just minimum: I generally care deeply about the filesystem that PostgreSQL is running on and I will want to ensure the transaction log is on a different disk than the data files... and I now have to configure all of this stuff multiple times.

At some point I am going to have to edit the configuration file for PostgreSQL... is it inside of the container? Did I have to manually map it from the outside of the container?

The way you access PostgreSQL locally--maybe if you find yourself ready to add a copy of pgbouncer, but also just to run the admin tools--is via unix domain sockets. Are those correctly mapped outside of the container, and where did I put them?

I honestly don't get it for something like PostgreSQL. I even use containers, but I can only see downsides for this particular use case. You know how easy it is to run PostgreSQL in some reasonable default confirmation? It is effectively 0 commands as you just install it and your distribution generally already has it running.


> So my issue is that now I have PostgreSQL running... inside a container? So I need to figure out where the data for it is stored and figure out how to ensure it is being backed up.

And how is that different from running directly on the OS?


the OS will have an understood abstraction behind the filesystem structure, whereas container-systems often create an entirely new abstraction that they view as the best way.

this makes sense when you're trying to be deployable universally, but it increases the amount of onboarding that someone needs to receive before they're proficient with the container system; onboarding they may have not actually needed to get the software working and well understood, simply 'docker overhead'.

from personal experience : i'm a long time old linux person, the insistence on going 'all in' on Docker (or whatever) just to run a python script that has two or three common shared dependencies gains me nothing but the hassle of now having to maintain and understand a container system.

if you're shipping truly fragile software that is dependent on version 1.29382828 rather than version 1.29382827 then I understand the benefits gained, but just to containerize something very simple in order to follow industry trends is obnoxious, increasingly common, and seemingly has soured a lot of people on a good idea.

p.s. : I can also understand the idea of containerizing very simple things as parts of a larger mechanism; I just don't get it with the promise that it'll reduce end-user complexity, it isn't that simple.


About "truly fragile software", it seems to be the norm now, thanks to the ease docker provides to hide this

Database in container solves a few problems for me:

1. I can start it and stop it automatically with Compose. That's a big increase in ergonomics.

2. I don't have to write my own scripts to set up and tear down the database in the dev environment. The Postgres image + Compose does all that for me.

3. Contributors who don't know much about Unix or database admin stuff (data analysts learning Python & data scientists contributing to the code) don't have to install or mess with anything. It works on everyone's machine the same way.

Volume and port mapping are basically trivial concepts anyway. There's been zero downside for me in using it, even though I personally have the skills and knowledge to not "need" it. Why would I go without it? It saves me time and effort that could be significantly better used elsewhere.


I am honestly shocked by the amount of comments here that mention volume mounts as some gotcha. No it isn't. It is as trivial as it gets.

I like your point of view. I started working with software more than 20 years ago and did my fair share of VMs, bare metal, ESXi 3.5, and so on.

However, we live in the world where the choice we have for new hires is: a) teach them all of those OS fundamentals, b) give them Docker.

This doesn’t mean we shouldn’t strive to teach said new developers all those underlying concepts, but when we talk about training juniors and have them contribute relatively quickly, it’s a much smaller surface area to bite through.


Docker doesn't solve having to know OS fundamentals as both the inside and the outside of the container have an OS, and now you additionally have to know how to communicate between the former and the latter.

> This doesn’t mean we shouldn’t strive to teach said new developers all those underlying concepts

That's what I said.

Furthermore, let's be pragmatic. The operations team needs to know, yes. We want new hires to contribute and feel productive. They'll naturally learn while working on software. A junior person can contribute almost immediately with a limited surface knowledge.

Less friction: no need to understand systemd/upstart/rc what have you, /etc, /opt, /usr, mount, umount, differences between various distros, build-essentials, Development Tools, apt, dpkg, rpm, dnf, ssh keys, ...

The right time will come but give them an easy way in. Containers provide exactly that.

All they need to know: it's somewhat isolated so under normal circumstances whatever you do in the container doesn't affect the host, how to expose ports, basics of getting your dependencies in, volumes, basic understanding of container networks - for things to talk to each other they need to be in the same network. Enough to start.


Every problem you mention here is solved by working primarily from your compose file/deployment manifest and having that be your source of truth.

- Where is the data stored? Your compose file will tell you.

- Is the configuration file in that container or mapped? File tells you. (If you didn't map it, it's container-local)

- How do you get to it with admin tools? If you mapped those sockets outside, file tells you.


Problem: too many config files.

Solution: config file for the config files.


I literally don't have to do any of that if I just install PostgreSQL the normal way as the confirmation file is in the same place the documentation for PostgreSQL says it is in, the daemon is probably already running, and all of the tools know how to find the local domain socket. Why am I having to configure all of this stuff manually in some new "compose file"? Oh, right: because I am now using a container.

It's really hard for me to not make some incredibly dismissive "okay, boomer" comment here, but you are not giving me much to work with and this really reads as obstinate resistance to learning new things. Are containers different? Without a doubt. The benefits are, however, impossible to ignore.

The amount of work you are complaining about is, objectively, trivial. It's no harder than learning how to deploy on another distro or operating system, except in this case, your new knowledge is OS-agnostic.


I agree. All he is doing is making people uncomfortable for using containers. The only real major downside with containers is that the building process is quite expensive so you should get familiar with docker run -it --rm <id> bash or the same thing with docker exec. Oh and also that your container might not come with diagnostic tools.

All my automated integration tests use a real Postgres database in a known state. I don’t mock or stub or use an in-memory database pretending to be Postgres. It’s nice.

Yup. And you can make sure to run your tests in the exact same Postgres version you are running in production. No need to go help your teammate whose machine is for some reason behaving different, due to some weird configuration file somewhere that they forgot they changed it 6 months ago.

> So I need to figure out where the data for it is stored and figure out how to ensure it is being backed up.

As a user, this is why I love Docker. The configuration is explicit and contained, and it's well documented which directories and ports are in play.

I don't need to remember to tweak that one config file in /etc/ which I can never remember where is. Either it's a flag or it's a file in a directory I map explicitly. And where does _this_ program store its data? Don't need to remember, data dir is mapped explicitly.

That said I haven't tried to use PostgreSQL myself directly, just tools that uses it like Gitea.


Yes this is the hidden beauty of docker... "I don't need to remember". Someone can reverse engineer exactly how I have a goofy custom postgres setup instantly, just by looking at my original Dockerfile I committed a year ago. No hunting around on the OS!

As someone else said, docker isn't about you, it's about everyone else. The extra complexity up front is so worth it for the rest of the team.


> So my issue is that now I have PostgreSQL running... inside a container? So I need to figure out where the data for it is stored and figure out how to ensure it is being backed up.

You bind a mount to where the data is onto the external system. That way only the important data is exposed. It's very clean, although it requires you understand docker a bit to know to do this. But for things like postgres and similar sprawling software that basically assume they're a cornerstone of your entire application and spread out as though this was the case, it's actually a very neat way of using them a bit without having them take over your machine.

This is something that can be useful for a developer too. Like my search engine software assumes it owns the hardware it runs on. It assumes you've set up specific hard drives for the index and the crawl data. But sometimes you just wanna fiddle with it in development, and then it can live in a pretend world inside docker where it owns the entire "machine", and in reality it's actually just bound to some dir ~/junk/search-data.

Although I guess an important point is that using docker requires you to understand the system better than not using docker, since you both need to understand the software you run inside the container, as well as docker. It's sometimes used as though you don't need to understand what the container does. This is a footgun that would make C++ blush.


> Being able to run self contained postgres with a single command is an easy ergonomic win. You don't necessarily need to be an engineer or write code for containers to be useful.

Postgres is a poor example; it's far far easier to do `apt-get install postgresql` than to run it from a container.

The latter needs a container set up with correct ports, plus it needs to be pointed at the correct data directory before it is as usable as the former.


Is this some kind of joke? Those things are absolutely trivial and once you wrote them in your bash script or compose file you can forget about these things.

Legal | privacy