Unable to connect to postgresql database lying on a different container than the database
Steps to reproduce
I’m using the mattermost command.
mattermost user create --email test2@email.com --username testatene --password Password
Observed behavior
{“level”:“error”,“ts”:1564644335.8744648,“caller”:“sqlstore/supplier.go:236”,“msg”:“Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused”}
Is there any way I can change the postgresql path? The container is reacheable through the “db” name set in docker-compose, but I haven’t found a way to set that path.
Since you mentioned that the container is reachable through the database name, can you please let us know how did you confirm that? Did you try to ping to the machine hosting the database from the container and vice versa or did you use any other methods?
Can you share a copy of the docker-compose.yaml here so we can go through it together?
Hi,
Yes, this is the docker-compose I’m using. It comes directly from the gitlab site (https://github.com/mattermost/mattermost-docker). This seems to be defaul behaviour with mattermost. I’ve got two somewhat different installations with the same problem, one of which uses different settings for the reverse proxy:
version: "2"
services:
db:
build: db
read_only: true
restart: unless-stopped
volumes:
- ./volumes/db/var/lib/postgresql/data:/var/lib/postgresql/data
- /etc/localtime:/etc/localtime:ro
environment:
- POSTGRES_USER=mmuser
- POSTGRES_PASSWORD=mmuser_password
- POSTGRES_DB=mattermost
# uncomment the following to enable backup
# - AWS_ACCESS_KEY_ID=XXXX
# - AWS_SECRET_ACCESS_KEY=XXXX
# - WALE_S3_PREFIX=s3://BUCKET_NAME/PATH
# - AWS_REGION=us-east-1
app:
build:
context: app
# uncomment following lines for team edition or change UID/GID
args:
- edition=team
- PUID=5000
- PGID=5000
restart: unless-stopped
volumes:
- config:/mattermost/config:rw
- data:/mattermost/data:rw
- logs:/mattermost/logs:rw
- plugins:/mattermost/plugins:rw
- client-plugins:/mattermost/client/plugins:rw
- /etc/localtime:/etc/localtime:ro
environment:
# set same as db credentials and dbname
- MM_USERNAME=mmuser
- MM_PASSWORD=mmuser_password
- MM_DBNAME=mattermost
# in case your config is not in default location
#- MM_CONFIG=/mattermost/config/config.json
extra_hosts:
- "mm.vinci.de:192.168.99.12"
- "git.vinci.de:192.168.99.11"
web:
build: web
ports:
- "80:80"
- "443:443"
read_only: true
restart: unless-stopped
volumes:
# This directory must have cert files if you want to enable SSL
- ./volumes/web/cert:/cert:ro
- /etc/localtime:/etc/localtime:ro
# Uncomment for SSL
# environment:
# - MATTERMOST_ENABLE_SSL=true
extra_hosts:
- "mm.vinci.de:192.168.99.12"
- "git.vinci.de:192.168.99.11"
volumes:
config:
data:
logs:
plugins:
client-plugins:
So as you see, the images were built directly using the github files of the mattermost official repo.
I can’t ping into the other container, I’m guessing it’s because I can’t do with with the mattermost user (probably because it’s not allowed access to ping etc.) Ping does work from database to the app, though.
How would the application know to reach the ‘db’ name nonetheless? It can’t work like that, if it’s not set somewhere in the configuration files or whatever. I could have named the container something else, but I’d still expect it to work by being able to offer it the path to the db.
Later did I see that there’s actually a DB_HOST=${DB_HOST:-db} in the Dockerfile, which doesn’t actually appear in the environmental variables in the in the app container.
Hey @vinci,
what is the exact command or commands you use to run this?
I’ve found that running the command from inside the machine did not work.
However using docker exec worked for me because then all the environment vars are set correctly.
So for example to get the server version I’d use docker exec -it mattermost_docker_app_1 /mattermost/bin/mattermost version
Thank you for the tip. Unfortunately it still doesn’t work. I would have been pretty surprised to work from the docker command, but not from inside the container, to be honest. The variables are there and the user should have access to them. This is what I’ve tried:
root@mattermost:/home/atkadmin# docker exec -it 9c12 sh
~ $ mattermost version
{"level":"error","ts":1567092520.6169271,"caller":"sqlstore/supplier.go:236","msg":"Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused"}
^C
~ $ root@mattermost:/home/atkadmin# docker exec -it 9c12 mattermost version
{"level":"error","ts":1567092530.4008448,"caller":"sqlstore/supplier.go:236","msg":"Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused"}
I can confirm that a fresh install as of a few days ago exhibits the same behaviour. This happens both with docker-compose exec app mattermost and docker-compose exec app sh followed by /mattermost/bin/mattermost
Running docker-compose exec app sh and then env results in:
Thank you for confirming the issue.
For me it would be really important to be able to use the command, as there are several things that you can do only with the command line, such as completely removing users.
I’ve gone one step further and installed from Docker and then removed all references to external volumes so that it’s a completely isolated installation. And the same issue persists.
I’m hoping to add 300+ user accounts which is another thing that the command line enables. I’m really curious what the underlying issue is?
I can’t see it as anything else than negligence. It’s obvious that this works if you install it conventionally outside docker, as it would connect to the localhost, which inside a VM would be common to all services, so there would be a problem in the first place.
[later edit:]
So this is what I’ve done to try accessing the database.
First of all I deleted the USER mattermost directive from the dockerfile of the application container, so that I can start the container with root privileges. Then I’ve installed socat and ran:
172.19.0.2 being the database container.
What do you think I get?
{"level":"error","ts":1568147811.9297912,"caller":"sqlstore/supplier.go:235","msg":"Failed to ping DB retrying in 10 seconds err=pq: SSL is not enabled on the server"}
It’s as if it never even crossed their minds to establish the connection between the containers using the command. I still don’t understand though why the application is able to connect to the database, as mattermost works fine otherwise.
[even later on:]
I’ve set up ssl on the database, just for the sake of it, then I switched to the mattermost user using su -, but for some reason the mattermost binary wasn’t found in the path. So I ran:
/mattermost/bin/mattermost version
{"level":"error","ts":1568149541.235386,"caller":"sqlstore/supplier.go:235","msg":"Failed to ping DB retrying in 10 seconds err=pq: password authentication failed for user \"mattermost\""}
Any update on this?
Iam struggling with the same problem! The mattermost instance is up and running with docker, no problem at all. But i cant get access to the command line interface, iam getting same error Connection Refused
andreas@filesafe:/filesafe/prolikematters$ docker exec -it prolikematters_app_1 /mattermost/bin/mattermost version
{"level":"error","ts":1568287324.1087985,"caller":"sqlstore/supplier.go:235","msg":"Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused"}
{"level":"error","ts":1568287334.110559,"caller":"sqlstore/supplier.go:235","msg":"Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused"}
{"level":"error","ts":1568287344.1120512,"caller":"sqlstore/supplier.go:235","msg":"Failed to ping DB retrying in 10 seconds err=dial tcp 127.0.0.1:5432: connect: connection refused"}
I have followed the mattermost production deployment guideline, this server have nginx running as reverse proxy, and letsencrypt for ssl certification. But i guess there is not these which cause the problem?
My docker-compose yml looks like:
version: "2"
services:
db:
build: db
read_only: true
restart: unless-stopped
volumes:
- ./volumes/db/var/lib/postgresql/data:/var/lib/postgresql/data
- /etc/localtime:/etc/localtime:ro
environment:
- POSTGRES_USER=mmuser
- POSTGRES_PASSWORD=mmuser_password
- POSTGRES_DB=mattermost
# uncomment the following to enable backup
# - AWS_ACCESS_KEY_ID=XXXX
# - AWS_SECRET_ACCESS_KEY=XXXX
# - WALE_S3_PREFIX=s3://BUCKET_NAME/PATH
# - AWS_REGION=us-east-1
app:
build:
context: app
# uncomment following lines for team edition or change UID/GID
args:
- edition=team
# - PUID=1000
# - PGID=1000
restart: unless-stopped
volumes:
- ./volumes/app/mattermost/config:/mattermost/config:rw
- ./volumes/app/mattermost/data:/mattermost/data:rw
- ./volumes/app/mattermost/logs:/mattermost/logs:rw
- ./volumes/app/mattermost/plugins:/mattermost/plugins:rw
- ./volumes/app/mattermost/client-plugins:/mattermost/client/plugins:rw
- /etc/localtime:/etc/localtime:ro
environment:
# set same as db credentials and dbname
- MM_USERNAME=mmuser
- MM_PASSWORD=mmuser_password
- MM_DBNAME=mattermost
- VIRTUAL_HOST=matters.prolike.io
- LETSENCRYPT_HOST=matters.prolike.io
# in case your config is not in default location
#- MM_CONFIG=/mattermost/config/config.json
networks:
default:
external:
name: webproxy
Thanks to the magic of git checkout and way too much time spent, I can confirm that the breakage took place between mattermost-docker 5.11.1 and mattermost-docker 5.12.0
docker exec mattermostdocker_app_1 sh -c ‘/mattermost/bin/mattermost team create --name test --display_name “test”’
… works fine on 5.11.1 whereas the exact same docker-compose.yml (and associated filesystem mount points) results in the database error on 5.12.0
There were 260 commits between those tags, and I’m hoping that someone with more knowledge than I of Mattermost’s internals can quickly spot what caused the breakage.
I opened this thread two months ago and the problem started on the 15 June, and it’s still being ignored…
They are very careful about not supporting direct ldap authentication themselves, so that you buy the commercial version (I’m ok with that in itself), but they have a huge problem for such a long time that they’re not addressing. Not really nice. How can this be considered enterprise ready?