[Solved] Gateway Timout when running Mattermost behind Traefik Proxy

Summary

I got a (in principle) working docker-compose setup with mattermost, postgresql and traefik as reverse proxy which returns a “Gateway Timout” error most of the times when I try to access mattermost but not always.

I’m not 100% sure by which component the error is caused and if it has anything to do with mattermost at all. But since the config is mostly stolen from https://github.com/mattermost/mattermost-docker I post it here.

Steps to Reproduce

Fire up what’s defined by the config files below.

Expected and Observed Behavior

The desired behaviour is that the mattermost service is reachable from the mattermost subdomain. In fact, this works sometimes, i.e. any couple reboots of the setup. Otherwise, a Gateway Timeout error is returned after a while. As I’m writing this, the mattermost service became reachable after about 4 hours without touching any of the services.

Versions tested

  • mattermost 5.30.1/5.29.1
  • postgres 13.1/9.4
  • traefik 2.3

What I’ve checked so far

  • bypassing traefik by exposing mattermosts port to the outside world works just as intended all the time (no HTTPS obviously)
  • curl the mattermost service from within service-public network (by attaching another container) works all the time
  • the mattermost service is recognized all the time (including all routers and middlewares) by traefik (checked via dashboard)
  • all services are healthy according to logs
  • https://mattermost.example.com is set as site url (with my domain obviously)
  • restarting services in any possible order, works sometimes
  • launching another service (nginx with some static files) with same traefik configuration, can be reached immediately
  • docker network inspect ... reveals that all services appear in the intendet networks

My Setup

Reverse Proxy

docker-compose.yml

version: "3"

networks:
  service-socket-proxy:
    external: false
  service-public:
    external: true

services:

  socket-proxy:
    image: tecnativa/docker-socket-proxy:latest
    container_name: "socket-proxy"
    restart: unless-stopped
    privileged: yes
    environment:
      CONTAINERS: 1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - service-socket-proxy

  traefik:
    image: "traefik:v2.3"
    container_name: "traefik"
    depends_on:
      - socket-proxy
    restart: unless-stopped
    privileged: no
    volumes:
      - "./conf:/etc/traefik:ro"
      - "./letsencrypt:/letsencrypt"
    ports:
      - "80:80"
      - "443:443"
    labels:
      - "traefik.enable=true"
      # secure router
      - "traefik.http.routers.traefik.rule=Host(`traefik.example.com, `www.traefik.example.com`)"
      - "traefik.http.routers.traefik.entrypoints=websecure"
      - "traefik.http.routers.traefik.tls.certresolver=LetsEncrypt"
      - "traefik.http.routers.traefik.service=api@internal"
      # middlewares
      - "traefik.http.routers.traefik.middlewares=traefik-auth"
      - "traefik.http.middlewares.traefik-auth.basicauth.users=user:password."
    networks:
      - service-socket-proxy
      - service-public

conf/traefik.yml

# Uncomment for Development
log:
  level: DEBUG
  
api:
  dashboard: true
  
providers:
  # Pseudo provider that holds some middlewares that cannot be configured statically
  file:
    filename: "/etc/traefik/dyn.yml"
    watch: true
  # Default docker provider behind socket proxy
  docker:
    network: "service-socket-proxy"
    endpoint: "tcp://socket-proxy:2375"
    exposedByDefault: false
    
entryPoints:
  # HTTP entry point - does nothing but redirecting to HTTPS
  web:
    address: ":80"
    http:
      middlewares:
        - http-redirect@file
  # HTTPS entry point
  websecure:
    address: ":443"
    http:
      middlewares:
        - www-redirect@file

certificatesResolvers:
  LetsEncrypt:
    acme:
      email: "admin@example.com"
      storage: "/letsencrypt/acme.json"
      tlschallenge: {}

conf/dyn.yml

http:
  middlewares:
    # Prune all "www" Prefixes
    www-redirect:
      redirectRegex:
        regex: "^https?://www\\.(.*)"
        replacement: "https://${1}"
        permanent: true
    # Enforce HTTPS
    http-redirect:
      redirectScheme:
        port: "443"
        scheme: https
        permanent: true

Mattermost

docker-compose.yml

version: "3"

networks:
  service-public:
    external: true
  service-mattermost:
    external: false

services:

  db:
    build: db
    read_only: true
    container_name: "mattermost-db"
    restart: unless-stopped
    volumes:
      - ./volumes/db/var/lib/postgresql/data:/var/lib/postgresql/data
      - /etc/localtime:/etc/localtime:ro
    environment:
      - POSTGRES_USER=mmuser
      - POSTGRES_PASSWORD=mmuser_password
      - POSTGRES_DB=mattermost
    networks:
      - service-mattermost

  app:
    build:
      context: app
      args:
        - edition=team
        - PUID=1000
        - PGID=1000
    container_name: "mattermost-app"
    restart: unless-stopped
    depends_on:
      - db
    # bypassing traefik by exposing mm directly works just fine
    # ports:
    #  - "8080:8000"
    volumes:
      - ./volumes/app/mattermost/config:/mattermost/config:rw
      - ./volumes/app/mattermost/data:/mattermost/data:rw
      - ./volumes/app/mattermost/logs:/mattermost/logs:rw
      - ./volumes/app/mattermost/plugins:/mattermost/plugins:rw
      - ./volumes/app/mattermost/client-plugins:/mattermost/client/plugins:rw
      - /etc/localtime:/etc/localtime:ro
    environment:
      - MM_USERNAME=mmuser
      - MM_PASSWORD=mmuser_password
      - MM_DBNAME=mattermost
      - MM_SQLSETTINGS_DATASOURCE=postgres://mmuser:mmuser_password@db:5432/mattermost?sslmode=disable&connect_timeout=10
    labels:
      - "traefik.enable=true"
      # insecure router
      - "traefik.http.routers.mm-router.rule=Host(`mattermost.example.com``, `www.mattermost.example.com``)"
      - "traefik.http.routers.mm-router.entrypoints=web"
      # secure router
      - "traefik.http.routers.mm-router-sec.rule=Host(`mattermost.example.com``, `www.mattermost.example.com`)"
      - "traefik.http.routers.mm-router-sec.entrypoints=websecure"
      - "traefik.http.routers.mm-router-sec.tls.certresolver=LetsEncrypt"
    networks:
      - service-mattermost
      - service-public

app/Dockerfile

FROM alpine:3.10

# Some ENV variables
ENV PATH="/mattermost/bin:${PATH}"
ENV MM_VERSION=5.30.1

# Build argument to set Mattermost edition
ARG edition=enterprise
ARG PUID=2000
ARG PGID=2000
ARG MM_BINARY=


# Install some needed packages
RUN apk add --no-cache \
	ca-certificates \
	curl \
	jq \
	libc6-compat \
	libffi-dev \
    libcap \
	linux-headers \
	mailcap \
	netcat-openbsd \
	xmlsec-dev \
	tzdata \
	&& rm -rf /tmp/*

# Get Mattermost
RUN mkdir -p /mattermost/data /mattermost/plugins /mattermost/client/plugins \
    && if [ ! -z "$MM_BINARY" ]; then curl $MM_BINARY | tar -xvz ; \
      elif [ "$edition" = "team" ] ; then curl https://releases.mattermost.com/$MM_VERSION/mattermost-team-$MM_VERSION-linux-amd64.tar.gz?src=docker-app | tar -xvz ; \
      else curl https://releases.mattermost.com/$MM_VERSION/mattermost-$MM_VERSION-linux-amd64.tar.gz?src=docker-app | tar -xvz ; fi \
    && cp /mattermost/config/config.json /config.json.save \
    && rm -rf /mattermost/config/config.json \
    && addgroup -g ${PGID} mattermost \
    && adduser -D -u ${PUID} -G mattermost -h /mattermost -D mattermost \
    && chown -R mattermost:mattermost /mattermost /config.json.save /mattermost/plugins /mattermost/client/plugins \
    && setcap cap_net_bind_service=+ep /mattermost/bin/mattermost

USER mattermost

# Healthcheck to make sure container is ready
HEALTHCHECK CMD curl --fail http://localhost:8000 || exit 1

# Configure entrypoint and command
COPY entrypoint.sh /
ENTRYPOINT ["/entrypoint.sh"]
WORKDIR /mattermost
CMD ["mattermost"]

# Expose port 8000 of the container
EXPOSE 8000

# Declare volumes for mount point directories
VOLUME ["/mattermost/data", "/mattermost/logs", "/mattermost/config", "/mattermost/plugins", "/mattermost/client/plugins"]

app/entrypoint.sh

#!/bin/sh

# Function to generate a random salt
generate_salt() {
  tr -dc 'a-zA-Z0-9' < /dev/urandom | fold -w 48 | head -n 1
}

# Read environment variables or set default values
DB_HOST=${DB_HOST:-db}
DB_PORT_NUMBER=${DB_PORT_NUMBER:-5432}
MM_DBNAME=${MM_DBNAME:-mattermost}
MM_CONFIG=${MM_CONFIG:-/mattermost/config/config.json}

if [ "${1:0:1}" = '-' ]; then
    set -- mattermost "$@"
fi

if [ "$1" = 'mattermost' ]; then
  # Check CLI args for a -config option
  for ARG in $@;
  do
      case "$ARG" in
          -config=*)
              MM_CONFIG=${ARG#*=};;
      esac
  done

  if [ ! -f $MM_CONFIG ]
  then
    # If there is no configuration file, create it with some default values
    echo "No configuration file" $MM_CONFIG
    echo "Creating a new one"
    # Copy default configuration file
    cp /config.json.save $MM_CONFIG
    # Substitue some parameters with jq
    jq '.ServiceSettings.ListenAddress = ":8000"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.LogSettings.EnableConsole = true' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.LogSettings.ConsoleLevel = "ERROR"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.FileSettings.Directory = "/mattermost/data/"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.FileSettings.EnablePublicLink = true' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.FileSettings.PublicLinkSalt = "'$(generate_salt)'"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.SendEmailNotifications = false' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.FeedbackEmail = ""' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.SMTPServer = ""' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.SMTPPort = ""' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.InviteSalt = "'$(generate_salt)'"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.EmailSettings.PasswordResetSalt = "'$(generate_salt)'"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.RateLimitSettings.Enable = true' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.SqlSettings.DriverName = "postgres"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.SqlSettings.AtRestEncryptKey = "'$(generate_salt)'"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
    jq '.PluginSettings.Directory = "/mattermost/plugins/"' $MM_CONFIG > $MM_CONFIG.tmp && mv $MM_CONFIG.tmp $MM_CONFIG
  else
    echo "Using existing config file" $MM_CONFIG
  fi

  # Configure database access
  if [[ -z "$MM_SQLSETTINGS_DATASOURCE" && ! -z "$MM_USERNAME" && ! -z "$MM_PASSWORD" ]]
  then
    echo -ne "Configure database connection..."
    # URLEncode the password, allowing for special characters
    ENCODED_PASSWORD=$(printf %s $MM_PASSWORD | jq -s -R -r @uri)
    export MM_SQLSETTINGS_DATASOURCE="postgres://$MM_USERNAME:$ENCODED_PASSWORD@$DB_HOST:$DB_PORT_NUMBER/$MM_DBNAME?sslmode=disable&connect_timeout=10"
    echo OK
  else
    echo "Using existing database connection"
  fi

  # Wait another second for the database to be properly started.
  # Necessary to avoid "panic: Failed to open sql connection pq: the database system is starting up"
  sleep 1

  echo "Starting mattermost"
fi

exec "$@"

db/Dockerfile

FROM postgres:13.1-alpine

ENV DEFAULT_TIMEZONE UTC

# update packages
RUN apk upgrade --no-cache && rm -rf /var/cache/apk/* /tmp/* /var/tmp/*

#Healthcheck to make sure container is ready
HEALTHCHECK CMD pg_isready -U $POSTGRES_USER -d $POSTGRES_DB || exit 1

# Add and configure entrypoint and command
COPY entrypoint.sh /
ENTRYPOINT ["/entrypoint.sh"]
CMD ["postgres"]

VOLUME ["/var/run/postgresql", "/usr/share/postgresql/", "/var/lib/postgresql/data", "/tmp"]

db/entrypoint.sh

#!/bin/bash

function update_conf () {
  # PGDATA is defined in upstream postgres dockerfile
  config_file=$PGDATA/postgresql.conf

  # Check if configuration file exists. If not, it probably means that database is not initialized yet
  if [ ! -f $config_file ]; then
    return
  fi

  # Reinitialize config
  sed -i "s/log_timezone =.*$//g" $config_file
  sed -i "s/timezone =.*$//g" $config_file

  echo "log_timezone = $DEFAULT_TIMEZONE" >> $config_file
  echo "timezone = $DEFAULT_TIMEZONE" >> $config_file
}

if [ "${1:0:1}" = '-' ]; then
  set -- postgres "$@"
fi

if [ "$1" = 'postgres' ]; then
  # Update postgresql configuration
  update_conf

  # Run the postgresql entrypoint
  docker-entrypoint.sh postgres
fi

EDIT1: add version numbers

2 Likes

Turns out I was wrong and the issue is traefik not properly discovering services here. I mark the problem as solved here and move discussion to the traefik forum. You’ll find the respective post at: https://community.traefik.io/t/service-discovery-with-multiple-networks-involved/9088

1 Like