Redirect loop, "sorry, we could not find the page"

Summary

Visiting main site URL results in redirect loop with URLs of the following format: https://chat.slhck.info/error?message=Sorry%2C+we+could+not+find+the+page.&s=…

Steps to reproduce

I am using the Docker version a272b0d github.com/mattermost/mattermost-docker.git and the following modified docker-compose,yml:

version: "2"

services:

  db:
    build: db
    read_only: true
    restart: unless-stopped
    volumes:
      - ./volumes/db/var/lib/postgresql/data:/var/lib/postgresql/data
      - /etc/localtime:/etc/localtime:ro
    environment:
      - POSTGRES_USER=mmuser
      - POSTGRES_PASSWORD=mmuser_password
      - POSTGRES_DB=mattermost
    # uncomment the following to enable backup
    #  - AWS_ACCESS_KEY_ID=XXXX
    #  - AWS_SECRET_ACCESS_KEY=XXXX
    #  - WALE_S3_PREFIX=s3://BUCKET_NAME/PATH
    #  - AWS_REGION=us-east-1

  app:
    build:
      context: app
      # comment out following lines for team edition or change UID/GID
      # args:
      #   - edition=team
      #   - PUID=1000
      #   - PGID=1000
    restart: unless-stopped
    volumes:
      - ./volumes/app/mattermost/config:/mattermost/config:rw
      - ./volumes/app/mattermost/data:/mattermost/data:rw
      - ./volumes/app/mattermost/logs:/mattermost/logs:rw
      - /etc/localtime:/etc/localtime:ro
    environment:
      # set same as db credentials and dbname
      - MM_USERNAME=mmuser
      - MM_PASSWORD=mmuser_password
      - MM_DBNAME=mattermost
      # in case your config is not in default location
      #- MM_CONFIG=/mattermost/config/config.json

  web:
    build: web
    ports:
      - "127.0.0.1:8080:80"
      # - "443:443"
    read_only: true
    restart: unless-stopped
    volumes:
      # This directory must have cert files if you want to enable SSL
      - ./volumes/web/cert:/cert:ro
      - /etc/localtime:/etc/localtime:ro
    # Uncomment for SSL
    # environment:
    #  - MATTERMOST_ENABLE_SSL=true

I am forwarding the app to a local port 8080, which I am proxying through from the outside with Nginx and the following config:

upstream backend {
   server 127.0.0.1:8080;
}

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=mattermost_cache:10m max_size=3g inactive=120m use_temp_path=off;

server {
   listen 80 default_server;
   server_name   chat.slhck.info ;
   return 301 https://$server_name$request_uri;
}

server {
   listen 443 ssl http2;
   server_name    chat.slhck.info;

    ssl on;
    ssl_certificate /etc/letsencrypt/live/chat.slhck.info/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/chat.slhck.info/privkey.pem;
    ssl_session_timeout 5m;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers 'EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH';
    ssl_prefer_server_ciphers on;
    ssl_session_cache shared:SSL:10m;

   location ~ /api/v[0-9]+/(users/)?websocket$ {
       proxy_set_header Upgrade $http_upgrade;
       proxy_set_header Connection "upgrade";
       client_max_body_size 50M;
       proxy_set_header Host $http_host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Frame-Options SAMEORIGIN;
       proxy_buffers 256 16k;
       proxy_buffer_size 16k;
       proxy_read_timeout 600s;
       proxy_pass http://backend;
   }

  location / {
        proxy_http_version 1.1;

       client_max_body_size 50M;
       proxy_set_header Connection "";
       proxy_set_header Host $http_host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Frame-Options SAMEORIGIN;
       proxy_buffers 256 16k;
       proxy_buffer_size 16k;
       proxy_read_timeout 600s;
       proxy_cache mattermost_cache;
       proxy_cache_revalidate on;
       proxy_cache_min_uses 2;
       proxy_cache_use_stale timeout;
       proxy_cache_lock on;
       proxy_pass http://backend;
   }
}

I’ve set up a LetsEncrypt certificate for the HTTPS version, which used to work fine. But apparently I did something that made it get stuck in this redirect loop.

I haven’t set a Site URL, but even when I set it to https://chat.slhck.info/, the error is the same. My config is here: https://pastebin.com/raw/QFDNghJn

The error is the same no matter if I am using the main browser (Chrome) in which I was supposed to be logged in, or another browser with a fresh profile (i.e. cookies deleted etc.).

The Docker logs start as follows:

app_1  | Using existing config file /mattermost/config/config.json
app_1  | Configure database connection...OK
app_1  | Wait until database db:5432 is ready...
db_1   | AWS_ACCESS_KEY_ID is required for Wal-E but not set. Skipping Wal-E setup.
db_1   | AWS_SECRET_ACCESS_KEY is required for Wal-E but not set. Skipping Wal-E setup.
db_1   | WALE_S3_PREFIX is required for Wal-E but not set. Skipping Wal-E setup.
web_1  | linking plain config
db_1   | AWS_REGION is required for Wal-E but not set. Skipping Wal-E setup.
web_1  | ln: /etc/nginx/conf.d/mattermost.conf: File exists
db_1   | LOG:  database system was shut down at 2018-05-29 12:06:35 UTC
db_1   | LOG:  MultiXact member wraparound protections are now enabled
db_1   | LOG:  database system is ready to accept connections
db_1   | LOG:  autovacuum launcher started
db_1   | LOG:  incomplete startup packet
app_1  | Starting platform
db_1   | ERROR:  relation "idx_teams_description" does not exist
db_1   | STATEMENT:  SELECT $1::regclass
web_1  | 127.0.0.1 - - [29/May/2018:14:10:10 +0200] "GET / HTTP/1.1" 404 659 "-" "curl/7.60.0" "-"
web_1  | 127.0.0.1 - - [29/May/2018:14:10:40 +0200] "GET / HTTP/1.1" 404 659 "-" "curl/7.60.0" "-"
web_1  | 172.19.0.1 - - [29/May/2018:14:10:57 +0200] "GET / HTTP/1.1" 404 332 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36" "95.210.191.88"
web_1  | 172.19.0.1 - - [29/May/2018:14:10:58 +0200] "GET /error?message=Sorry%2C+we+could+not+find+the+page.&s=MEUCIQC7p1Ewtoyx_FgboLTZYhhz1uZ_U2mOG6rzmk_cnbWAOAIgMRZEpxu_NVRV-jDItHq4vwiIJxGn8wLoVnE0bX8zugo= HTTP/1.1" 404 331 "https://chat.slhck.info/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36" "95.210.191.88"

Expected behavior

I expected to land on the home page or the default group’s default chat (if I remember correctly).

Observed behavior

Whenever I visit the main site URL, I get redirected in a loop to an error page that never loads, and continues to forward to another error page.

@pichouk This is docker related, so I just wanted to check in first whether this issue is something you are familiar with?

I’m not sure of what your trouble is, but I am surprised that you use the web docker image. Can you try to remove the web container and directly reach your application container from your Nginx. It may be possible that the double reverse-proxy make bad things ^^

How would I directly reach the app container from Nginx?

I have used this kind of double-proxying from Nginx to a Docker web server in quite a few projects. The funny thing is that it worked fine up until some point where I don’t know what changed…

Could it be some form of caching?

You can directly expose port 8000 of your application container to your host port 8000 (like you did with your web container).

Another thing, to be sure that it is not an issue in the Docker image, you can try to access your Mattermost server directly through the web container (and not your Nginx proxy). If it works, then the issue is in the Nginx configuration and/or due to the double proxy.

Thanks for your input. I’ve tried both — exposing the app container directly, and also calling it directly, without Nginx.

The error is the same, so it’s not a double-proxy issue.

Hmm ok so if you expose the application container port and you try to curl it directly, there is still the issue ? In this case I will think of an issue inside Mattermost itself, but I don’t understand why.
@amy.blais Is there someone from the Mattermost staff, familiar with the web server part, that can have an idea about this issue ? At least having a clue on the log error, that will help to fix the trouble.

Hi @slhck! I’ll ask one of our engineers to take a look at this.

First, can you help confirm a few more details:

  1. What is your Mattermost server version?
  2. Can you take a look at our Important Upgrade notes to see if any applies to you: https://docs.mattermost.com/administration/important-upgrade-notes.html.

Thanks Amy for your response.

The server version is indicated by the Docker Git repo version a272b0d, so that should be 4.10.0.
I have not performed any update or upgrade, and none of the upgrade instructions apply to my case.

I wonder if this is related to caching and whether I just need to clean up something, and it’ll work again? Like I said, I do not remember changing anything in between, when suddenly this error showed up.

So this error:

ERROR:  relation "idx_teams_description" does not exist

is saying that one of the indexes doesn’t exist. That particular index is supposed to be getting removed per https://github.com/mattermost/mattermost-server/blame/release-4.10/store/sqlstore/team_store.go#L48. That’s not a new change however (more than a year ago) so I’m not sure what’s causing the problem now, especially since we would be removing the index if it did exist.

Hmm, yeah, I’ve only installed this a few days ago. Perhaps I will just delete the database and try again; there’s no critical data on this installation yet.

For what it’s worth, I deleted the entire database and started from scratch. Now it’s working again.