[SOLVED] Can't login after upgrading from 4.8 to 4.9

After upgrading from 4.8 to 4.9 in docker (and fixing configuration for the breaking changes e.g. changing ownership on the app volumes now that the app container no longer runs as root), mattermost seems to be up and running happily, but even after restoring a db dump, I cannot log in with pre-existing users through the UI.

In the mattermost app logs:

[2018/04/17 04:12:13 UTC] [EROR] /api/v4/users/login:SqlUserStore.GetForLogin code=400 
rid=5ysecdtgr3fmix4ujpfk41hawr uid= ip= We couldn't find an existing account matching your 
credentials. This team may require an invite from the team owner to join. [details: sql: Scan error on column 
index 22: unsupported Scan, storing driver.Value type <nil> into type *string]

but I am using a user which exists in the users table, e.g. cdm-ium:

mattermost=# select * from users where username = 'cdm-ium';
-[ RECORD 1 ]------+------------------------------------------------------------------------------------------------------------------- 
id                 | ehnbhiy5jjghtbcmyd3fnanptc
createat           | 1499634908645
updateat           | 1502321045886
deleteat           | 0
username           | cdm-ium
password           | <redacted>
authdata           | 
authservice        | 
email              | <redacted>
emailverified      | t
nickname           | 
firstname          | 
lastname           | 
position           | 
roles              | system_admin system_user
allowmarketing     | t
props              | {}
lastpasswordupdate | 1502321045886
lastpictureupdate  | 1499700640363
failedattempts     | 0
locale             | en
timezone           | 
mfaactive          | f
mfasecret          | 

So any tips for looking into this further//figuring out where the linkage is breaking for my login attempts?

I restored my dump (from 4.8 mattermost) with psql mattermost -U mmuser < db_backup.sql into the 4.9 db. Should I have completely dropped the new db from 4.9 and loaded in the 4.8 db from a completely blank state? Any other things people can suggest to try?

Another weird thing is that I only see the above error when attempting to log in. I do not see the above error (or any error or email) when trying to initiate the forgot password flow for the same user.

EDIT: resolved, see last post.

One update:

It seems like there’s a deeper issue. When I try to sign up a new user from scratch I get this error:

This server does not allow open signups. Please speak with your Administrator to receive an invitation.

even thought signups seem to be enabled in the config:

(darkslack) ✔ 10:13 ~/personal/notes/darkslack [master|⚑ 1] $ cat config.json | grep -i email
    "EmailSettings": {
        "EnableSignUpWithEmail": true,

I wonder if there is some way to check that it’s even going to the right db. The config looks right:

(darkslack) ✔ 08:58 ~/personal/notes/darkslack [master|⚑ 1] $ cat config.json | grep -i db
        "DataSource": "postgres://XXX:yyy@db:5432/mattermost?sslmode=disable\u0026connect_timeout=10",

But I don’t see any logs in the db container for the queries that should be happening. This could be a verbosity thing or maybe it’s somehow failing before the DB?

Looks like you DB didn’t migrate properly. You should have a value in the timezone field. Did you see the DB upgrade message in the logs on server start?

For the “This server does not allow open signups.” message that’s the wrong setting. The open server setting is “TeamSettings” -> “EnableOpenServer”

OK I changed the setting you mentioned and now it’s getting a bit interesting. Now trying to sign up with an exist user fails with:

An account with that username already exists.

So clearly we’re talking to the database somehow. But logging in with that user says “We couldn’t find an account matching your login credentials.”

I suspected that maybe there was something up with the passwords so I tried the forgot password flow, but that generates no forgot password email, even though I see a successful post to the endpoint in the logs:

[2018/04/18 19:51:49 UTC] [DEBG] POST - /api/v4/users/password/reset/send

I have the same email config as I did from before the upgrade so I assume the emails should be working and don’t see any errors relating to try to send the email.

Just looking at the code for getting a user by email, it looks like a simple select from the users table by email which I can run manually on the db and I get reasonable results.

I’m going to see if I can enable more verbose logs for the postgres container to see what query is actually being run when the get users command returns no results.

Turning on the verbose sql logging I see:

[19139] [5ad79df9.4ac3] [2018-04-18 20:06:19.611 UTC] [0]: LOG:  execute <unnamed>: SELECT
					($1 AND Username = $2)
					OR ($3 AND Email = $4)
					OR ($5 AND AuthService = 'ldap' AND AuthData = $6)
[19139] [5ad79df9.4ac3] [2018-04-18 20:06:19.611 UTC] [0]: DETAIL:  parameters: $1 = 't', $2 = 'xxx', $3 = 't', $4 = 'xxxx', $5 = 'f', $6 = 'xxx'

And running the command manually returns the user I want, so I really can’t figure out why login is failing here. Is there some additional check (e.g. against the users timezone) which could cause the code to fail in some weird way because that isn’t set?

I think the problem is in fact the missing timezones. I see: https://github.com/mattermost/mattermost-server/pull/8185 is a fairly recent feature and somehow I missed an upgrade to add the timezone. My dump is against a version of mattermost without timezones for users, and the code in the user store assumes timezones now. I’m going to try fixing my db and retry.

Sounds good. Let me know if that fixes it for you.

So it looks like I’m finally up. There are 4 sticking points with this upgrade:

  • the EB config for running mattermost in aws actually just pulls the latest image: https://github.com/mattermost/mattermost-docker/blob/master/contrib/aws/Dockerrun.aws.json so I accidentally upgraded to version 4.9. There was probably always a path forward updating the Dockerrun json to pull a specific tag for the image. Before anything else I had to realize the the version of mattermost had changed out from under me and the main issue was mattermost, not some change in beanstalk.
  • Version 4.9 has same breaking changes in how it builds its docker images, notably the app container now runs as a custom user and not root. This means changing the ownership of the volumes from root to the custom user as explained in the 4.9 release notes. I still want to see if there is a way to do this automatically in the beanstalk config rather than manually doing it each time beanstalk brings up a new instance.
  • The schema had a few changes between 4.8 and 4.9 which were not handled very gracefully. Notably the timezone feature mentioned above and a new column on the teams table (lasticonupdate). The thing that is a little weird here is both columns appear as nullable, but the ORM doesn’t actually handle null values in those columns properly. Manually setting defaults made the database work properly.
  • Finally, I’m not sure if 4.9 is more resource intensive or the default instance for a beanstalk cluster got smaller, but once I started to get an environment up, the app container kept crashing with memory issues. This was fairly easy to fix by a) launching the cluster with a bigger instance and b) increasing the memory limit for the app container in the beanstalk config.

I’ll probably follow up over github to see if I can push to make some of these things a bit more user-friendly. Thanks!