Lots of Warnings since upgrade to v7.7.1

Since the upgrade from Mattermost v7.5.2 Professional to v7.7.1 the logfile is spammed with warnings about websocket.slow. How can this be fixed or at least how can the spam be reduced?

There are up to 60 entries a minute like these:

warn [2023-01-25 08:19:53.021 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.021 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.021 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.044 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.044 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.044 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.176 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.176 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.176 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.260 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.260 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.260 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.371 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.371 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.371 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.453 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.453 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.453 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.454 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.454 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.454 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.479 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.479 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.479 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.639 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.639 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.639 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.788 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=typing
warn [2023-01-25 08:19:53.788 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=typing
warn [2023-01-25 08:19:53.788 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=typing
warn [2023-01-25 08:19:53.800 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.800 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.800 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.975 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed
warn [2023-01-25 08:19:53.975 +01:00] websocket.slow: dropping message caller=“platform/web_conn.go:714” user_id=wdenti53df8wdynobjjaj71fya type=channel_viewed

Hi @TI-Admin ,

there are ongoing discussions about this log message currently. My logs are also full of them and it usually is caused by a few users with slow/laggy internet connection and in my opinion should not be logged as warning on the server end, but at the moment there’s unfortunately nothing you can do about it.

Well, Mattermost logging was and still is a mess.
It is extremely hard to identify real problems and why is the focalboard_errors.log not located in the logs folder?

Hi @TI-Admin - there has been no changes to the code from 7.5.2 to 7.7.1 that can cause this.

As Alexander pointed above, this is usually caused by slow/laggy clients not being able to consume websocket events fast enough. When the server generates events faster than the clients can consume, we log this as warning, because the clients are actually missing websocket events.

From your log, it appears that this is only from a single user_id. Are you seeing multiple user_ids here? If it’s just that user, maybe you can reach out to them and see what’s happening?

I might have some ideas to suppress the log messages from spamming continously. I’ll look into this.

Your feedback about focalboard_errors.log is a packaging issue. @amy.blais - this is probably something that the release team/boards team should look into.

Thanks for jumping in here, Agniva.
I’ve bumped the focalboard_errors.log issue again, I’ve brought it up once already - let’s see if I can get some drive on it now :slight_smile:

1 Like

Thanks for looking into it. We have 5000+ websocket.slow messages in a short period of time from different users. I think there is no need for 60+ messages in second for a single user.

Yep, I have some ideas that can alleviate this.

I am also seeing this log spam. My current log file is 395600 lines, and 344009 of them (86%) are websocket.slow.

Is there any way to filter these out of the logs?

It looks like maybe [MM-51700] - Mattermost is designed to help with this - but did not make it into 7.10. Is that right?

With this change, all the warnings are dropped unless it has been 1 minute since the last one (for any user). While this may make the log files a bit smaller, I am not all that keen on it because

  1. there will still be this warning in the log once a minute, which is quite a lot, and
  2. most of the information is discarded; I just get a random sampling once a minute

Since these warnings would still be busy and now even less actionable, I would prefer that you drop this logging to Debug.

If we do actually want log warnings, maybe they should gather counters on how many times we had to do drops and how many users were affected, and print that once an hour? If this is something that we think admins of actual instances can investigate, then can we expose more useful info via Prometheus metrics possibly?

Opened a PR to revert the logging filtering and drop the logging level to Debug.