Many errors in logs- "websocket.NextReader: closing websocket"

We face many entries in logs with errors:
{"timestamp":"2022-10-26 14:04:59.515 +05:00","level":"error","msg":"websocket.NextReader: closing websocket","caller":"app/web_conn.go:830","user_id":"grfze6dkeb8bige7737ibojecr","error":"read tcp 127.0.0.1:8065->127.0.0.2:33418: i/o timeout"}
Overwise, Mattermost works normally at this time. Maybe, you knows what errors or bugs can triggers such entries in system log?

Hi. I reported this problem a long time ago to support. The feedback was unsatisfactory: It is nor really a bug as long as everything is working fine. For me it should than be flagged as INFO and not as ERROR.

Hey all, from 7.4 onwards, the log level has been changed to DEBUG again. Revert "Use error log level for websocket timeouts (#19609)" by agnivade · Pull Request #20929 · mattermost/mattermost-server · GitHub

Please ignore the error for the time being. This can normally happen in a healthy system.

Awesome, that’s great to hear, thanks @agnivade for this info :slight_smile:

Hi!
After updating to 7.4 the error disappeared. But a new one has appeared and there are many of them.

{"timestamp":"2022-12-21 09:06:46.778 +05:00","level":"warn","msg":"websocket.slow: dropping message","caller":"app/web_conn.go:714","user_id":"8pbho6btcf8w5ydy56tq1yhuoy","type":"channel_viewed"}
{"timestamp":"2022-12-21 09:06:46.931 +05:00","level":"warn","msg":"websocket.slow: dropping message","caller":"app/web_conn.go:714","user_id":"8pbho6btcf8w5ydy56tq1yhuoy","type":"typing"}

This can happen when your server is generating high amount of traffic. If the websocket queue gets filled up, the server will start dropping low priority messages. This feature was always there since the beginning and hasn’t changed.

If you don’t want typing and channel viewed messages, you can disable ServiceSettings.EnableUserTypingMessages and ServiceSettings.EnableChannelViewedMessages

Thank you. Is it possible to somehow increase the throughput of web sockets?

The throughput is dictated by server resources and client consumption speed. If the client is slow in consuming the events, the server cannot do much.

One idea could be to increase the number of nodes in your cluster. This will shard traffic across nodes and give more breathing space for your servers. Or alternatively vertically scale up your existing server.

But even then slow clients will clog events for that connection, and server will drop them. But only for that client and not others.