Performance issues around email sends

Summary

When posting messages that will result in email sends, noticeable delays are evident. I suspect these are even leading to missing emails.

Steps to reproduce

Using mattermost 3.5 with email send configured with an smtp server, no authentication. Approx 250 users in the team.

Compare:

  • post message to an online user
  • post message to an offline user
  • post @all message in Town Square

Expected behavior

I would expect the interface to respond immediately in all three cases showing the message as posted. I would also expect all the emails to be sent of course.

Observed behavior

Posts to online users appear immediately.
Posts to offline users show a noticeable delay, during which the text is grey and there is a wait icon displayed.
Posts that will result in a lot of emails, like @all in Town Square, can take tens of seconds to appear confirmed.

This effect has become much more noticeable since we moved our mattermost server to a different host, which I suspect is because the communication between the new host and the smtp server is slower.

I suspect (unconfirmed) that we are also getting missing emails when the posting user does not wait for the post to complete. If they instead go to another channel, close their session etc, then I think the emails are not sent. We have had various reports of users failing to get email notifications in situations that they would expect to get them, and the evidence so far would be consistent with this theory although it’s difficult to know. There is one known example of an @all post being “abandoned” in this way while still showing the wait icon, and it seems that some users were notified by email while others were not. There are no errors in the log related to email sends.

I am not sure if this only affects users with their email send set to Immediate, but almost all of our users do have that set.

If I’m right about the underlying cause of the slow post and missing emails, I’d suggest that the email sending needs to be pushed to a background thread to resolve these issues.

Any help in further diagnosis would be great - as would any views on whether I am on the right track here!

Hi @gubbins,

Are you running 3.5.0? If so please try upgrading to 3.5.1, which can be downloaded here. That release includes a fix for performance issues related to sending email notifications. You can see the changelog for 3.5.1 here.

Oh yes, we are - that’s embarrassing, I didn’t even check for a new version! I guess I have been relying on email alerts to tell me there is one out.

I will upgrade and see how we go. Thanks for the reply

Upgraded to 3.5.1 and posted a @all message to announce it. On the plus side, the message post completed quickly.

Unfortunately the log filled up with 183 of these errors (we have 260 active users):

[2016/12/22 09:17:24 AEDT] [EROR] Failed to send mention email successfully email=redacted@domain err=SendMail: utils.mail.connect_smtp.open_tls.app_error, 421 4.3.2 The maximum number of concurrent connections has exceeded a limit, closing transmission channel

So our mail relay servers only allow 20 incoming connections from one domain (the default for MS exchange), and it looks like mattermost 3.5.1 is trying to open a lot more than that when we have a lot of notifications to send - does that sound plausible?

I am now wondering if I can run some kind of simple MTA in between to buffer them up.

I set up slimta as a mail relay. I confirmed that mattermost opens a concurrent connection for each message (over 200 connections for “all” notifications). Configuring slimta to queue them up and relay them over a max of 5 connections to the actual smtp server seems to work very well.

An added bonus is that slimta will retry failed emails (occasionally there’s a timeout or “Insufficient system resources” error on the smtp server).

I hope this is a useful tip for anyone else who gets the “maximum number of concurrent connections” error.

@jwilander I guess it would be nice if mattermost supported a max smtp connection count and retries itself?

Apologies for the late reply.

Hmmm, that does sound like an issue. I’ve opened a ticket here for us to take a look at it. We’ll be discussing it during our next developer meeting after the holidays.

If you’re interested in having input, feel free to join our daily-built Mattermost instance and the Developers channel.

OK thanks for the update.

The slimta solution was so easy that I wonder if it’s worth suggesting in your install guides anyway. Rather than re-implement a subset of the functionality (retries, max connection count) you can just let slimta handle it. It brings lots of other functionality then too like configurable email logging, filters, non-SMTP endpoints, …