Dial tcp: lookup push-test.mattermost.com: i/o timeout

Summary
No notifications are working, and getting the following error in notifications.log: dial tcp: lookup push-test.mattermost.com: i/o timeout

Steps to reproduce
Using Team Edition 7.8 on AlmaLinux 8.7 with Docker and Nginx over SSL.

Expected behavior
Push notifications have not worked since I deployed the Docker container. TPNS is enabled in console with proper push notification settings set correctly.

Observed behavior
Below is a copy/paste of one of the error messages in the notifications.log. Unfortunately, no other errors can be found in any other logs. I’ve scoured these forums, Google and Github, and even tried GPT and Sydney, but none of the suggestions have worked.

{“timestamp”:“2023-02-26 22:16:41.603 -05:00”,“level”:“error”,“msg”:“Notification error”,“caller”:“app/notification_push.go:110”,“logSource”:“notifications”,“ackId”:“o9ygsjw5u7bruqwfwh6n4iuxwo”,“type”:“message”,“userId”:“68p3x4pnfpfatys8netawodgeo”,“postId”:“ur4878upkfd4ur9m8qjtx438we”,“channelId”:“ix7oz4i9dirtfbauoukpbrjfky”,“deviceId”:“73757f67955fcb2fddb4f979d745a28e45b248606c315512c751e6b08932dafe”,“status”:“Post "https://push-test.mattermost.com/api/v1/send_push\”: dial tcp: lookup push-test.mattermost.comR: i/o timeout"}

Please help! :slight_smile:

Hi @mikebowden ,

this would indicate that your Mattermost server is unable to connect ot the push proxy on a network level. Do you have outbound ACLs active which would disallow this connection?

Running the following command on your Mattermot server should work and return a simple HTML telling you that this is the pushproxy service:

# curl https://push-test.mattermost.com/
<html><body>Mattermost Push Proxy</body></html>

BTW: This is only the test server, it should not be used for production services. The production service is available at push.mattermost.com or hpns-de.mattermost.com if you want to use one hosted in Germany.

Sorry, I forgot to mention that curling the TPNS server works just fine. I get the text back just as you pasted it.

That’s interesting… So when you’re in the docker container (docker exec ...) you can curl the URL without problems, but as soon as the Mattermost application tries to send a notification, it times out? Does this happen for all three (test, prod-us, prod-de) push service servers?

I haven’t tried anything other than the TPNS. I need a subscription to use the others, don’t I?

Here is the output of the command, works fine.

[root@adsl-70-237-101-177 ~]# docker exec 3f17543172e1 curl https://push-test.mattermost.com/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    47  100    47    0     0     11      0  0:00:04  0:00:04 --:--:--    11
<html><body>Mattermost Push Proxy</body></html>[root@adsl-70-237-101-177 ~]# 

@agriesser I swapped the server out for push.matter and I get the same error.

{"timestamp":"2023-02-27 10:16:29.317 -05:00","level":"error","msg":"Notification error","caller":"app/notification_push.go:110","logSource":"notifications","ackId":"rngajksznp8ump7q3ha4yr5sbe","type":"message","userId":"68p3x4pnfpfatys8netawodgeo","postId":"7nn5g4rjp3fj5eumwif3kwkiyw","channelId":"ix7oz4i9dirtfbauoukpbrjfky","deviceId":"73757f67955fcb2fddb4f979d745a28e45b248606c315512c751e6b08932dafe","status":"Post \"https://push.mattermost.com/api/v1/send_push\": dial tcp: lookup push.mattermost.com: i/o timeout"}

This appears to be related.

error [2023-02-27 19:03:58.670 -05:00] Failed to install marketplace plugin. caller="web/context.go:117" path=/api/v4/plugins/marketplace request_id=ppznh9ugi7rp9j6rja1z4ht84r ip_addr=172.20.0.4 user_id=68p3x4pnfpfatys8netawodgeo method=POST err_where=InstallMarketplacePlugin http_code=500 error="InstallMarketplacePlugin: Failed to install marketplace plugin., download failed after multiple retries.: failed to fetch from https://plugins-store.test.mattermost.com/release/mattermost-plugin-playbooks-v1.28.2-linux-amd64.tar.gz: Get "https://plugins-store.test.mattermost.com/release/mattermost-plugin-playbooks-v1.28.2-linux-amd64.tar.gz": dial tcp: lookup plugins-store.test.mattermost.com: i/o timeout"

Can’t install boards or playbooks. I get the same error on both.

There definitely is a problem with the outbound connection of your Mattermost application container, or the internet is so slow on your end, that it times out.

How long does downloading this file from within your container take?

docker exec curl --output /tmp/temp.tar.gz https://plugins-store.test.mattermost.com/release/mattermost-plugin-playbooks-v1.28.2-linux-amd64.tar.gz

Yeah, not an Internet issue. Dedicated 1GB fiber line.

That download in 4s.

[root@mm ~]# docker exec 86aa48dc9650 curl --output /tmp/temp.tar.gz https://plugins-store.test.mattermost.com/release/mattermost-plugin-playbooks-v1.28.2-linux-amd64.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18.7M  100 18.7M    0     0  3857k      0  0:00:04  0:00:04 --:--:-- 4662k

I was able to get the playbook and boards working again. Not sure what happened, but I renamed the folders in data, rebooted, and they re-downloaded and started working again.

Push notifications started working earlier today for about 10 mins, then stopped and haven’t worked again since. I’m unsure what happened and why they kicked in and stopped.

I’m a bit puzzled by the messages I can see in your logs, honestly.
The message clearly states that there’s an i/o timeout trying to connect to the external resources, but whenever you run the commands manually, they work.

The internal http method of the Mattermost server seems to have some fixed timeouts set for DNS resolution and content reception after the connection has been established, maybe your DNS resolution is slow inside the container or something like this and therefore you exceed the configured timeouts?

You pegged it.

I realized we had an old search domain configured in resolv.conf that is no longer in use. It had to timeout, then process the next in line. Which, of course, caused too much of a delay in DNS resolution, and it timed out.

Thanks for the help in tracking this down!

I guess I should have taken the error message at face value, it did tell me what was wrong. haha

Woohoo :slight_smile: This was a tough one, thanks for confirming that it works now :slight_smile: