How can I cancel/clean up import jobs? Is it normal that the Mattermost server restarts after starting a bulk import?

xtaran · October 14, 2022, 10:13am

While working on Import from JSONL ignores attachments, despite no error messages and "success" at the end I noticed two things:

The Mattermost server seems to restart (or crash and being restarted by systemd) after an successful mmctl import process … where the upload contains attachments. The mmctl import process … command actually returns “Status: pending”, but for about a dozen of seconds, all mmctl import job show … return “Coonnection refused” on port 8065 or (IIRC) “502 Gateway Timeout” via Nginx.

In /var/log/mattermost/mattermost.log I also see the typical startup message of the Mattermost server, but before that nothing what looks like a shutdown. So I suspect it’s a crash.

Here’s an excerpt from /var/log/mattermost/mattermost.logwhen it restarted after starting processing a bulk import (the first line from 09:36 is the last line from the previous restart):

{"timestamp":"2022-10-14 09:36:51.817 Z","level":"info","msg":"Post.Message has size restrictions","caller":"sqlstore/post_store.go:2442","max_characters":16383,"max_bytes":65535}
{"timestamp":"2022-10-14 10:05:24.442 Z","level":"info","msg":"Server is initializing...","caller":"app/server.go:269","go_version":"go1.18.1"}
{"timestamp":"2022-10-14 10:05:24.442 Z","level":"info","msg":"Pinging SQL","caller":"sqlstore/store.go:228","database":"master"}
{"timestamp":"2022-10-14 10:05:24.463 Z","level":"error","msg":"License key from https://mattermost.com required to unlock enterprise features.","caller":"app/license.go:152","error":"resource: License id: "}
{"timestamp":"2022-10-14 10:05:24.463 Z","level":"info","msg":"Starting websocket hubs","caller":"app/web_hub.go:93","number_of_hubs":4}
{"timestamp":"2022-10-14 10:05:24.464 Z","level":"info","msg":"Loaded system translations","caller":"i18n/i18n.go:93","for locale":"en","from locale":"/opt/mattermost/i18n/en.json"}
{"timestamp":"2022-10-14 10:05:24.491 Z","level":"info","msg":"Current version is 7.3.0 (7.3.0/Wed Sep 14 12:12:23 UTC 2022/5d1fe271e68cc3baecaa115732c7fb7bc5073ec7/3579daeda6a3b53c41eadda7391e14ffef009f6b)","caller":"app/server.go:580","current_version":"7.3.0","build_number":"7.3.0","build_date":"Wed Sep 14 12:12:23 UTC 2022","build_hash":"5d1fe271e68cc3baecaa115732c7fb7bc5073ec7","build_hash_enterprise":"3579daeda6a3b53c41eadda7391e14ffef009f6b"}
{"timestamp":"2022-10-14 10:05:24.492 Z","level":"info","msg":"Enterprise Build","caller":"app/server.go:589","enterprise_build":true}
{"timestamp":"2022-10-14 10:05:24.492 Z","level":"info","msg":"Printing current working","caller":"app/server.go:595","directory":"/opt/mattermost"}
{"timestamp":"2022-10-14 10:05:24.492 Z","level":"info","msg":"Loaded config","caller":"app/server.go:596","source":"postgres://mmuser:@localhost:5432/mattermost?sslmode=disable&connect_timeout=10"}
{"timestamp":"2022-10-14 10:05:24.506 Z","level":"info","msg":"Starting workers","caller":"jobs/workers.go:48"}
{"timestamp":"2022-10-14 10:05:24.507 Z","level":"info","msg":"Starting schedulers.","caller":"jobs/schedulers.go:47"}
{"timestamp":"2022-10-14 10:05:24.535 Z","level":"error","msg":"License key from https://mattermost.com required to unlock enterprise features.","caller":"app/license.go:152","error":"resource: License id: "}
{"timestamp":"2022-10-14 10:05:24.569 Z","level":"info","msg":"Starting up plugins","caller":"app/plugin.go:212"}
{"timestamp":"2022-10-14 10:05:24.569 Z","level":"info","msg":"Syncing plugins from the file store","caller":"app/plugin.go:286"}
{"timestamp":"2022-10-14 10:05:26.858 Z","level":"info","msg":"activated","caller":"app/plugin_api.go:971","plugin_id":"com.mattermost.apps"}
{"timestamp":"2022-10-14 10:05:26.928 Z","level":"info","msg":"got public IP address","caller":"app/plugin_api.go:971","plugin_id":"com.mattermost.calls","origin":"main.(*logger).Info log.go:84","addr":"129.132.126.4"}
{"timestamp":"2022-10-14 10:05:26.930 Z","level":"info","msg":"rtc: server is listening on udp 8443","caller":"app/plugin_api.go:971","plugin_id":"com.mattermost.calls","origin":"main.(*logger).Info log.go:84"}
{"timestamp":"2022-10-14 10:05:26.933 Z","level":"info","msg":"rtc: server is listening on udp 8443","caller":"app/plugin_api.go:971","plugin_id":"com.mattermost.calls","origin":"main.(*logger).Info log.go:84"}
{"timestamp":"2022-10-14 10:05:27.008 Z","level":"info","msg":"Starting Server...","caller":"app/server.go:1128"}
{"timestamp":"2022-10-14 10:05:27.009 Z","level":"info","msg":"Server is listening on [::]:8065","caller":"app/server.go:1201","address":"[::]:8065"}
{"timestamp":"2022-10-14 10:05:27.009 Z","level":"info","msg":"Sending systemd READY notification.","caller":"commands/server.go:131"}
{"timestamp":"2022-10-14 10:05:31.878 Z","level":"info","msg":"Post.Message has size restrictions","caller":"sqlstore/post_store.go:2442","max_characters":16383,"max_bytes":65535}

I just noticed I have about 20 import jobs running with “Status: in_progress”, mostly because my workflow initially exited on “Connection refused”. And the oldest of these “in progress” import jobs are from 8 days (!) ago. It also seems as if those are all the import jobs where the Mattermost server restarted after starting the processing of the bulk import.

Sample mmctl import job show … output:
```
ID: d7nzbmw4c3ratg1nxo77gicjmc
Status: in_progress
Created: 2022-10-06 13:14:00 +0000 UTC
Started: 2022-10-06 13:14:02 +0000 UTC
Data: map[import_file:kp7bq63pgfds3y71tdb1mbohyr_telegram2mm_y_l_cFCLf.zip]
```
(Note that this command has been run on 2022-10-14, i.e. on the day of this posting.)

Basically my main question on both things is: Is this normal?

Or to be more precise:

Is a server restart to be expected when starting to process a bulk import?
Is it normal that importing a chat history with nearly 100’000 messages including attachements (up to 710 MB) takes over a week? (Without attachments and restart/crash it IIRC was just around 10 minutes.)

And is it possible that a huge import blocks later smaller imports. Because some of these imports are also just debug imports with just a single line of chat. And they are also “in progress” for a week or so.

And as a follow up question (assuming that this isn’t normal ):

How can I cancel or clean up bulk import jobs, especially those in process? Haven’t found anything into that direction in the mmctl documentation.

Happened with Mattermost v7.3.0 instealled via the omnibus .deb on Debian 11. (v7.4.0 seems not yet available there despite already tagged in Git.)

agnivade · October 15, 2022, 4:11am

cc @noxer

agriesser · October 21, 2022, 5:38am

Nope

No.

No, but while the import was running, did you see any specific resource constraints on your system? I.e. high cpu and/or disk usage?

Usually, only one import can be running at the same time, but to be honest, I did never try to run concurrent imports. The in_progress status might be a red herring, I also experienced that when the import job crashed with an unhandled exception. There’s no healthcheck for the running job and I think not even a restart of the server fixes this state.

There’s no way to do that currently, unfortunately, besides fixing the job’s state in the database directly.

plusmid · October 21, 2022, 9:33am

Hi @xtaran
The “in_progress” jobs are crashed. When a job panics it will crash the whole server and never set the status to failed. There should be a stack trace printed on the console when this happens.
The server never cleans up those statuses, so the jobs are listed as in_progress indefinitely.
As @agriesser mentioned, currently the only way it to change the status in the database directly.

xtaran · December 6, 2022, 5:14pm

JFTR: The crashes I experienced seem to have been fixed with at least 7.5.1.

plusmid · December 12, 2022, 10:54am

That’s great news, it’s probably the fixed nil pointer exception in version 7.4.0. Thank you for the feedback!

xtaran · December 14, 2022, 3:12pm

I assume you refer to Initialize the logger for imports by noxer · Pull Request #20978 · mattermost/mattermost · GitHub — if so: Yes, I think, too, that this is what actually fixed my import crashes.

mmospanenko · November 26, 2023, 3:14pm

I have the same issue - all finished (by cpu activity), but status still in_progress for a while

MM 9.2.2
all utils are latest version on post moment

clean docker setup, --local connection via socket

without any interruptions

dump size more 30gb, but I fixed all warnings related to limits

Topic		Replies	Views
Import from JSONL ignores attachments, despite no error messages and "success" at the end Troubleshooting	13	1852	December 9, 2022
Attaching files failing using the Bulk Import Tool Troubleshooting	3	1531	April 22, 2020
Large slack import issues on mattermost 5.18.0 Command Line Interface	3	1063	January 7, 2020
Large File import from Slack fails with i/o timeout Troubleshooting	35	1308	March 30, 2023
Mattermost unstable after server update 4.7.3 Troubleshooting	13	1883	April 20, 2018

How can I cancel/clean up import jobs? Is it normal that the Mattermost server restarts after starting a bulk import?

Related topics