Large File import from Slack fails with i/o timeout

Thanks, I upped the postgresql connections to see if it allows me to continue the import. This seems to be successful so far. I did see reference to the experimental feature I enabled for Bleve indexing. I have disabled this and it appears to be processing the import now.

This should take a while.

So the import failed this time due to referencing a user that was not created. I’m recreating the jsonl to re-add to the import and will try again.

Hooray! That actually worked!

So disabling the bleve indexing and email notifications and breaking down the jobs into smaller parts seems to be the thing. The server also seemed to require at minimum 16-32GB of RAM and after turning off the oom-killer at the OS level might have helped, not sure about that one. Still have to try the last and largest import. Will let it run overnight.

Awesome, yes we still have some work to do reduce the mem usage.

1 Like

Edit: Thought I had posted this last night…
almost immediate error:

[
  {
    "id": "pgw9itp5kfykudrb87ynfkqj3e",
    "type": "import_process",
    "priority": 0,
    "create_at": 1680065039664,
    "start_at": 1680065045592,
    "last_activity_at": 1680065047022,
    "status": "error",
    "progress": -1,
    "data": {
      "error": "Error during job execution. — BulkImport: Channel type is invalid.",
      "import_file": "bulk_import_01022022_03222023.zip",
      "line_number": "1412"
    }
  }
]

So this line in the jsonl is referencing a user to import, not a channel. This doesn’t seem to make sense.

Any ideas?

OK, so I was apparently not looking at the correct line as this line does reference a channel definition:

{"type":"channel","channel":{"team":"company","name":"c047u3dh05p","display_name":"c047u3dh05p","type":"G","header":"","purpose":"Group messaging with: @first1.last1 @first2.last2 @first3.last3 @first4.last4 @first5.last5 @first6.last6 @first7.last7 @first8.last8 @first9.last9"}}

Not sure where channel definitions are documented at this detail level, but could this be unsupported? It would be useful to know all supported channel types, maybe I can just assign it a supported channel type.

Only open (O) and private (P) channels are supported. So DM and GM channels are not supported. You’d need to exclude them from the export.

1 Like

Going to just change the channel type to (P) since all of the others are of similar type and re-zip the jsonl. Hopefully that’s the last error. I didn’t find any additional matching patterns in the file.

bulk import was recreated successfully and the import appears to be processing successfully, will hopefully be marking this as solved today

So it ran for a while and then gave me this error:

[
  {
    "id": "gs5gzzdby3bairigwi95w4674w",
    "type": "import_process",
    "priority": 0,
    "create_at": 1680112696068,
    "start_at": 1680112704816,
    "last_activity_at": 1680113869661,
    "status": "error",
    "progress": -1,
    "data": {
      "error": "Error during job execution. — BulkImport: Missing required direct post property: channel_members",
      "import_file": "bulk_import_01022022_03222023.zip",
      "line_number": "474853"
    }
  }
]

Oddly enough the “channel_members” field does have a value but it’s null:

{"type":"direct_post","direct_post":{"channel_members":null,"user":"first.last","type":null,"message":"This is a dummy message.","props":null,"create_at":1672794825652,"edit_at":null,"flagged_by":null,"reactions":null,"replies":[],"attachments":[]}}

The previous two “direct_post” items also have the identical value for “channel_members” so I’m confused. It also seems odd that there would be a direct post to a channel with no members. Would it make sense to exclude these data since it doesn’t seem anyone should have access to it?

How would Mattermost handle this type of import data?

Edit: just decided to drop those posts since it seems the count was low (153) now on to dropping a message that included someone pinging Google. Re-zipping the file each time feels redundant to have the import process unzip it every time.

This could be posts to a channel which does not exist anymore and have been deleted in the source, or a channel which could not be converted properly and is therefore missing in the import. Can you find the message in the source files and if so, in what context?

1 Like

On a side note, the message length errors are suddenly gone now? Or did you reduce the length of those messages?

1 Like

I suspected the same, it seemed to be an impromptu channel created amongst a group of people that subsequently dropped out afterwards so thus justified my excluding the data associated. I wasn’t able to identify the original data in Slack since access to those data would have required advanced knowledge about the platform I don’t posses currently.

I dropped the single message that contained the message length error that contained the ping/response and re-ran the import job successfully! So I’m guessing the characters used within the message somehow are consuming enough characters above the predefined limit mentioned earlier (since the message character length was only 9432) as I did not encounter any additional errors after that point.

Thank you for your invaluable support and communication in order to get this finished! I’m not really sure which to mark as the solution since it appears to be a team effort.

I was more thinking of a zgrep -ir <pattern> <slackfile.zip> approach :slight_smile: But I think it’s really just something along the lines of an export inconsistency which is perfectly fine being handled with a removal of these messages.

Before picking a solution here let’s make sure your system is really running after the import - no need to rush on that :slight_smile:

Solution to the problem and burden of a large file import:

  • You can manually copy the large import zip file directly to “/<install_path_of_mattermost>/data/import” without having to muck about with importing it via the command line or web interface. It will then show up when running “mmctl import show available”

  • Ensure your server has at least 16GB to 32GB of RAM (this is where mine operated for the size of import even though I increased it far above this amount) as the import process isn’t as efficient as it could be yet.

  • Break it down into smaller chunks to manipulate easily. I started with a 1.1TB file and ended up with smaller 10GB to 600GB files.

  • Disable Bleve indexing if enabled (I had it enabled)

  • Disable Email notifications (not sure if this was needed but if you’re not running 7.9.1 it might affect the import according to the release notes)

  • Identify any “direct_post”:{} entries that might contain “channel_members”:null and consider dropping them or adding channel members to import this data as needed (I chose to drop it).

  • Identify any channels that are of “type”:“G” and change them to “type”:“P” or drop them if not needed.

I also disabled the Linux based OOM killer and increased the connection count in postgres.conf above 100 but I can’t confirm or deny that actually was helpful in the solution of the initial problem since it seems Bleve indexing was the cause for too many connections to postgres.

@agriesser and @agnivade are due credit for this solution, this is just marked as solution since it was a progressive thing vs. a simple solution. Hopefully this TLDR; will help someone else in the future.

Thanks again for all your help!

3 Likes