Running `mattermost extract-documents-content` after going to 5.35 gave errors

Hi all,

I updated from 5.33.2 to 5.35.1 and that went fine. Then I ran mattermost extract-documents-content as suggested and at the end it spewed the below. It looks like a crash and backtrace I guess?

Anyone else see this? What does it mean exactly?

{"level":"warn","ts":1621736812.116505,"caller":"docextractor/combine.go:35","msg":"unable to extract file content","error":"exec: \"pdftotext\": executable file not found in $PATH"}
{"level":"info","ts":1621736812.1348782,"caller":"app/server.go:911","msg":"Stopping Server..."}
{"level":"info","ts":1621736812.1354005,"caller":"app/web_hub.go:103","msg":"stopping websocket hub connections"}
{"level":"info","ts":1621736812.1388233,"caller":"app/server.go:997","msg":"Server stopped"}
panic: loading {11 0}: found {10 0}

goroutine 1 [running]:*Reader).resolve(0xc00689db00, 0x0, 0x2822080, 0xc000e75a90, 0xc006a254d8, 0x0, 0x1000, 0xc001877680) +0xdd6, 0x0, 0x279e280, 0xc005ae12c0, 0x2b09a7b, 0x4, 0x0, 0x0, 0xc006a25440, 0x26758c0) +0x9e*Reader).NumPage(0xc00689db00, 0x1946c0f) +0x69*Reader).GetPlainText(0xc00689db00, 0xc005d12048, 0x13a7110, 0x0, 0xc00689db00) +0x45*pdfExtractor).Extract(0x4393048, 0xc004685360, 0x15, 0x2f9bde0, 0xc005d12038, 0x0, 0x0, 0x0, 0x0) +0x35f*combineExtractor).Extract(0xc001eec680, 0xc004685360, 0x15, 0x2f9bde0, 0xc005d12038, 0xc006a61d80, 0x2, 0x4, 0x2f9bde0) +0x142, 0x15, 0x2f9bde0, 0xc005d12038, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc001877b60, ...) +0x2d4*App).ExtractContentFromFileInfo(0xc0000b0a20, 0xc004698500, 0x0, 0x0) +0x1f0, 0x4393048, 0x0, 0x0, 0x0, 0x0) +0x3c5*Command).execute(0x43182a0, 0x4393048, 0x0, 0x0, 0x43182a0, 0x4393048) +0x47c*Command).ExecuteC(0x431e6a0, 0x0, 0xffffffff, 0xc000102058) +0x375*Command).Execute(...)
main.main() +0x86

We have a ticket open and are working on a fix: [MM-35990] Mattermost crashes when content extractor dependencies not present - Mattermost.

1 Like

5.35.2 dot release is now available with this fix,

Yup, seems fixed, though now I hit a different crash:

fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x2b55bb5, 0x16)
	runtime/panic.go:1116 +0x72
runtime.sysMap(0xc024000000, 0x8000000, 0x4397378)
	runtime/mem_linux.go:169 +0xc6
runtime.(*mheap).sysAlloc(0x4379a20, 0x7c00000, 0x42df97, 0x4379a28)
	runtime/malloc.go:727 +0x1e5
runtime.(*mheap).grow(0x4379a20, 0x3c63, 0x0)
	runtime/mheap.go:1344 +0x85
runtime.(*mheap).allocSpan(0x4379a20, 0x3c63, 0x100, 0x4397388, 0xfffffffffffffade)
	runtime/mheap.go:1160 +0x6b6
	runtime/mheap.go:907 +0x65
runtime.(*mheap).alloc(0x4379a20, 0x3c63, 0x7f09ef460001, 0x42ba4a)
	runtime/mheap.go:901 +0x85
runtime.largeAlloc(0x78c501a, 0x460100, 0xc018ac0000)
	runtime/malloc.go:1177 +0x92
	runtime/malloc.go:1071 +0x46
	runtime/asm_amd64.s:370 +0x66

This was when it was processing a 150 MB file. My server has only 1.25 GiB of memory (it’s a VM, I’ll try increasing it), but it has plenty of swap space, so I’m a bit surprised it didn’t succeed anyway.

@jespino Can you help take a look?

Hi @seanm,

This looks like is related to linux kernel memory management and how it works. Maybe this link has explanation: linux - Out of memory, but swap available - Unix & Linux Stack Exchange

Anyway, there is a bad combination of huge extractable files and low memory availability, probably we can add a configuration setting to limit the extracted content to certain size. What do you think @eric?

1 Like