Skip to main content


I finally got a backtrace with debug symbols


hey, if you get a chance to write up a blog post about how you walked through this process to get to this point it would be amazing. I really need to learn how to get to this point.

I understand the terminology and what the tools are doing, but have no clue how to do it.
I already need to give a TED Talk on how I made Pleroma go from compiling 172 files to 14. You’ll see that MR soon.
Yep exactly. And I think it’s actually this line: https://github.com/erlang/otp/blob/OTP-23.3.4/erts/emulator/beam/erl_proc_sig_queue.c#L855

I don’t know much C, but I guess that pointer &proc could be the problem.
What's the context here? At face value there's nothing wrong with that line.
It’s causing gleasonator to segfault like 10 times per day.
Are you positive it’s that line, and is it always that line?

The only pointer that’s being dereferenced on that line is proc, which is also dereferenced shortly before, so I don’t see how that line could normally cause a segfault.

The & operator binds after the -> and . operators, meaning that it applies to the whole expression proc->sig_inq.first. In other words, that == is checking whether proc->sig_inq.nmsigs.next is a pointer to the memory location of proc->sig_inq.first. (The former is an ErtsMessage ** and the latter an ErtsMessage * so that makes sense.)

My C isn’t all that great either but I think the most likely thing to be going on is concurrent access to the data causing the problem, like proc being a valid pointer during the execution of one line and suddenly not anymore during the execution of the next line, because a thread running in parallel did something. That’s assuming the segfault really happened on that line.
A kind denizen of #c on Libera.Chat says that my analysis is sound. I had to ask to make sure I'm not spouting BS. 😛

Another possible cause they named is stack corruption. To quote: "thread bug or stack corruption or some other external effect, the code as written seems fine."
#c
oooh ooooh

Alex if you're using docker, what's the base here? Perhaps this could be a glibc vs musl issue?
It’s not docker. I can reproduce it on Spinster and Gleasonator, both are VMs running Ubuntu 20.04 and I’m running Pleroma with systemd