[erlang-questions] Why isn't 'EXIT' message being received/processed?

Youngkin, Rich richard.youngkin@REDACTED
Tue Mar 3 15:36:06 CET 2015


Hi Robert,

Thanks for confirming that I did indeed understand what was going on.

The code ends up calling a gen_server, albeit indirectly. I'm using the
RabbitMQ amqp library, with a function call. The associated amqp module
does the gen_server call.

Thanks again for confirming.

Cheers,
Rich

On Mon, Mar 2, 2015 at 8:02 AM, Robert Virding <rvirding@REDACTED> wrote:

> Yes, your explanation is correct. When the signal is converted to a
> message it behaves just like any other message and is added to the end of
> the queue.
>
> How are you sending to the child process? If you just do Pid ! Msg then
> there will be no error if the process has died.
>
> Robert
>
>
> On 26 February 2015 at 01:36, Youngkin, Rich <richard.youngkin@REDACTED
> > wrote:
>
>> I have an explanation for why it appears as if an 'EXIT' message isn't
>> received and processed. It seems kind of obvious looking back, but also
>> subtle enough that I don't feel like a complete idiot :)  Hopefully this
>> may help someone else who trips over the problem. I'm also interested in
>> feedback, especially concerning some alternatives I describe below for
>> addressing it.
>>
>> So to recap, I have a process that creates a process using spawn_link.
>> This parent process has a receive loop with a clause that matches 'EXIT'.
>> The problem I encountered was that the child's 'EXIT' messages weren't
>> always being received by the parent process before the parent process tried
>> to send a message to the linked child process. In this case the parent
>> process would fail with a "noproc" error (i.e., the child process no longer
>> exists).
>>
>> The cause appears to be the order of messages received in the parent's
>> mailbox. The 'EXIT' message is handled just like any other message in that
>> a process's mailbox behaves like a queue (FIFO). So the 'EXIT' message is
>> placed in the mailbox after any other unprocessed messages. Those messages
>> will be handled before the 'EXIT' message. In my case, one of the prior
>> messages results in sending a message (PID ! Message) to the failed child
>> process before the 'EXIT' message is seen, a simple race condition.  I have
>> seen test results that support this theory.  Am I correctly characterizing
>> the handling of 'EXIT' messages WRT general mailbox behavior?
>>
>> Since this behavior is non-deterministic my simple approach of handling
>> 'EXIT' messages won't work. Alternatives I'm considering include:
>> 1. Selective receives, but I don't really see this as a robust approach
>> to handling this given the description of this approach in LYSE. LYSE also
>> describes a better approach using min_heap, but this seems like overkill in
>> my case.
>> 2. Handling 'EXIT' messages and backing that up with a try/catch for the
>> race condition.
>> 3. Forget linking altogether and just handle the problem with a try/catch
>> block
>>
>> I'm leaning towards option 3 and I'm interested in other opinions/options.
>>
>> Finally, this seems like a fairly common use case, especially in RabbitMQ
>> applications (where the recommendation is to monitor/link amqp_channel
>> processes). But maybe I'm missing something, or misusing 'EXIT'?  Any
>> comments?
>>
>> Thanks,
>> Rich
>>
>> On Tue, Feb 17, 2015 at 3:25 PM, Youngkin, Rich <
>> richard.youngkin@REDACTED> wrote:
>>
>>> Hi,
>>>
>>> I've got an app that spawn_links processes with trap_exit. I'm killing
>>> the linked processes but the monitoring process isn't always receiving the
>>> 'EXIT' message.  Here are some code snippets:
>>>
>>> ...
>>>   process_flag(trap_exit, true),
>>>   link(Connection),
>>>   link(Channel),
>>> ...
>>>
>>> loop(State) ->
>>>   ...
>>>   {'EXIT', What, Reason} ->
>>>     do_something_smart();
>>>
>>>   ...
>>>
>>> Connection and Channel are a RabbitMQ connection and channel (although
>>> that's not necessarily important to know). I'm manually running "force
>>> close" on the connection via the RabbitMQ admin interface to trigger the
>>> 'EXIT'.  In one case the 'EXIT' message is received and in the other case
>>> it isn't. Here are more code snippets to illustrate this (same loop/1
>>> function as above):
>>>
>>> loop(#state{channel=Channel, delay_ack= DelayAck} = State) ->
>>>   ...
>>>
>>>   {#'basic.deliver'{delivery_tag=DeliveryTag}, Content} ->
>>>     ... do something with the content
>>>     case DelayAck of
>>>       true ->
>>>         timer:sleep(500), %% allow time for 'EXIT' to arrive in the
>>> mailbox before "ack_delivery" message
>>>         self() ! {ack_delivery, Channel, DeliveryTag},
>>>         loop(State);
>>>       _ ->
>>>        amqp_channel:call(Channel,
>>> #'basic.ack'{delivery_tag=DeliveryTag}),
>>>        loop(State)
>>>     end;
>>>
>>>   {ack_delivery, Channel, DeliveryTag} ->
>>>     timer:sleep(50), %% ack delay
>>>     amqp_channel:call(Channel, #'basic.ack'{delivery_tag=DeliveryTag}),
>>>     loop(State);
>>>
>>>   ...
>>>
>>> In the above snippet DelayAck specifies whether the actual ack happens
>>> immediately or as a result of sending another message through loop/1.
>>> When DelayAck is false the 'EXIT' message is received as expected. When
>>> DelayAck is true there is a sleep of 500ms in order to allow the 'EXIT' to
>>> arrive in the mailbox before the {ack_delivery, Channel, DeliveryTag}
>>> message. But in this case the 'EXIT' message isn't received. The process
>>> instead fails with a "noproc" when invoking amqp_channel:call/2 in
>>> {ack_delivery...}. This makes sense since the Channel is now invalid, but I
>>> did expect 'EXIT' to be received first thereby avoiding this failure.
>>> Increasing the sleep before sending the {ack_delivery...} message doesn't
>>> make any difference (except to delay the "noproc" failure). The behavior
>>> described in this paragraph is consistent across several test runs.
>>>
>>> What would explain why the 'EXIT' message isn't received (ahead of the
>>> ack_delivery message, or even at all) in the DelayAck case?
>>>
>>> Thanks,
>>> Rich
>>>
>>>
>>>
>>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150303/dfd5c8b8/attachment.htm>


More information about the erlang-questions mailing list