DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 998C264D86
Subject: Re: [PATCH] ipmi: use rcu lock around call to
 intf->handlers->sender()
To: minyard@acm.org, openipmi-developer@lists.sourceforge.net
References: <1497365651-7413-1-git-send-email-tcamuso@redhat.com>
 <1a359a04-f1da-0e2e-8663-6527e84dd5a3@acm.org>
 <613dc158-e3de-f713-019e-cf8c190e72be@acm.org>
Cc: linux-kernel@vger.kernel.org
From: Tony Camuso <tcamuso@redhat.com>
Message-ID: <e354b8d3-043a-e1ed-5d2f-f5d0a5d3a9b3@redhat.com>
Date: Fri, 16 Jun 2017 09:11:20 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <613dc158-e3de-f713-019e-cf8c190e72be@acm.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 7293
Lines: 163

On 06/16/2017 08:15 AM, Corey Minyard wrote:
> On 06/15/2017 10:54 AM, Corey Minyard wrote:
>> On 06/13/2017 09:54 AM, Tony Camuso wrote:
>>> A vendor with a system having more than 128 CPUs occasionally encounters a
>>> crash during shutdown. This is not an easily reproduceable event, but the
>>> vendor was able to provide the following analysis of the crash, which
>>> exhibits the same footprint each time.
>>>
>>> crash> bt
>>> PID: 0      TASK: ffff88017c70ce70  CPU: 5   COMMAND: "swapper/5"
>>>   #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
>>>   #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
>>>   #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
>>>   #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
>>>   #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
>>>   #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
>>>   #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
>>>   #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
>>>   #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
>>>   #9 [ffff88085c143d70] page_fault at ffffffff8168e188
>>>      [exception RIP: unknown or invalid address]
>>>      RIP: ffffffffa053c800  RSP: ffff88085c143e28  RFLAGS: 00010206
>>>      RAX: ffff88017c72bfd8  RBX: ffff88017a8dc000  RCX: ffff8810588b5ac8
>>>      RDX: ffff8810588b5a00  RSI: ffffffffa053c800  RDI: ffff8810588b5a00
>>>      RBP: ffff88085c143e58   R8: ffff88017c70d408   R9: ffff88017a8dc000
>>>      R10: 0000000000000002  R11: ffff88085c143da0  R12: ffff8810588b5ac8
>>>      R13: 0000000000000100  R14: ffffffffa053c800  R15: ffff8810588b5a00
>>>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>> --- <IRQ stack> ---
>>>      [exception RIP: cpuidle_enter_state+82]
>>>      RIP: ffffffff81514192  RSP: ffff88017c72be50  RFLAGS: 00000202
>>>      RAX: 0000001e4c3c6f16  RBX: 000000000000f8a0  RCX: 0000000000000018
>>>      RDX: 0000000225c17d03  RSI: ffff88017c72bfd8  RDI: 0000001e4c3c6f16
>>>      RBP: ffff88017c72be78   R8: 000000000000237e   R9: 0000000000000018
>>>      R10: 0000000000002494  R11: 0000000000000001  R12: ffff88017c72be20
>>>      R13: ffff88085c14f8e0  R14: 0000000000000082  R15: 0000001e4c3bb400
>>>      ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
>>>
>>> This is the corresponding stack trace
>>>
>>> It has crashed because the area pointed with RIP extracted from timer
>>> element is already removed during a shutdown process.
>>>
>>> The function is smi_timeout().
>>>
>>> And we think ffff8810588b5a00 in RDX is a parameter struct smi_info
>>>
>>> crash> rd ffff8810588b5a00 20
>>> ffff8810588b5a00:  ffff8810588b6000 0000000000000000 .`.X............
>>> ffff8810588b5a10:  ffff880853264400 ffffffffa05417e0 .D&S......T.....
>>> ffff8810588b5a20:  24a024a000000000 0000000000000000 .....$.$........
>>> ffff8810588b5a30:  0000000000000000 0000000000000000 ................
>>> ffff8810588b5a40:  ffffffffa053a040 ffffffffa053a060 @.S.....`.S.....
>>> ffff8810588b5a50:  0000000000000000 0000000100000001 ................
>>> ffff8810588b5a60:  0000000000000000 0000000000000e00 ................
>>> ffff8810588b5a70:  ffffffffa053a580 ffffffffa053a6e0 ..S.......S.....
>>> ffff8810588b5a80:  ffffffffa053a4a0 ffffffffa053a250 ..S.....P.S.....
>>> ffff8810588b5a90:  0000000500000002 0000000000000000 ................
>>>
>>> Unfortunately the top of this area is already detroyed by someone.
>>> But because of two reasonns we think this is struct smi_info
>>>   1) The address included in between  ffff8810588b5a70 and ffff8810588b5a80:
>>>    are inside of ipmi_si_intf.c  see crash> module ffff88085779d2c0
>>>
>>>   2) We've found the area which point this.
>>>    It is offset 0x68 of  ffff880859df4000
>>>
>>> crash> rd  ffff880859df4000 100
>>> ffff880859df4000:  0000000000000000 0000000000000001 ................
>>> ffff880859df4010:  ffffffffa0535290 dead000000000200 .RS.............
>>> ffff880859df4020:  ffff880859df4020 ffff880859df4020    @.Y.... @.Y....
>>> ffff880859df4030:  0000000000000002 0000000000100010 ................
>>> ffff880859df4040:  ffff880859df4040 ffff880859df4040 @@.Y....@@.Y....
>>> ffff880859df4050:  0000000000000000 0000000000000000 ................
>>> ffff880859df4060:  0000000000000000 ffff8810588b5a00 .........Z.X....
>>> ffff880859df4070:  0000000000000001 ffff880859df4078 ........x@.Y....
>>>
>>>   If we regards it as struct ipmi_smi in shutdown process
>>>   it looks consistent.
>>>
>>> The remedy for this apparent race is affixed below.
>>
>> I think you are right about this problem, but in_shutdown is checked already
>> a bit before when newmsg is extracted from the list.  Wouldn't it be better
>> to add the rcu_read_lock() region starting right before the previous
>> in_shutdown check to after the send?  That would avoid a leak in this
>> case.
>
> While lying awake unable to sleep, I realized that you can't call the
> sender function while holding rcu_read_lock().  That will break RT,
> because you can't claim a mutex while holding rcu_read_lock(),
> and the sender function will claim normal spinlocks.
>
> So I need to think about this a bit.
>
> -corey
>
>>
>> Thanks,
>>
>> -corey
>>

Would this be adequate to prevent the race?
Is the sender's mutex/spinlock sufficient to limit access?

diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 9f69995..ebce6b3 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -3880,6 +3880,7 @@ static void smi_recv_tasklet(unsigned long val)
          */
         if (!run_to_completion)
                 spin_lock_irqsave(&intf->xmit_msgs_lock, flags);
+
         if (intf->curr_msg == NULL && !intf->in_shutdown) {
                 struct list_head *entry = NULL;
  
@@ -3894,11 +3895,13 @@ static void smi_recv_tasklet(unsigned long val)
                         newmsg = list_entry(entry, struct ipmi_smi_msg, link);
                         intf->curr_msg = newmsg;
                 }
+
+               if (newmsg)
+                       intf->handlers->sender(intf->send_info, newmsg);
         }
+
         if (!run_to_completion)
                 spin_unlock_irqrestore(&intf->xmit_msgs_lock, flags);
-       if (newmsg)
-               intf->handlers->sender(intf->send_info, newmsg);
  
         handle_new_recv_msgs(intf);
  }


>>> Signed-off-by: Tony Camuso <tcamuso@redhat.com>
>>> ---
>>>   drivers/char/ipmi/ipmi_msghandler.c | 9 +++++++--
>>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
>>> index 9f69995..577509f 100644
>>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>>> @@ -3897,8 +3897,13 @@ static void smi_recv_tasklet(unsigned long val)
>>>       }
>>>       if (!run_to_completion)
>>>           spin_unlock_irqrestore(&intf->xmit_msgs_lock, flags);
>>> -    if (newmsg)
>>> -        intf->handlers->sender(intf->send_info, newmsg);
>>> +
>>> +    if (newmsg) {
>>> +        rcu_read_lock();
>>> +        if (!intf->in_shutdown)
>>> +            intf->handlers->sender(intf->send_info, newmsg);
>>> +        rcu_read_unlock();
>>> +    }
>>>         handle_new_recv_msgs(intf);
>>>   }
>>
>>
>