Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754127AbbHYDx1 (ORCPT ); Mon, 24 Aug 2015 23:53:27 -0400 Received: from mail4.hitachi.co.jp ([133.145.228.5]:38051 "EHLO mail4.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751990AbbHYDx0 (ORCPT ); Mon, 24 Aug 2015 23:53:26 -0400 From: =?utf-8?B?5rKz5ZCI6Iux5a6PIC8gS0FXQUnvvIxISURFSElSTw==?= To: "'minyard@acm.org'" CC: "openipmi-developer@lists.sourceforge.net" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH 7/7] ipmi/kcs: Don't run the KCS state machine when it is KCS_IDLE Thread-Topic: [PATCH 7/7] ipmi/kcs: Don't run the KCS state machine when it is KCS_IDLE Thread-Index: AQHQ1c5R+UxrINSkDEO5r8EHT60uwp4RB7iggAa6goCAAqlZUIAAXd0AgAFaueA= Date: Tue, 25 Aug 2015 03:53:22 +0000 Message-ID: <04EAB7311EE43145B2D3536183D1A8445495009D@GSjpTKYDCembx31.service.hitachi.net> References: <20150727055516.4759.34462.stgit@softrs> <20150727055516.4759.65106.stgit@softrs> <55CAC849.70902@acm.org> <04EAB7311EE43145B2D3536183D1A84454938089@GSjpTKYDCembx31.service.hitachi.net> <55D8B543.4000805@acm.org> <04EAB7311EE43145B2D3536183D1A8445493D025@GSjpTKYDCembx31.service.hitachi.net> <55DB3F8E.5030100@acm.org> In-Reply-To: <55DB3F8E.5030100@acm.org> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.198.220.63] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id t7P3rf2x015369 Content-Length: 4206 Lines: 96 > From: Corey Minyard [mailto:tcminyard@gmail.com] On Behalf Of Corey Minyard > > On 08/23/2015 08:52 PM, 河合英宏 / KAWAI,HIDEHIRO wrote: > >> From: Corey Minyard [mailto:tcminyard@gmail.com] On Behalf Of Corey Minyard > >> > >> On 08/17/2015 09:54 PM, 河合英宏 / KAWAI,HIDEHIRO wrote: > >>>> From: Corey Minyard [mailto:tcminyard@gmail.com] On Behalf Of Corey Minyard > >>>> > >>>> This patch will break ATN handling on the interfaces. So we can't do this. > >>> I understand. So how about doing like this: > >>> > >>> /* All states wait for ibf, so just do it here. */ > >>> - if (!check_ibf(kcs, status, time)) > >>> + if (kcs->state != KCS_IDLE && !check_ibf(kcs, status, time)) > >>> return SI_SM_CALL_WITH_DELAY; > >>> > >>> I think it is not necessary to wait IBF when the state is IDLE. > >>> In this way, we can also handle the ATN case. > >> I think it would be more reliable to go up a level and add a timeout. > > It may be so, but we should address this issue separately (at least > > I think above solution reasonably solves the issue). > > > > This issue happens after all queued messages are processed or dropped > > by timeout. There is no current message. So what should we set > > a timeout against? We can add a timeout into my new flush_messages(), > > but that is meaningful only in panic context. That doesn't help > > in normal context; we would perform a busy loop of smi_event_handler() > > and schedule() in ipmi_thread(). > > I'm a little confused here. Is the problem that the ATN bit is stuck > high? If so, it's going to be really hard to work around this without > breaking ATN handling. Sorry for my insufficient explanation. I assume the case where IBF bit is always 1. I don't know what happens when BMC hangs up, but I guess IBF stays in 1 because my server's BMC behaves as such while rebooting. Regards, Hidehiro Kawai > >> One should > >> be there, anyway. I thought they were all covered, but I may have missed > >> something. > >> > >> -corey > >> > >>> Regards, > >>> > >>> Hidehiro Kawai > >>> Hitachi, Ltd. Research & Development Group > >>> > >>>> It's going to be extremely hard to recover if the BMC is not working > >>>> correctly when a panic happens. I'm not sure what can be done, but if > >>>> you can fix it another way it would be good. > >>>> > >>>> -corey > >>>> > >>>> On 07/27/2015 12:55 AM, Hidehiro Kawai wrote: > >>>>> If a BMC is unresponsive for some reason, it ends up completing > >>>>> the requested message as an error, then kcs_event() is called once > >>>>> to advance the state machine. However, since the BMC is > >>>>> unresponsive now, the status of the KCS interface may not be > >>>>> idle. As the result, the state machine can continue to run and > >>>>> comsume CPU time indefinitely even if there is no more request > >>>>> message. Moreover, if this happens in run-to-completion mode > >>>>> (i.e. context of panic_event()), the kernel hangs up. > >>>>> > >>>>> To fix this problem, this patch ignores kcs_event() call if there > >>>>> is no request message to be processed. > >>>>> > >>>>> Signed-off-by: Hidehiro Kawai > >>>>> --- > >>>>> drivers/char/ipmi/ipmi_kcs_sm.c | 4 ++++ > >>>>> 1 file changed, 4 insertions(+) > >>>>> > >>>>> diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c > >>>>> index 8c25f59..0e187fb 100644 > >>>>> --- a/drivers/char/ipmi/ipmi_kcs_sm.c > >>>>> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c > >>>>> @@ -353,6 +353,10 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time) > >>>>> if (kcs_debug & KCS_DEBUG_STATES) > >>>>> printk(KERN_DEBUG "KCS: State = %d, %x\n", kcs->state, status); > >>>>> > >>>>> + /* We don't want to run the state machine when the state is IDLE */ > >>>>> + if (kcs->state == KCS_IDLE) > >>>>> + return SI_SM_IDLE; > >>>>> + > >>>>> /* All states wait for ibf, so just do it here. */ > >>>>> if (!check_ibf(kcs, status, time)) > >>>>> return SI_SM_CALL_WITH_DELAY; > >>>>> > >>>>> ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?