Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp9447133imu; Sat, 29 Dec 2018 20:02:34 -0800 (PST) X-Google-Smtp-Source: ALg8bN5LIlhXJHlbmiW0tjXjd/g5AI2gwrW0DYLXmki8zOl8cYOQbglSBUlQxMIpB9FCCEcGdXOg X-Received: by 2002:a65:5a4c:: with SMTP id z12mr3516678pgs.188.1546142554472; Sat, 29 Dec 2018 20:02:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546142554; cv=none; d=google.com; s=arc-20160816; b=LC7iuUCTHbp/s3oQ//rjgD3Bu/EgYxIFccq85YQeS6/3AfNxHkNLzq0RXAmHDdJYhS rsP6KybA8mmbdalVTXLanLwD2LkjPTiVRdr54n1r9izmBquAlQmxgTY9wzFWPPkj7ZSk Afj6vf7XBzsxcZoHCqJXtY9VNG5bDUm01KbPjfIzlYSfe09Ld2B5AxfuBLApvLkkvjpI iC7TINKnB+/aut1ScHtZ6ZZIwm01Mwvaw7R0nWOwceW5hLqqCfjG189QRZNGYiE7cPJ3 kvgXiFso7gvWd5oKDFcsaZcQ6rJzqFQxqP7iYvb1/nAMzMwk55g/IZ3ndJm3YW71J4xk Uscw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:references:to:from:subject; bh=TZd00JitEDgP+yZ6uzrTWfC5DifsRTVEi0xDqkGcK6Y=; b=Ky6Van6+4G5kjgbU1a3mdtzLXmHtpGnPaHfhShT7SEKo3PGZqvhUfqu9IiXVlr74ex ONzLKVbrp/Z/5MSoeGeBKxs8vq1Myw7sNrnlkzL5v3p3OYTP1Yp2FkvoqrvnDQ4CaFAM ggEooDfS9P7NW5jYhIJFz0OLfBfy+daM22CHoUwSr2NnQw58ZAjmSMlVKSdO+ScymggR UZbcKCxi8uWa9zPNbSd5RgUAwT7WsNg84sezerW/k5gVtv9mzKu5PPxe0aZc80Qhu6Qv jFna6UJZzspXsxbZlvNSG4oV/A45dd/VpNU2U565RGs3inL574wIWQGv1XqMBC+/KI+s P9eA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x8si39536751pll.187.2018.12.29.20.02.03; Sat, 29 Dec 2018 20:02:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725950AbeL3EAl (ORCPT + 99 others); Sat, 29 Dec 2018 23:00:41 -0500 Received: from mail.nwra.com ([72.52.192.72]:52746 "EHLO mail.nwra.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725892AbeL3EAl (ORCPT ); Sat, 29 Dec 2018 23:00:41 -0500 Received: from pacas.cora.nwra.com (c-67-166-25-97.hsd1.co.comcast.net [67.166.25.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail.nwra.com (Postfix) with ESMTPSA id 643B33404AA; Sat, 29 Dec 2018 20:00:38 -0800 (PST) Subject: Re: [PATCH] fanotify: allow freeze on suspend when waiting for response from userspace From: Orion Poplawski To: linux-kernel@vger.kernel.org, Vivek Trivedi , Jan Kara References: Organization: NorthWest Research Associates Message-ID: <41fc7839-f2f1-22d8-b038-5f5ed577ac3b@nwra.com> Date: Sat, 29 Dec 2018 21:00:28 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/29/18 3:34 PM, Orion Poplawski wrote: > On 12/29/18 3:04 PM, Orion Poplawski wrote: >>> On Thu 22-02-18 15:14:54, Kunal Shubham wrote: >>>> >> On Fri 16-02-18 15:14:40, t.vivek@samsung.com wrote: >>>> >> From: Vivek Trivedi >>>> >> >> If fanotify userspace response server thread is frozen first, >>>> >> it may fail to send response from userspace to kernel space >>>> listener. >>>> >> In this scenario, fanotify response listener will never get response >>>> >> from userepace and fail to suspend. >>>> >> >> Use freeze-friendly wait API to handle this issue. >>>> >> >> Same problem was reported here: >>>> >> https://bbs.archlinux.org/viewtopic.php?id=232270 >>>> >> >> Freezing of tasks failed after 20.005 seconds >>>> >> (1 tasks refusing to freeze, wq_busy=0) >>>> >> >> Backtrace: >>>> >> [] (__schedule) from [] (schedule+0x4c/0xa4) >>>> >> [] (schedule) from [] >>>> (fanotify_handle_event+0x1c8/0x218) >>>> >> [] (fanotify_handle_event) from [] >>>> (fsnotify+0x17c/0x38c) >>>> >> [] (fsnotify) from [] >>>> (security_file_open+0x88/0x8c) >>>> >> [] (security_file_open) from [] >>>> (do_dentry_open+0xc0/0x338) >>>> >> [] (do_dentry_open) from [] (vfs_open+0x54/0x58) >>>> >> [] (vfs_open) from [] >>>> (do_last.isra.10+0x45c/0xcf8) >>>> >> [] (do_last.isra.10) from [] >>>> (path_openat+0x424/0x600) >>>> >> [] (path_openat) from [] >>>> (do_filp_open+0x3c/0x98) >>>> >> [] (do_filp_open) from [] >>>> (do_sys_open+0x120/0x1e4) >>>> >> [] (do_sys_open) from [] (SyS_open+0x28/0x2c) >>>> >> [] (SyS_open) from [] >>>> (__sys_trace_return+0x0/0x20) >>>> > >>>> > Yeah, good catch. >>>> > >>>> >> @@ -63,7 +64,9 @@ static int fanotify_get_response(struct >>>> fsnotify_group *group, >>>> >> >>      pr_debug("%s: group=%p event=%p\n", __func__, group, event); >>>> >> >> -    wait_event(group->fanotify_data.access_waitq, >>>> event->response); >>>> >> +    while (!event->response) >>>> >> +        wait_event_freezable(group->fanotify_data.access_waitq, >>>> >> +                     event->response); >>>> > >>>> > But if the process gets a signal while waiting, we will just >>>> livelock the >>>> > kernel in this loop as wait_event_freezable() will keep returning >>>> > ERESTARTSYS. So you need to be a bit more clever here... >>>> >>>> Hi Jack, >>>> Thanks for the quick review. >>>> To avoid livelock issue, is it fine to use below change? If agree, I >>>> will send v2 patch. >>>> >>>> @@ -63,7 +64,11 @@ static int fanotify_get_response(struct >>>> fsnotify_group *group, >>>> >>>>         pr_debug("%s: group=%p event=%p\n", __func__, group, event); >>>> >>>> -       wait_event(group->fanotify_data.access_waitq, event->response); >>>> +       while (!event->response) { >>>> +               if >>>> (wait_event_freezable(group->fanotify_data.access_waitq, >>>> +                                       event->response)) >>>> +                       flush_signals(current); >>>> +       } >>> >>> Hum, I don't think this is correct either as this way if any signal was >>> delivered while waiting for fanotify response, we'd just lose it while >>> previously it has been properly handled. So what I think needs to be >>> done >>> is that we just use wait_event_freezable() and propagate non-zero return >>> value (-ERESTARTSYS) up to the caller to handle the signal and >>> restart the >>> syscall as necessary. >>> >>>                                 Honza >>> -- >>> Jan Kara >>> SUSE Labs, CR >> >> Is there any progress here?  This has become a real pain for us while >> running BitDefender on EL7 laptops.  I tried applying the following to >> the EL7 kernel: >> >> diff -up >> linux-3.10.0-957.1.3.el7.x86_64/fs/notify/fanotify/fanotify.c.orig >> kernel-3.10.0-957.1.3.el7/linux-3.10.0-957.1.3.el7.x86_64/fs/notify/fanotify/fanotify.c >> >> --- linux-3.10.0-957.1.3.el7.x86_64/fs/notify/fanotify/fanotify.c.orig >> 2018-11-15 10:07:13.000000000 -0700 >> +++ linux-3.10.0-957.1.3.el7.x86_64/fs/notify/fanotify/fanotify.c >> 2018-12-28 15:44:26.452895337 -0700 >> @@ -9,6 +9,7 @@ >>   #include >>   #include >>   #include >> +#include >> >>   #include "fanotify.h" >> >> @@ -64,7 +65,12 @@ static int fanotify_get_response(struct >> >>          pr_debug("%s: group=%p event=%p\n", __func__, group, event); >> >> -       wait_event(group->fanotify_data.access_waitq, event->response); >> +       while (!event->response) { >> +               ret = >> wait_event_freezable(group->fanotify_data.access_waitq, >> +                                          event->response); >> +               if (ret < 0) >> +                       return ret; >> +       } >> >>          /* userspace responded, convert to something usable */ >>          switch (event->response & ~FAN_AUDIT) { >> >> but I get a kernel panic shortly after logging in to the system. >> I tried a slightly different patch to see if setting event->response = 0 helps and to confirm the return value of wait_event_freezable: --- linux-3.10.0-957.1.3.el7/fs/notify/fanotify/fanotify.c 2018-11-15 10:07:13.000000000 -0700 +++ linux-3.10.0-957.1.3.el7.fanotify.x86_64/fs/notify/fanotify/fanotify.c 2018-12-29 16:05:53.451125868 -0700 @@ -9,6 +9,7 @@ #include #include #include +#include #include "fanotify.h" @@ -64,7 +65,15 @@ pr_debug("%s: group=%p event=%p\n", __func__, group, event); - wait_event(group->fanotify_data.access_waitq, event->response); + while (!event->response) { + ret = wait_event_freezable(group->fanotify_data.access_waitq, + event->response); + if (ret < 0) { + pr_debug("%s: group=%p event=%p about to return ret=%d\n", __func__, + group, event, ret); + goto finish; + } + } /* userspace responded, convert to something usable */ switch (event->response & ~FAN_AUDIT) { @@ -75,7 +84,7 @@ default: ret = -EPERM; } - +finish: /* Check if the response should be audited */ if (event->response & FAN_AUDIT) audit_fanotify(event->response & ~FAN_AUDIT); and I enabled the pr_debug. This does indeed trigger the panic: [ 4181.113781] fanotify_get_response: group=ffff9e3af9952b00 event=ffff9e3aea426c80 about to return ret=-512 [ 4181.113788] ------------[ cut here ]------------ [ 4181.113804] WARNING: CPU: 0 PID: 24290 at fs/notify/notification.c:84 fsnotify_destroy_event+0x6b/0x70 So it appears that the notify system cannot handle simply passing -ERESTARTSYS back up the stack here. -- Orion Poplawski Manager of NWRA Technical Systems 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane orion@nwra.com Boulder, CO 80301 https://www.nwra.com/