Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3330664ybt; Mon, 29 Jun 2020 23:19:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5qtuyuMCDhc5Pc4RLdMIBVecCGeUA5qEZTgxyuCsg9HGaOxUOh/FgO3sMdEoxu/AvIF+f X-Received: by 2002:aa7:d989:: with SMTP id u9mr3592963eds.85.1593497947044; Mon, 29 Jun 2020 23:19:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593497947; cv=none; d=google.com; s=arc-20160816; b=uGiDdWTk+AEWHXvciFkvGN6/mZDT3vbSNkb33zLyG7KBiEAcwbEhroLq4ZfEcKeTOa tNuhkZqxJU28tE+jtqBncYfnIXaB1R3Xvpri8wKcH8a+EmE0loBaJoPt6e3PaeKd00/J bMcLcprV7zl4v/H99EqBqbUnayfxwHry0I4lcmnvOoEPH+XLjP91RmEFZCilKQZYISoH C38HFm51MXiKJU3bsLIJuPewWnhOx3+nZC7SNuBUaay6DBurKOQzKvqmW23bUyULZV6O GVVtnc9xUN9qIGfP+SUmvS+tML6cz9C8bgK9NBNxzTxCTWNkExEmbDwcgcfENoco6Gk7 S1vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=muRk0viGagP/ZRN+GQvLEdey6PLP16ppsmDc55IhS0E=; b=sV7SbaxliJj7iELHMAqNFNplHgZHOJHkVHmARYpgyHSVGInVbujDfGuOz6766HXacZ 79MpOHS9NMKw5/2eepBEseiJr1HRG+kRqLnaY6Rd5QZNxKP8fPXYosPHZKkCLaIjXORn utCvcgucsEW9qIErhDVsCzrLeDXbx8BxtviWAHgEkTWOJGPlbQpwvfPN6mga+bSsHkCV 8mp9+r9UggMJC79AIvznWm5WPI6rWsqvEIy8rGhlOG3FJGmWux1pFXpSnqc5Hn2Q4gfl DfBB0eoCQtNkM1v9v9xBcQlGLoDgPy3B3MGp0lxRMaghWNvHkUrlYeT/xVPIvIAqDo4A Tp0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o17si1130736edz.304.2020.06.29.23.18.44; Mon, 29 Jun 2020 23:19:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730011AbgF3GSG (ORCPT + 99 others); Tue, 30 Jun 2020 02:18:06 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:61336 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725845AbgF3GSF (ORCPT ); Tue, 30 Jun 2020 02:18:05 -0400 Received: from fsav405.sakura.ne.jp (fsav405.sakura.ne.jp [133.242.250.104]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 05U6H3xt072997; Tue, 30 Jun 2020 15:17:03 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav405.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav405.sakura.ne.jp); Tue, 30 Jun 2020 15:17:03 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav405.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 05U6H2bZ072931 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 30 Jun 2020 15:17:02 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [PATCH v2 00/15] Make the user mode driver code a better citizen To: "Eric W. Biederman" , Alexei Starovoitov Cc: linux-kernel@vger.kernel.org, David Miller , Greg Kroah-Hartman , Kees Cook , Andrew Morton , Alexei Starovoitov , Al Viro , bpf , linux-fsdevel , Daniel Borkmann , Jakub Kicinski , Masahiro Yamada , Gary Lin , Bruno Meneguele , LSM List , Casey Schaufler , Luis Chamberlain , Linus Torvalds References: <20200625095725.GA3303921@kroah.com> <778297d2-512a-8361-cf05-42d9379e6977@i-love.sakura.ne.jp> <20200625120725.GA3493334@kroah.com> <20200625.123437.2219826613137938086.davem@davemloft.net> <87pn9mgfc2.fsf_-_@x220.int.ebiederm.org> <87y2oac50p.fsf@x220.int.ebiederm.org> <87bll17ili.fsf_-_@x220.int.ebiederm.org> <20200629221231.jjc2czk3ul2roxkw@ast-mbp.dhcp.thefacebook.com> <87y2o51hkk.fsf@x220.int.ebiederm.org> From: Tetsuo Handa Message-ID: <8137153b-6163-b52c-a1f8-f3949dc321bd@i-love.sakura.ne.jp> Date: Tue, 30 Jun 2020 15:16:57 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <87y2o51hkk.fsf@x220.int.ebiederm.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/06/30 10:13, Eric W. Biederman wrote: > Alexei Starovoitov writes: > >> On Mon, Jun 29, 2020 at 02:55:05PM -0500, Eric W. Biederman wrote: >>> >>> I have tested thes changes by booting with the code compiled in and >>> by killing "bpfilter_umh" and running iptables -vnL to restart >>> the userspace driver. >>> >>> I have compiled tested each change with and without CONFIG_BPFILTER >>> enabled. >> >> With >> CONFIG_BPFILTER=y >> CONFIG_BPFILTER_UMH=m >> it doesn't build: >> >> ERROR: modpost: "kill_pid_info" [net/bpfilter/bpfilter.ko] undefined! >> >> I've added: >> +EXPORT_SYMBOL(kill_pid_info); >> to continue testing... kill_pid() is already exported. >> >> And then did: >> while true; do iptables -L;rmmod bpfilter; done >> >> Unfortunately sometimes 'rmmod bpfilter' hangs in wait_event(). >> >> I suspect patch 13 is somehow responsible: >> + if (tgid) { >> + kill_pid_info(SIGKILL, SEND_SIG_PRIV, tgid); >> + wait_event(tgid->wait_pidfd, !pid_task(tgid, PIDTYPE_TGID)); >> + bpfilter_umh_cleanup(info); >> + } >> >> I cannot figure out why it hangs. Some sort of race ? >> Since adding short delay between kill and wait makes it work. Because there is a race window that detach_pid() from __unhash_process() from __exit_signal() from release_task() from exit_notify() from do_exit() is called some time after wake_up_all(&pid->wait_pidfd) from do_notify_pidfd() from do_notify_parent() from exit_notify() from do_exit() was called (in other words, we can't use pid->wait_pidfd when pid_task() is used at wait_event()) ? Below are changes I suggest. diff --git a/kernel/umd.c b/kernel/umd.c index ff79fb16d738..f688813b8830 100644 --- a/kernel/umd.c +++ b/kernel/umd.c @@ -26,7 +26,7 @@ static struct vfsmount *blob_to_mnt(const void *data, size_t len, const char *na if (IS_ERR(mnt)) return mnt; - file = file_open_root(mnt->mnt_root, mnt, name, O_CREAT | O_WRONLY, 0700); + file = file_open_root(mnt->mnt_root, mnt, name, O_CREAT | O_WRONLY | O_EXCL, 0700); if (IS_ERR(file)) { mntput(mnt); return ERR_CAST(file); @@ -52,16 +52,18 @@ static struct vfsmount *blob_to_mnt(const void *data, size_t len, const char *na /** * umd_load_blob - Remember a blob of bytes for fork_usermode_driver - * @info: information about usermode driver - * @data: a blob of bytes that can be executed as a file - * @len: The lentgh of the blob + * @info: information about usermode driver (shouldn't be NULL) + * @data: a blob of bytes that can be executed as a file (shouldn't be NULL) + * @len: The lentgh of the blob (shouldn't be 0) * */ int umd_load_blob(struct umd_info *info, const void *data, size_t len) { struct vfsmount *mnt; - if (WARN_ON_ONCE(info->wd.dentry || info->wd.mnt)) + if (!info || !info->driver_name || !data || !len) + return -EINVAL; + if (info->wd.dentry || info->wd.mnt) return -EBUSY; mnt = blob_to_mnt(data, len, info->driver_name); @@ -76,15 +78,14 @@ EXPORT_SYMBOL_GPL(umd_load_blob); /** * umd_unload_blob - Disassociate @info from a previously loaded blob - * @info: information about usermode driver + * @info: information about usermode driver (shouldn't be NULL) * */ int umd_unload_blob(struct umd_info *info) { - if (WARN_ON_ONCE(!info->wd.mnt || - !info->wd.dentry || - info->wd.mnt->mnt_root != info->wd.dentry)) + if (!info || !info->driver_name || !info->wd.dentry || !info->wd.mnt) return -EINVAL; + BUG_ON(info->wd.mnt->mnt_root != info->wd.dentry); kern_unmount(info->wd.mnt); info->wd.mnt = NULL; @@ -138,7 +139,7 @@ static void umd_cleanup(struct subprocess_info *info) { struct umd_info *umd_info = info->data; - /* cleanup if umh_setup() was successful but exec failed */ + /* cleanup if umd_setup() was successful but exec failed */ if (info->retval) { fput(umd_info->pipe_to_umh); fput(umd_info->pipe_from_umh); @@ -163,7 +164,10 @@ int fork_usermode_driver(struct umd_info *info) const char *argv[] = { info->driver_name, NULL }; int err; - if (WARN_ON_ONCE(info->tgid)) + if (!info || !info->driver_name || !info->wd.dentry || !info->wd.mnt) + return -EINVAL; + BUG_ON(info->wd.mnt->mnt_root != info->wd.dentry); + if (info->tgid) return -EBUSY; err = -ENOMEM; diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c index 91474884ddb7..9dd70aacb81a 100644 --- a/net/bpfilter/bpfilter_kern.c +++ b/net/bpfilter/bpfilter_kern.c @@ -19,8 +19,13 @@ static void shutdown_umh(void) struct pid *tgid = info->tgid; if (tgid) { - kill_pid_info(SIGKILL, SEND_SIG_PRIV, tgid); - wait_event(tgid->wait_pidfd, !pid_task(tgid, PIDTYPE_TGID)); + kill_pid(tgid, SIGKILL, 1); + while (({ bool done; + rcu_read_lock(); + done = !pid_task(tgid, PIDTYPE_TGID); + rcu_read_unlock(); + done; })) + schedule_timeout_uninterruptible(1); bpfilter_umh_cleanup(info); } }