Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4002131ybl; Tue, 21 Jan 2020 11:01:01 -0800 (PST) X-Google-Smtp-Source: APXvYqx8Qy4QQEp5dU+gygyVsmeD34RNBVw6fMPLXyr4UIrjy8qhUL68YIHH+mLY2B4c/gXfypYn X-Received: by 2002:aca:4a87:: with SMTP id x129mr4030686oia.165.1579633260941; Tue, 21 Jan 2020 11:01:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579633260; cv=none; d=google.com; s=arc-20160816; b=wnHdv7navQ4uJVbgZiNTg7RwEUlk/+OPWLhb1qMzBnBWkpx+eZ1RH2EgTGk3tDo6Cl e473MXMymNHs4JDyY7gfjCEgwnb4+jsAkOsxKoy3ssYwl6sE2SL9iWHiRpmzmQlhSf5r tctTh7IXovn69ewQm5MDvvCyizaxMr5+o+yLuDorxwTX46u+vMpKH82+S+zJYSW3TfOg WgimgkYS069Yk9auTPvgrWb1P7kPXZpkbHSj/HK1wav8XjPMBfkiBMQ1/ml9NlOeIMFr nrKk/Es06yEak28gOIOdkPpdJHpMfenrIUYY1bbP7MwKQNfgkk7Dy2xEAx1zcRPLgdwq h3Hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=AUfBB+a1tjStZZGj1K4kk1azlBm3XMuCDaW2D25A41s=; b=ZGNCv+PZ2KF27cbEvngcWoVPnsTALBco4+HtHnHPYO4vAjrTFMCjC0TUcw+6o0Jgz6 OMJSuH3eSomRiyQyp7R30QZyifxceY8BmoVwWUqrKbGZ8r60uTeOZIktEm9LpB7ijrSl DLZg7CCRqT0PWLiZ0VzOvC/fgWLL1k4un7RYE/Htcudc+Pw9dGQ0NyS6M5v2/RuL9ulk Rcy4RfDe/uG0tnn81skBLZAi1LBIpxCYPtpwKzzgMylUmQcsoxhz/0vNWf3yrws4tbz1 SYcYq6BhSr4qIaeXwTjOI01538FSysLcnQXQztqyJ5hiYGehGgw8L/HZvw9VzvA1ca9S HbaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s1si20855860oic.234.2020.01.21.11.00.47; Tue, 21 Jan 2020 11:01:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729224AbgAUS7r (ORCPT + 99 others); Tue, 21 Jan 2020 13:59:47 -0500 Received: from mx2.suse.de ([195.135.220.15]:60228 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728829AbgAUS7r (ORCPT ); Tue, 21 Jan 2020 13:59:47 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 514D1AE44; Tue, 21 Jan 2020 18:59:45 +0000 (UTC) Date: Tue, 21 Jan 2020 10:52:50 -0800 From: Davidlohr Bueso To: Alexey Mateosyan Cc: Andrew Morton , Al Viro , Markus Elfring , Li Rongqing , Arnd Bergmann , Kees Cook , David Howells , linux-kernel@vger.kernel.org Subject: Re: mq_notify loose notification request on fork() Message-ID: <20200121185250.fk6nkcw5nqbkab5o@linux-p48b> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 14 Jan 2020, Alexey Mateosyan wrote: >Reported bug on bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=206207 >Duplicating here: > >Bug is related to implementation of POSIX mqueues, specifically >mq_notify syscall.\n > >The problem is: if we do subscription on notifications with help of >mq_notify() syscall, and then do fork() and this fork() fail for some >reason (for example there is pending signals for the parent process), >then we loose the subscription on the notification. > >The root cause is: during copy_process() we duplicate opened file >descriptors, and then if something goes wrong inside copy_process() >(for example there is pending signals for the parent process) we do >cleanup for the duplicated file descriptors via exit_files() which is >finally calling filp_close() and flush(). From the other side, current >implementation of ipc/mqueue.c implements file_operations.flush so >that it removes notification for the current process, which is >certainly not designed(desired) behavior. > >There is another case how to reproduce this behavior. Do fd = >mq_open("my_queue"); mq_notify(fd, ...); and never do close(fd) to >preserve notifications. Then if we open() and close() the queue file >from the same process - we loose the notification request >(subscription) for the still opened fd. I.e. current behavior is the >first close() removes notification. Is this a real issue? Do you have an actual reproducer? See below. > >If we do remove_notification() in file_ops.release instead of >file_ops.flush in mqueue.c then this change fixes the described above >fork() related problem. But this will change the behavior so that the >last close() will remove the notifications. Right, we don't want to change any user-visible behavior; and only address the failed fork case. I guess mqueue_flush_file() really wants to be using the result of copy_process() task_struct pointer for its cleanup, instead of the parent 'current'. The following does this but adds a way of getting the pointer back from files_struct, perhaps you could test this? Alternatively we could 'hijack' the fl_owner_t id before performing f_op->flush() callback in filp_close() -- afaict no users actually use the fl_owner_t field during flush callback, then we could just restore it back to 'files' before dnotify_flush() etc paths, but I'm not certain and is rather hacky to begin with. Thanks, Davidlohr --------8<-------------------------------------------------------- diff --git a/fs/file.c b/fs/file.c index 2f4fcf985079..70e5fe1c70d5 100644 --- a/fs/file.c +++ b/fs/file.c @@ -440,6 +440,7 @@ void exit_files(struct task_struct *tsk) if (files) { task_lock(tsk); + files->tsk = NULL; tsk->files = NULL; task_unlock(tsk); put_files_struct(files); diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index f07c55ea0c22..29f906157b73 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -55,6 +55,7 @@ struct files_struct { struct fdtable __rcu *fdt; struct fdtable fdtab; + struct task_struct *tsk; /* * written part on a separate cache line in SMP */ diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 3d920ff15c80..22300f0fb35a 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -589,10 +590,12 @@ static ssize_t mqueue_read_file(struct file *filp, char __user *u_data, static int mqueue_flush_file(struct file *filp, fl_owner_t id) { struct mqueue_inode_info *info = MQUEUE_I(file_inode(filp)); + struct files_struct *files = id; spin_lock(&info->lock); - if (task_tgid(current) == info->notify_owner) + if (task_tgid(files->tsk) == info->notify_owner) { remove_notification(info); + } spin_unlock(&info->lock); return 0; diff --git a/kernel/fork.c b/kernel/fork.c index 2508a4f238a3..8a85858db9f5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1467,6 +1467,7 @@ static int copy_files(unsigned long clone_flags, struct task_struct *tsk) if (!newf) goto out; + newf->tsk = tsk; tsk->files = newf; error = 0; out: