Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp4011269ybc; Tue, 26 Nov 2019 02:17:22 -0800 (PST) X-Google-Smtp-Source: APXvYqyEW/hBOAdlO0ObQlJXpaaaal1yV0qQO76cK62oInipUMPC9j3up+BS/dVdAZlqxrO/ZVUS X-Received: by 2002:a50:e002:: with SMTP id e2mr24555243edl.278.1574763442054; Tue, 26 Nov 2019 02:17:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574763442; cv=none; d=google.com; s=arc-20160816; b=rDmeUQNcHkNdgxszPS3ImqH94ZGbsNAL5XKAs8YHLzJxlLCaoWmnCNrowDuJmp0Gaz 4B6GFe4/DcuPgnEAZH7C0srq8wPrtkidT3e8+7Gn0S9SACdDObg5pg4czV1Xrs2F8DjH Lb6nHAZMx3/v8H/oNU7wg9ixPGBjh8ogBf7hoQH/d77DMYp9p6VxdXVjMXcrvqHUTTzB uckXEdgUZY2GPX/8/iL3BKUYAkopgG2S6VgAE3gTIJQzPOh8e+Lx+AMIy986IhwdHrLP WHHt7p7PBJd53Gamw1w7IfKE+yrYo2KOORGcTM8YV4Mspf8xoPbSvm2SLxCVk9Zt0vY5 6L9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :date:in-reply-to:references:message-id:cc:to:subject:from :dkim-signature:dkim-signature; bh=mqzlyKykmpv5Iq5zfQK/GYgxax6QSycqOmjhHdZR9Rg=; b=MvKuh3CgwUkbUktchj37I8BlSjSMZEpWzsVHX4u/162vVlzuC+uWVYyuWAnwRu5/ix 2Tu9Ii3Z5sPMZWH7MTduNVL69XEDKm6olaswKoEw2z3UjxNsR5uV0HUYKWNQXv8FPf9P GnGezpgyGOA+LSFPYTVCAUUns2NQeKfVK7c2Th/ZRaz1jgEoYVDSvHr90i2KJpCLsyE0 VDRHF70J40bEmiJRpl73sGzZV9uhhTbTG7J2ryGejAGHZYhoJJgxCJyDlXQBtjtkTjZA Mq0mfURJ91BfhEcXbYDadQc11Po/jBdqY5TdTzOrzkbBewx4rdYDNvHhipWZbXbtj9+g NnJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nexedi.com header.s=mandrill header.b=YgVNqyN4; dkim=pass header.i=@mandrillapp.com header.s=mandrill header.b=DWFA5Ia9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bs15si7090600edb.408.2019.11.26.02.16.57; Tue, 26 Nov 2019 02:17:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nexedi.com header.s=mandrill header.b=YgVNqyN4; dkim=pass header.i=@mandrillapp.com header.s=mandrill header.b=DWFA5Ia9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727731AbfKZKPW (ORCPT + 99 others); Tue, 26 Nov 2019 05:15:22 -0500 Received: from mail187-16.suw11.mandrillapp.com ([198.2.187.16]:42022 "EHLO mail187-16.suw11.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727388AbfKZKPW (ORCPT ); Tue, 26 Nov 2019 05:15:22 -0500 X-Greylist: delayed 900 seconds by postgrey-1.27 at vger.kernel.org; Tue, 26 Nov 2019 05:15:21 EST DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:References:In-Reply-To:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=mqzlyKykmpv5Iq5zfQK/GYgxax6QSycqOmjhHdZR9Rg=; b=YgVNqyN4m6yRCFOuhC3OQCjK+9Sqd3Ni4orjJofIqNBgPXQFvO9L51m3pmgTfm0yaQ7FozUlN/ZF fbzLDamZfqN2aRCbT21DTwfQHTIomDzH5qhlOuooT+vZqqQ+u+AfNJUDhdyyPMXCOstaXtKJcaC7 7goOjFFYEswGOwi6kaY= Received: from pmta01.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail187-16.suw11.mandrillapp.com id hrjtji174i4h for ; Tue, 26 Nov 2019 10:00:20 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1574762420; h=From : Subject : To : Cc : Message-Id : References : In-Reply-To : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=mqzlyKykmpv5Iq5zfQK/GYgxax6QSycqOmjhHdZR9Rg=; b=DWFA5Ia94m/kn/9k/ddFNhjQuSTnTFeHMPhozcbn5izKrImfFMBqscQLJVztox2sCmSx+b lHfdliLn5jE7gXlIhxwpz0XHSOqRGNbeseaN07VbFLpdbB1a1OvtGduKdehJtXmdp1W1kCCT L3QbY2OaAD8qjlnWZtp0cARmvy6wo= From: Kirill Smelkov Subject: Re: Commit 0be0ee71 ("fs: properly and reliably lock f_pos in fdget_pos()") breaking userspace Received: from [87.98.221.171] by mandrillapp.com id ea7ed32e943f45b58e80e7c553ee0419; Tue, 26 Nov 2019 10:00:20 +0000 To: Linus Torvalds Cc: "Kenneth R. Crudup" , Linux Kernel Mailing List Message-Id: <20191126100010.GA15941@deco.navytux.spb.ru> References: In-Reply-To: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.ea7ed32e943f45b58e80e7c553ee0419 X-Mandrill-User: md_31050260 Date: Tue, 26 Nov 2019 10:00:20 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 25, 2019 at 07:39:34PM -0800, Linus Torvalds wrote: > On Mon, Nov 25, 2019 at 7:21 PM Linus Torvalds > wrote: > > > > Of course, this may fix the f_pos locking issue, but replace it with a > > "oops, the character device driver tried to look at *pos anyway", and > > that will give you a nice OOPS instead. > > Confirmed. At least the x86 firmware update code uses > "simple_read_from_buffer()", which does use the file position, but > doesn't actually allow llseek(). > > So no, "it's a character device no llseek" does not mean that it acts > as a pure streaming device with no file position, and we'd actually > have to mark individual drivers (either by adding 'stream_open()' in > their open routines, or adding the extra field to 'struct > file_operations' that I mentioned). > > I think I'll have to revert that trial commit. I'll give it another > day in case somebody has a better idea, but it looks like it's too > early to do that nice cleanup as things are now. > > Linus Linus, thanks for heads up and for pushing FMODE_STREAM conversion forward. I still believe the most practical approach to do it is via automation - via extending and teaching stream_open.cocci to find other places that actually don't use ppos / ki->ki_pos at all and at least warn that stream_open() should be used for that file. Else it will be a lot of pain. Of course we can manually convert the core cases like pipes and sockets. But those things are easy to do. On the other hand by manually converting them, we lesser the extent to where the automation is applied and tested, and that leaves us with just many small corner cases that will be left in potentially racy/deadlocked state until someone actually hit that problem and cares to debug it to understand what is going on. Those will be many different places and so likely many different people, and since debugging concurrency issues is not easy, it will likely last for many years. Of course things like KCSAN helps a lot, but, if I understand correctly, they do not provide full coverage for the whole kernel, especially they are likely not providing coverage for leaf kernel bits in drivers. Of course, I might be missing something, and there is a way to e.g. combine automation+pain approaches. Then that would be good to make combined progress. I still keep in mind extending stream_open.cocci myself, but for now I cannot delve into it before finishing my transactional filesystem... Kirill P.S. even though I was put into cc there, somehow I did not received any notification email for commits d8e464ecc17b (vfs: mark pipes and sockets as stream-like file descriptors) and 0be0ee71816b (vfs: properly and reliably lock f_pos in fdget_pos()). P.P.S completely untested, but looks sane: ---- 8< ---- From 634e1af8775aa27799b4879fdb527e4a3b8b31ef Mon Sep 17 00:00:00 2001 From: Kirill Smelkov Date: Tue, 26 Nov 2019 12:40:56 +0300 Subject: [PATCH] socket: No need to check for ki_pos != 0 anymore Commit d8e464ecc17b (vfs: mark pipes and sockets as stream-like file descriptors) converted sockets to use stream_open() which forbids llseek and sets FMODE_STREAM for which VFS ksys_read/ksys_write routines make sure not to change file->f_pos at all and always pass either ppos=NULL, or ki->ki_pos=0 to fops->read/write, because there is no file position for stream-like files. Drop unnecessary checks. Signed-off-by: Kirill Smelkov --- net/socket.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/net/socket.c b/net/socket.c index 17bc1eee198a..d8c583c755ae 100644 --- a/net/socket.c +++ b/net/socket.c @@ -959,9 +959,6 @@ static ssize_t sock_read_iter(struct kiocb *iocb, struct iov_iter *to) if (file->f_flags & O_NONBLOCK) msg.msg_flags = MSG_DONTWAIT; - if (iocb->ki_pos != 0) - return -ESPIPE; - if (!iov_iter_count(to)) /* Match SYS5 behaviour */ return 0; @@ -978,9 +975,6 @@ static ssize_t sock_write_iter(struct kiocb *iocb, struct iov_iter *from) .msg_iocb = iocb}; ssize_t res; - if (iocb->ki_pos != 0) - return -ESPIPE; - if (file->f_flags & O_NONBLOCK) msg.msg_flags = MSG_DONTWAIT; -- 2.20.1