Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp641403imm; Fri, 22 Jun 2018 02:47:59 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL+OYOskt3qhknLEhMlKV+FsXMDMigWw4DKMxufraTcthlV90rn+VsQmzIuvXfOKwdwfsFl X-Received: by 2002:a17:902:8f94:: with SMTP id z20-v6mr897690plo.337.1529660879244; Fri, 22 Jun 2018 02:47:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529660879; cv=none; d=google.com; s=arc-20160816; b=cJfgpzXQyfhZy+6DVr7XS180UeVAN81Nms9gU3qLOxizO6rdS50VyJJidufRxq4u3A UaxzWaCZ5E2kuM/WdH+mkwlML6BUjI9VcU6/HvFPtYfB+Fh2jvQOm1Qtl5yX/gajFIEW 7kE24lnhvmlL3d+pWIYcDT6xqix/SDf9zRk8TAuJqLO1Cwnt/4GudvOM9DgMktvV06mL nc8QxS46We4jn1n1pEuBGfuryksHwy4DAP3N6MYqzik8JU9YIHM+JzMqQ+8BQ/D+HSEa 6RFwki2x/UfMLyBg4GuC6mRw3/Q29wPp6yHAzQbzxphmAwtzCXrqRUsMmNUqVr2X9aYc JAFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=/pQcW6Mt+LwsATJf7hBaNWyTEm+4TC+KhNIwLD14dDk=; b=f/ZSUoIu1li3Q7bAR2fYlyx6CfHulLv7uhtHVp6W9Wp8HjcRq3tZvfFCWnda+zOaX+ 7rjlvOrg/DC2XPMUi/GgAwzA+wIrNqGtkItTni/ArxMuI7kednyW4373yTDMus33jWqU 352mz10b8PJasm6ghOseoVibedoHNQxMmCsp3JFU+CKiU/uz4lH3gQNv2VgnSXyt8zDK H16QnDr5mBYYLnLpHBAWokaR20YblvJsNvXcTi/ZczTn652emj2MmHJ/JihdDyhXKGne MAAEglENZwXXPS64H3t90YN94xs3lYIGNWekmYVhYdzuWH5C0TTchSJJxweDOfAlp9Ep m2eQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v8-v6si6686408pfn.191.2018.06.22.02.47.44; Fri, 22 Jun 2018 02:47:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754080AbeFVJqq (ORCPT + 99 others); Fri, 22 Jun 2018 05:46:46 -0400 Received: from verein.lst.de ([213.95.11.211]:55611 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751242AbeFVJqo (ORCPT ); Fri, 22 Jun 2018 05:46:44 -0400 Received: by newverein.lst.de (Postfix, from userid 2407) id B806168AA6; Fri, 22 Jun 2018 11:56:08 +0200 (CEST) Date: Fri, 22 Jun 2018 11:56:08 +0200 From: Christoph Hellwig To: Linus Torvalds Cc: kernel test robot , Al Viro , Christoph Hellwig , Greg Kroah-Hartman , "Darrick J. Wong" , Linux Kernel Mailing List , LKP Subject: Re: [lkp-robot] [fs] 3deb642f0d: will-it-scale.per_process_ops -8.8% regression Message-ID: <20180622095608.GA12263@lst.de> References: <20180622082752.GX11011@yexl-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 22, 2018 at 06:25:45PM +0900, Linus Torvalds wrote: > What was the alleged advantage of the new poll methods again? Because > it sure isn't obvious - not from the numbers, and not from the commit > messages. The primary goal is that we can implement a race-free aio poll, the primary benefit is that we can get rid of the currently racy and bug prone way we do in-kernel poll-like calls for things like eventfd. The first is clearly is in 4.18-rc and provides massive performance advantanges if used, the second is not there yet, more on that below. > I was assuming there was a good reason for it, but looking closer I > see absolutely nothing but negatives. The argument that keyed wake-ups > somehow make multiple wake-queues irrelevant doesn't hold water when > the code is more complex and apparently slower. It's not like anybody > ever *had* to use multiple wait-queues, but the old code was both > simpler and cleaner and *allowed* you to use multiple queues if you > wanted to. It wasn't cleaner at all if you aren't poll or select, and even for those it isn't exactly clean, see the whole mess around ->qproc. > The disadvantages are obvious: every poll event now causes *two* > indirect branches to the low-level filesystem or driver - one to get > he poll head, and one to get the mask. Add to that all the new "do we > have the new-style or old sane poll interface" tests, and poll is > obviously more complicated. It already caused two, and now we have three thanks to ->qproc. One of the advantages of the new code is that we can eventually get rid of ->qproc once all users of a non-default qproc are switched away from vfs_poll. Which requires a little more work, but I have the patches for that to be posted soon. > If we could get the poll head by just having a direct pointer in the > 'struct file', maybe that would be one thing. As it is, this all > literally just adds overhead for no obvious reason. It replaced one > simple direct call with two dependent but separate ones. People are doing weird things with their poll heads, so we can't do that unconditionally. We could however offer a waitqueue pointer in struct file and most users would be very happy with that. In the meantime below is an ugly patch that removes the _qproc indirect for ->poll only (similar patch is possible for select assuming the code uses select). And for next merge window I plan to kill it off entirely. How can we get this thrown into the will it scale run? --- From 50ca47fdcfec0a1af56aac6db8a168bb678308a5 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 22 Jun 2018 11:36:26 +0200 Subject: fs: optimize away ->_qproc indirection for poll_mask based polling Signed-off-by: Christoph Hellwig --- fs/select.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/fs/select.c b/fs/select.c index bc3cc0f98896..54406e0ad23e 100644 --- a/fs/select.c +++ b/fs/select.c @@ -845,7 +845,25 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait, /* userland u16 ->events contains POLL... bitmap */ filter = demangle_poll(pollfd->events) | EPOLLERR | EPOLLHUP; pwait->_key = filter | busy_flag; - mask = vfs_poll(f.file, pwait); + if (f.file->f_op->poll) { + mask = f.file->f_op->poll(f.file, pwait); + } else if (file_has_poll_mask(f.file)) { + struct wait_queue_head *head; + + head = f.file->f_op->get_poll_head(f.file, pwait->_key); + if (!head) { + mask = DEFAULT_POLLMASK; + } else if (IS_ERR(head)) { + mask = EPOLLERR; + } else { + if (pwait->_qproc) + __pollwait(f.file, head, pwait); + mask = f.file->f_op->poll_mask(f.file, pwait->_key); + } + } else { + mask = DEFAULT_POLLMASK; + } + if (mask & busy_flag) *can_busy_poll = true; mask &= filter; /* Mask out unneeded events. */ -- 2.17.1