Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp860428rwp; Thu, 13 Jul 2023 02:09:22 -0700 (PDT) X-Google-Smtp-Source: APBJJlH5kFi0ociRZFTLlpRHpv3Py7MIhMUHHmPRzQ47X+vAWrUOpdnX+Ao5aLQAyI4nsYyT2ttK X-Received: by 2002:a19:6459:0:b0:4fb:9105:58b0 with SMTP id b25-20020a196459000000b004fb910558b0mr688276lfj.20.1689239361888; Thu, 13 Jul 2023 02:09:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689239361; cv=none; d=google.com; s=arc-20160816; b=KXvdPLkLeg8MNlXTIlijZuOuPDwcCAcMkCw43X76t6B/jCn6hQ6SwBaup+CbRhm1uX MkVEDKcV+7d1h7ZU1X3UFrZOpkTPPZ0V57WhUmk6qzgvrxsA37cCV1C0cWb4Y4a7fYN5 DO4JK0TDIyg1xXEu1Ag77PNoRxzHCXwyPpkZzI5kijGsOZi/9sD+DYQ1DPvgpZqsoVfk 37EokbJyNxkvsD2TZlJ+4ss3y8Rn9NxnkowrkLGkOYI9tdTABmaMfkxs7ft/FSvH5VDr v1H+31108dmUSlvvuTNq+nmK1GL/H4RjCgBNkydTclcoR0lNN53hrCrs/NUJBFrhQrHO gSaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=6Gt8Rdtciz0j0OiJDrOGJlnAbI6Aop8sZDAt+C4WuHI=; fh=4gTZngYQXdm/QLXxkG+4VH220Gzj2H39BBtR8jiafyg=; b=Xa5U9LmBj+RKfRAezWQvciEVf/gfPi6KIGpMZkx6/Mgyf80x3+czUeHhpr+q6OuXx6 I2Kmm5OdiaqSMLkr1iY3cv66k9Vu37vAy5GoST7qtJIwfqZ5dq3w0AkhUdOLz3G4Fk1i eGo6q1OU5Rao9lKc/70f6rw96HcapTNAdMVKhxmP8o4i0O88ApwhIpfSeXn5YP9B+8Hb tx/nsMsalRjOb9KoX9G+v19BUGes8i8lSYEiiS13wCWKuvKmoP8WiYGpm3/Za1REO1gq LOL1rXSSA8gqKOkqKvfeTo9naHXljIy6HoCfEVdzd4gFCGdP6C+BtfM1vQfi8P/cHL0u hUog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=X1yAPWUw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u1-20020a05640207c100b0051e04ea21ebsi6515955edy.202.2023.07.13.02.08.57; Thu, 13 Jul 2023 02:09:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=X1yAPWUw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234018AbjGMI42 (ORCPT + 99 others); Thu, 13 Jul 2023 04:56:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232663AbjGMI40 (ORCPT ); Thu, 13 Jul 2023 04:56:26 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CA7F12E; Thu, 13 Jul 2023 01:56:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9506061A91; Thu, 13 Jul 2023 08:56:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0C52AC433C7; Thu, 13 Jul 2023 08:56:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689238585; bh=fXAs8WA2MGh2xR497240sNMs6n7reD0dpJ8m9xGMUa8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=X1yAPWUwuO4iyD+7JcINl6beVztAa0Mvu8ZNb2RKzUNYrAq/NRcZGZMN3asORDhpo C2dJYnxbmnTguhDcwem2Sy4GIEmQJ4mvKOOizyMRsOh+PeWCYWKs8XIaN1hFoEuDQq fAEBfrdoHpPADPHBtyoWkOlgC4kVeHiiOFY45LaXx2/PvfEKuc3KSSqEcpigW+R07U PXk3+5hDwMfyXNJ7tS5io/C84qln5RJywbmtwVoMKW4j6ORWQry3yFckOBCePCkUB7 SHZwnBCYVkPNlj3Nl90S3twQih+T9I1E2SDdy+zW8rID4mWj0m1Gcm8OGhuMpOBZV+ YBVmYNQ8vrreQ== Date: Thu, 13 Jul 2023 10:56:19 +0200 From: Christian Brauner To: wenyang.linux@foxmail.com Cc: Alexander Viro , Jens Axboe , Christoph Hellwig , Dylan Yudaken , David Woodhouse , Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] eventfd: avoid unnecessary wakeups in eventfd_write() Message-ID: <20230713-wellen-heftig-b950ad3e64d2@brauner> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 13, 2023 at 12:42:32AM +0800, wenyang.linux@foxmail.com wrote: > From: Wen Yang > > In eventfd_write(), when ucnt is 0 and ctx->count is also 0, > current->in_eventfd will be set to 1, which may affect eventfd_signal(), > and unnecessary wakeups will also be performed. > > Fix this issue by ensuring that ctx->count is not zero. > > Signed-off-by: Wen Yang > Cc: Alexander Viro > Cc: Jens Axboe > Cc: Christian Brauner > Cc: Christoph Hellwig > Cc: Dylan Yudaken > Cc: David Woodhouse > Cc: Matthew Wilcox > Cc: linux-fsdevel@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > fs/eventfd.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/fs/eventfd.c b/fs/eventfd.c > index 33a918f9566c..254b18ff0e00 100644 > --- a/fs/eventfd.c > +++ b/fs/eventfd.c > @@ -281,10 +281,12 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c > } > if (likely(res > 0)) { > ctx->count += ucnt; > - current->in_eventfd = 1; > - if (waitqueue_active(&ctx->wqh)) > - wake_up_locked_poll(&ctx->wqh, EPOLLIN); > - current->in_eventfd = 0; > + if (ctx->count) { > + current->in_eventfd = 1; > + if (waitqueue_active(&ctx->wqh)) > + wake_up_locked_poll(&ctx->wqh, EPOLLIN); > + current->in_eventfd = 0; > + } > } > spin_unlock_irq(&ctx->wqh.lock); I don't think we can do this. Consider the following: struct pollfd pfd = { .events = POLLIN | POLLOUT, }; int fd = eventfd(0, 0); if (fd < 0) return -1; write(fd, &w, sizeof(__u64)); poll(&pfd, 1, -1); printf("%d\n", pfd.revents & POLLOUT); Currently, the eventfd_poll() will do: ULLONG_MAX - 1 > ctx->count informing pollers with POLLOUT that the eventfd is writable, iow, that the count has overflowed. After your change such POLLOUT waiters will hang forever even though the eventfd is writable. So currently, a zero write on an eventfd can be used to inform another process that they can write. This breaks this completely. Caller's that don't want to be woken up on zero writes should just not set POLLOUT: struct pollfd pfd = { .events = POLLIN, }; int fd = eventfd(0, 0); if (fd < 0) return -1; write(fd, &w, sizeof(__u64)); poll(&pfd, 1, -1); This will wait until someone actually writes something.