Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp78919ybh; Fri, 13 Mar 2020 17:25:18 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsFSd/auknJNenxupvnYg83kExzZgbQX49AmPsYf3WQr25NgT8Yb0ZAtCocLCKm5G8NSTHA X-Received: by 2002:a9d:7857:: with SMTP id c23mr7438374otm.288.1584145518532; Fri, 13 Mar 2020 17:25:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584145518; cv=none; d=google.com; s=arc-20160816; b=0HDWdZGMRokkVsn90b5KspK3WcYkx4thp6eVjUY3eQGYE0Vf3qhmiY+gZlqVHUV5eK rd3WsRi+aAlQ/PLTWUK61DWX1CM5GW78gyjsy1Yv8sIueIvaRoUqG2jYa3gWuc3r/SW+ Ya5BxGTmyQ3k9sGhHoLejgVZKkNXaAj/QJRUvBgIBFCHbuFCY4rAb/qrldfREHldi5Lz JXPSiDcdr0O3e3WNmKIiTjvQuq0XfYMKYaLksIGRNzzIrldIM22VuP1E3Gv0TovhhZmd rPFTsocWSRSLi5sTRtX5tZVUiOOuLRfyhMJ8dfgZUfExBdxADNT2nEWS1rbYd6ZZ5lTo 4UTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=PF4B1N2cbvUgUxFTe74Ngj645C1HctMa5nRbIwJBiM0=; b=utRPM4WszVd7gA1+XfPIpUW0QKa0ioGw2KItPZEFK8HnY9o1QSGCv4tOaY7NfTdMPg rtkTUL0r1nQ0uGH415tzZ7JTZrfsIAmCEyABiqwZqG7AR99KpMRK3WgG5f3M0tmmgOEV nRAaFjA+0gVpLEeT5YbSD2KsqEX74+Uc+yTpsYgU1U6MxPxjgMHxgYOIwFzwXC5anNJw XVeotPYKbp0C4443rlIK8fFaPxr3nftj8LlCWz86SvYGxrtYH/xL2UCnV6pNt0P3RsC7 KdG+mS4qDuvcKgQwtrVIS6Wlw7uVI+vj9QZ6knOf1veYpeP3e1Ke4NTo2PAycIXzGB1y pi4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k3si5517391oib.82.2020.03.13.17.24.52; Fri, 13 Mar 2020 17:25:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727636AbgCNAYY (ORCPT + 99 others); Fri, 13 Mar 2020 20:24:24 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:48291 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726853AbgCNAYY (ORCPT ); Fri, 13 Mar 2020 20:24:24 -0400 Received: from p5de0bf0b.dip0.t-ipconnect.de ([93.224.191.11] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jCuah-0005UY-Sv; Sat, 14 Mar 2020 01:23:48 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 16D97101115; Sat, 14 Mar 2020 01:23:47 +0100 (CET) From: Thomas Gleixner To: Logan Gunthorpe , Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , "Paul E . McKenney" , Joel Fernandes , Steven Rostedt , Linus Torvalds , Kurt Schwemmer , Bjorn Helgaas , linux-pci@vger.kernel.org Subject: Re: [PATCH 3/9] pci/switchtec: Don't abuse completion wait queue for poll In-Reply-To: <4d3a997d-ced4-3dbe-d766-0b1e9fc35b29@deltatee.com> References: <20200313174701.148376-1-bigeasy@linutronix.de> <20200313174701.148376-4-bigeasy@linutronix.de> <4d3a997d-ced4-3dbe-d766-0b1e9fc35b29@deltatee.com> Date: Sat, 14 Mar 2020 01:23:47 +0100 Message-ID: <87sgibeqcs.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Logan Gunthorpe writes: > On 2020-03-13 11:46 a.m., Sebastian Andrzej Siewior wrote: >> The poll callback is abusing the completion wait queue and sticks it into >> poll_wait() to wake up pollers after a command has completed. >> >> First of all it's a layering violation as it imposes restrictions on the >> inner workings of completions. Just because C allows to do so does not >> justify that in any way. The proper way to do such things is to post >> patches which extend the core infrastructure and not by silently abusing >> it. > > As I've said previously, I disagree with this approach. Feel free to do s. > Open coding standard primitives sweeps issues under the rug and is a > step backwards for code quality. Calling it a layering violation is > just one opinion and if it is, the better solution would be to create > an interface you find appropriate so that it isn't one. There is no standard primitive which allows to poll on a completion. You decided that this is smart to do and just because C does not allow to hide implementation details this is not a justification for this at all. Due to the limitations of C, the kernel has to rely on the assumption that developers know and respect the difference between API and implementation. Relying on implementation details of an interface and then arguing that this is a standard primitive for the chosen purpose is just backwards. What's even more hillarious is that you now request that we give you a replacement interface for something which was not an interface to use in the first place. >> 1) It cannot work with EPOLLEXCLUSIVE > > Why? You don't explain this. And I don't see how this patch would change > anything to do with the call to poll_wait(). All you've done is > open-code the completion. > > Not that it matters that much, having multiple waiters poll on this > interface can pretty much never happen. It only makes sense for the > process who submitted the write to poll on the interface. It does not matter whether your envisioned usage implies that it cannot happen. Fact is that there is no restriction. That means using this with the well documented semantics of poll(2) will result in failure. >> 2) It's racy: >> >> poll() write() >> switchtec_dev_poll() switchtec_dev_write() >> poll_wait(&s->comp.wait); mrpc_queue_cmd() >> init_completion(&s->comp) >> init_waitqueue_head(&s->comp.wait) > > That's a nice catch! But wouldn't an easier solution be to change the > code to use reinit_completion() instead of using the bug to justify a > different change? Sure taht can be cured by changing it to reinit, but that does not cure the abuse of implementation details. As Peter, who maintains the completion code says: Relying on implementation details of locking primitives like that is yuck. Thanks, tglx