Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1258469imm; Fri, 22 Jun 2018 13:12:40 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLKsbrD+X/xUz/2yUtz/47DyhhibTEeH3XB4VWxXRwK4JP3GsopFX6GGnGx6zJEybuRNeGS X-Received: by 2002:a63:9902:: with SMTP id d2-v6mr2706137pge.166.1529698360191; Fri, 22 Jun 2018 13:12:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529698360; cv=none; d=google.com; s=arc-20160816; b=fjuS5hK3TlNPFTsLSFtXCI4oRqL9mevOZeQZtFwprbX0aYnuCapHz02yUVQQP+MjbC Q5EZeWiHtVjn4tee6MaHpeHdInYf0imhOkrE+gsTMNVDRdPdSZYaTm/2xx1TP+FooY09 8yt1HI3hYYD05ohTMEVscwr/b9K4zUgbH+RMhb5rDoGORb3XCTFTTlo+oOeHmh1F57nk TqqOl72j843C6Phk29cFYk37DBXQvwe7HgWrQmMq4RD+ZIjE7XGNWrp9DIIThPEqBAyq Ndo+UVC1fuvongy+7acUoGEuq7M67RCRgu9jgTqsGTGEqumrMHj21SK4d03OMIHhsPJZ 5tUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=Go1Ljh3AzyeZW6LQV8kXaEyWEdbjTGlMygSDYulXy98=; b=ESi42NqmO/Yga2+mTNfhMPa6mlC3MNpi6cv8yEvxae+OzSUiSDJXc86ZYaqohH2PNT m67sgTAgjss0cedBsG2k5HdQ3zV1nMureOeO3ZW9s23zPS3KIVj6VPEZG0MZz0Blm+oK Bgo2yR8C+b7W/lJuiScOaYtF7QypTtcZT30KAtPFsouij7OLF6zC4xrCftQ/TqZ/1d0u 8GK496L/5luml6U8SAmJKYymJYF6KD8Xglf/wz/bZjb+s3dmlD01+OdgQDdntfU7JJFF kV8GaicQ6iqvFuz9oFZFjkf77i6SQBFdoBGsEi/YYbU5utqrrchXET8XwPrsuaTjOkRw E6JA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q71-v6si8061790pfd.153.2018.06.22.13.12.25; Fri, 22 Jun 2018 13:12:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754332AbeFVULc (ORCPT + 99 others); Fri, 22 Jun 2018 16:11:32 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:42745 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751411AbeFVULb (ORCPT ); Fri, 22 Jun 2018 16:11:31 -0400 Received: from p4fea482e.dip0.t-ipconnect.de ([79.234.72.46] helo=nanos.glx-home) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fWSP0-00038n-Es; Fri, 22 Jun 2018 22:11:26 +0200 Date: Fri, 22 Jun 2018 22:11:26 +0200 (CEST) From: Thomas Gleixner To: Lukas Wunner cc: Bjorn Helgaas , Mika Westerberg , Sebastian Andrzej Siewior , linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org Subject: Re: [PATCH] genirq: Synchronize only with single thread on free_irq() In-Reply-To: Message-ID: References: <8f770886632640321592873e4c902218d42c436b.1527194314.git.lukas@wunner.de> <20180622080125.GA13709@wunner.de> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 22 Jun 2018, Thomas Gleixner wrote: > On Fri, 22 Jun 2018, Lukas Wunner wrote: > > On Thu, Jun 21, 2018 at 10:21:44PM +0200, Thomas Gleixner wrote: > > > On Thu, 24 May 2018, Lukas Wunner wrote: > > > > static int irq_wait_for_interrupt(struct irqaction *action) > > > > { > > > > - set_current_state(TASK_INTERRUPTIBLE); > > > > + for (;;) { > > > > + set_current_state(TASK_INTERRUPTIBLE); > > > > > > > > - while (!kthread_should_stop()) { > > > > + if (kthread_should_stop()) { > > > > + /* may need to run one last time. */ > > > > + if (test_and_clear_bit(IRQTF_RUNTHREAD, > > > > + &action->thread_flags)) { > > > > + __set_current_state(TASK_RUNNING); > > > > + return 0; > > > > + } > > > > + __set_current_state(TASK_RUNNING); > > > > + return -1; > > > > + } > > > > > > > > if (test_and_clear_bit(IRQTF_RUNTHREAD, > > > > &action->thread_flags)) { > > > > @@ -766,10 +776,7 @@ static int irq_wait_for_interrupt(struct irqaction *action) > > > > return 0; > > > > } > > > > schedule(); > > > > - set_current_state(TASK_INTERRUPTIBLE); > > > > } > > > > - __set_current_state(TASK_RUNNING); > > > > - return -1; > > > > } > > > > > > > > /* > > > > @@ -990,7 +997,7 @@ static int irq_thread(void *data) > > > > /* > > > > * This is the regular exit path. __free_irq() is stopping the > > > > * thread via kthread_stop() after calling > > > > - * synchronize_irq(). So neither IRQTF_RUNTHREAD nor the > > > > + * synchronize_hardirq(). So neither IRQTF_RUNTHREAD nor the > > > > * oneshot mask bit can be set. We cannot verify that as we > > > > * cannot touch the oneshot mask at this point anymore as > > > > * __setup_irq() might have given out currents thread_mask > > > > @@ -1595,7 +1602,7 @@ static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id) > > > > unregister_handler_proc(irq, action); > > > > > > > > /* Make sure it's not being used on another CPU: */ > > > > - synchronize_irq(irq); > > > > + synchronize_hardirq(irq); > > > > > > So after that, action can be freed and when the thread above tries to > > > access it. Nice Use After Free. That needs more thought. > > > > No, after that, kthread_stop() is called which blocks until the IRQ > > thread has completed. Only then is the action freed. > > Missed that. Fair enough. I just had to go back and figure out why I missed it: kthread_stop() is only half of the story. Just look at the comment above: * oneshot mask bit can be set. We cannot verify that as we * cannot touch the oneshot mask at this point anymore as * __setup_irq() might have given out currents thread_mask But you got lucky. That comment is not longer accurate because at the time when it was written desc->request_mutex did not exist. It's there now and prevents a concurrent request_irq() coming in after dropping desc->lock and handing out the oneshot mask bit. It that wouldn't be the case, then your scheme would be very subtly busted. This really needs to be documented in the code, the comment needs to be fixed and the changelog needs a proper explanation of all that. Thanks, tglx