Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp678436imm; Wed, 10 Oct 2018 02:30:14 -0700 (PDT) X-Google-Smtp-Source: ACcGV620Bm9EVYyaeQA/ZbfowopuPCkP0P6TqZKX3P/0DVOBlL132MkcQ5PI6DJUD5GwrN2djyWQ X-Received: by 2002:a63:5922:: with SMTP id n34-v6mr29404958pgb.134.1539163814651; Wed, 10 Oct 2018 02:30:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539163814; cv=none; d=google.com; s=arc-20160816; b=N3URVI1RtV82gDGFVZ81NMvjxHKkD41AtocJ9LBY8JdTHj9Wkc4oUEkWDmwevONSlZ wGfohWmbS5oijsPMQptVhcJBMi+x6hfhKXcmgr4KCyAcSTnu8cwCgSFNaB65JqndbSMi gGnoAbwBjR0C186wzLcFvYc5yPzQv6LM5DtIjxV4sLo3OltIOPoqHUTgY12bOAKMrOiu kGE4pB/utZZY1Mcqb5PxSag+xNmpkIVXFeXf3ZW0SARPkFEhgoU4Eh4NQCO/ZGIMPeRf 96HGjJtbB15rqxamD7sEQGhKBPuEpRAlq7QTtUI5bPlaw3yggwG/6GN6LomXpT6N1Bbg mMzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=zdE1ZiUc9JIr675gL7m2qkt2kxqdvSQArzHjLnAeQGw=; b=Nk6GoMCCR9HzqutO+g7jYTkoVL9lXMj43QJUOJcH+4meKzSTtcxJnhO5JYfY7Db6dj CbxuHuA6NU5Gg6yJnlkDjJhB+v6aFNeOG/8M+mGUutXiVM8CD+TG8o3TQWgnfN+y+rlo tpizWtuDj+hYUYqEqk1loWAieNJp22NZpkPdLrOKZ085fTt3YkIGsr379IT7eC+JYXhz cnkvXm8rhsQgPX5iJfd8+3Ef5oGmpsfAS0gXKpfVGevh7D4OernT8vNvRzr5Rh6mab+e OIRCosDLDiCTsHmBcTJ/Hd/mYsasbOwylbqJQV64Jd8pGBxuE0PgOFFY5A5W2A8jqQ+B 7VTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t11-v6si20591365pgm.572.2018.10.10.02.29.58; Wed, 10 Oct 2018 02:30:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726771AbeJJQuy convert rfc822-to-8bit (ORCPT + 99 others); Wed, 10 Oct 2018 12:50:54 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:45028 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726579AbeJJQuy (ORCPT ); Wed, 10 Oct 2018 12:50:54 -0400 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1gAAo6-0004Tx-7Z; Wed, 10 Oct 2018 11:29:30 +0200 Date: Wed, 10 Oct 2018 11:29:29 +0200 From: Sebastian Andrzej Siewior To: Dmitry Vyukov Cc: Clark Williams , Alexander Potapenko , kasan-dev , Linux-MM , LKML , linux-rt-users@vger.kernel.org, Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH] kasan: convert kasan/quarantine_lock to raw_spinlock Message-ID: <20181010092929.a5gd3fkkw6swco4c@linutronix.de> References: <20180918152931.17322-1-williams@redhat.com> <20181005163018.icbknlzymwjhdehi@linutronix.de> <20181005163320.zkacovxvlih6blpp@linutronix.de> <20181009142742.ikh7xv2dn5skjjbe@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-10-10 10:25:42 [+0200], Dmitry Vyukov wrote: > > That loop should behave like your on_each_cpu() except it does not > > involve the remote CPU. > > > The problem is that it can squeeze in between: > > + spin_unlock(&q->lock); > > spin_lock(&quarantine_lock); > > as far as I see. And then some objects can be left in the quarantine. Okay. But then once you are at CPU10 (in the on_each_cpu() loop) there can be objects which are added to CPU0, right? So based on that, I assumed that this would be okay to drop the lock here. > > But this is debug code anyway, right? And it is highly complex imho. > > Well, maybe only for me after I looked at it for the first timeā€¦ > > It is debug code - yes. > Nothing about its performance matters - no. > > That's the way to produce unusable debug tools. > With too much overhead timeouts start to fire and code behaves not the > way it behaves in production. > The tool is used in continuous integration and developers wait for > test results before merging code. > The tool is used on canary devices and directly contributes to usage experience. Completely understood. What I meant is that debug code in general (from RT perspective) increases latency to a level where the device can not operate. Take lockdep for instance. Debug facility which is required for RT as it spots locking problems early. It increases the latency (depending on the workload) by 50ms+ and can't be used in production. Same goes for SLUB debug and most other. > We of course don't want to trade a page of assembly code for cutting > few cycles here (something that could make sense for some networking > code maybe). But otherwise let's not introduce spinlocks on fast paths > just for refactoring reasons. Sure. As I said. I'm fine with patch Clark initially proposed. I assumed the refactoring would make things simpler and avoiding the cross-CPU call a good thing. > > Can you take it as-is or should I repost it with an acked-by? > > Perhaps it's the problem with the way RT kernel changes things then? > This is not specific to quarantine, right? We got rid of _a lot_ of local_irq_disable/save() + spin_lock() combos which were there for reasons which are no longer true or due to lack of the API. And this kasan thing is just something Clark stumbled upon recently. And I try to negotiate something where everyone can agree on. > Should that mutex detect > that IRQs are disabled and not try to schedule? If this would happen > in some networking code, what would we do? It is not only that it is supposed not to schedule. Assuming the "mutex" is not owned you could acquire it right away. No scheduling. However, you would record current() as the owner of the lock which is wrong and you get into other trouble later on. The list goes on :) However, networking. If there is something that breaks then it will be addressed. It will be forwarded upstream if this something where it is likely to assume that RT won't change. So networking isn't special. Should I repost Clark's patch? Sebastian