Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp692363imm; Wed, 10 Oct 2018 02:46:36 -0700 (PDT) X-Google-Smtp-Source: ACcGV63wkaJ0weVP0ZjyKp7HjD/Bd08AKdyMxxPe7aJcztGJ4qZHIHUegY2ifRP3ADL4HfXwvaZU X-Received: by 2002:a17:902:d01:: with SMTP id 1-v6mr32890841plu.88.1539164796005; Wed, 10 Oct 2018 02:46:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539164795; cv=none; d=google.com; s=arc-20160816; b=bCs0H+C+QAnCLlPlO1EAU3J54a25t8KRXRBsVn2mpMf+l+Vh3GjgoVw5nTpONdqH6h PRy9oGphsN3d6rFTxGciqYWaeBlOsiPR8kxhqV4Ln900L9uq1aAKcme6zZa36J/IgWSi wz+PK+j2xuL9/8sHiXoBituV5JbUTpsdNNC+kuSWwO6OBgrzhEwP/8jXBOoaUeO6OsQE moLnmIijDjVuUsaArgjhPhmV94sJq7mHTTYP6YQc4I/ZWlLihAMxJvSljmqedXNfuGRK nLwGLRfOuiat7KBpv/xZlx8X6OFu2vAEd46cRZ3ci8b7N7aJi8u9BRFstGO5PAJM/Jc7 zypg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature; bh=T/Xtc4JDuPxLBVIUSEO2sIVKU497WYVgDA5LVpanvf4=; b=cDXDbeGojwLwdCQX2Mni31ZE2J4z6hE/a2jh5nFpokh9SQOxk+HLmY7Xs9cYtL0VQI DqRQ4TMHL6SbLjpO7hrw7WXb8Ult2NVzJOqJsqUEGIk50DNrLCaPith0sVisYpglPuuD kh+XPQac7ZflBxGtVRvCbOSbcXGiWkMo+bFPQ378GjhhqfK6aUFR7Go3aNfhnfjij0FA anvq5+CwNf6SOBNKZH18R+k8Istrb9UQ09ZCp8tKRBX+DjncQdnBXk0h7Acuu2Lk9uLS f6r2OcYbd0j0MLSj0BSTvDWY3ea4CjDbHMdusbAne+jYTou1z6i5LOjrv1bUntxNNJg2 /SoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=PwgTyaoo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b2-v6si22063182pgw.310.2018.10.10.02.46.18; Wed, 10 Oct 2018 02:46:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=PwgTyaoo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726752AbeJJRHQ (ORCPT + 99 others); Wed, 10 Oct 2018 13:07:16 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:40370 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726515AbeJJRHP (ORCPT ); Wed, 10 Oct 2018 13:07:15 -0400 Received: by mail-it1-f196.google.com with SMTP id i191-v6so7037515iti.5 for ; Wed, 10 Oct 2018 02:45:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=T/Xtc4JDuPxLBVIUSEO2sIVKU497WYVgDA5LVpanvf4=; b=PwgTyaooIHS2IjwWMOedtgGe6bNThHX+ncPldx4G0WfvISb6y5ABK/ddIUoShe7ZIV X9FrlRRwZBDN0HOZa2Dul3Q2VHabWJZA5jTfwzoOD69Cifj/Lq2GWMMu3dBtpn8R9Gr6 hRfIB4Hq6TJznQOrijCj5aKlD2fMxTNnai1aWgyzwldwGq0CpvXL2GCysSiE7hO0oisd Znt+7wuskn9QFhcHkdHPKfE9VMv9YoOlt1983fh55KFHPQxX0tMitvoS7VtkQD5GSXLr bpuAvQwILwDlDHK0yO/l5/A2zqP6DBpfPOwZgDr0LNPBV0E3dpWSK0vEcepv2eihhQPk eDXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=T/Xtc4JDuPxLBVIUSEO2sIVKU497WYVgDA5LVpanvf4=; b=ShAjXlvUff3EW/VXGDAjR0yty0pLgp6KCvf9kqsuIePR5Khwm15/HnVMvNSWasJUjd qggLmeduotpFPkDLlbyr9PMUvDFtlohmRKg98DDZq4ygw6vIYO3vLWOsViNbz6P/Fwug 4vSHizY7PybGoXZVpHCLYuPVgscGn6bKkPedTgGMOW6gCryCJBrP9SdDdUYjnVVQc2cB pgu78SLZnGgq5zTSZBxxCVix9gA2yYWt6ZGA+FjVX8e5RQrzBwoT3vUxgjk/RU0Tq5bM z2JNsYoS/VxTEwcC4uNkxqd/NTfvYLzb7Ps5w/5zxpyNxg1DMOe6MPESzRaT3s3gWYUu bwVg== X-Gm-Message-State: ABuFfogeBeLp3FXyIVDbTp0Yvj2WtKCQHbSZ48lOgho+4OGkxgan9zIF BS9on7jxBQdgEFNATkNkU7SNI6Si71znsxtF5ae/Dg== X-Received: by 2002:a24:940f:: with SMTP id j15-v6mr163814ite.12.1539164753019; Wed, 10 Oct 2018 02:45:53 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:1003:0:0:0:0:0 with HTTP; Wed, 10 Oct 2018 02:45:32 -0700 (PDT) In-Reply-To: <20181010092929.a5gd3fkkw6swco4c@linutronix.de> References: <20180918152931.17322-1-williams@redhat.com> <20181005163018.icbknlzymwjhdehi@linutronix.de> <20181005163320.zkacovxvlih6blpp@linutronix.de> <20181009142742.ikh7xv2dn5skjjbe@linutronix.de> <20181010092929.a5gd3fkkw6swco4c@linutronix.de> From: Dmitry Vyukov Date: Wed, 10 Oct 2018 11:45:32 +0200 Message-ID: Subject: Re: [PATCH] kasan: convert kasan/quarantine_lock to raw_spinlock To: Sebastian Andrzej Siewior Cc: Clark Williams , Alexander Potapenko , kasan-dev , Linux-MM , LKML , linux-rt-users@vger.kernel.org, Peter Zijlstra , Thomas Gleixner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 10, 2018 at 11:29 AM, Sebastian Andrzej Siewior wrote: > On 2018-10-10 10:25:42 [+0200], Dmitry Vyukov wrote: >> > That loop should behave like your on_each_cpu() except it does not >> > involve the remote CPU. >> >> >> The problem is that it can squeeze in between: >> >> + spin_unlock(&q->lock); >> >> spin_lock(&quarantine_lock); >> >> as far as I see. And then some objects can be left in the quarantine. > > Okay. But then once you are at CPU10 (in the on_each_cpu() loop) there > can be objects which are added to CPU0, right? So based on that, I > assumed that this would be okay to drop the lock here. What happens here is a bit tricky. When a kmem cache is freed, all objects from the cache must be already free. That is kmem_cache_free has returned for all objects (otherwise it's a user bug), these calls could have have happened on other CPUs. So is we are freeing a cache on CPU10, it is not possible that CPU0 frees an object from this cache right now/concurrently, the objects were already freed but they still can be sitting in per-cpu quarantine caches. Now a free on an unrelated object on CPU0 can decide to drain CPU cache concurrently, it grabs whole CPU cache, unlocks the cache spinlock (with your patch), and now proceeds to pushing the batch to global quarantine list. At this point CPU10 drains quarantine to evict all objects related to the cache, but it misses the batch that CPU0 transfers from cpu cache to global quarantine, because at this point it's referenced from just local stack variable temp in quarantine_put. Wrapping the whole transfer sequence with irq disable/enable, ensures that the transfer happens atomically wrt quarantine_remove_cache. That is, that quarantine_remove_cache will see the object either in cpu cache or in global quarantine, but not somewhere in the flight. >> > But this is debug code anyway, right? And it is highly complex imho. >> > Well, maybe only for me after I looked at it for the first time=E2=80= =A6 >> >> It is debug code - yes. >> Nothing about its performance matters - no. >> >> That's the way to produce unusable debug tools. >> With too much overhead timeouts start to fire and code behaves not the >> way it behaves in production. >> The tool is used in continuous integration and developers wait for >> test results before merging code. >> The tool is used on canary devices and directly contributes to usage exp= erience. > > Completely understood. What I meant is that debug code in general (from > RT perspective) increases latency to a level where the device can not > operate. Take lockdep for instance. Debug facility which is required > for RT as it spots locking problems early. It increases the latency > (depending on the workload) by 50ms+ and can't be used in production. > Same goes for SLUB debug and most other. > >> We of course don't want to trade a page of assembly code for cutting >> few cycles here (something that could make sense for some networking >> code maybe). But otherwise let's not introduce spinlocks on fast paths >> just for refactoring reasons. > > Sure. As I said. I'm fine with patch Clark initially proposed. I assumed > the refactoring would make things simpler and avoiding the cross-CPU > call a good thing. > >> > Can you take it as-is or should I repost it with an acked-by? >> >> Perhaps it's the problem with the way RT kernel changes things then? >> This is not specific to quarantine, right? > > We got rid of _a lot_ of local_irq_disable/save() + spin_lock() combos > which were there for reasons which are no longer true or due to lack of > the API. And this kasan thing is just something Clark stumbled upon > recently. And I try to negotiate something where everyone can agree on. > >> Should that mutex detect >> that IRQs are disabled and not try to schedule? If this would happen >> in some networking code, what would we do? > > It is not only that it is supposed not to schedule. Assuming the "mutex" > is not owned you could acquire it right away. No scheduling. However, > you would record current() as the owner of the lock which is wrong and > you get into other trouble later on. The list goes on :) > However, networking. If there is something that breaks then it will be > addressed. It will be forwarded upstream if this something where it > is likely to assume that RT won't change. So networking isn't special. > > Should I repost Clark's patch? I am much more comfortable with just changing the type of the lock. What are the bad implications of using the raw spinlock? Will it help to do something along the following lines: // Because of ... #if CONFIG_RT #define quarantine_spinlock_t raw_spinlock_t #else #define quarantine_spinlock_t spinlock_t #endif ?