Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp9501668imu; Wed, 5 Dec 2018 05:56:31 -0800 (PST) X-Google-Smtp-Source: AFSGD/ULqkA8Hrs7+CZ6LvgzyYG2vRpxceRgds81D2MAORncQgIIurTGaxw8f+1jYK/TG6fb/oEp X-Received: by 2002:a65:4946:: with SMTP id q6mr19997788pgs.201.1544018191813; Wed, 05 Dec 2018 05:56:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544018191; cv=none; d=google.com; s=arc-20160816; b=HFrE4Vb9Aua0EX8TTAVgkZmj3zdB3UehrSP67bOX/DcoRzRj7Pji0FrgMvSj+7Z6kC zqKy4TeFmdqJr4NYHNwKewRMbypSYwYynjGoQtCIST/eFRvxHSXWva3rrMe/O5DVfeEy TthwklODTl7kwVK8F+qA6d/aVdlgEhE1OFXnnIHuGmTYcGgiqSNUu7lM69fnvf+KY5CC zVntKVOAu67XIzkJIXdpZxJ1KTC2xYwiBUJ+B5Uug/63DPIi+9vxuW+5oRDgpA4usv9H mZpTq/4LucOS4oEqvhtqH1UB6mR9sMFaHg9BkwzOgntErLOgksoWLeJW13hcpV0brx2U r/dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=6WmHr5YlfYP617Eq019w6rZO0ZTtjREd/NmdpFNfPzQ=; b=u0skMvJ7nMukQzM3yWjRVV3oOjYu2M4I5vKqo3z8CXu5Mi7nduRf/lg2oW5oX5rtN4 zPh0GRKA1F1D79x9Qzn+7X0cwQqRPUcXevcLkkFTlJAc2jbwaXINv3Mw7IhzOIy8mbhS 8GJprtlAlkculmcX9XsYlmEliHitk6fQPlDLEI4tljPjtxGJ1speLGDcu5Fqxk0zajrB YINWAZ5Mb7D+ATWuKapw9rn3PqPJ6vo4r8vx7Y6uJRDMGB6B0Qf5cvSLW51Y9DclLmDP h4K5694LazwTHrc4DlhUbVhVauq6alO7pyW85mzc0hk1HlJUPQ+WvzENQgeRG0/6wB3I GenA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id cf17si10302590plb.52.2018.12.05.05.56.16; Wed, 05 Dec 2018 05:56:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727604AbeLENza (ORCPT + 99 others); Wed, 5 Dec 2018 08:55:30 -0500 Received: from mail5.windriver.com ([192.103.53.11]:57052 "EHLO mail5.wrs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727195AbeLENz3 (ORCPT ); Wed, 5 Dec 2018 08:55:29 -0500 Received: from ALA-HCA.corp.ad.wrs.com (ala-hca.corp.ad.wrs.com [147.11.189.40]) by mail5.wrs.com (8.15.2/8.15.2) with ESMTPS id wB5DriM4004206 (version=TLSv1 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 5 Dec 2018 05:54:00 -0800 Received: from [128.224.162.162] (128.224.162.162) by ALA-HCA.corp.ad.wrs.com (147.11.189.50) with Microsoft SMTP Server (TLS) id 14.3.408.0; Wed, 5 Dec 2018 05:53:41 -0800 Subject: Re: [PATCH v2] kmemleak: Turn kmemleak_lock to raw spinlock on RT To: Sebastian Andrzej Siewior CC: , , , , , References: <1542877459-144382-1-git-send-email-zhe.he@windriver.com> <20181123095314.hervxkxtqoixovro@linutronix.de> <40a63aa5-edb6-4673-b4cc-1bc10e7b3953@windriver.com> <20181130181956.eewrlaabtceekzyu@linutronix.de> From: He Zhe Message-ID: Date: Wed, 5 Dec 2018 21:53:37 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20181130181956.eewrlaabtceekzyu@linutronix.de> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Language: en-US X-Originating-IP: [128.224.162.162] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/12/1 02:19, Sebastian Andrzej Siewior wrote: > On 2018-11-24 22:26:46 [+0800], He Zhe wrote: >> On latest v4.19.1-rt3, both of the call traces can be reproduced with kmemleak >> enabied. And none can be reproduced with kmemleak disabled. > okay. So it needs attention. > >> On latest mainline tree, none can be reproduced no matter kmemleak is enabled >> or disabled. >> >> I don't get why kfree from a preempt-disabled section should cause a warning >> without kmemleak, since kfree can't sleep. > it might. It will acquire a sleeping lock if it has go down to the > memory allocator to actually give memory back. Got it. Thanks. > >> If I understand correctly, the call trace above is caused by trying to schedule >> after preemption is disabled, which cannot be reached in mainline kernel. So >> we might need to turn to use raw lock to keep preemption disabled. > The buddy-allocator runs with spin locks so it is okay on !RT. So you > can use kfree() with disabled preemption or disabled interrupts. > I don't think that we want to use raw-locks in the buddy-allocator. For call trace 1: I went through the calling paths inside kfree() and found that there have already been things using raw lock in it as follow. 1) in the path of kfree() itself kfree -> slab_free -> do_slab_free -> __slab_free -> raw_spin_lock_irqsave 2) in the path of CONFIG_DEBUG_OBJECTS_FREE kfree -> slab_free -> slab_free_freelist_hook -> slab_free_hook -> debug_check_no_obj_freed -> __debug_check_no_obj_freed -> raw_spin_lock_irqsave 3) in the path of CONFIG_LOCKDEP kfree -> __free_pages -> free_unref_page -> free_unref_page_prepare -> free_pcp_prepare -> free_pages_prepare -> debug_check_no_locks_freed -> debug_check_no_locks_freed -> raw_local_irq_save Since kmemleak would most likely be used to debug in environments where we would not expect as great performance as without it, and kfree() has raw locks in its main path and other debug function paths, I suppose it wouldn't hurt that we change to raw locks. > >> >From what I reached above, this is RT-only and happens on v4.18 and v4.19. >> >> The call trace above is caused by grabbing kmemleak_lock and then getting >> scheduled and then re-grabbing kmemleak_lock. Using raw lock can also solve >> this problem. > But this is a reader / writer lock. And if I understand the other part > of the thread then it needs multiple readers. For call trace 2: I don't get what "it needs multiple readers" exactly means here. In this call trace, the kmemleak_lock is grabbed as write lock, and then scheduled away, and then grabbed again as write lock from another path. It's a write->write locking, compared to the discussion in the other part of the thread. This is essentially because kmemleak hooks on the very low level memory allocation and free operations. After scheduled away, it can easily re-enter itself. We need raw locks to prevent this from happening. > Couldn't we just get rid of that kfree() or move it somewhere else? > I mean if the free() memory on CPU-down and allocate it again CPU-up > then we could skip that, rigth? Just allocate it and don't free it > because the CPU will likely get up again. For call trace 1: I went through the CPU hotplug code and found that the allocation of the problematic data, cpuc->shared_regs, is done in intel_pmu_cpu_prepare. And the free is done in intel_pmu_cpu_dying. They are handlers triggered by two different perf events. It seems we can hardly form a convincing method that holds the data while CPUs are off and then uses it again. raw locks would be easy and good enough. Thanks, Zhe > >> Thanks, >> Zhe > Sebastian >