Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp111329ybm; Tue, 26 May 2020 12:04:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyNpVKUmV+xJyJqiPH+NpO5MXxHbyvcpeHeIHNjGxkc7Xrm5aXwgU0Xmh0t6gRj6ZMVM3tI X-Received: by 2002:a17:906:b7cd:: with SMTP id fy13mr699698ejb.133.1590519852235; Tue, 26 May 2020 12:04:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590519852; cv=none; d=google.com; s=arc-20160816; b=wl474obmTv/5BQIJMPnxPsc1RtQ/vfg/K0qGV3KmAObH1sa/VE3Mti1XyJPffqe1Rh 3YdGYMczBem1WNaz3JS1YXzbTxQDGEb1ntML3mUD0zOoGGoNr8kYxFGzQHmxuGvcVhx6 ZueGdzZYFfxScNwZ5Ee1mehBjZLQZ5BkHK3getOCZaLLVaJ/MWT9Snvz2WejOJygjGy0 ZoOJHpTJVn6ImYrLU102+7R9+CDOC+sDDOLa1O10S2CIQWmVAuAVQUSHB5i45s5iwXyS jQC+Iy5E3H5O0/mxJ7w+QF+hNOGGpXAOF8x6r56jRcK8K29ytISLxGyuYRYaz/RyOzIH T17Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=SjoliUfvbku6NAedJER/qXlY6SBZSp5CpYZIHr5HINA=; b=zJa0MxneHiEZMuTFlQSIK6hGDHvorC3khYCRlzLTlHO3gxXDr/qLY2bp9F35EowIBJ ijtTk5wc02qOucquxvbZQ0++9Nc9XQHcRXy5tePSRI564NjC6oWsbMzfyv1ofXB4go6G /Q3/lxODlPF9aes0872VTJGQ0BbzRc+MMlCPI7pzRWWTq9edY2l+74Faj3saY/+YGpbG 6Bp2semynQxVCtEP+8wfDwbLzjlHiTVNL2EHKOQcgz98rX1i4h7ejMezPTv0chPjdpKA wRCQ2hQVknn+rcn/WQ+FroFxqj/x8EDw6hGuEROp39zGrrJNc9HNUhuay05hJldbwXTd Stwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=IiYzYMWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f14si442684edj.235.2020.05.26.12.03.48; Tue, 26 May 2020 12:04:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=IiYzYMWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390438AbgEZS66 (ORCPT + 99 others); Tue, 26 May 2020 14:58:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390428AbgEZS6y (ORCPT ); Tue, 26 May 2020 14:58:54 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77FD9C03E96D for ; Tue, 26 May 2020 11:58:54 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id v4so17097157qte.3 for ; Tue, 26 May 2020 11:58:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SjoliUfvbku6NAedJER/qXlY6SBZSp5CpYZIHr5HINA=; b=IiYzYMWUvmNd5hzuMmXXytnwc4UwCEuW3hPpITIAioJ270FpHW9eQDA25d+VUTRUyV /dFTBpU0IBDTV+SzfofT+BLcaaKMYszPa8Mg/NudsKGPn5ur5DvaSFZEXKGmRzHP0Kk2 efn+BdwZDIxJ3U5aain49eCJefS1/q0Yy8QHN95I2lbHFFnWyL6a7FRVprrM01hWOuLa GjsZhH0Wn0JM5D2nYaDO1PWSskADk7tahX2wHWnrf9oKKXlg0huwSFMRBAwk0iOev9w4 8vcv0yMrI/PjIUg9k7/c3KPaKxVritmc3iMdrSOD7O2K5tFIu2Ey8Zax6q7cD4hS0pJf NRig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=SjoliUfvbku6NAedJER/qXlY6SBZSp5CpYZIHr5HINA=; b=txG9R1B6C/HeaUpoc7f+rPH6ZNkTLllYrzuz653/9WQzdBrJ2bZt/APry9k45J5yjX YstVNIDXFqHPcJ1GAER3nLWrLDnrynsInkhFfi0Q4DdIZJ3ExHRNkISTIY+UUlmXh2+U hJSMCaOfynbJ833M86jPA8crCE6p88oV0JYwmhOvH4u1D1UPtfWI5HAS8E3tjKI4Qkvx kiDM0qCWIEnkfzlqYSSm5jyKUwFWUBAUCujJaXmUr8FjRhz9gcEMsLicgU5yHXtKXFQS iYpVmSn3FnDAkxKY01iW6a8k+KUPMIJX+lifoCjMD9TsE36ibfZNWPajN9gIRxex8zf7 C+Hw== X-Gm-Message-State: AOAM5332QcCCXp4jfIHuQT+XsVUpT3kNfBH1g3AadlPeiZwjzp5RJV2e 8VGxdTpTIu6OsgYzcFYnRFjI6Q== X-Received: by 2002:ac8:1601:: with SMTP id p1mr228565qtj.311.1590519533596; Tue, 26 May 2020 11:58:53 -0700 (PDT) Received: from lca.pw (pool-71-184-117-43.bstnma.fios.verizon.net. [71.184.117.43]) by smtp.gmail.com with ESMTPSA id r7sm509210qtc.25.2020.05.26.11.58.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 May 2020 11:58:52 -0700 (PDT) Date: Tue, 26 May 2020 14:58:50 -0400 From: Qian Cai To: Waiman Long Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , linux-kernel@vger.kernel.org Subject: Re: [PATCH] locking/lockdep: Increase MAX_LOCKDEP_ENTRIES by half Message-ID: <20200526185850.GE991@lca.pw> References: <20200526174349.8312-1-longman@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200526174349.8312-1-longman@redhat.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 26, 2020 at 01:43:49PM -0400, Waiman Long wrote: > It was found by Qian Cai that lockdep splat sometimes appears with the > "BUG: MAX_LOCKDEP_ENTRIES too low" message on linux-next. On a 32-vcpu VM > guest with a v5.7-rc7 based kernel, I looked at how many of the various > table entries were being used after bootup and after a parallel kernel > build (make -j32). The tables below show the usage statistics. I think this approach could help the situatin on this AMD server. However, I don't understand why this particular system starts to exceed the MAX_LOCKDEP_ENTRIES. All other Intel, powerpc, amd64 and s390 running the same workload have only < 20k at the end, but on this AMD server, After system boot: ~13000 (which is similar to other arches.) After LTP MM +2000 After LTP Syscalls: +5000 After reading all sysfs [1] (except debugfs): +10000 [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/fs/read_all/read_all.c # read_all -d /sys -q -r 10 I still don't understand why reading all sysfs files on this system could increase that much, but here is the lockdep file after running sysfs read to see if you could spot anything obviously, https://cailca.github.io/files/lockdep.txt lock-classes: 2114 [max: 8192] direct dependencies: 28568 [max: 32768] indirect dependencies: 126882 all direct dependencies: 304192 dependency chains: 38269 [max: 65536] dependency chain hlocks used: 141480 [max: 327680] dependency chain hlocks lost: 0 in-hardirq chains: 166 in-softirq chains: 818 in-process chains: 37285 stack-trace entries: 175558 [max: 524288] number of stack traces: 8355 number of stack hash chains: 6545 combined max dependencies: 804750782 hardirq-safe locks: 80 hardirq-unsafe locks: 1048 softirq-safe locks: 169 softirq-unsafe locks: 957 irq-safe locks: 183 irq-unsafe locks: 1048 hardirq-read-safe locks: 4 hardirq-read-unsafe locks: 943 softirq-read-safe locks: 12 softirq-read-unsafe locks: 940 irq-read-safe locks: 12 irq-read-unsafe locks: 943 uncategorized locks: 255 unused locks: 0 max locking depth: 19 max bfs queue depth: 1136 chain lookup misses: 43195 chain lookup hits: 3151662027 cyclic checks: 44009 redundant checks: 0 redundant links: 0 find-mask forwards checks: 7726 find-mask backwards checks: 8095 hardirq on events: 3503795225 hardirq off events: 3503795224 redundant hardirq ons: 453772 redundant hardirq offs: 1856554342 softirq on events: 51104551 softirq off events: 51104551 redundant softirq ons: 0 redundant softirq offs: 0 debug_locks: 1 zapped classes: 1101 zapped lock chains: 1117 large chain blocks: 1 > > After bootup: > > Table Used Max %age > ----- ---- --- ---- > lock_classes[] 1834 8192 22.4 > list_entries[] 15646 32768 47.7 > lock_chains[] 20873 65536 31.8 > chain_hlocks[] 83199 327680 25.4 > stack_trace[] 146177 524288 27.9 > > After parallel kernel build: > > Table Used Max %age > ----- ---- --- ---- > lock_classes[] 1864 8192 22.8 > list_entries[] 17134 32768 52.3 > lock_chains[] 25196 65536 38.4 > chain_hlocks[] 106321 327680 32.4 > stack_trace[] 158700 524288 30.3 > > Percentage-wise, it can be seen that the list_entries for direct > dependencies are used much more than the other tables. So it is also > the table that is mostly likely to run out of space when running a > compex workload. > > To reduce the likelihood of running out of table entries, we can increase > MAX_LOCKDEP_ENTRIES by 50% from 16384/32768 to 24576/49152. On a 64-bit > architecture, that represents an increase in memory consumption of > 917504 bytes. With that change, the percentage usage of list_entries[] > will fall to 31.8% and 34.9% respectively to make them more in line > with the other tables. > > Signed-off-by: Waiman Long > --- > kernel/locking/lockdep_internals.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h > index baca699b94e9..6108d2fbe775 100644 > --- a/kernel/locking/lockdep_internals.h > +++ b/kernel/locking/lockdep_internals.h > @@ -89,12 +89,12 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ = > * table (if it's not there yet), and we check it for lock order > * conflicts and deadlocks. > */ > -#define MAX_LOCKDEP_ENTRIES 16384UL > +#define MAX_LOCKDEP_ENTRIES 24576UL > #define MAX_LOCKDEP_CHAINS_BITS 15 > #define MAX_STACK_TRACE_ENTRIES 262144UL > #define STACK_TRACE_HASH_SIZE 8192 > #else > -#define MAX_LOCKDEP_ENTRIES 32768UL > +#define MAX_LOCKDEP_ENTRIES 49152UL > > #define MAX_LOCKDEP_CHAINS_BITS 16 > > -- > 2.18.1 >