Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1747894rwb; Tue, 29 Nov 2022 19:02:42 -0800 (PST) X-Google-Smtp-Source: AA0mqf7s75B0nh80cF9Oc2CBn9a+YuQOi16Zjvo/xek69GoZpdXZBEu4w7Sany9ZfF5dIrQeXmuB X-Received: by 2002:a05:6402:3886:b0:463:ab08:2bc6 with SMTP id fd6-20020a056402388600b00463ab082bc6mr36972224edb.143.1669777362229; Tue, 29 Nov 2022 19:02:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669777362; cv=none; d=google.com; s=arc-20160816; b=DGtUR+edDfS70YpXAY17QpAe6oFLlILtAsbjVIjdz9ofgM3asRHkE7lF0Pd3Uijdeg w9uL+5BA+P8L9Ek9coWKwO5ez5TIDuqQvn1GjXQbvTxGb1zTHOuS3QHPKD7zqc5J9tom 8hNnEId0L9KKVG5tndz0XeN8qqUqrwa8OKOfIbPuybdwFfprh+9icxkCuxO5povCOnZV 6esIF9b+ggp9210JjniCyZcIA86KwWGqM63nlRklBq85PSsR0dLmUKjKEHIMZPUdaIGv 6hi+Pys/hRyjSz4/fDYeQm8u4jbwvlVxPhY8HBVGxLjEqc4z8lklSSBzzU+2qdAihs1C pkDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=TtkyVa2IrpwnujLbAlvNXcH3WzrDZO7fZhfeo5ojqx4=; b=XirraDbLGgNOVwPExZ34Q7Tso1sXIJlfEQCPojf0PUCFLVZUdlrJ+RYR1h/MB6hM/5 4k2a3xDXyDHtXtenoKs7WKc43nqAK0EHGI1Sn6s6BYePewWJoDXfWdI+BeOdJR3cr3uk K2IyYT+gG3o3LjNvKiiW9LuV5MrJjx8CUaR0mJwQJL6zhmBOMZ0d8Am4JJWaHcYkywnn PPoLdTtf9+YmIB60eeL3OCPnTpPMK8tovaQyLsHW54dauB3VlXXIpFYhdIBoatuyu88E FHK6ENaHwyv88NAzzr1kZsh1qFw13yjQIXGzfMnS6DY0B5VDu560ilG8iBiKxuDwIeAy pQbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Yys8xeVi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t16-20020a1709064f1000b0073dced7204bsi189595eju.767.2022.11.29.19.02.20; Tue, 29 Nov 2022 19:02:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Yys8xeVi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229595AbiK3Csc (ORCPT + 85 others); Tue, 29 Nov 2022 21:48:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232385AbiK3Cs3 (ORCPT ); Tue, 29 Nov 2022 21:48:29 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F9C56CA1D; Tue, 29 Nov 2022 18:48:28 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id q7so24115426wrr.8; Tue, 29 Nov 2022 18:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=TtkyVa2IrpwnujLbAlvNXcH3WzrDZO7fZhfeo5ojqx4=; b=Yys8xeVioelna9iNWiERvl5crZc+mVVlT+/OZm1LNnG4J2rCxTB+EJIklMMegkuscc lSsPxMzna89c6M//ei9tMuP1ATGqyC3dtCL3IquQxuTokBHVfGQNifV+4NriXFtI2X17 fszP4brWR/Hx+1/LnduzDEDWLG2bW2wn+YiLZznjK3Bl6yjsYFGNitQ1NCRBg6VsdwA3 mdvc/0pCR4hmvzoA9KsE/aXTinG4WlUei5trUCNevQehnnAyVeIxHs/rrX9R1ViLAbSq ugV0pwm7w1LorCZ+BNVB/yXIdQa79D+iw5qhGFXmZxo5o6Ez23tqKsvpa3jdwq/RxNVs BinQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TtkyVa2IrpwnujLbAlvNXcH3WzrDZO7fZhfeo5ojqx4=; b=4Q8gpUOG4ZS0u6BnZFgHhNs1Kp4k9IZ9oSILavaATvPoMwXWRSl+nZdtzVNjcpxeM0 jMIflimS3goU7RNYmluXJ8GQhkjmbArxJEWsK/shlBJMnwjk13sO6n8DWjoj5UYsrkh9 QcUsSA0m6qaQ8/6AnwKFH7nlvXFc4BSLBzz6DeL/BHoVB+/yzhX+AXYdQBD+3f43aA++ MJDHlh7Src1ud4fKPLQlh0x7CKTKhrMFjpSt8zv9C89WQgYT4rF6s9Enl4KsSeAl/BAi ByBvOy2kuxXzGkblJ4p0N2ZR7j74T3fuO0d8m/rQC+s0np/c78Fe/wxN+eFWpSuqGhP0 nr5w== X-Gm-Message-State: ANoB5pmjLwPTQpd63w83pxxNMw8mJRgTghTzgDULghZRP+MWCTnG+UMs mUVN7lx6BjFSblcnUp2xGyZDyr61SWisMo252jE= X-Received: by 2002:adf:e68a:0:b0:242:1926:7838 with SMTP id r10-20020adfe68a000000b0024219267838mr7357388wrm.200.1669776506939; Tue, 29 Nov 2022 18:48:26 -0800 (PST) MIME-Version: 1.0 References: <41eda0ea-0ed4-1ffb-5520-06fda08e5d38@huawei.com> <07a7491e-f391-a9b2-047e-cab5f23decc5@huawei.com> <59fc54b7-c276-2918-6741-804634337881@huaweicloud.com> <541aa740-dcf3-35f5-9f9b-e411978eaa06@redhat.com> <23b5de45-1a11-b5c9-d0d3-4dbca0b7661e@huaweicloud.com> In-Reply-To: <23b5de45-1a11-b5c9-d0d3-4dbca0b7661e@huaweicloud.com> From: Tonghao Zhang Date: Wed, 30 Nov 2022 10:47:50 +0800 Message-ID: Subject: Re: [net-next] bpf: avoid hashtab deadlock with try_lock To: Hou Tao Cc: Hao Luo , Waiman Long , Peter Zijlstra , Ingo Molnar , Will Deacon , netdev@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Jiri Olsa , bpf , "houtao1@huawei.com" , LKML , Boqun Feng Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 30, 2022 at 9:50 AM Hou Tao wrote: > > Hi Hao, > > On 11/30/2022 3:36 AM, Hao Luo wrote: > > On Tue, Nov 29, 2022 at 9:32 AM Boqun Feng wrote: > >> Just to be clear, I meant to refactor htab_lock_bucket() into a try > >> lock pattern. Also after a second thought, the below suggestion doesn't > >> work. I think the proper way is to make htab_lock_bucket() as a > >> raw_spin_trylock_irqsave(). > >> > >> Regards, > >> Boqun > >> > > The potential deadlock happens when the lock is contended from the > > same cpu. When the lock is contended from a remote cpu, we would like > > the remote cpu to spin and wait, instead of giving up immediately. As > > this gives better throughput. So replacing the current > > raw_spin_lock_irqsave() with trylock sacrifices this performance gain. > > > > I suspect the source of the problem is the 'hash' that we used in > > htab_lock_bucket(). The 'hash' is derived from the 'key', I wonder > > whether we should use a hash derived from 'bucket' rather than from > > 'key'. For example, from the memory address of the 'bucket'. Because, > > different keys may fall into the same bucket, but yield different > > hashes. If the same bucket can never have two different 'hashes' here, > > the map_locked check should behave as intended. Also because > > ->map_locked is per-cpu, execution flows from two different cpus can > > both pass. > The warning from lockdep is due to the reason the bucket lock A is used in a > no-NMI context firstly, then the same bucke lock is used a NMI context, so Yes, I tested lockdep too, we can't use the lock in NMI(but only try_lock work fine) context if we use them no-NMI context. otherwise the lockdep prints the warning. * for the dead-lock case: we can use the 1. hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1) 2. or hash bucket address. * for lockdep warning, we should use in_nmi check with map_locked. BTW, the patch doesn't work, so we can remove the lock_key https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c50eb518e262fa06bd334e6eec172eaf5d7a5bd9 static inline int htab_lock_bucket(const struct bpf_htab *htab, struct bucket *b, u32 hash, unsigned long *pflags) { unsigned long flags; hash = hash & min(HASHTAB_MAP_LOCK_MASK, htab->n_buckets -1); preempt_disable(); if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { __this_cpu_dec(*(htab->map_locked[hash])); preempt_enable(); return -EBUSY; } if (in_nmi()) { if (!raw_spin_trylock_irqsave(&b->raw_lock, flags)) return -EBUSY; } else { raw_spin_lock_irqsave(&b->raw_lock, flags); } *pflags = flags; return 0; } > lockdep deduces that may be a dead-lock. I have already tried to use the same > map_locked for keys with the same bucket, the dead-lock is gone, but still got > lockdep warning. > > > > Hao > > . > -- Best regards, Tonghao