Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp3149268ybk; Mon, 18 May 2020 19:39:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyyfMrzb8bhEMJ0djcl38QEz/wngn+cDkYJepbX3Y/7xKZ1KZmCEx2qsqlOrK3CdwwU/2BO X-Received: by 2002:aa7:d487:: with SMTP id b7mr4951158edr.351.1589855998254; Mon, 18 May 2020 19:39:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589855998; cv=none; d=google.com; s=arc-20160816; b=AS+xdKCCK5bBb9/bsCLx2NyJ5fST1GwdU7NnR7hz7H7Ap0fKFPSFIyH3YjwOOqC7MS t2BlQo57LDhGJGuplLFI2Rmw0IMlhaLgiTsPaOfNnCMxDqaG1azAzAbYc+nuMWMdxA2T VoIKm7yOz7AD6EzCH38+EBE2x0Dh5p4ElAq6+S/vH+hE4UxtBWF6tiGwnLgmrdRnX2y3 kxZB/BtAvJQrOhPOO3jPVLwE+naYO/R9rPetVhapsuHzAFrqhIIl+LxMachFa537pBfK ugnezgwUy5x2t3Od5aiuYkd90S7RDvLJSZchVvSTZkhzK5LsAh8uXNHqOtGd8Y2f4ndP DBag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=1cHkZoHQXI3aEuEjslUR4plc1ANHqu9wqQB4RG/kOcg=; b=sME1val7v042KWKzEwgN5xxiXaBrc06l9Cml+54mEQZFFaqpVnJQjuRpaYptPoO9M6 zH4A5YvU+hc2Waw+rqScTdweKqGYTMkVZN/jOnCSmsahpvgiGzTabT9K2BdDAzkLy9iO xwPtiJCqFje9rVu7wkeryh1CX8YzLS7IoahQv1kth1WkKtwOBFiysuNp/Dn/y+YiDbHp 06E6EdkqlsJO07E0eXdgh3bD+RVOtHrQw1ormUFqK9MJI8cCg0YitxJxOudL87LnvZt6 auauD0BmkvOsb8zZgT3C9Cs4K1VBJ90bGJPiGe8aJLFZYzsyiM9WtWbaY2+LcF7xpz5m pmWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=Mh9PV2be; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b34si7000022edf.411.2020.05.18.19.39.35; Mon, 18 May 2020 19:39:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=Mh9PV2be; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726831AbgESCiA (ORCPT + 99 others); Mon, 18 May 2020 22:38:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726302AbgESCiA (ORCPT ); Mon, 18 May 2020 22:38:00 -0400 Received: from mail-ej1-x642.google.com (mail-ej1-x642.google.com [IPv6:2a00:1450:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D84C2C05BD09 for ; Mon, 18 May 2020 19:37:58 -0700 (PDT) Received: by mail-ej1-x642.google.com with SMTP id s21so10544028ejd.2 for ; Mon, 18 May 2020 19:37:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1cHkZoHQXI3aEuEjslUR4plc1ANHqu9wqQB4RG/kOcg=; b=Mh9PV2beR8kr3sxrsvtSNBQkdVdaxMDyWEAR2BMrKGoZGEcLEsnHv0wI4lF5m6XERu JvelpzFXtcY8ryrfqTkyQEGTQmv4WXI2Dh7AjkIKEytkAabZGXbXoTPk31UF9p6/tjlK LQaieh22LBBb9He/v+ecjLB0cQvCZ/TTiegmxZwkYTp7SptjPtEpPVKmYxbwwxGo4j6H PXQfoOTEvBRVFcUUkwhkowsJGWpLTkrTc93AG0Dyk2c6Kizmq3BsUbSgXap+UhHnOWfI bFuWi+hQxnmLCCP7Fa2oF9mX2/zuzfQTqydfJMT5u9z75zlp+ALwcqCBOeSUIXEUdQG8 qkvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1cHkZoHQXI3aEuEjslUR4plc1ANHqu9wqQB4RG/kOcg=; b=lfjPU2o7hF3r1MczKhLdJFbHlaiSpbG+CJhdGAoIguqXHo4YhCDtRz5OmQSNGfax3o CBlYJxo2abDs0ogMs/+Uv7bo7tMRs2VsCOOjFv9pAHDZMTn+jgeKBNHMO0mUY2UivpAg Nftypgzy5Vqm/zh17vo8qOU5OzjX00we6PKoncmIkl2+oVxl6xAyS5UqBtVTKOb9GSEF ne8cMNd9jghxHES/wLj3BbmeyieWvaXO8oNmfSD7NRVxjn1iEr8IxHBoEr3hwapzovJz 6BPmJFUKEatwZQNDbCmdxzVV3exz6mvbdxVY2IJA+XQIrvgmaxKk1Mt0KQdg6Zgfw1PW EfJA== X-Gm-Message-State: AOAM532K9ERRXI+m38lWk+1bCZlVE3K3B7CSwcFjrHTFZOwdNQ25munj dVENLPhnCAQNXNNFB2efeNvExUK7mdiRjykSLKgYWw== X-Received: by 2002:a17:906:af47:: with SMTP id ly7mr552521ejb.98.1589855877357; Mon, 18 May 2020 19:37:57 -0700 (PDT) MIME-Version: 1.0 References: <20200427235621.7823-4-longman@redhat.com> <638f59c0-60f1-2279-fea6-28b2980720f4@redhat.com> In-Reply-To: <638f59c0-60f1-2279-fea6-28b2980720f4@redhat.com> From: Qian Cai Date: Mon, 18 May 2020 22:37:46 -0400 Message-ID: Subject: Re: [PATCH v2 3/4] mm/slub: Fix another circular locking dependency in slab_attr_store() To: Waiman Long Cc: Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Johannes Weiner , Michal Hocko , Vladimir Davydov , Linux-MM , Linux Kernel Mailing List , Cgroups , Juri Lelli Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 18, 2020 at 6:05 PM Waiman Long wrote: > > On 5/16/20 10:19 PM, Qian Cai wrote: > > > >> On Apr 27, 2020, at 7:56 PM, Waiman Long wrote: > >> > >> It turns out that switching from slab_mutex to memcg_cache_ids_sem in > >> slab_attr_store() does not completely eliminate circular locking dependency > >> as shown by the following lockdep splat when the system is shut down: > >> > >> [ 2095.079697] Chain exists of: > >> [ 2095.079697] kn->count#278 --> memcg_cache_ids_sem --> slab_mutex > >> [ 2095.079697] > >> [ 2095.090278] Possible unsafe locking scenario: > >> [ 2095.090278] > >> [ 2095.096227] CPU0 CPU1 > >> [ 2095.100779] ---- ---- > >> [ 2095.105331] lock(slab_mutex); > >> [ 2095.108486] lock(memcg_cache_ids_sem); > >> [ 2095.114961] lock(slab_mutex); > >> [ 2095.120649] lock(kn->count#278); > >> [ 2095.124068] > >> [ 2095.124068] *** DEADLOCK *** > > Can you show the full splat? > > > >> To eliminate this possibility, we have to use trylock to acquire > >> memcg_cache_ids_sem. Unlikely slab_mutex which can be acquired in > >> many places, the memcg_cache_ids_sem write lock is only acquired > >> in memcg_alloc_cache_id() to double the size of memcg_nr_cache_ids. > >> So the chance of successive calls to memcg_alloc_cache_id() within > >> a short time is pretty low. As a result, we can retry the read lock > >> acquisition a few times if the first attempt fails. > >> > >> Signed-off-by: Waiman Long > > The code looks a bit hacky and probably not that robust. Since it is the shutdown path which is not all that important without lockdep, maybe you could drop this single patch for now until there is a better solution? > > That is true. Unlike using the slab_mutex, the chance of failing to > acquire a read lock on memcg_cache_ids_sem is pretty low. Maybe just > print_once a warning if that happen. That seems cleaner. If you are going to repost this series, you could also mention that the series will fix slabinfo triggering a splat as well.