Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp335685ybj; Mon, 4 May 2020 22:50:49 -0700 (PDT) X-Google-Smtp-Source: APiQypI6mSYLOTVv/ASTLUo5kSKAtBcpldtGeHPpdJfpLFbzXTscgHevBluLOjTn6fUopVsXY8jE X-Received: by 2002:a17:906:5e4e:: with SMTP id b14mr1123105eju.285.1588657849408; Mon, 04 May 2020 22:50:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588657849; cv=none; d=google.com; s=arc-20160816; b=0wfEXx+ApT3uaFkJoLEpYF8w8z9ZBct3rITKRflpQH7AQBKX7w83CVRJ0Qd13LNoep /z+J/vQG7KC+Awmlbca9wG7567eiBDj9ebZ9vgwDYl8+KlG6rp7XRMbq7Aadkq82BNmc xQZZuBxx1pxgCljseyWZqxGXa2sYxJ6iz0HhU+GKmJaoJTGnRg/pL5E8JWfnyC7L9yih k2li63gyG3O+bqAixuLOLyjrCAA+H+n11UVxqPJv2jAK0xwNJkW8xnSKDt+mFH2bF1re nGHYLoI7YT+JNj40MTAdGs3gXq0D//1MX7Xrd2TLNlFxbVJhiyWto9boGc3QfAR7ok1P ncfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=jvG490rWGc5GqhUhuA0SKysCR0pmXtiZsNTw6qj5OqI=; b=SbZifStVW8000HclR8KrmtbygQCcxHqoxRHqoci+8LsPXseipD9AToqU+XDa2iLTwY AZS6S9KNrwYsaHuLPeHRV/QPXXX8rOEodXkKgiWkgpgA5AB8cTWSA9GInjL8eTI6rlF5 M+eRXHorHWdamZx7ghNfX/4xnjjH0VHBhJkpc/H93da/GWckDIWQFwkXmZmKWDgjheGS 8hDlK5S/N429s5mfxPd+6erOJWBrZUJBIZypq2dRp7LxUtbUyv2paKv4jlKU6wUp86J8 f+59khRvBnQewo6WL+m24kb3/8EvggVzgJooVLQJDo/DJYoiRwjAY4nxUWcKK/AQKDEw yEtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Zt0TKXNZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w10si464068ejn.246.2020.05.04.22.50.27; Mon, 04 May 2020 22:50:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Zt0TKXNZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726635AbgEEFqx (ORCPT + 99 others); Tue, 5 May 2020 01:46:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725535AbgEEFqx (ORCPT ); Tue, 5 May 2020 01:46:53 -0400 Received: from forwardcorp1p.mail.yandex.net (forwardcorp1p.mail.yandex.net [IPv6:2a02:6b8:0:1472:2741:0:8b6:217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C27B7C061A0F for ; Mon, 4 May 2020 22:46:52 -0700 (PDT) Received: from mxbackcorp1o.mail.yandex.net (mxbackcorp1o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::301]) by forwardcorp1p.mail.yandex.net (Yandex) with ESMTP id E54E92E152C; Tue, 5 May 2020 08:46:47 +0300 (MSK) Received: from iva8-88b7aa9dc799.qloud-c.yandex.net (iva8-88b7aa9dc799.qloud-c.yandex.net [2a02:6b8:c0c:77a0:0:640:88b7:aa9d]) by mxbackcorp1o.mail.yandex.net (mxbackcorp/Yandex) with ESMTP id J8LnfxHcLB-kkbWnYJJ; Tue, 05 May 2020 08:46:47 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1588657607; bh=jvG490rWGc5GqhUhuA0SKysCR0pmXtiZsNTw6qj5OqI=; h=In-Reply-To:Message-ID:From:Date:References:To:Subject:Cc; b=Zt0TKXNZozq14mw3t2u5RzNa9xJ/1yboSHfqSOzTozGCAafpIZaPvTjwK4fUHA69c z9VmelD979LLg2waFsb/pZ8YPJpv5+jRfY0B1PoYDa6k5lfAa0j6H9AhZdEdqdpdSH Uyf554/AAUywEOgBa3c3Hpq84NaqWuGPZOIm1Le4= Authentication-Results: mxbackcorp1o.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from dynamic-vpn.dhcp.yndx.net (dynamic-vpn.dhcp.yndx.net [2a02:6b8:b080:7201::1:2]) by iva8-88b7aa9dc799.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id 4YTxxCJg1p-kkWeViGx; Tue, 05 May 2020 08:46:46 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) Subject: Re: [PATCH] slub: limit count of partial slabs scanned to gather statistics To: Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim References: <158860845968.33385.4165926113074799048.stgit@buzz> <20200504125656.e3d04b350c807aba8a2a7271@linux-foundation.org> From: Konstantin Khlebnikov Message-ID: Date: Tue, 5 May 2020 08:46:41 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200504125656.e3d04b350c807aba8a2a7271@linux-foundation.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/05/2020 22.56, Andrew Morton wrote: > On Mon, 04 May 2020 19:07:39 +0300 Konstantin Khlebnikov wrote: > >> To get exact count of free and used objects slub have to scan list of >> partial slabs. This may take at long time. Scanning holds spinlock and >> blocks allocations which move partial slabs to per-cpu lists and back. >> >> Example found in the wild: >> >> # cat /sys/kernel/slab/dentry/partial >> 14478538 N0=7329569 N1=7148969 >> # time cat /sys/kernel/slab/dentry/objects >> 286225471 N0=136967768 N1=149257703 >> >> real 0m1.722s >> user 0m0.001s >> sys 0m1.721s > > I assume this could trigger the softlockup detector or even NMI > watchdog in some situations? Yes, irqs are disabled here. But loop itself is pretty fast. It requires terabytes of ram to reach common thresholds for watchdogs. > >> The same problem in slab was addressed in commit f728b0a5d72a ("mm, slab: >> faster active and free stats") by adding more kmem cache statistics. >> For slub same approach requires atomic op on fast path when object frees. >> >> Let's simply limit count of scanned slabs and print warning. >> Limit set in /sys/module/slub/parameters/max_partial_to_count. >> Default is 10000 which should be enough for most sane cases. >> >> Return linear approximation if list of partials is longer than limit. >> Nobody should notice difference. > > That's a pretty sad "solution" :( > > But I guess it's better than nothing at all, unless there are > alternative ideas? Running this loop till the end adds more problems than gives information. Adding new percpu or atomic counters to fast paths seems redundant even for debugging. Actually there is no much sense in accurate statistics for count of objects, when there are millions of them. Memory consumption here is defined by count and size of slabs. > >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2407,16 +2407,29 @@ static inline unsigned long node_nr_objs(struct kmem_cache_node *n) >> #endif /* CONFIG_SLUB_DEBUG */ >> >> #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) >> + >> +static unsigned long max_partial_to_count __read_mostly = 10000; >> +module_param(max_partial_to_count, ulong, 0644); >> + >> static unsigned long count_partial(struct kmem_cache_node *n, >> int (*get_count)(struct page *)) >> { >> + unsigned long counted = 0; >> unsigned long flags; >> unsigned long x = 0; >> struct page *page; >> >> spin_lock_irqsave(&n->list_lock, flags); >> - list_for_each_entry(page, &n->partial, slab_list) >> + list_for_each_entry(page, &n->partial, slab_list) { >> x += get_count(page); >> + >> + if (++counted > max_partial_to_count) { >> + pr_warn_once("SLUB: too much partial slabs to count all objects, increase max_partial_to_count.\n"); >> + /* Approximate total count of objects */ >> + x = mult_frac(x, n->nr_partial, counted); >> + break; >> + } >> + } >> spin_unlock_irqrestore(&n->list_lock, flags); >> return x; >> }