Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1286349ybg; Wed, 23 Oct 2019 13:19:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqx2uhXoWJpeFt1Ndrjuugf+uc4TJZeNnjIrkPDJaHsylnpGH2ilu26uxuZ7GxRhSioVNNne X-Received: by 2002:aa7:d8c8:: with SMTP id k8mr3635453eds.246.1571861944391; Wed, 23 Oct 2019 13:19:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571861944; cv=none; d=google.com; s=arc-20160816; b=tPbnnFgCuYcEjuLhfu5aftPbF8hEF/brYzN2qhdxmvMonVFkUfxyBru+ti90cgZ8R8 2HHT5Qd5rL1h3cz0GDE6LXJ+zVoyErv5zp5iTF1V1ezEWMxhh8hbgx0JPQ7lVZcKmfX2 TKZ1A1MgGnLd8MpmdWoAVvHL0MyE/M6xMJ8WSIa9dEahKkvUYnPg4DjMB1M9kdff6DPT evBK3f7soAjlCe00pqgv7jpSczy3yR5MuWtPc5Q7KhiK4ff6rnNghqhLzEXrno4Dlho/ c2RPyMgzmBQx0itl55YnqpHaLlBTyMdfx+Qymdyxsz/zXE8D0kgoXc6KfvGm2lIKg7pp vooA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:from:references:cc:to:subject; bh=SXFCExirDf0jAURm4JzMCN92xWVgCrmGpaU+eB6QGIo=; b=OqZBstutm7MZaVgCSnoKhVbu7fa+FKalk+MSgoAOPcRjRwro2f+G2sSwD1CGL8jq0o nrMw0XBz0u8BkeVzJWyeaBP0JM9EgwAaIORSAmNjhJZ623vzNMaDvDrAC9he7VQLqBuU vvppdlMZMzvBRY8S4iAJ2SFYaxT4/G0M3gSOWRznGDjIS+lOZ0aJ8/vqGeWbTMmRGCzG Va5tI6Lhofa7a9Y+IZMTlHwOzvhTWE1CQuBTRlKcYhqG44TgXcaMqw9CuHHY51NDas+u YassWjiQY8kmXhapXEshTMVNWr0uaoh68z1188ZdrvWChgnFSRl7K12TOrvEGFXM88o5 YaBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c57si3924547edc.386.2019.10.23.13.18.39; Wed, 23 Oct 2019 13:19:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406133AbfJWNsl (ORCPT + 99 others); Wed, 23 Oct 2019 09:48:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:57768 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405869AbfJWNsl (ORCPT ); Wed, 23 Oct 2019 09:48:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5BAC2B123; Wed, 23 Oct 2019 13:48:38 +0000 (UTC) Subject: Re: [RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo To: Michal Hocko Cc: Andrew Morton , Mel Gorman , Waiman Long , Johannes Weiner , Roman Gushchin , Konstantin Khlebnikov , Jann Horn , Song Liu , Greg Kroah-Hartman , Rafael Aquini , linux-mm@kvack.org, LKML References: <20191023095607.GE3016@techsingularity.net> <20191023102737.32274-1-mhocko@kernel.org> <20191023102737.32274-3-mhocko@kernel.org> <30211965-8ad0-416d-0fe1-113270bd1ea8@suse.cz> <20191023133720.GA17610@dhcp22.suse.cz> From: Vlastimil Babka Autocrypt: addr=vbabka@suse.cz; prefer-encrypt=mutual; keydata= mQINBFZdmxYBEADsw/SiUSjB0dM+vSh95UkgcHjzEVBlby/Fg+g42O7LAEkCYXi/vvq31JTB KxRWDHX0R2tgpFDXHnzZcQywawu8eSq0LxzxFNYMvtB7sV1pxYwej2qx9B75qW2plBs+7+YB 87tMFA+u+L4Z5xAzIimfLD5EKC56kJ1CsXlM8S/LHcmdD9Ctkn3trYDNnat0eoAcfPIP2OZ+ 9oe9IF/R28zmh0ifLXyJQQz5ofdj4bPf8ecEW0rhcqHfTD8k4yK0xxt3xW+6Exqp9n9bydiy tcSAw/TahjW6yrA+6JhSBv1v2tIm+itQc073zjSX8OFL51qQVzRFr7H2UQG33lw2QrvHRXqD Ot7ViKam7v0Ho9wEWiQOOZlHItOOXFphWb2yq3nzrKe45oWoSgkxKb97MVsQ+q2SYjJRBBH4 8qKhphADYxkIP6yut/eaj9ImvRUZZRi0DTc8xfnvHGTjKbJzC2xpFcY0DQbZzuwsIZ8OPJCc LM4S7mT25NE5kUTG/TKQCk922vRdGVMoLA7dIQrgXnRXtyT61sg8PG4wcfOnuWf8577aXP1x 6mzw3/jh3F+oSBHb/GcLC7mvWreJifUL2gEdssGfXhGWBo6zLS3qhgtwjay0Jl+kza1lo+Cv BB2T79D4WGdDuVa4eOrQ02TxqGN7G0Biz5ZLRSFzQSQwLn8fbwARAQABtCBWbGFzdGltaWwg QmFia2EgPHZiYWJrYUBzdXNlLmN6PokCVAQTAQoAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIe AQIXgBYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJcbbyGBQkH8VTqAAoJECJPp+fMgqZkpGoP /1jhVihakxw1d67kFhPgjWrbzaeAYOJu7Oi79D8BL8Vr5dmNPygbpGpJaCHACWp+10KXj9yz fWABs01KMHnZsAIUytVsQv35DMMDzgwVmnoEIRBhisMYOQlH2bBn/dqBjtnhs7zTL4xtqEcF 1hoUFEByMOey7gm79utTk09hQE/Zo2x0Ikk98sSIKBETDCl4mkRVRlxPFl4O/w8dSaE4eczH LrKezaFiZOv6S1MUKVKzHInonrCqCNbXAHIeZa3JcXCYj1wWAjOt9R3NqcWsBGjFbkgoKMGD usiGabetmQjXNlVzyOYdAdrbpVRNVnaL91sB2j8LRD74snKsV0Wzwt90YHxDQ5z3M75YoIdl byTKu3BUuqZxkQ/emEuxZ7aRJ1Zw7cKo/IVqjWaQ1SSBDbZ8FAUPpHJxLdGxPRN8Pfw8blKY 8mvLJKoF6i9T6+EmlyzxqzOFhcc4X5ig5uQoOjTIq6zhLO+nqVZvUDd2Kz9LMOCYb516cwS/ Enpi0TcZ5ZobtLqEaL4rupjcJG418HFQ1qxC95u5FfNki+YTmu6ZLXy+1/9BDsPuZBOKYpUm 3HWSnCS8J5Ny4SSwfYPH/JrtberWTcCP/8BHmoSpS/3oL3RxrZRRVnPHFzQC6L1oKvIuyXYF rkybPXYbmNHN+jTD3X8nRqo+4Qhmu6SHi3VquQENBFsZNQwBCACuowprHNSHhPBKxaBX7qOv KAGCmAVhK0eleElKy0sCkFghTenu1sA9AV4okL84qZ9gzaEoVkgbIbDgRbKY2MGvgKxXm+kY n8tmCejKoeyVcn9Xs0K5aUZiDz4Ll9VPTiXdf8YcjDgeP6/l4kHb4uSW4Aa9ds0xgt0gP1Xb AMwBlK19YvTDZV5u3YVoGkZhspfQqLLtBKSt3FuxTCU7hxCInQd3FHGJT/IIrvm07oDO2Y8J DXWHGJ9cK49bBGmK9B4ajsbe5GxtSKFccu8BciNluF+BqbrIiM0upJq5Xqj4y+Xjrpwqm4/M ScBsV0Po7qdeqv0pEFIXKj7IgO/d4W2bABEBAAGJA3IEGAEKACYWIQSpQNQ0mSwujpkQPVAi T6fnzIKmZAUCWxk1DAIbAgUJA8JnAAFACRAiT6fnzIKmZMB0IAQZAQoAHRYhBKZ2GgCcqNxn k0Sx9r6Fd25170XjBQJbGTUMAAoJEL6Fd25170XjDBUH/2jQ7a8g+FC2qBYxU/aCAVAVY0NE YuABL4LJ5+iWwmqUh0V9+lU88Cv4/G8fWwU+hBykSXhZXNQ5QJxyR7KWGy7LiPi7Cvovu+1c 9Z9HIDNd4u7bxGKMpn19U12ATUBHAlvphzluVvXsJ23ES/F1c59d7IrgOnxqIcXxr9dcaJ2K k9VP3TfrjP3g98OKtSsyH0xMu0MCeyewf1piXyukFRRMKIErfThhmNnLiDbaVy6biCLx408L Mo4cCvEvqGKgRwyckVyo3JuhqreFeIKBOE1iHvf3x4LU8cIHdjhDP9Wf6ws1XNqIvve7oV+w B56YWoalm1rq00yUbs2RoGcXmtX1JQ//aR/paSuLGLIb3ecPB88rvEXPsizrhYUzbe1TTkKc 4a4XwW4wdc6pRPVFMdd5idQOKdeBk7NdCZXNzoieFntyPpAq+DveK01xcBoXQ2UktIFIsXey uSNdLd5m5lf7/3f0BtaY//f9grm363NUb9KBsTSnv6Vx7Co0DWaxgC3MFSUhxzBzkJNty+2d 10jvtwOWzUN+74uXGRYSq5WefQWqqQNnx+IDb4h81NmpIY/X0PqZrapNockj3WHvpbeVFAJ0 9MRzYP3x8e5OuEuJfkNnAbwRGkDy98nXW6fKeemREjr8DWfXLKFWroJzkbAVmeIL0pjXATxr +tj5JC0uvMrrXefUhXTo0SNoTsuO/OsAKOcVsV/RHHTwCDR2e3W8mOlA3QbYXsscgjghbuLh J3oTRrOQa8tUXWqcd5A0+QPo5aaMHIK0UAthZsry5EmCY3BrbXUJlt+23E93hXQvfcsmfi0N rNh81eknLLWRYvMOsrbIqEHdZBT4FHHiGjnck6EYx/8F5BAZSodRVEAgXyC8IQJ+UVa02QM5 D2VL8zRXZ6+wARKjgSrW+duohn535rG/ypd0ctLoXS6dDrFokwTQ2xrJiLbHp9G+noNTHSan ExaRzyLbvmblh3AAznb68cWmM3WVkceWACUalsoTLKF1sGrrIBj5updkKkzbKOq5gcC5AQ0E Wxk1NQEIAJ9B+lKxYlnKL5IehF1XJfknqsjuiRzj5vnvVrtFcPlSFL12VVFVUC2tT0A1Iuo9 NAoZXEeuoPf1dLDyHErrWnDyn3SmDgb83eK5YS/K363RLEMOQKWcawPJGGVTIRZgUSgGusKL NuZqE5TCqQls0x/OPljufs4gk7E1GQEgE6M90Xbp0w/r0HB49BqjUzwByut7H2wAdiNAbJWZ F5GNUS2/2IbgOhOychHdqYpWTqyLgRpf+atqkmpIJwFRVhQUfwztuybgJLGJ6vmh/LyNMRr8 J++SqkpOFMwJA81kpjuGR7moSrUIGTbDGFfjxmskQV/W/c25Xc6KaCwXah3OJ40AEQEAAYkC PAQYAQoAJhYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJbGTU1AhsMBQkDwmcAAAoJECJPp+fM gqZkPN4P/Ra4NbETHRj5/fM1fjtngt4dKeX/6McUPDIRuc58B6FuCQxtk7sX3ELs+1+w3eSV rHI5cOFRSdgw/iKwwBix8D4Qq0cnympZ622KJL2wpTPRLlNaFLoe5PkoORAjVxLGplvQIlhg miljQ3R63ty3+MZfkSVsYITlVkYlHaSwP2t8g7yTVa+q8ZAx0NT9uGWc/1Sg8j/uoPGrctml hFNGBTYyPq6mGW9jqaQ8en3ZmmJyw3CHwxZ5FZQ5qc55xgshKiy8jEtxh+dgB9d8zE/S/UGI E99N/q+kEKSgSMQMJ/CYPHQJVTi4YHh1yq/qTkHRX+ortrF5VEeDJDv+SljNStIxUdroPD29 2ijoaMFTAU+uBtE14UP5F+LWdmRdEGS1Ah1NwooL27uAFllTDQxDhg/+LJ/TqB8ZuidOIy1B xVKRSg3I2m+DUTVqBy7Lixo73hnW69kSjtqCeamY/NSu6LNP+b0wAOKhwz9hBEwEHLp05+mj 5ZFJyfGsOiNUcMoO/17FO4EBxSDP3FDLllpuzlFD7SXkfJaMWYmXIlO0jLzdfwfcnDzBbPwO hBM8hvtsyq8lq8vJOxv6XD6xcTtj5Az8t2JjdUX6SF9hxJpwhBU0wrCoGDkWp4Bbv6jnF7zP Nzftr4l8RuJoywDIiJpdaNpSlXKpj/K6KrnyAI/joYc7 Message-ID: <7fb34979-66a4-4a5d-1798-402826e31e72@suse.cz> Date: Wed, 23 Oct 2019 15:48:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.2 MIME-Version: 1.0 In-Reply-To: <20191023133720.GA17610@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/23/19 3:37 PM, Michal Hocko wrote: > On Wed 23-10-19 15:32:05, Vlastimil Babka wrote: >> On 10/23/19 12:27 PM, Michal Hocko wrote: >>> From: Michal Hocko >>> >>> pagetypeinfo_showfree_print is called by zone->lock held in irq mode. >>> This is not really nice because it blocks both any interrupts on that >>> cpu and the page allocator. On large machines this might even trigger >>> the hard lockup detector. >>> >>> Considering the pagetypeinfo is a debugging tool we do not really need >>> exact numbers here. The primary reason to look at the outuput is to see >>> how pageblocks are spread among different migratetypes therefore putting >>> a bound on the number of pages on the free_list sounds like a reasonable >>> tradeoff. >>> >>> The new output will simply tell >>> [...] >>> Node 6, zone Normal, type Movable >100000 >100000 >100000 >100000 41019 31560 23996 10054 3229 983 648 >>> >>> instead of >>> Node 6, zone Normal, type Movable 399568 294127 221558 102119 41019 31560 23996 10054 3229 983 648 >>> >>> The limit has been chosen arbitrary and it is a subject of a future >>> change should there be a need for that. >>> >>> Suggested-by: Andrew Morton >>> Signed-off-by: Michal Hocko >> >> Hmm dunno, I would rather e.g. hide the file behind some config or boot >> option than do this. Or move it to /sys/kernel/debug ? > > But those wouldn't really help to prevent from the lockup, right? No, but it would perhaps help ensure that only people who know what they are doing (or been told so by a developer e.g. on linux-mm) will try to collect the data, and not some automatic monitoring tools taking periodic snapshots of stuff in /proc that looks interesting. > Besides that who would enable that config and how much of a difference > would root only vs. debugfs make? I would hope those tools don't scrap debugfs as much as /proc, but I might be wrong of course :) > Is the incomplete value a real problem? Hmm perhaps not. If the overflow happens only for one migratetype, one can use also /proc/buddyinfo to get to the exact count, as was proposed in this thread for Movable migratetype. >>> --- >>> mm/vmstat.c | 19 ++++++++++++++++++- >>> 1 file changed, 18 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/vmstat.c b/mm/vmstat.c >>> index 4e885ecd44d1..762034fc3b83 100644 >>> --- a/mm/vmstat.c >>> +++ b/mm/vmstat.c >>> @@ -1386,8 +1386,25 @@ static void pagetypeinfo_showfree_print(struct seq_file *m, >>> >>> area = &(zone->free_area[order]); >>> >>> - list_for_each(curr, &area->free_list[mtype]) >>> + list_for_each(curr, &area->free_list[mtype]) { >>> freecount++; >>> + /* >>> + * Cap the free_list iteration because it might >>> + * be really large and we are under a spinlock >>> + * so a long time spent here could trigger a >>> + * hard lockup detector. Anyway this is a >>> + * debugging tool so knowing there is a handful >>> + * of pages in this order should be more than >>> + * sufficient >>> + */ >>> + if (freecount > 100000) { >>> + seq_printf(m, ">%6lu ", freecount); >>> + spin_unlock_irq(&zone->lock); >>> + cond_resched(); >>> + spin_lock_irq(&zone->lock); >>> + continue; >>> + } >>> + } >>> seq_printf(m, "%6lu ", freecount); >>> } >>> seq_putc(m, '\n'); >>> >