Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp6014584ybg; Tue, 22 Oct 2019 11:43:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqy0hpIq84xv0GKV3CWtWUqfr+ssGpoXPCq/QsR1EUj7qTi6ij/hWLSu4Xc0C5vQNiUkL3FA X-Received: by 2002:a50:eb81:: with SMTP id y1mr3181699edr.119.1571769828215; Tue, 22 Oct 2019 11:43:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571769828; cv=none; d=google.com; s=arc-20160816; b=jx75TMFbFyALyhOfEnHj3zMSQl/g2Ox8uvQ7D3Q5eZlbCNxWMKtL4r056TdlCtXkrF pWPt57D2wlk4fBV/313oDy8sv4pqvuz+03wTeZgEDmigezr4Ex2DtxauXWy28280n1jB 0SVDUJ+huteU0Hl6cf3irkJo4qz5fH95CPqEU0UZ0uCmVVa+cn4OEBKnfPJt+79od6XL zSEx5N8/femqwF9tcUg+eTKRhD890rAD29ZjZishY5EUK2DZY+8RevSxHg/XBhqmDHuX tJAurJQ6hR2amlpHT0uBeLE+MZOqqz24pcLRhMSQ098BwDF7e6hMlaKPi6gN25eP6xp0 FrAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:references:cc:to:from:subject :dkim-signature; bh=LBqFTHigo1FB9xwoqXhjVaLlLB8M7eFR2otQiW7RMsY=; b=oKTkc4JmqP2SeUFYoI8v+njFUUIN7f3KdTX/hdEXhMybtATyq18ktFoIjkg3hjogny 1+13bvupguT0DMBC38hQ0yK/SIcx9b3from+OcteYSWv4/Cju7l3tYP6JvU/d0jhnqaq MrLqaoAKs6HZWfvN4I/MAuo4W/jHkyR0LcELS690sEl3qHVdU9WCr9i+3Ylg5DvzQEjf pSCQugU8zCUrbXTzaHvhcZa89l98LosDb6IvxEf/35y6H6GJvV+7a+AW581ht6JcLIQ0 P06c2y9acdAx/xz8Je8Mjz9YBwAi03YJRy4cMjJ/BtfLBcjxMO5Kma2iOr/dV1MoaOas kjBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZN72Ey8p; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y15si3611554ejq.159.2019.10.22.11.43.23; Tue, 22 Oct 2019 11:43:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZN72Ey8p; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387659AbfJVSkS (ORCPT + 99 others); Tue, 22 Oct 2019 14:40:18 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:20581 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729666AbfJVSkS (ORCPT ); Tue, 22 Oct 2019 14:40:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1571769617; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LBqFTHigo1FB9xwoqXhjVaLlLB8M7eFR2otQiW7RMsY=; b=ZN72Ey8pGADHedkBGrWCOqSjaOY6X9MuYkQngNK9jArnzb6vcoVhC9wewsDD1oW0NY1fHR uEgIgavEBOh4sEyRO2pgDdRpJ8j7amvbcCQNMn6WJCTQgZO/FzvHWeny0ym/qdo38GUFVD 9n8vEg2r4VLldoU/OP9yY9kNwqmhJag= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-353-1-eL6FxLOQO0i-XaU-vsxA-1; Tue, 22 Oct 2019 14:40:13 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 98567800D49; Tue, 22 Oct 2019 18:40:09 +0000 (UTC) Received: from llong.remote.csb (dhcp-17-59.bos.redhat.com [10.18.17.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id E990260C5D; Tue, 22 Oct 2019 18:40:04 +0000 (UTC) Subject: Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo From: Waiman Long To: Michal Hocko Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Johannes Weiner , Roman Gushchin , Vlastimil Babka , Konstantin Khlebnikov , Jann Horn , Song Liu , Greg Kroah-Hartman , Rafael Aquini , Mel Gorman References: <20191022162156.17316-1-longman@redhat.com> <20191022165745.GT9379@dhcp22.suse.cz> <0b206255-5c62-18f5-d751-a5576a6c0e8f@redhat.com> Organization: Red Hat Message-ID: Date: Tue, 22 Oct 2019 14:40:04 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <0b206255-5c62-18f5-d751-a5576a6c0e8f@redhat.com> Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-MC-Unique: 1-eL6FxLOQO0i-XaU-vsxA-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/22/19 2:00 PM, Waiman Long wrote: > On 10/22/19 12:57 PM, Michal Hocko wrote: > >>> and used nr_free to compute the missing count. Since MIGRATE_MOVABLE >>> is usually the largest one on large memory systems, this is the one >>> to be skipped. Since the printing order is migration-type =3D> order, w= e >>> will have to store the counts in an internal 2D array before printing >>> them out. >>> >>> Even by skipping the MIGRATE_MOVABLE pages, we may still be holding the >>> zone lock for too long blocking out other zone lock waiters from being >>> run. This can be problematic for systems with large amount of memory. >>> So a check is added to temporarily release the lock and reschedule if >>> more than 64k of list entries have been iterated for each order. With >>> a MAX_ORDER of 11, the worst case will be iterating about 700k of list >>> entries before releasing the lock. >> But you are still iterating through the whole free_list at once so if it >> gets really large then this is still possible. I think it would be >> preferable to use per migratetype nr_free if it doesn't cause any >> regressions. >> > Yes, it is still theoretically possible. I will take a further look at > having per-migrate type nr_free. BTW, there is one more place where the > free lists are being iterated with zone lock held - mark_free_pages(). Looking deeper into the code, the exact migration type is not stored in the page itself. An initial movable page can be stolen to be put into another migration type. So in a delete or move from free_area, we don't know exactly what migration type the page is coming from. IOW, it is hard to get accurate counts of the number of entries in each lists. I am not saying this is impossible, but doing it may require stealing some bits from the page structure to store this information which is probably not worth the benefit we can get from it. So if you have any good suggestion of how to do it without too much cost, please let me know about it. Otherwise, I will probably stay with the current patch. Cheers, Longman