Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp277168pxk; Sun, 30 Aug 2020 03:01:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzQqoFQbRW1x4KeR8HXA/o/0PIe9zAiP1fB4Jm+GpAJRCgiSNn66OtBTbkHqhFzrOABM5qs X-Received: by 2002:a50:ab59:: with SMTP id t25mr3429977edc.364.1598781682504; Sun, 30 Aug 2020 03:01:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598781682; cv=none; d=google.com; s=arc-20160816; b=lOFqRHhnlapXRluTZPWc1zxxRighkwmjSlK4cScmtjuVbx6NikGBTh+1vEMshPNfkz HZdauuLfQ2QYKpcdxRHY9a3RBeP+YYRHyUOTdo7h6FX9vFyPNyqsF6FPgOoUHi7YPCgQ OqJISkHvlKzgWLwgHztjahCIdXmHQjFweJLVuG4VVdUiWV5cSAPu6UHVuRIpq4Jivx+S U+F0KEe1sW4wqIeznQ5aJoO3iOzhkRMAzr3qRF/O1lHdJEM1QhOCVj/Hvrn+5BmU20MP om7zG4IPt2ba8ihWs6pWx6OXik8wvaXgIL2NHX7CpYL32Qns0GZ2fJJVvQ5ZHWPuEWyQ rZ/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=nxwlMwARqba8jZDjT/nhKogg23RO7+4YkZqIwSMfAVg=; b=KHhJBOf/DmmflrLaki/zlAzOEiTYS6ElhLesyaazisT5OA1FwZp6IiaC2px7gnRBeP nR+tznAkfCjRfUzeoLR1DBX0F8c9YHFY9tYVZjnphGNf9A7fUadAYjgJWiyC15bjiAkB vEdVdNLrZ23FW86xVdK2/qlbA6zRzA3C74ObZ6T12+6I7ra6tWWexmKc0/z8Hj5GwKmZ zqM18VoE7Abf1hJA/HUROtpqzRyTK25CkFuJOh+5bRIZ3bZJHnPT9F27FIHGYCVDEH/9 OpIthJK1B0SPvVnWTI2Lp2KYSmMt2QiLaASs5nCFayzevRz4jq3qAXdMs2kh8f1tTsd5 HmHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e22si3543160ejx.604.2020.08.30.03.00.59; Sun, 30 Aug 2020 03:01:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726493AbgH3KAZ (ORCPT + 99 others); Sun, 30 Aug 2020 06:00:25 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:47164 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725869AbgH3KAY (ORCPT ); Sun, 30 Aug 2020 06:00:24 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R521e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01355;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0U7FbXgd_1598781620; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0U7FbXgd_1598781620) by smtp.aliyun-inc.com(127.0.0.1); Sun, 30 Aug 2020 18:00:21 +0800 Subject: Re: [PATCH v2 2/2] mm/pageblock: remove false sharing in pageblock_flags To: Alexander Duyck Cc: Anshuman Khandual , Matthew Wilcox , David Hildenbrand , Andrew Morton , Hugh Dickins , Alexander Duyck , LKML , linux-mm References: <1597816075-61091-1-git-send-email-alex.shi@linux.alibaba.com> <1597816075-61091-2-git-send-email-alex.shi@linux.alibaba.com> <715f1588-9cd5-b845-51a5-ca58549c4d28@arm.com> From: Alex Shi Message-ID: Date: Sun, 30 Aug 2020 18:00:20 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2020/8/20 上午12:50, Alexander Duyck 写道: > On Wed, Aug 19, 2020 at 1:11 AM Alex Shi wrote: >> >> >> >> 在 2020/8/19 下午3:57, Anshuman Khandual 写道: >>> >>> >>> On 08/19/2020 11:17 AM, Alex Shi wrote: >>>> Current pageblock_flags is only 4 bits, so it has to share a char size >>>> in cmpxchg when get set, the false sharing cause perf drop. >>>> >>>> If we incrase the bits up to 8, false sharing would gone in cmpxchg. and >>>> the only cost is half char per pageblock, which is half char per 128MB >>>> on x86, 4 chars in 1 GB. >>> >>> Agreed that increase in memory utilization is negligible here but does >>> this really improve performance ? >>> >> >> It's no doubt in theory. and it would had a bad impact according to >> commit e380bebe4771548 mm, compaction: keep migration source private to a single >> >> but I do have some problem in running thpscale/mmtest. I'd like to see if anyone >> could give a try. >> >> BTW, I naturally hate the false sharing even it's in theory. Anyone who doesn't? :) > > You keep bringing up false sharing but you don't fix the false sharing > by doing this. You are still allowing the flags for multiple > pageblocks per cacheline so you still have false sharing even after > this. yes, the cacheline false sharing is still there. But as you pointed, cmpxchg level false sharing could be addressed much by the patchset. > > What I believe you are attempting to address is the fact that multiple > pageblocks share a single long value and that long is being used with > a cmpxchg so you end up with multiple threads potentially all banging > on the same value and watching it change. However the field currently > consists of only 4 bits, 3 of them for migratetype and 1 for the skip > bit. In the case of the 3 bit portion a cmpxchg makes sense and is > usually protected by the zone lock so you would only have one thread > accessing it in most cases with the possible exception of a section > that spans multiple zones. > > For the case such as the skip bit and MIGRATE_UNMOVABLE (0x0) where we > would be clearing or setting the entire mask maybe it would make more > sense to simply use an atomic_or or atomic_and depending on if you are > setting or clearing the flag? It would allow you to avoid the spinning > or having to read the word before performing the operation since you > would just be directly applying an AND or OR via a mask value. Right that the different level to fix this problem, but narrow the cmpxchg comparsion is still needed and helpful. Thanks Alex >