Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751230AbdH1FTZ (ORCPT ); Mon, 28 Aug 2017 01:19:25 -0400 Received: from mga07.intel.com ([134.134.136.100]:64335 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750767AbdH1FTY (ORCPT ); Mon, 28 Aug 2017 01:19:24 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,440,1498546800"; d="scan'208";a="1188736962" From: "Huang\, Ying" To: Peter Zijlstra Cc: "Huang\, Ying" Cc: Eric Dumazet , , Ingo Molnar , Michael Ellerman , Borislav Petkov , Thomas Gleixner , "Juergen Gross" , Aaron Lu Subject: Re: [PATCH 3/3] IPI: Avoid to use 2 cache lines for one call_single_data References: <20170802085220.4315-1-ying.huang@intel.com> <20170802085220.4315-4-ying.huang@intel.com> <1501669138.25002.20.camel@edumazet-glaptop3.roam.corp.google.com> <87d18d122e.fsf@yhuang-dev.intel.com> <20170803085752.yrqox3kwrvkj544a@hirez.programming.kicks-ass.net> <87wp6kyvda.fsf@yhuang-dev.intel.com> <87mv7gytmk.fsf@yhuang-dev.intel.com> <20170804092754.hyhbhyr2r4gonpu4@hirez.programming.kicks-ass.net> <87d18alu2h.fsf@yhuang-mobile.sh.intel.com> <20170807082837.dakfoq5kbj52opha@hirez.programming.kicks-ass.net> <87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com> <8760dqln47.fsf@yhuang-dev.intel.com> Date: Mon, 28 Aug 2017 13:19:21 +0800 In-Reply-To: <8760dqln47.fsf@yhuang-dev.intel.com> (Ying Huang's message of "Mon, 14 Aug 2017 13:44:24 +0800") Message-ID: <87a82kp8va.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2157 Lines: 57 "Huang, Ying" writes: > Hi, Peter, > > "Huang, Ying" writes: > >> Peter Zijlstra writes: >> >>> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote: >>>> Yes. That looks good. So you will prepare the final patch? Or you >>>> hope me to do that? >>> >>> I was hoping you'd do it ;-) >> >> Thanks! Here is the updated patch >> >> Best Regards, >> Huang, Ying >> >> ---------->8---------- >> From 957735e9ff3922368286540dab852986fc7b23b5 Mon Sep 17 00:00:00 2001 >> From: Huang Ying >> Date: Mon, 7 Aug 2017 16:55:33 +0800 >> Subject: [PATCH -v3] IPI: Avoid to use 2 cache lines for one >> call_single_data >> >> struct call_single_data is used in IPI to transfer information between >> CPUs. Its size is bigger than sizeof(unsigned long) and less than >> cache line size. Now, it is allocated with no explicit alignment >> requirement. This makes it possible for allocated call_single_data to >> cross 2 cache lines. So that double the number of the cache lines >> that need to be transferred among CPUs. >> >> This is resolved by requiring call_single_data to be aligned with the >> size of call_single_data. Now the size of call_single_data is the >> power of 2. If we add new fields to call_single_data, we may need to >> add pads to make sure the size of new definition is the power of 2. >> Fortunately, this is enforced by gcc, which will report error for not >> power of 2 alignment requirement. >> >> To set alignment requirement of call_single_data to the size of >> call_single_data, a struct definition and a typedef is used. >> >> To test the effect of the patch, we use the vm-scalability multiple >> thread swap test case (swap-w-seq-mt). The test will create multiple >> threads and each thread will eat memory until all RAM and part of swap >> is used, so that huge number of IPI will be triggered when unmapping >> memory. In the test, the throughput of memory writing improves ~5% >> compared with misaligned call_single_data because of faster IPI. > > What do you think about this version? > Ping. Best Regards, Huang, Ying