Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp2346870imb; Mon, 4 Mar 2019 02:47:36 -0800 (PST) X-Google-Smtp-Source: APXvYqxcGDYaLC0VCWO0PMSaagyppXCXBLWCtGyIwa6oHyQgfczhSYMO8LtR72tkChYPnoFSjYfQ X-Received: by 2002:a65:5142:: with SMTP id g2mr17978834pgq.149.1551696456012; Mon, 04 Mar 2019 02:47:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551696456; cv=none; d=google.com; s=arc-20160816; b=lk3r0vJ5ojdbtupRWapz1I0xXiDBP2lp2ebh9X/dJVnD88t7YkbQKEkPnj6CEWA+sy bl24egSzN9tfaRnBYR1WDvV4WG6O7+08pfOk/fbClK6D24bKfttf9dcwILGp25y0rRIG iSHLaVfTlyDYoLfMQRcmn6hJ3B3o8ibGaYDPs6V3md4EZPOrPgsDZEX4/ddlpnqOYW1I nhFTSsyMZiNCnQURYD5XH7viMX1YRV7JGcVvvps3lYjNoxdHfzBq7UGWG7SZazgp43Uw mUlbD0QNC8fv3kuVzCTZxp7BANauQ2O5QX8UobfakXqs6mFDoiYQSN/MlXsqOkU3vzNQ ZU/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject; bh=0AWwDcF4/rfquT8fkSN0C5u5XCoSaTIroFuqvtHocPg=; b=ZgjvfYUpvhDNeq7pW1k4pnVB5h8TbqVSYyVuxwlte05ei5nKKlCM1lBWrTIdBdXPtd Dn3JGnt2UaDuyMh9zeh5VYH3jLZhZq/SGk3xrlgOZZVsrz+w34+vI4wW/6325sG+YvFJ SKvhb7UOMIwmZKi+gfZ6Xc0Dmr9stTAcsdYhbFgzbDkvW171QUEMUDlk+z18IxZtuNFS b2kxZagmGR0hvHYrw6UneOHjcIbJopmYTN08eRAIR3JbIzPGe8vTtzxsgxgp+VMqP+d9 uWqgMawgIo++qi1rg28nFr5JV3XvNK5Z1LlvT+nDNQQ3PDeM7iQ/JBIryNcZrNZVnj6g Ld2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 206si4948194pgb.323.2019.03.04.02.47.20; Mon, 04 Mar 2019 02:47:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726234AbfCDKq7 (ORCPT + 99 others); Mon, 4 Mar 2019 05:46:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35330 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726061AbfCDKq7 (ORCPT ); Mon, 4 Mar 2019 05:46:59 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B4D9D6561; Mon, 4 Mar 2019 10:46:58 +0000 (UTC) Received: from [10.36.112.48] (ovpn-112-48.ams2.redhat.com [10.36.112.48]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B276F5D71E; Mon, 4 Mar 2019 10:46:54 +0000 (UTC) Subject: Re: [PATCH v4 03/17] wlcore: Align reg_ch_conf_pending and tmp_ch_bitmap to unsigned long for better performance To: Peter Zijlstra , Fenghua Yu Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Dave Hansen , Ashok Raj , Ravi V Shankar , Xiaoyao Li , linux-kernel , x86 , kvm@vger.kernel.org References: <1551494711-213533-1-git-send-email-fenghua.yu@intel.com> <1551494711-213533-4-git-send-email-fenghua.yu@intel.com> <20190304101141.GB32477@hirez.programming.kicks-ass.net> From: Paolo Bonzini Openpgp: preference=signencrypt Autocrypt: addr=pbonzini@redhat.com; prefer-encrypt=mutual; keydata= mQHhBFRCcBIBDqDGsz4K0zZun3jh+U6Z9wNGLKQ0kSFyjN38gMqU1SfP+TUNQepFHb/Gc0E2 CxXPkIBTvYY+ZPkoTh5xF9oS1jqI8iRLzouzF8yXs3QjQIZ2SfuCxSVwlV65jotcjD2FTN04 hVopm9llFijNZpVIOGUTqzM4U55sdsCcZUluWM6x4HSOdw5F5Utxfp1wOjD/v92Lrax0hjiX DResHSt48q+8FrZzY+AUbkUS+Jm34qjswdrgsC5uxeVcLkBgWLmov2kMaMROT0YmFY6A3m1S P/kXmHDXxhe23gKb3dgwxUTpENDBGcfEzrzilWueOeUWiOcWuFOed/C3SyijBx3Av/lbCsHU Vx6pMycNTdzU1BuAroB+Y3mNEuW56Yd44jlInzG2UOwt9XjjdKkJZ1g0P9dwptwLEgTEd3Fo UdhAQyRXGYO8oROiuh+RZ1lXp6AQ4ZjoyH8WLfTLf5g1EKCTc4C1sy1vQSdzIRu3rBIjAvnC tGZADei1IExLqB3uzXKzZ1BZ+Z8hnt2og9hb7H0y8diYfEk2w3R7wEr+Ehk5NQsT2MPI2QBd wEv1/Aj1DgUHZAHzG1QN9S8wNWQ6K9DqHZTBnI1hUlkp22zCSHK/6FwUCuYp1zcAEQEAAbQj UGFvbG8gQm9uemluaSA8cGJvbnppbmlAcmVkaGF0LmNvbT6JAg0EEwECACMFAlRCcBICGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRB+FRAMzTZpsbceDp9IIN6BIA0Ol7MoB15E 11kRz/ewzryFY54tQlMnd4xxfH8MTQ/mm9I482YoSwPMdcWFAKnUX6Yo30tbLiNB8hzaHeRj jx12K+ptqYbg+cevgOtbLAlL9kNgLLcsGqC2829jBCUTVeMSZDrzS97ole/YEez2qFpPnTV0 VrRWClWVfYh+JfzpXmgyhbkuwUxNFk421s4Ajp3d8nPPFUGgBG5HOxzkAm7xb1cjAuJ+oi/K CHfkuN+fLZl/u3E/fw7vvOESApLU5o0icVXeakfSz0LsygEnekDbxPnE5af/9FEkXJD5EoYG SEahaEtgNrR4qsyxyAGYgZlS70vkSSYJ+iT2rrwEiDlo31MzRo6Ba2FfHBSJ7lcYdPT7bbk9 AO3hlNMhNdUhoQv7M5HsnqZ6unvSHOKmReNaS9egAGdRN0/GPDWr9wroyJ65ZNQsHl9nXBqE AukZNr5oJO5vxrYiAuuTSd6UI/xFkjtkzltG3mw5ao2bBpk/V/YuePrJsnPFHG7NhizrxttB nTuOSCMo45pfHQ+XYd5K1+Cv/NzZFNWscm5htJ0HznY+oOsZvHTyGz3v91pn51dkRYN0otqr bQ4tlFFuVjArBZcapSIe6NV8C4cEiSS5AQ0EVEJxcwEIAK+nUrsUz3aP2aBjIrX3a1+C+39R nctpNIPcJjFJ/8WafRiwcEuLjbvJ/4kyM6K7pWUIQftl1P8Woxwb5nqL7zEFHh5I+hKS3haO 5pgco//V0tWBGMKinjqntpd4U4Dl299dMBZ4rRbPvmI8rr63sCENxTnHhTECyHdGFpqSzWzy 97rH68uqMpxbUeggVwYkYihZNd8xt1+lf7GWYNEO/QV8ar/qbRPG6PEfiPPHQd/sldGYavmd //o6TQLSJsvJyJDt7KxulnNT8Q2X/OdEuVQsRT5glLaSAeVAABcLAEnNgmCIGkX7TnQF8a6w gHGrZIR9ZCoKvDxAr7RP6mPeS9sAEQEAAYkDEgQYAQIACQUCVEJxcwIbAgEpCRB+FRAMzTZp scBdIAQZAQIABgUCVEJxcwAKCRC/+9JfeMeug/SlCACl7QjRnwHo/VzENWD9G2VpUOd9eRnS DZGQmPo6Mp3Wy8vL7snGFBfRseT9BevXBSkxvtOnUUV2YbyLmolAODqUGzUI8ViF339poOYN i6Ffek0E19IMQ5+CilqJJ2d5ZvRfaq70LA/Ly9jmIwwX4auvXrWl99/2wCkqnWZI+PAepkcX JRD4KY2fsvRi64/aoQmcxTiyyR7q3/52Sqd4EdMfj0niYJV0Xb9nt8G57Dp9v3Ox5JeWZKXS krFqy1qyEIypIrqcMbtXM7LSmiQ8aJRM4ZHYbvgjChJKR4PsKNQZQlMWGUJO4nVFSkrixc9R Z49uIqQK3b3ENB1QkcdMg9cxsB0Onih8zR+Wp1uDZXnz1ekto+EivLQLqvTjCCwLxxJafwKI bqhQ+hGR9jF34EFur5eWt9jJGloEPVv0GgQflQaE+rRGe+3f5ZDgRe5Y/EJVNhBhKcafcbP8 MzmLRh3UDnYDwaeguYmxuSlMdjFL96YfhRBXs8tUw6SO9jtCgBvoOIBDCxxAJjShY4KIvEpK b2hSNr8KxzelKKlSXMtB1bbHbQxiQcerAipYiChUHq1raFc3V0eOyCXK205rLtknJHhM5pfG 6taABGAMvJgm/MrVILIxvBuERj1FRgcgoXtiBmLEJSb7akcrRlqe3MoPTntSTNvNzAJmfWhd SvP0G1WDLolqvX0OtKMppI91AWVu72f1kolJg43wbaKpRJg1GMkKEI3H+jrrlTBrNl/8e20m TElPRDKzPiowmXeZqFSS1A6Azv0TJoo9as+lWF+P4zCXt40+Zhh5hdHO38EV7vFAVG3iuay6 7ToF8Uy7tgc3mdH98WQSmHcn/H5PFYk3xTP3KHB7b0FZPdFPQXBZb9+tJeZBi9gMqcjMch+Y R8dmTcQRQX14bm5nXlBF7VpSOPZMR392LY7wzAvRdhz7aeIUkdO7VelaspFk2nT7wOj1Y6uL nRxQlLkBDQRUQnHuAQgAx4dxXO6/Zun0eVYOnr5GRl76+2UrAAemVv9Yfn2PbDIbxXqLff7o yVJIkw4WdhQIIvvtu5zH24iYjmdfbg8iWpP7NqxUQRUZJEWbx2CRwkMHtOmzQiQ2tSLjKh/c HeyFH68xjeLcinR7jXMrHQK+UCEw6jqi1oeZzGvfmxarUmS0uRuffAb589AJW50kkQK9VD/9 QC2FJISSUDnRC0PawGSZDXhmvITJMdD4TjYrePYhSY4uuIV02v028TVAaYbIhxvDY0hUQE4r 8ZbGRLn52bEzaIPgl1p/adKfeOUeMReg/CkyzQpmyB1TSk8lDMxQzCYHXAzwnGi8WU9iuE1P 0wARAQABiQHzBBgBAgAJBQJUQnHuAhsMAAoJEH4VEAzNNmmxp1EOoJy0uZggJm7gZKeJ7iUp eX4eqUtqelUw6gU2daz2hE/jsxsTbC/w5piHmk1H1VWDKEM4bQBTuiJ0bfo55SWsUNN+c9hh IX+Y8LEe22izK3w7mRpvGcg+/ZRG4DEMHLP6JVsv5GMpoYwYOmHnplOzCXHvmdlW0i6SrMsB Dl9rw4AtIa6bRwWLim1lQ6EM3PWifPrWSUPrPcw4OLSwFk0CPqC4HYv/7ZnASVkR5EERFF3+ 6iaaVi5OgBd81F1TCvCX2BEyIDRZLJNvX3TOd5FEN+lIrl26xecz876SvcOb5SL5SKg9/rCB ufdPSjojkGFWGziHiFaYhbuI2E+NfWLJtd+ZvWAAV+O0d8vFFSvriy9enJ8kxJwhC0ECbSKF Y+W1eTIhMD3aeAKY90drozWEyHhENf4l/V+Ja5vOnW+gCDQkGt2Y1lJAPPSIqZKvHzGShdh8 DduC0U3xYkfbGAUvbxeepjgzp0uEnBXfPTy09JGpgWbg0w91GyfT/ujKaGd4vxG2Ei+MMNDm S1SMx7wu0evvQ5kT9NPzyq8R2GIhVSiAd2jioGuTjX6AZCFv3ToO53DliFMkVTecLptsXaes uUHgL9dKIfvpm+rNXRn9wAwGjk0X/A== Message-ID: <44bb6771-7aea-c44d-6605-45e7a1499d1b@redhat.com> Date: Mon, 4 Mar 2019 11:46:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190304101141.GB32477@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 04 Mar 2019 10:46:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/03/19 11:11, Peter Zijlstra wrote: > On Fri, Mar 01, 2019 at 06:44:57PM -0800, Fenghua Yu wrote: >> A bit in reg_ch_conf_pending in wl271 and tmp_ch_bitmap is set >> atomically by set_bit(). set_bit() sets the bit in a single >> unsigned long location. If the variables are not aligned to >> unsigned long, set_bit() accesses two cache lines and thus causes >> slower performance. On x86, this scenario is called split lock and >> can cause overall performance degradation due to locked BTSL >> instruction in set_bit() locks bus. >> >> To avoid performance degradation, the two variables are aligned to >> unsigned long. >> >> Signed-off-by: Fenghua Yu --- >> drivers/net/wireless/ti/wlcore/cmd.c | 3 ++- >> drivers/net/wireless/ti/wlcore/wlcore.h | 6 ++++-- 2 files >> changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/net/wireless/ti/wlcore/cmd.c >> b/drivers/net/wireless/ti/wlcore/cmd.c index >> 903968735a74..8d15a6307d44 100644 --- >> a/drivers/net/wireless/ti/wlcore/cmd.c +++ >> b/drivers/net/wireless/ti/wlcore/cmd.c @@ -1707,7 +1707,8 @@ int >> wlcore_cmd_regdomain_config_locked(struct wl1271 *wl) { struct >> wl12xx_cmd_regdomain_dfs_config *cmd = NULL; int ret = 0, i, b, >> ch_bit_idx; - u32 tmp_ch_bitmap[2]; + /* Align to unsigned long >> for better performance in set_bit() */ + u32 tmp_ch_bitmap[2] >> __aligned(sizeof(unsigned long)); This is the only place where an array of u32 is needed, because of cmd->ch_bit_map1 = cpu_to_le32(tmp_ch_bitmap[0]); cmd->ch_bit_map2 = cpu_to_le32(tmp_ch_bitmap[1]); All the others should use DECLARE_BITMAP, including reg_ch_conf_last which was already using __aligned. As Peter mentioned they should also use set_bit_le. Actually they do not need locked access at all because the only code paths to the set_bit take a mutex. There is one other place that is accessing the items of the array, but it is fixed easily. The following patch should do everything: ------------------- 8< -------------------------- From: Paolo Bonzini Subject: [PATCH] wlcore: simplify/fix/optimize reg_ch_conf_pending operations Bitmaps are defined on unsigned longs, so the usage of u32[2] in the wlcore driver is incorrect. As noted by Peter Zijlstra, casting arrays to a bitmap is incorrect for big-endian architectures. When looking at it I observed that: - operations on reg_ch_conf_pending is always under the wl_lock mutex, so set_bit is overkill - the only case where reg_ch_conf_pending is accessed a u32 at a time is unnecessary too. This patch cleans up everything in this area, and changes tmp_ch_bitmap to have the proper alignment. Reported-by: Fenghua Yu Signed-off-by: Paolo Bonzini diff --git a/drivers/net/wireless/ti/wlcore/cmd.c b/drivers/net/wireless/ti/wlcore/cmd.c index 903968735a74..3e093f3a7ec8 100644 --- a/drivers/net/wireless/ti/wlcore/cmd.c +++ b/drivers/net/wireless/ti/wlcore/cmd.c @@ -1700,14 +1700,14 @@ void wlcore_set_pending_regdomain_ch(struct wl1271 *wl, u16 channel, ch_bit_idx = wlcore_get_reg_conf_ch_idx(band, channel); if (ch_bit_idx >= 0 && ch_bit_idx <= WL1271_MAX_CHANNELS) - set_bit(ch_bit_idx, (long *)wl->reg_ch_conf_pending); + __set_bit_le(ch_bit_idx, (long *)wl->reg_ch_conf_pending); } int wlcore_cmd_regdomain_config_locked(struct wl1271 *wl) { struct wl12xx_cmd_regdomain_dfs_config *cmd = NULL; int ret = 0, i, b, ch_bit_idx; - u32 tmp_ch_bitmap[2]; + u32 tmp_ch_bitmap[2] __aligned(sizeof(unsigned long)); struct wiphy *wiphy = wl->hw->wiphy; struct ieee80211_supported_band *band; bool timeout = false; @@ -1717,7 +1717,7 @@ int wlcore_cmd_regdomain_config_locked(struct wl1271 *wl) wl1271_debug(DEBUG_CMD, "cmd reg domain config"); - memset(tmp_ch_bitmap, 0, sizeof(tmp_ch_bitmap)); + memcpy(tmp_ch_bitmap, wl->reg_ch_conf_pending, sizeof(tmp_ch_bitmap)); for (b = NL80211_BAND_2GHZ; b <= NL80211_BAND_5GHZ; b++) { band = wiphy->bands[b]; @@ -1738,13 +1738,10 @@ int wlcore_cmd_regdomain_config_locked(struct wl1271 *wl) if (ch_bit_idx < 0) continue; - set_bit(ch_bit_idx, (long *)tmp_ch_bitmap); + __set_bit_le(ch_bit_idx, (long *)tmp_ch_bitmap); } } - tmp_ch_bitmap[0] |= wl->reg_ch_conf_pending[0]; - tmp_ch_bitmap[1] |= wl->reg_ch_conf_pending[1]; - if (!memcmp(tmp_ch_bitmap, wl->reg_ch_conf_last, sizeof(tmp_ch_bitmap))) goto out; diff --git a/drivers/net/wireless/ti/wlcore/wlcore.h b/drivers/net/wireless/ti/wlcore/wlcore.h index dd14850b0603..870eea3e7a27 100644 --- a/drivers/net/wireless/ti/wlcore/wlcore.h +++ b/drivers/net/wireless/ti/wlcore/wlcore.h @@ -320,9 +320,9 @@ struct wl1271 { bool watchdog_recovery; /* Reg domain last configuration */ - u32 reg_ch_conf_last[2] __aligned(8); + DECLARE_BITMAP(reg_ch_conf_last, 64); /* Reg domain pending configuration */ - u32 reg_ch_conf_pending[2]; + DECLARE_BITMAP(reg_ch_conf_pending, 64); /* Pointer that holds DMA-friendly block for the mailbox */ void *mbox;