Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp687368ybz; Wed, 15 Apr 2020 16:44:16 -0700 (PDT) X-Google-Smtp-Source: APiQypLDeGE3xcMz0+U82BwE2icyvlSk7gr2I+PO1KzYMoRqciNjghn2kHi5DZkln+byK/2A0sXF X-Received: by 2002:a17:906:2962:: with SMTP id x2mr6726066ejd.233.1586994256792; Wed, 15 Apr 2020 16:44:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586994256; cv=none; d=google.com; s=arc-20160816; b=dUCNrGK4ntbzDKqMj6xYp8vH6kU0tz+eTjSD2nBpd7hikIrLu+h/SL7LPKi4DSrjCl WD+OUzkn83x5KtyCH2fcnaFzkZZv9Ah3oYVypSY0WLheVNtO3bLVFhJQsOmZYlW66FM9 pvAMKekZ75A6Vo9oMulmOyYJkxDtgnyrERGpFchXmKFWs9Jys8s7mZVEa2IkmoGat6gg pww9IjtezdEFm/tCISn3jbx9ZceUp/9GOR+7NXXWqkcPt7g/fmtzoMVYL7s5sK7jPHkQ DmudkZj0sunS6wVEPmtMTCRax192AWmKyHsiNP7wjA9zPxH4O/MQUN5hrIMJdt6T7GSd LeQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :organization:in-reply-to:subject:cc:to:from:ironport-sdr :ironport-sdr; bh=vte3j+Amts3AVj782r3sIqAUWrKDwW6JDCVUlRATDJc=; b=zLPKf3HIop2XB2F7u5QZ8HKgBDr95LBIy9YuiYO3cpEehnkkEoScA02b76my5Aefjb 2jCaJoJUWPqicHiM6SVqmnVtxeiOsn8eVV3tmMwRom/J1J6tc3Kp6OslNYjTelExITWW OKuohrO4uTtpBfHCS2LeSz7CuI4xA+O/deNn57Q2lGMzceCi4EBNbASk7DwF0jkvNUlg nl2pnTrGJ54uhU5Q0im+aoyaWIS88a7mrhKLaj0q1wub9IpF74OG0oUuWivf/fSUTAns 68NzUx6s0v9eAklkMLIYHZ01L9kgbZLwq0mjnaMP4BXWnvGLxaiAeXDB22M/fBd4qT1P KQdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g5si7124374edu.188.2020.04.15.16.43.53; Wed, 15 Apr 2020 16:44:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2635227AbgDOHlw (ORCPT + 99 others); Wed, 15 Apr 2020 03:41:52 -0400 Received: from mga18.intel.com ([134.134.136.126]:61786 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2635175AbgDOHls (ORCPT ); Wed, 15 Apr 2020 03:41:48 -0400 IronPort-SDR: /Zmg2UTnCEURGa06SdjaN2PCpDrMpgW2VrU9pZQJN8Wo8z4oSzK9unBC9W0wjSf110oulSgAdw AO67ZWrFkfAw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Apr 2020 00:41:46 -0700 IronPort-SDR: xgNl2Gj6HE2Gw18Pl4vLDPZpmTopzDm+0O6eyyi3TvCEhbIFddpB7JlWvW9q/uchtjVyUU+/tu yoYiN1esP0YQ== X-IronPort-AV: E=Sophos;i="5.72,386,1580803200"; d="scan'208";a="427348570" Received: from ssolodk-mobl1.ccr.corp.intel.com (HELO localhost) ([10.252.48.37]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Apr 2020 00:41:40 -0700 From: Jani Nikula To: Alex Deucher , Bernard Zhao Cc: Alex Sierra , Oak Zeng , Maling list - DRI developers , David Airlie , Felix Kuehling , LKML , amd-gfx list , kernel@vivo.com, Huang Rui , Kent Russell , Alex Deucher , Sam Ravnborg , Christian =?utf-8?Q?K=C3=B6nig?= , Xiaojie Yuan Subject: Re: [PATCH] Optimized division operation to shift operation In-Reply-To: Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo References: <1586864113-30682-1-git-send-email-bernard@vivo.com> Date: Wed, 15 Apr 2020 10:41:37 +0300 Message-ID: <87lfmx5h72.fsf@intel.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 14 Apr 2020, Alex Deucher wrote: > On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: >> >> On some processors, the / operate will call the compiler`s div lib, >> which is low efficient, We can replace the / operation with shift, >> so that we can replace the call of the division library with one >> shift assembly instruction. This was applied already, and it's not in a driver I look after... but to me this feels like something that really should be justified. Using >> instead of / for multiples of 2 division mattered 20 years ago, I'd be surprised if it still did on modern compilers. BR, Jani. >> >> Signed-off-by: Bernard Zhao > > Applied. thanks. > > Alex > >> --- >> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 4 ++-- >> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 4 ++-- >> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 4 ++-- >> 3 files changed, 6 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c >> index b205039..66cd078 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c >> @@ -175,10 +175,10 @@ static int gmc_v6_0_mc_load_microcode(struct amdgpu_device *adev) >> amdgpu_ucode_print_mc_hdr(&hdr->header); >> >> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version); >> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2); >> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3; >> new_io_mc_regs = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes)); >> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4; >> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2; >> new_fw_data = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes)); >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c >> index 9da9596..ca26d63 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c >> @@ -193,10 +193,10 @@ static int gmc_v7_0_mc_load_microcode(struct amdgpu_device *adev) >> amdgpu_ucode_print_mc_hdr(&hdr->header); >> >> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version); >> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2); >> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3; >> io_mc_regs = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes)); >> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4; >> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2; >> fw_data = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes)); >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> index 27d83204..295039c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c >> @@ -318,10 +318,10 @@ static int gmc_v8_0_tonga_mc_load_microcode(struct amdgpu_device *adev) >> amdgpu_ucode_print_mc_hdr(&hdr->header); >> >> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version); >> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2); >> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3; >> io_mc_regs = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes)); >> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4; >> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2; >> fw_data = (const __le32 *) >> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes)); >> >> -- >> 2.7.4 >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Jani Nikula, Intel Open Source Graphics Center