Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6086658imu; Wed, 30 Jan 2019 08:34:30 -0800 (PST) X-Google-Smtp-Source: ALg8bN4j3VFz/IBDKxSNon+RN3Sk+Ie2q3w7ule8T2YdOadEIUQAY/2eZ6wUxMmkdsELhsv9WXvj X-Received: by 2002:a63:6506:: with SMTP id z6mr21527152pgb.334.1548866070450; Wed, 30 Jan 2019 08:34:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548866070; cv=none; d=google.com; s=arc-20160816; b=O9v0FlE9q3o/55PV+x0VQ3Wt7g4iul0MPDZvAeGlKnrY4Fn7vHFFK424IwO4nYnSi8 rqM14+qyvKsWAS2wvJk75mb77jIAjWjLCKQcdAdLgnUbZQRjnSsBlvGlHUB6vk398KNG v2ZV7Zd2t/4OW27pLEm7Bn1Nh3YZPIbIIo5tm5X4MlkxOdeArbpSMxCUL2JLG541OVjd fjqrYil0weHlEv1k1x6yfZzKOdU0r2p4MwOqL72d5Dm71xRLA0YjCbcdQSe796EgrHI3 Md01fPhhdXGtMM276YMkOzH1HmQw+tWEWx+C7CaYvw12tXr7zU7M+xOtK2mlMkpnu6Nm 2mNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=bkQZCda9xo5Eu99elqBilXzoCDVRlcMsDq1JWlMueZ4=; b=Yj2s5ER+P9/62SLKLo7VtIkorb8U+0tTI0LxillGI6YfNK195/nWQCQYwN5r3Oclk0 oktz+ddYRUDQS/oi0tl0AzAhcxoYZ3JoS6TPacak34Y3GEEHnUR1VTctkgtnUZsj2CRS hru2AjLwpD5RhT84+xnUAojxCVRbpJrhurFGK764iqIjwrdq1NsuI6Z7qLr6zZ825E52 m1E0MJyr0I4N/gY8jfylpq28B90x3+tJ+JNP1z/sbuVhH78nYyzx/rm1f/w1eEm9rzad TqhnJViwMMqL8sxtt1u6XnEbWHB0DaKP287TQDc1A57jsvwdtjPDrqnuE01t8NdFuucO OwgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=flVKY4pU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 11si1749252pgs.126.2019.01.30.08.34.15; Wed, 30 Jan 2019 08:34:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=flVKY4pU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732089AbfA3Qcw (ORCPT + 99 others); Wed, 30 Jan 2019 11:32:52 -0500 Received: from us01smtprelay-2.synopsys.com ([198.182.47.9]:53330 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731857AbfA3Qcu (ORCPT ); Wed, 30 Jan 2019 11:32:50 -0500 Received: from mailhost.synopsys.com (unknown [10.12.135.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtprelay.synopsys.com (Postfix) with ESMTPS id D744B24E2597; Wed, 30 Jan 2019 08:32:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=synopsys.com; s=mail; t=1548865969; bh=j8Q8QvZ3axEHYvnG+Zzo98mhwpDrFhSbkf/zC57ekFc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=flVKY4pUMGLHGNHIXHtfYUZS8T8DoYdvfRYTHsi/RR/AXuaZujQHpOpvjK0CNaGgL Pn1HitTDYm73iJ3896Yvz63T9isNvqVXaWLKCettFy5Q8y7nIViUw5GOIgVLe+VzBb 5QCwlsuTUIHtTzI8iH47EytyRgwXt88SpuL6GnFUtXYEiYf9aySnwTe6AbOSfNx4z6 rd65PKz70vesAnlsMc0jPa/qdt9mlFJNroMl5vfrStKPw+hrJ0+qCCKlPZ7LSFACOI SJG5ocTGMJ4POEl0+rURaaEUpSXsFCH0YMt7ryRk+CFsFsuAFJM8WmyLdY+2qnrFj6 7kvFyAsUtQlag== Received: from paltsev-e7480.internal.synopsys.com (unknown [10.121.8.46]) by mailhost.synopsys.com (Postfix) with ESMTP id 837FCA008B; Wed, 30 Jan 2019 16:32:48 +0000 (UTC) From: Eugeniy Paltsev To: linux-snps-arc@lists.infradead.org, Vineet Gupta Cc: linux-kernel@vger.kernel.org, Alexey Brodkin , Eugeniy Paltsev Subject: [PATCH v2 1/5] ARCv2: lib: memcpy: fix doing prefetchw outside of buffer Date: Wed, 30 Jan 2019 19:32:40 +0300 Message-Id: <20190130163244.10870-2-Eugeniy.Paltsev@synopsys.com> X-Mailer: git-send-email 2.14.5 In-Reply-To: <20190130163244.10870-1-Eugeniy.Paltsev@synopsys.com> References: <20190130163244.10870-1-Eugeniy.Paltsev@synopsys.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ARCv2 optimized memcpy uses PREFETCHW instruction for prefetching the next cache line but doesn't ensure that the line is not past the end of the buffer. PRETECHW changes the line ownership and marks it dirty, which can cause data corruption if this area is used for DMA IO. Fix the issue by avoiding the PREFETCHW. This leads to performance degradation but it is OK as we'll introduce new memcpy implementation optimized for unaligned memory access using. We also cut off all PREFETCH instructions at they are quite useless here: * we call PREFETCH right before LOAD instruction call. * we copy 16 or 32 bytes of data (depending on CONFIG_ARC_HAS_LL64) in a main logical loop. so we call PREFETCH 4 times (or 2 times) for each L1 cache line (in case of 64B L1 cache Line which is default case). Obviously this is not optimal. Signed-off-by: Eugeniy Paltsev --- arch/arc/lib/memcpy-archs.S | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/arch/arc/lib/memcpy-archs.S b/arch/arc/lib/memcpy-archs.S index d61044dd8b58..ea14b0bf3116 100644 --- a/arch/arc/lib/memcpy-archs.S +++ b/arch/arc/lib/memcpy-archs.S @@ -25,15 +25,11 @@ #endif #ifdef CONFIG_ARC_HAS_LL64 -# define PREFETCH_READ(RX) prefetch [RX, 56] -# define PREFETCH_WRITE(RX) prefetchw [RX, 64] # define LOADX(DST,RX) ldd.ab DST, [RX, 8] # define STOREX(SRC,RX) std.ab SRC, [RX, 8] # define ZOLSHFT 5 # define ZOLAND 0x1F #else -# define PREFETCH_READ(RX) prefetch [RX, 28] -# define PREFETCH_WRITE(RX) prefetchw [RX, 32] # define LOADX(DST,RX) ld.ab DST, [RX, 4] # define STOREX(SRC,RX) st.ab SRC, [RX, 4] # define ZOLSHFT 4 @@ -41,8 +37,6 @@ #endif ENTRY_CFI(memcpy) - prefetch [r1] ; Prefetch the read location - prefetchw [r0] ; Prefetch the write location mov.f 0, r2 ;;; if size is zero jz.d [blink] @@ -72,8 +66,6 @@ ENTRY_CFI(memcpy) lpnz @.Lcopy32_64bytes ;; LOOP START LOADX (r6, r1) - PREFETCH_READ (r1) - PREFETCH_WRITE (r3) LOADX (r8, r1) LOADX (r10, r1) LOADX (r4, r1) @@ -117,9 +109,7 @@ ENTRY_CFI(memcpy) lpnz @.Lcopy8bytes_1 ;; LOOP START ld.ab r6, [r1, 4] - prefetch [r1, 28] ;Prefetch the next read location ld.ab r8, [r1,4] - prefetchw [r3, 32] ;Prefetch the next write location SHIFT_1 (r7, r6, 24) or r7, r7, r5 @@ -162,9 +152,7 @@ ENTRY_CFI(memcpy) lpnz @.Lcopy8bytes_2 ;; LOOP START ld.ab r6, [r1, 4] - prefetch [r1, 28] ;Prefetch the next read location ld.ab r8, [r1,4] - prefetchw [r3, 32] ;Prefetch the next write location SHIFT_1 (r7, r6, 16) or r7, r7, r5 @@ -204,9 +192,7 @@ ENTRY_CFI(memcpy) lpnz @.Lcopy8bytes_3 ;; LOOP START ld.ab r6, [r1, 4] - prefetch [r1, 28] ;Prefetch the next read location ld.ab r8, [r1,4] - prefetchw [r3, 32] ;Prefetch the next write location SHIFT_1 (r7, r6, 8) or r7, r7, r5 -- 2.14.5