Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp153656pxv; Thu, 15 Jul 2021 01:01:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwSEWeTSEfOgPaqt1MacHSXfFQQvq1U0IvSqu3C7lz/nhU4o0qInws+KV2uQtOYq3Cn+q6S X-Received: by 2002:a17:907:3c81:: with SMTP id gl1mr4114936ejc.48.1626336112019; Thu, 15 Jul 2021 01:01:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626336112; cv=none; d=google.com; s=arc-20160816; b=tNAsIn3Ycnx+SIyH0ra2qRaqJHGra6VTNH0HYxKk85IIQGKb5MYY3+qMtrnAlYobWe sxBE1aNDrEmpLIMqqR8wZFmbafiTrajTlnHs41zHcSwQTo/pc3Gty8GUJTqDyq3ENa8Q yv5l+np3XPyIQ7M1BP6g9dp/b/lgMiwJdoIkeqAWiO2v2Hl4+J+67Use/nRyb2Asedzg I18me6ougjVEsbRYsWi1faamdTWpx8narvDCcPS3slXYoRN6BwxPbwR0NbMqYetHrHVc b4MF8FD6NVCi5botrYAlozPAlZuvkoGWRTpafviErT3Lat/1cs3dUoYg3kBJOTkzxoqh bkZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject:cc:dkim-signature; bh=3KEYRV18inQTvxQEnqR/XVe6/C5rX+tc+VWpSwxHYPA=; b=0ehQGjkDwtt2yb5BWiK7Fm3iTeQqioYOVTVV2gXrussqpkMoVZOH4sfx0H8pFm7/eZ ZzB9oRNUIuyJFnugUfN9NOwA7I1VDTRi649Nc1NYNtf3kQlFTf+w4u1OAtugLf2RNm8B Vh+hJLFtySdDtVX+D2YDPG69d1WL3+hvf75XbnlX2hLCUgXpOO0xiZQd1cg51ynf9P88 vCHmZTRKwBOmBtyx040L0b0iUzdIjJcPsuiCeJqCZtx2NMmKeL8D/Hu0Pym2wqtq/5Ol rnGTACSbF0veNVA/K3I5gtoPxPizO8/GMtqaWlFRDEEZaey4b3bCaa5XPLtX+6eDNRgI XKIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VKLtGA+E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g26si6326072edu.548.2021.07.15.01.01.29; Thu, 15 Jul 2021 01:01:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=VKLtGA+E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234143AbhGOGXB (ORCPT + 99 others); Thu, 15 Jul 2021 02:23:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231149AbhGOGXB (ORCPT ); Thu, 15 Jul 2021 02:23:01 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E62AC06175F for ; Wed, 14 Jul 2021 23:20:08 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id n10so2664918plk.1 for ; Wed, 14 Jul 2021 23:20:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=cc:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=3KEYRV18inQTvxQEnqR/XVe6/C5rX+tc+VWpSwxHYPA=; b=VKLtGA+Eu/71OhbovI75hGlNKv6anN9s84itIzsfFEminUCEj7EMW+9LuLjH4s8qyl qbPz62Qkv/52TuLe0OxHtGpTeN5XCIlLBF5ZfnVzAyox8b8uvekArNjyzf4CFoYoVL16 VwtxPXcMa4pUGx5cjrofYbSvsc/QYbYXFUnZne9Q3kA5dPWAXC+UodCttN/Dl6KFh3yx xy8kJMhdRkAZobISbgNz3c9Vl3M5ujsTlDb0s11C3mfnI90HlRcU4y67bTr4hvgUJaLB 3aIgzEjl/18UtraL0xRGv98D2/fDEpVb+yo817YdKlcnQjqCwjr+vhIwtsE25BM085mL LONQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:cc:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=3KEYRV18inQTvxQEnqR/XVe6/C5rX+tc+VWpSwxHYPA=; b=BYmmYdNZnltqROFYn69q3eKlYrSacReAqdMXJ0Gk6712+1pUn9OXipDZjvKElyMzKy XIBRQ9IDw4vMvktt5oQsTM1vLL1wTXcMwopMoaBUAgm4MuM2GC/LQ84bCPREQwQm/meK ODZ30j7szCDjY4k5Q0+89j4VxNpwekHc+MKcx2RHebf79JhSi7i6Voq695K47/1/FI2X j0oUBftBCh9UCR3A34zeHfGvjXgzibcU6asL41Yr2W6Tb+DsytTM+uXStCUw30mKKnZ2 MtFUU7TelsZttvj4KTOFweqgMVfH1L4bnTa/T7xBWeACNjs1XuwNKWf0nb+KGPGYDEMJ UX1A== X-Gm-Message-State: AOAM532KwW5x8VbHnCNFBgoPINXichGE0HRPpXYYGvQPXES4hesmfnvO GuKvZXQnzNOmnejmqE1mJ3zdaqGgeAc= X-Received: by 2002:a17:90a:3e0f:: with SMTP id j15mr2678681pjc.178.1626330006434; Wed, 14 Jul 2021 23:20:06 -0700 (PDT) Received: from [192.168.1.123] (M106072041033.v4.enabler.ne.jp. [106.72.41.33]) by smtp.gmail.com with ESMTPSA id u21sm4937658pfh.163.2021.07.14.23.20.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Jul 2021 23:20:05 -0700 (PDT) Cc: akira.tsukamoto@gmail.com, Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Linux Kernel Mailing List Subject: Re: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall To: Geert Uytterhoeven , Guenter Roeck References: <3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com> <60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com> <20210710014915.GA149706@roeck-us.net> From: Akira Tsukamoto Message-ID: Date: Thu, 15 Jul 2021 15:20:03 +0900 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/14/2021 3:10 AM, Geert Uytterhoeven wrote: > Hi Günter, Tsukamoto-san, > > On Sat, Jul 10, 2021 at 3:50 AM Guenter Roeck wrote: >> On Wed, Jun 23, 2021 at 09:40:39PM +0900, Akira Tsukamoto wrote: >>> This patch will reduce cpu usage dramatically in kernel space especially >>> for application which use sys-call with large buffer size, such as network >>> applications. The main reason behind this is that every unaligned memory >>> access will raise exceptions and switch between s-mode and m-mode causing >>> large overhead. >>> >>> First copy in bytes until reaches the first word aligned boundary in >>> destination memory address. This is the preparation before the bulk >>> aligned word copy. >>> >>> The destination address is aligned now, but oftentimes the source address >>> is not in an aligned boundary. To reduce the unaligned memory access, it >>> reads the data from source in aligned boundaries, which will cause the >>> data to have an offset, and then combines the data in the next iteration >>> by fixing offset with shifting before writing to destination. The majority >>> of the improving copy speed comes from this shift copy. >>> >>> In the lucky situation that the both source and destination address are on >>> the aligned boundary, perform load and store with register size to copy the >>> data. Without the unrolling, it will reduce the speed since the next store >>> instruction for the same register using from the load will stall the >>> pipeline. >>> >>> At last, copying the remainder in one byte at a time. >>> >>> Signed-off-by: Akira Tsukamoto >> >> This patch causes all riscv32 qemu emulations to stall during boot. >> The log suggests that something in kernel/user communication may be wrong. >> >> Bad case: >> >> Starting syslogd: OK >> Starting klogd: OK >> /etc/init.d/S02sysctl: line 68: syntax error: EOF in backquote substitution >> /etc/init.d/S20urandom: line 1: syntax error: unterminated quoted string >> Starting network: /bin/sh: syntax error: unterminated quoted string > >> # first bad commit: [ca6eaaa210deec0e41cbfc380bf89cf079203569] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall > > Same here on vexriscv. Bisected to the same commit. > > The actual scripts look fine when using "cat", but contain some garbage > when executing them using "sh -v". > > Tsukamoto-san: glancing at the patch: > > + addi a0, a0, 8*SZREG > + addi a1, a1, 8*SZREG > > I think you forgot about rv32, where registers cover only 4 > bytes each? Thanks Günter and Geert for the pointing out the errors. I will send the fixes, probably this weekend. Akira