Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp1663518ybg; Sun, 27 Oct 2019 02:25:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqzrz+ZEKzH8dyyKFpHy7O53z3tx/P5HfkF81diOoVGY8blw4gNOS/0cURRd2n/UTd8HuJST X-Received: by 2002:a17:906:2599:: with SMTP id m25mr6896795ejb.227.1572168327310; Sun, 27 Oct 2019 02:25:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572168327; cv=none; d=google.com; s=arc-20160816; b=BUUjkb4fsG4rjCYfpzt214LDzW6nwXR/0EZ1yBdwUgp6Q0YCzoLaHlb3eByfhJUieq qw+BUt3gIqssy8rsNBpa3WzGpXxrkahnavcFufka/CpfcPov5tDDyZ2JGoC1tLl4pswr WZWpQg0yDQ1gHOtmz866C8Pr76KCywGI8QAjoPe8BWCD78W3V2kna7Ok27Q6Xqw1fHMH 9eRL7mJ6V0ABJq+4N7HI2rqyVIwNv+32N//cOCG+ojtEAji+bkSfxjytCrvwYAGPoc8z nPwicOeNwlej+tWKdu2gHTI1oWY384ysKzmFQkOHGXdWW7kduTiPVNRdmv34fgeJSMM+ uTeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=5hjHvfXSMkcgXBmkACPGCMxpghIDPNJ8CFc7b3vEb8c=; b=QtaYbihjp+m8CKhUBUuofC2E3xhsO8l74PrguHwHXDufeOGfzm3X/VELf/TpAWNr3t WRtxSbX8Rh7d5BGAnIiFF9dEfFtjLQHRJ8yEgukEB7Fgo8HomRYvpQ50An1SDekIHJek c5L/6h+hFUHAFdm9zRWbBbWUTF0Q20kjk1QNA6el5AKVGH2K78n/Gx5yCdlc6wcH/Ulf +5JyW2AV5btMJgFxJ6yA6YRcaRzo3sQUc+yx/jjFysf720elfvCa8gT8z/vAO76PQ7xN Bz/bgekVoU1ZnAzVdmXlp7ZCiqm9nIN5IJ2khy8YHY6uQr7IpASiwCgF2Zb1Ye7OGGIx tyOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@c-s.fr header.s=mail header.b=YTLpRmAB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j11si5114237edj.43.2019.10.27.02.24.53; Sun, 27 Oct 2019 02:25:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@c-s.fr header.s=mail header.b=YTLpRmAB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726622AbfJ0JV3 (ORCPT + 99 others); Sun, 27 Oct 2019 05:21:29 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:8025 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726106AbfJ0JV2 (ORCPT ); Sun, 27 Oct 2019 05:21:28 -0400 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 471C5c0KBSz9vC0b; Sun, 27 Oct 2019 10:21:24 +0100 (CET) Authentication-Results: localhost; dkim=pass reason="1024-bit key; insecure key" header.d=c-s.fr header.i=@c-s.fr header.b=YTLpRmAB; dkim-adsp=pass; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id NhDefIqlusly; Sun, 27 Oct 2019 10:21:23 +0100 (CET) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 471C5b6BLqz9vC0Z; Sun, 27 Oct 2019 10:21:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=c-s.fr; s=mail; t=1572168083; bh=5hjHvfXSMkcgXBmkACPGCMxpghIDPNJ8CFc7b3vEb8c=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=YTLpRmAByL4MjS715uXHks+1TNjjNQI4ISm64uPIsrvheYndqpxyNo3StEgFCSovP Pk0iFqQrOCEyhZyayeeN3LoJzRrLwSZIDHxrzaKcnxfDEtXWNPytsanojbyrvLOgGv abYiyFb9QH1nrR8N0IQeh5t88CTrMQrO62GgGRV8= Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id C567F8B923; Sun, 27 Oct 2019 10:21:26 +0100 (CET) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id bdT3vZYeqZTv; Sun, 27 Oct 2019 10:21:26 +0100 (CET) Received: from [192.168.4.90] (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 1C7798B922; Sun, 27 Oct 2019 10:21:26 +0100 (CET) Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation. To: Segher Boessenkool , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, Paul Mackerras , luto@kernel.org, vincenzo.frascino@arm.com, linuxppc-dev@lists.ozlabs.org References: <8ce3582f7f7da9ff0286ced857e5aa2e5ae6746e.1571662378.git.christophe.leroy@c-s.fr> <95bd2367-8edc-29db-faa3-7729661e05f2@c-s.fr> <439bce37-9c2c-2afe-9c9e-2f500472f9f8@c-s.fr> <20191026230609.GY28442@gate.crashing.org> From: Christophe Leroy Message-ID: <8e4d0b82-a7a1-b7f1-308e-df871b32d317@c-s.fr> Date: Sun, 27 Oct 2019 10:21:25 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191026230609.GY28442@gate.crashing.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 27/10/2019 à 01:06, Segher Boessenkool a écrit : > On Sat, Oct 26, 2019 at 08:48:27PM +0200, Thomas Gleixner wrote: >> On Sat, 26 Oct 2019, Christophe Leroy wrote: >> Let's look at the code: >> >> __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) >> { >> const struct vdso_data *vd = __arch_get_vdso_data(); >> >> if (likely(tv != NULL)) { >> struct __kernel_timespec ts; >> >> if (do_hres(&vd[CS_HRES_COARSE], CLOCK_REALTIME, &ts)) >> return gettimeofday_fallback(tv, tz); >> >> tv->tv_sec = ts.tv_sec; >> tv->tv_usec = (u32)ts.tv_nsec / NSEC_PER_USEC; >> >> IIRC PPC did some magic math tricks to avoid that. Could you just for the >> fun of it replace this division with >> >> (u32)ts.tv_nsec >> 10; > > On this particular CPU (the 885, right?) a division by 1000 is just 9 > cycles. On other CPUs it can be more, say 19 cycles like on the 750; not > cheap at all, but not hugely expensive either, comparatively. > > (A 64/32->32 division is expensive on all 32-bit PowerPC: there is no > hardware help for it at all, so it's all done in software.) > > Of course the compiler won't do a division by a constant with a division > instruction at all, so it's somewhat cheaper even, 5 or 6 cycles or so. > >> One thing which might be worth to try as well is to mark all functions in >> that file as inline. The speedup by the do_hres() inlining was impressive >> on PPC. > > The hand-optimised asm code will pretty likely win handsomely, whatever > you do. Especially on cores like the 885 (no branch prediction, single > issue, small caches, etc.: every instruction counts). > > Is there any reason to replace this hand-optimised code? It was written > for exacty this reason? These functions are critical and should be as > fast as possible. Well, all this started with COARSE clocks not being supported by PPC32 VDSO. I first submitted a series with a set of optimisations including the implementation of COARSE clocks (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=126779) Then after a comment received on patch 4 of the series from Santosh Sivaraj asking for a common implementation of it for PPC32 and PPC64, I started looking into making the whole VDSO source code common to PPC32 and PPC64. Most functions are similar. Time functions are also rather similar but unfortunately don't use the same registers. They also don't cover all possible clocks. And getres() is also buggy, see series https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110321 So instead of reworking the existing time functions, I started investigating whether we could plug powerpc to the generic implementation. One drawback of PPC is that we need to setup an ASM trampoline to handle the SO bit as it can't be handled from C directly, can it ? How critical are these functions ? Although we have a slight degration with the C implementation, they are still way faster than the corresponding syscall. Another thing I was wondering, is it worth using the 64 bit timebase on PPC32 ? As far as I understand, the timebase is there to calculate a linear date update since last VDSO datapage update. How often is the VDSO datapage updated ? On the 885 clocked at 132Mhz, the timebase is at 8.25 Mhz, which means it needs more than 8 minutes to loop over 32 bits. Christophe