Hi all,
How desirable/portable is it to use __int128 on non-x86 64-bit
architectures to get a 64*64 -> 128 bit multiply? On x86-64 this works
extremely well, but I'm worried about that needlessly breaking on other
architectures.
In particular, it looks opportune to use a scaling-by-multiply instead
of a multiply-divide on lines 253 and 269 of kernel/time.c:
243 unsigned int jiffies_to_msecs(const unsigned long j)
244 {
245 #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ)
246 return (MSEC_PER_SEC / HZ) * j;
247 #elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC)
248 return (j + (HZ / MSEC_PER_SEC) - 1)/(HZ / MSEC_PER_SEC);
249 #else
250 # if BITS_PER_LONG == 32
251 return (HZ_TO_MSEC_MUL32 * j) >> HZ_TO_MSEC_SHR32;
252 # else
253 return (j * HZ_TO_MSEC_NUM) / HZ_TO_MSEC_DEN;
254 # endif
255 #endif
256 }
257 EXPORT_SYMBOL(jiffies_to_msecs);
258
259 unsigned int jiffies_to_usecs(const unsigned long j)
260 {
261 #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
262 return (USEC_PER_SEC / HZ) * j;
263 #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
264 return (j + (HZ / USEC_PER_SEC) - 1)/(HZ / USEC_PER_SEC);
265 #else
266 # if BITS_PER_LONG == 32
267 return (HZ_TO_USEC_MUL32 * j) >> HZ_TO_USEC_SHR32;
268 # else
269 return (j * HZ_TO_USEC_NUM) / HZ_TO_USEC_DEN;
270 # endif
271 #endif
272 }
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Sun, 17 Mar 2013, H. Peter Anvin wrote:
> How desirable/portable is it to use __int128 on non-x86 64-bit
> architectures to get a 64*64 -> 128 bit multiply? On x86-64 this works
> extremely well, but I'm worried about that needlessly breaking on other
> architectures.
Hmm, nobody has replied, so just FYI such widening multiplication is
available in all 64-bit MIPS hardware and GCC has supported it since 4.4
or mid 2008 (older versions used a libgcc __multi3 helper, not quite so
efficient as you can imagine).
$ cat mulditi3.c
typedef int int128_t __attribute__((mode(TI)));
int128_t mulditi3(long x, long y)
{
int128_t _x = x, _y = y;
return _x * _y;
}
$ mips64-linux-gcc -O2 -c mulditi3.c
$ mips64-linux-objdump -d mulditi3.o
mulditi3.o: file format elf64-tradbigmips
Disassembly of section .text:
0000000000000000 <mulditi3>:
0: 0085001c dmult a0,a1
4: 00001812 mflo v1
8: 03e00008 jr ra
c: 00001010 mfhi v0
$
(MFLO and MFHI are register moves from the MD accumulator to GPRs).
There's an unsigned instruction variant as well.
HTH,
Maciej
On Sun, 21 Apr 2013 05:29:28 +0100 (BST) "Maciej W. Rozycki" <[email protected]> wrote:
>
> Hmm, nobody has replied, so just FYI such widening multiplication is
> available in all 64-bit MIPS hardware and GCC has supported it since 4.4
> or mid 2008 (older versions used a libgcc __multi3 helper, not quite so
> efficient as you can imagine).
$ powerpc64-linux-gcc --version
powerpc64-linux-gcc (GCC) 4.6.3
...
$ powerpc64-linux-gcc -O2 -c mulditi3.c
$ powerpc64-linux-objdump -d -r mulditi3.o
mulditi3.o: file format elf64-powerpc
Disassembly of section .text:
0000000000000000 <.mulditi3>:
0: 7c 08 02 a6 mflr r0
4: 7c 86 23 78 mr r6,r4
8: 7c c5 fe 76 sradi r5,r6,63
c: f8 01 00 10 std r0,16(r1)
10: 7c 64 1b 78 mr r4,r3
14: f8 21 ff 91 stdu r1,-112(r1)
18: 7c 63 fe 76 sradi r3,r3,63
1c: 48 00 00 01 bl 1c <.mulditi3+0x1c>
1c: R_PPC64_REL24 __multi3
20: 60 00 00 00 nop
24: 38 21 00 70 addi r1,r1,112
28: e8 01 00 10 ld r0,16(r1)
2c: 7c 08 03 a6 mtlr r0
30: 4e 80 00 20 blr
34: 00 00 00 00 .long 0x0
38: 00 00 00 01 .long 0x1
3c: 80 00 00 00 lwz r0,0(0)
$
i.e. for gcc 4.6.3, 64 bit powerpc calls out to __multi3
The same is true for sparc64.
--
Cheers,
Stephen Rothwell [email protected]
On Sun, Apr 21, 2013 at 07:35:31PM +1000, Stephen Rothwell wrote:
> On Sun, 21 Apr 2013 05:29:28 +0100 (BST) "Maciej W. Rozycki" <[email protected]> wrote:
> >
> > Hmm, nobody has replied, so just FYI such widening multiplication is
> > available in all 64-bit MIPS hardware and GCC has supported it since 4.4
> > or mid 2008 (older versions used a libgcc __multi3 helper, not quite so
> > efficient as you can imagine).
>
> i.e. for gcc 4.6.3, 64 bit powerpc calls out to __multi3
>
> The same is true for sparc64.
Likewise, with gcc-4.6.3, alpha calls out to __multi3.
Cheers
Michael.