2015-11-27 08:53:56

by Rasmus Villemoes

[permalink] [raw]
Subject: no-op delay loops

Hi,

It seems that gcc happily compiles

for (i = 0; i < 1000000000; ++i) ;

into simply

i = 1000000000;

(which is then usually eliminated as a dead store). At least at -O2, and
when i is not declared volatile. So it would seem that the loops at

arch/mips/pci/pci-rt2880.c:235
arch/mips/pmcs-msp71xx/msp_setup.c:80
arch/mips/sni/reset.c:35

actually don't do anything. (In the middle one, i is 'register', but
that doesn't change anything.) Is mips compiled with some special flags
that would make gcc actually emit code for the above?

Rasmus


2015-11-27 09:04:55

by yalin wang

[permalink] [raw]
Subject: Re: no-op delay loops


> On Nov 27, 2015, at 16:53, Rasmus Villemoes <[email protected]> wrote:
>
> Hi,
>
> It seems that gcc happily compiles
>
> for (i = 0; i < 1000000000; ++i) ;
>
> into simply
>
> i = 1000000000;
>
> (which is then usually eliminated as a dead store). At least at -O2, and
> when i is not declared volatile. So it would seem that the loops at
>
> arch/mips/pci/pci-rt2880.c:235
> arch/mips/pmcs-msp71xx/msp_setup.c:80
> arch/mips/sni/reset.c:35
>
> actually don't do anything. (In the middle one, i is 'register', but
> that doesn't change anything.) Is mips compiled with some special flags
> that would make gcc actually emit code for the above?
>
you can try to declare i as volatile int i;
may gcc will not optimize it .

Thanks

2015-11-27 09:25:08

by Andy Shevchenko

[permalink] [raw]
Subject: Re: no-op delay loops

On Fri, Nov 27, 2015 at 11:04 AM, yalin wang <[email protected]> wrote:
>> On Nov 27, 2015, at 16:53, Rasmus Villemoes <[email protected]> wrote:

>> It seems that gcc happily compiles
>>
>> for (i = 0; i < 1000000000; ++i) ;
>>
>> into simply
>>
>> i = 1000000000;
>>
>> (which is then usually eliminated as a dead store). At least at -O2, and
>> when i is not declared volatile. So it would seem that the loops at
>>
>> arch/mips/pci/pci-rt2880.c:235
>> arch/mips/pmcs-msp71xx/msp_setup.c:80
>> arch/mips/sni/reset.c:35
>>
>> actually don't do anything. (In the middle one, i is 'register', but
>> that doesn't change anything.) Is mips compiled with some special flags
>> that would make gcc actually emit code for the above?
>>
> you can try to declare i as volatile int i;
> may gcc will not optimize it .

Might be, but Rasmus as I can see asked about *existing* code.


--
With Best Regards,
Andy Shevchenko

2015-11-27 11:18:01

by Arnd Bergmann

[permalink] [raw]
Subject: Re: no-op delay loops

On Friday 27 November 2015 09:53:50 Rasmus Villemoes wrote:
>
> It seems that gcc happily compiles
>
> for (i = 0; i < 1000000000; ++i) ;
>
> into simply
>
> i = 1000000000;
>
> (which is then usually eliminated as a dead store). At least at -O2, and
> when i is not declared volatile. So it would seem that the loops at
>
> arch/mips/pci/pci-rt2880.c:235
> arch/mips/pmcs-msp71xx/msp_setup.c:80
> arch/mips/sni/reset.c:35
>
> actually don't do anything. (In the middle one, i is 'register', but
> that doesn't change anything.) Is mips compiled with some special flags
> that would make gcc actually emit code for the above?
>

I remember that gcc used to not optimize code that looked like a
delay loop such as the above, and my tests show that this was still
the case in gcc-4.0.3, but starting with gcc-4.1 it opimtized away
that loop.

Arnd

2015-11-27 11:44:15

by Ralf Baechle

[permalink] [raw]
Subject: Re: no-op delay loops

On Fri, Nov 27, 2015 at 09:53:50AM +0100, Rasmus Villemoes wrote:

> It seems that gcc happily compiles
>
> for (i = 0; i < 1000000000; ++i) ;
>
> into simply
>
> i = 1000000000;
>
> (which is then usually eliminated as a dead store). At least at -O2, and
> when i is not declared volatile. So it would seem that the loops at
>
> arch/mips/pci/pci-rt2880.c:235
> arch/mips/pmcs-msp71xx/msp_setup.c:80
> arch/mips/sni/reset.c:35
>
> actually don't do anything. (In the middle one, i is 'register', but
> that doesn't change anything.) Is mips compiled with some special flags
> that would make gcc actually emit code for the above?

Thanks for reporting!

GCC used to intentionally not eleminate empty loops. This has changed a
while ago. Using volatile for the loop variable will result in atrocious
code so should be avoided.

One problem of these open coded loops is that it's not obvious how much
delay was actually intended so when fixing this I will have to do a bit
of precission guessing ;-)

Ralf

2015-11-30 21:29:31

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: no-op delay loops

On Fri, Nov 27 2015, Arnd Bergmann <[email protected]> wrote:

> On Friday 27 November 2015 09:53:50 Rasmus Villemoes wrote:
>>
>> It seems that gcc happily compiles
>>
>> for (i = 0; i < 1000000000; ++i) ;
>>
>> into simply
>>
>> i = 1000000000;
>>
>> (which is then usually eliminated as a dead store). At least at -O2, and
>> when i is not declared volatile. So it would seem that the loops at
>>
>> arch/mips/pci/pci-rt2880.c:235
>> arch/mips/pmcs-msp71xx/msp_setup.c:80
>> arch/mips/sni/reset.c:35
>>
>> actually don't do anything. (In the middle one, i is 'register', but
>> that doesn't change anything.) Is mips compiled with some special flags
>> that would make gcc actually emit code for the above?
>>
>
> I remember that gcc used to not optimize code that looked like a
> delay loop such as the above, and my tests show that this was still
> the case in gcc-4.0.3, but starting with gcc-4.1 it opimtized away
> that loop.

OK, thanks. That's a very very long time ago.

FWIW, the remaining instances that my trivial coccinelle script found
are

./arch/alpha/boot/main.c:187:1-4: no-op delay loop
./arch/m68k/68000/m68VZ328.c:86:10-13: no-op delay loop
./arch/m68k/bvme6000/config.c:338:2-5: no-op delay loop
./arch/m68k/coldfire/m53xx.c:533:1-4: no-op delay loop
./drivers/cpufreq/cris-artpec3-cpufreq.c:85:3-6: no-op delay loop
./drivers/cpufreq/cris-etraxfs-cpufreq.c:85:3-6: no-op delay loop
./drivers/tty/hvc/hvc_opal.c:313:3-6: no-op delay loop
./drivers/tty/hvc/hvc_vio.c:289:3-6: no-op delay loop

(cc += a few people). The tty ones use volatile, so they probably work,
though one might still want to use the *delay API.

Rasmus

2015-11-30 21:45:11

by Arnd Bergmann

[permalink] [raw]
Subject: Re: no-op delay loops

On Monday 30 November 2015 22:29:26 Rasmus Villemoes wrote:
>
> OK, thanks. That's a very very long time ago.
>
> FWIW, the remaining instances that my trivial coccinelle script found
> are
>
> ./arch/alpha/boot/main.c:187:1-4: no-op delay loop
> ./arch/m68k/68000/m68VZ328.c:86:10-13: no-op delay loop
> ./arch/m68k/bvme6000/config.c:338:2-5: no-op delay loop
> ./arch/m68k/coldfire/m53xx.c:533:1-4: no-op delay loop
> ./drivers/cpufreq/cris-artpec3-cpufreq.c:85:3-6: no-op delay loop
> ./drivers/cpufreq/cris-etraxfs-cpufreq.c:85:3-6: no-op delay loop
> ./drivers/tty/hvc/hvc_opal.c:313:3-6: no-op delay loop
> ./drivers/tty/hvc/hvc_vio.c:289:3-6: no-op delay loop
>
> (cc += a few people). The tty ones use volatile, so they probably work,
> though one might still want to use the *delay API.
>
>

I suspect the tty users do it like this to get console output
before calibrate_delay_loop() is called.

Ard

2015-12-01 01:26:00

by Ralf Baechle

[permalink] [raw]
Subject: Re: no-op delay loops

On Mon, Nov 30, 2015 at 10:29:26PM +0100, Rasmus Villemoes wrote:

> OK, thanks. That's a very very long time ago.
>
> FWIW, the remaining instances that my trivial coccinelle script found
> are

After your initial report I also wrote a coccinelle which is looking
also for delay loops implemented in while loops. It found the following
two:

diff -u -p ./drivers/video/uvesafb.c /tmp/nothing/drivers/video/uvesafb.c
--- ./drivers/video/uvesafb.c
+++ /tmp/nothing/drivers/video/uvesafb.c
@@ -1142,7 +1142,6 @@ static int uvesafb_blank(int blank, stru
vga_wseq(NULL, 0x00, seq);

crtc17 |= vga_rcrt(NULL, 0x17) & ~0x80;
- while (loop--);
vga_wcrt(NULL, 0x17, crtc17);
vga_wseq(NULL, 0x00, 0x03);
} else
diff -u -p ./arch/mips/pmc-sierra/yosemite/atmel_read_eeprom.c /tmp/nothing/arch/mips/pmc-sierra/yosemite/atmel_read_eeprom.c
--- ./arch/mips/pmc-sierra/yosemite/atmel_read_eeprom.c
+++ /tmp/nothing/arch/mips/pmc-sierra/yosemite/atmel_read_eeprom.c
@@ -37,7 +37,6 @@

static void delay(int delay)
{
- while (delay--);
}

static void send_bit(unsigned char bit)

The 2nd file falls into my domain so I'm going to fix it. Not sure
how the uvesafb one should be treated.

Ralf

2015-12-01 15:32:01

by Richard Henderson

[permalink] [raw]
Subject: Re: no-op delay loops

On 11/30/2015 01:29 PM, Rasmus Villemoes wrote:
> ./arch/alpha/boot/main.c:187:1-4: no-op delay loop

Not sure why this is there. The runkernel function that preceeds it cannot
return, having performed an indirect branch (not a call) to the kernel start.

I see no reason to actually change anything, unless someone insists that this
be remove to avoid future similar analysis.


r~