2020-09-03 14:51:23

by Christoph Hellwig

[permalink] [raw]
Subject: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

Hi all,

this series removes the last set_fs() used to force a kernel address
space for the uaccess code in the kernel read/write/splice code, and then
stops implementing the address space overrides entirely for x86 and
powerpc.

[Note to Linus: I'd like to get this into linux-next rather earlier
than later. Do you think it is ok to add this tree to linux-next?]

The file system part has been posted a few times, and the read/write side
has been pretty much unchanced. For splice this series drops the
conversion of the seq_file and sysctl code to the iter ops, and thus loses
the splice support for them. The reasons for that is that it caused a lot
of churn for not much use - splice for these small files really isn't much
of a win, even if existing userspace uses it. All callers I found do the
proper fallback, but if this turns out to be an issue the conversion can
be resurrected.

Besides x86 and powerpc I plan to eventually convert all other
architectures, although this will be a slow process, starting with the
easier ones once the infrastructure is merged. The process to convert
architectures is roughtly:

(1) ensure there is no set_fs(KERNEL_DS) left in arch specific code
(2) implement __get_kernel_nofault and __put_kernel_nofault
(3) remove the arch specific address limitation functionality

Changes since v2:
- add back the patch to support splice through read_iter/write iter
on /proc/sys/*
- entirely remove the tests that depend on set_fs. Note that for
lkdtm the maintainer (Kees) disagrees with this request from Linus
- fix a wrong check in the powerpc access_ok, and drop a few spurious
cleanups there

Changes since v1:
- drop the patch to remove the non-iter ops for /dev/zero and
/dev/null as they caused a performance regression
- don't enable user access in __get_kernel on powerpc
- xfail the set_fs() based lkdtm tests

Diffstat:


2020-09-03 15:52:05

by Linus Torvalds

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

On Thu, Sep 3, 2020 at 7:22 AM Christoph Hellwig <[email protected]> wrote:
>
> [Note to Linus: I'd like to get this into linux-next rather earlier
> than later. Do you think it is ok to add this tree to linux-next?]

This whole series looks really good to me now, with each patch looking
like a clear cleanup on its own.

So ack on the whole series, and yes, please get this into linux-next
and ready for 5.10. Whether through Al or something else.

Thanks,

Linus

2020-09-04 06:03:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3


* Christoph Hellwig <[email protected]> wrote:

> Hi all,
>
> this series removes the last set_fs() used to force a kernel address
> space for the uaccess code in the kernel read/write/splice code, and then
> stops implementing the address space overrides entirely for x86 and
> powerpc.

Cool! For the x86 bits:

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

2020-09-04 17:59:26

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

On Fri, Sep 04, 2020 at 08:00:24AM +0200, Ingo Molnar wrote:
> * Christoph Hellwig <[email protected]> wrote:
> > this series removes the last set_fs() used to force a kernel address
> > space for the uaccess code in the kernel read/write/splice code, and then
> > stops implementing the address space overrides entirely for x86 and
> > powerpc.
>
> Cool! For the x86 bits:
>
> Acked-by: Ingo Molnar <[email protected]>

set_fs() is older than some kernel hackers!

$ cd linux-0.11/
$ find . -type f -name '*.h' | xargs grep -e set_fs -w -n -A3
./include/asm/segment.h:61:extern inline void set_fs(unsigned long val)
./include/asm/segment.h-62-{
./include/asm/segment.h-63- __asm__("mov %0,%%fs"::"a" ((unsigned short) val));
./include/asm/segment.h-64-}

2020-09-04 18:46:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

On Fri, Sep 4, 2020 at 10:58 AM Alexey Dobriyan <[email protected]> wrote:
>
> set_fs() is older than some kernel hackers!
>
> $ cd linux-0.11/
> $ find . -type f -name '*.h' | xargs grep -e set_fs -w -n -A3

Oh, it's older than that. It was there (as set_fs) in 0.10, and may
even predate that. But sadly, I don't have tar-balls for 0.02 and
0.03, so can't check.

The actual use of %fs as the user space segment is already there in
0.01, but there was no 'set_fs()'. That was a simpler and more direct
time, and "get_fs()" looked like this back then:

#define _fs() ({ \
register unsigned short __res; \
__asm__("mov %%fs,%%ax":"=a" (__res):); \
__res;})

and all the setting was basically part of the kernel entry asm and. Lovely.

Linus

2020-09-04 21:02:09

by David Laight

[permalink] [raw]
Subject: RE: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

From: Alexey Dobriyan
> Sent: 04 September 2020 18:58
>
> On Fri, Sep 04, 2020 at 08:00:24AM +0200, Ingo Molnar wrote:
> > * Christoph Hellwig <[email protected]> wrote:
> > > this series removes the last set_fs() used to force a kernel address
> > > space for the uaccess code in the kernel read/write/splice code, and then
> > > stops implementing the address space overrides entirely for x86 and
> > > powerpc.
> >
> > Cool! For the x86 bits:
> >
> > Acked-by: Ingo Molnar <[email protected]>
>
> set_fs() is older than some kernel hackers!
>
> $ cd linux-0.11/
> $ find . -type f -name '*.h' | xargs grep -e set_fs -w -n -A3
> ./include/asm/segment.h:61:extern inline void set_fs(unsigned long val)
> ./include/asm/segment.h-62-{
> ./include/asm/segment.h-63- __asm__("mov %0,%%fs"::"a" ((unsigned short) val));
> ./include/asm/segment.h-64-}

What is this strange %fs register you are talking about.
Figure 2-4 only has CS, DS, SS and ES.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-09-05 07:17:49

by Christophe Leroy

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3



Le 04/09/2020 à 23:01, David Laight a écrit :
> From: Alexey Dobriyan
>> Sent: 04 September 2020 18:58
>>
>> On Fri, Sep 04, 2020 at 08:00:24AM +0200, Ingo Molnar wrote:
>>> * Christoph Hellwig <[email protected]> wrote:
>>>> this series removes the last set_fs() used to force a kernel address
>>>> space for the uaccess code in the kernel read/write/splice code, and then
>>>> stops implementing the address space overrides entirely for x86 and
>>>> powerpc.
>>>
>>> Cool! For the x86 bits:
>>>
>>> Acked-by: Ingo Molnar <[email protected]>
>>
>> set_fs() is older than some kernel hackers!
>>
>> $ cd linux-0.11/
>> $ find . -type f -name '*.h' | xargs grep -e set_fs -w -n -A3
>> ./include/asm/segment.h:61:extern inline void set_fs(unsigned long val)
>> ./include/asm/segment.h-62-{
>> ./include/asm/segment.h-63- __asm__("mov %0,%%fs"::"a" ((unsigned short) val));
>> ./include/asm/segment.h-64-}
>
> What is this strange %fs register you are talking about.
> Figure 2-4 only has CS, DS, SS and ES.
>

Intel added registers FS and GS in the i386

Christophe

2020-09-05 10:16:28

by David Laight

[permalink] [raw]
Subject: RE: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

From: Christophe Leroy
> Sent: 05 September 2020 08:16
>
> Le 04/09/2020 à 23:01, David Laight a écrit :
> > From: Alexey Dobriyan
> >> Sent: 04 September 2020 18:58
...
> > What is this strange %fs register you are talking about.
> > Figure 2-4 only has CS, DS, SS and ES.
> >
>
> Intel added registers FS and GS in the i386

I know, I've got both the 'iAPX 286 Programmer's Reference Manual'
and the '80386 Programmer's Reference Manual' on my shelf.

I don't have the 8088 book though - which I used in 1982.

The old books are a lot easier to read if, for instance,
you are trying to work out how to back and forth to real mode
to do bios calls.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-09-10 15:35:11

by David Laight

[permalink] [raw]
Subject: RE: remove the last set_fs() in common code, and remove it for x86 and powerpc v3



> -----Original Message-----
> From: Segher Boessenkool <[email protected]>
> Sent: 10 September 2020 16:21
> To: David Laight <[email protected]>
> Cc: 'Christophe Leroy' <[email protected]>; 'Linus Torvalds' <torvalds@linux-
> foundation.org>; linux-arch <[email protected]>; Kees Cook <[email protected]>; the
> arch/x86 maintainers <[email protected]>; Nick Desaulniers <[email protected]>; Linux Kernel
> Mailing List <[email protected]>; Alexey Dobriyan <[email protected]>; Luis Chamberlain
> <[email protected]>; Al Viro <[email protected]>; linux-fsdevel <[email protected]>;
> linuxppc-dev <[email protected]>; Christoph Hellwig <[email protected]>
> Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3
>
> On Thu, Sep 10, 2020 at 12:26:53PM +0000, David Laight wrote:
> > Actually this is pretty sound:
> > __label__ label;
> > register int eax asm ("eax");
> > // Ensure eax can't be reloaded from anywhere
> > // In particular it can't be reloaded after the asm goto line
> > asm volatile ("" : "=r" (eax));
>
> This asm is fine. It says it writes the "eax" variable, which lives in
> the eax register *in that asm* (so *not* guaranteed after it!).
>
> > // Provided gcc doesn't save eax here...
> > asm volatile goto ("xxxxx" ::: "eax" : label);
>
> So this is incorrect.

From the other email:

> It is neither input nor output operand here! Only *then* is a local
> register asm guaranteed to be in the given reg: as input or output to an
> inline asm.

Ok, so adding '"r" (eax)' to the input section helps a bit.

> > // ... and reload the saved value here.
> > // The input value here will be that modified by the 'asm goto'.
> > // Since this modifies eax it can't be moved before the 'asm goto'.
> > asm volatile ("" : "+r" (eax));
> > // So here eax must contain the value set by the "xxxxx" instructions.
>
> No, the register eax will contain the value of the eax variable. In the
> asm; it might well be there before or after the asm as well, but none of
> that is guaranteed.

Perhaps not 'guaranteed', but very unlikely to be wrong.
It doesn't give gcc much scope for not generating the desired code.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-09-10 17:30:25

by Segher Boessenkool

[permalink] [raw]
Subject: Re: remove the last set_fs() in common code, and remove it for x86 and powerpc v3

On Thu, Sep 10, 2020 at 03:31:53PM +0000, David Laight wrote:
> > > asm volatile ("" : "+r" (eax));
> > > // So here eax must contain the value set by the "xxxxx" instructions.
> >
> > No, the register eax will contain the value of the eax variable. In the
> > asm; it might well be there before or after the asm as well, but none of
> > that is guaranteed.
>
> Perhaps not 'guaranteed', but very unlikely to be wrong.
> It doesn't give gcc much scope for not generating the desired code.

Wanna bet? :-)

Correct is correct. Anything else is not.


Segher