2002-03-16 00:00:09

by Balbir Singh

[permalink] [raw]
Subject: Nice values for kernel modules

In older v2.4 we could directly access current->nice
and set it to any value we wanted. This has now
been replaced by set_user_nice(). The problem
that I face is that task_nice() is not exportted, so
my kernel module cannot use it to read the current
nice value.

Was there some reason for hiding the nice value from
kernel modules?

I have the following solutions

0. I could use the TASK_NICE() macro, but I would
like to avoid using it.
1. Export task_nice in ksyms.c
2. Use sys_nice() using a user space disguise.

Comments,
Balbir Singh.

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/


2002-03-16 09:54:15

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Fri, 15 Mar 2002, Balbir Singh wrote:
> 0. I could use the TASK_NICE() macro, but I would
> like to avoid using it.
> 1. Export task_nice in ksyms.c
> 2. Use sys_nice() using a user space disguise.

jump to sys_nice() indirectly via exported sys_call_table[].

Regards,
Tigran

2002-03-16 10:00:45

by Keith Owens

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, 16 Mar 2002 09:51:16 +0000 (GMT),
<[email protected]> wrote:
>On Fri, 15 Mar 2002, Balbir Singh wrote:
>> 0. I could use the TASK_NICE() macro, but I would
>> like to avoid using it.
>> 1. Export task_nice in ksyms.c
>> 2. Use sys_nice() using a user space disguise.
>
>jump to sys_nice() indirectly via exported sys_call_table[].

Breaks on ia64 and ppc.

2002-03-16 11:08:49

by Paul Mackerras

[permalink] [raw]
Subject: Re: Nice values for kernel modules

Keith Owens writes:

> On Sat, 16 Mar 2002 09:51:16 +0000 (GMT),
> <[email protected]> wrote:
> >jump to sys_nice() indirectly via exported sys_call_table[].
>
> Breaks on ia64 and ppc.

Not that I want to encourage this sort of thing, but why would it
break on ppc?

Paul.

2002-03-16 11:23:48

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, 16 Mar 2002, Paul Mackerras wrote:

> Keith Owens writes:
>
> > On Sat, 16 Mar 2002 09:51:16 +0000 (GMT),
> > <[email protected]> wrote:
> > >jump to sys_nice() indirectly via exported sys_call_table[].
> >
> > Breaks on ia64 and ppc.
>
> Not that I want to encourage this sort of thing, but why would it
> break on ppc?

and also why would it break on ia64. I can understand __mips but why ia64?

Regards
Tigran

2002-03-16 12:35:11

by Keith Owens

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, 16 Mar 2002 11:27:03 +0000 (GMT),
Tigran Aivazian <[email protected]> wrote:
>On Sat, 16 Mar 2002, Paul Mackerras wrote:
>
>> Keith Owens writes:
>>
>> > On Sat, 16 Mar 2002 09:51:16 +0000 (GMT),
>> > <[email protected]> wrote:
>> > >jump to sys_nice() indirectly via exported sys_call_table[].
>> >
>> > Breaks on ia64 and ppc.
>>
>> Not that I want to encourage this sort of thing, but why would it
>> break on ppc?

Should have been ppc64, not ppc.

>and also why would it break on ia64. I can understand __mips but why ia64?

Address of function text is NOT the same as &function. On many
architectures &function is the same as the first byte of the function
text but not on all architectures. On IA64 &func points to a function
descriptor which contains { void * __gp; void * function_text; }. When
you call a function on ia64, the code is really :-

save current __gp
load address of function_text
load __gp (global data pointer) for new function
call function_text
restore original __gp

Within the kernel, all direct function calls (call function by name)
are assumed to have the same __gp so the code reduces to :-

load address of function_text
call function_text

just like most other architectures, this is why syscalls within the
kernel work.

When the kernel calls a function via a pointer then __gp must be saved,
set and restored. This is especially true when calling from the kernel
to a module or vice versa, I guarantee that kernel and modules have
different __gp values.

Fetching &sys_nice from the syscall table and blindly calling that
address from a module will crash the kernel. PPC64 has a similar
problem, it has a function descriptor that contains 3 fields.

There is no architecture independent method for accessing syscall
entries _as functions_ from a module. Code that works on ix86 will
break on ia64 and ppc64. I can see no good reason why the syscall
table has been exported.

2002-03-16 13:01:05

by Tigran Aivazian

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, 16 Mar 2002, Keith Owens wrote:
> I can see no good reason why the syscall table has been exported.

There are lots of good reasons why it has been exported, e.g. ability to
replace some system calls while leaving overall Linux personality (i.e.
without switching to an ABI emulation).

Ability to bypass the stupid commercial time-locked licences (at some time
wordperfect demo was locked like that and my timetravel module turned a
demo into full product -- users were happy, at least according to emails I
received :)

Also, ability to call those system calls from a module which are not
exported individually. Actually, the list of useful possibilities is
endless. Wasn't it wine or dosemu (or some other similar software) which
was based on being able to access sys_call_table. I can't remember the
name of that software but I am sure that a lot of things will break if
sys_call_table is unexported.

Anyway, yes, I agree that this feature is mainly useful on i386
architecture. I should have explicitly stated this when I recommended it.

(actually, I didn't _recommend_ it but only listed it as a possiblity.
>From the possibilities that were originally listed I would recommend the
macro)

Regards,
Tigran

2002-03-16 13:53:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, Mar 16, 2002 at 01:04:16PM +0000, Tigran Aivazian wrote:
> There are lots of good reasons why it has been exported, e.g. ability to
> replace some system calls while leaving overall Linux personality (i.e.
> without switching to an ABI emulation).

No, that never was a good reason and has been removed by Arjan and me
in current Linux-ABI versions.

I'm all for removing it, too.

> Also, ability to call those system calls from a module which are not
> exported individually.

If syscalls are supposed to be used by modules they should be exported
and have proper prototypes.

2002-03-16 15:49:27

by John Levon

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Sat, Mar 16, 2002 at 11:34:35PM +1100, Keith Owens wrote:

> I can see no good reason why the syscall table has been exported.

please don't change this. Just because it breaks on architectures X
and Y doesn't mean it's useless.

System call snooping is an ugly thing but being able to do it without
patching the kernel is incredibly useful. We're not unaware that
it is arch-specific.

regards
john

--
I am a complete moron for forgetting about endianness. May I be
forever marked as such.

2002-03-16 16:03:33

by Andi Kleen

[permalink] [raw]
Subject: Re: Nice values for kernel modules

John Levon <[email protected]> writes:

> On Sat, Mar 16, 2002 at 11:34:35PM +1100, Keith Owens wrote:
>
> > I can see no good reason why the syscall table has been exported.
>
> please don't change this. Just because it breaks on architectures X
> and Y doesn't mean it's useless.
>
> System call snooping is an ugly thing but being able to do it without
> patching the kernel is incredibly useful. We're not unaware that
> it is arch-specific.

I can just second that. It would make it impossible to fix pice for 2.5 for
example.

-Andi

2002-03-16 17:27:53

by Alan

[permalink] [raw]
Subject: Re: Nice values for kernel modules

> Ability to bypass the stupid commercial time-locked licences (at some time
> wordperfect demo was locked like that and my timetravel module turned a
> demo into full product -- users were happy, at least according to emails I
> received :)

Not any more. Under the DMCA your time travel module probably makes you
a fugtive from US justice 8)

In general though calling into the syscall table by hand is a bad move. If
the function you are calling is generically useful then its much better to
work out whether the real function should be exported.


2002-03-16 22:04:04

by Andi Kleen

[permalink] [raw]
Subject: Re: Nice values for kernel modules

Alan Cox <[email protected]> writes:

> > Ability to bypass the stupid commercial time-locked licences (at some time
> > wordperfect demo was locked like that and my timetravel module turned a
> > demo into full product -- users were happy, at least according to emails I
> > received :)
>
> Not any more. Under the DMCA your time travel module probably makes you
> a fugtive from US justice 8)
>
> In general though calling into the syscall table by hand is a bad move. If
> the function you are calling is generically useful then its much better to
> work out whether the real function should be exported.

Some programs depends on tapping the system call table. For example private
ice and oprofile do this for execve and other calls to know when a new
process is started. It would be possible to add function pointers to all these
functions, but just tapping the system call table actually looks cleaner
to me.

[yes, the approach has module unload races, but these modules tend to just
make themselves not unloadable]

-Andi

2002-03-25 07:53:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: Nice values for kernel modules

Andi Kleen <[email protected]> writes:

> Alan Cox <[email protected]> writes:
>
> > > Ability to bypass the stupid commercial time-locked licences (at some time
> > > wordperfect demo was locked like that and my timetravel module turned a
> > > demo into full product -- users were happy, at least according to emails I
> > > received :)
> >
> > Not any more. Under the DMCA your time travel module probably makes you
> > a fugtive from US justice 8)
> >
> > In general though calling into the syscall table by hand is a bad move. If
> > the function you are calling is generically useful then its much better to
> > work out whether the real function should be exported.
>
> Some programs depends on tapping the system call table. For example private
> ice and oprofile do this for execve and other calls to know when a new
> process is started. It would be possible to add function pointers to all these
> functions, but just tapping the system call table actually looks cleaner
> to me.
>
> [yes, the approach has module unload races, but these modules tend to just
> make themselves not unloadable]

What is wrong with using ptrace? That should already give you a hook into
every syscall.

Eric

2002-03-25 08:21:03

by John Levon

[permalink] [raw]
Subject: Re: Nice values for kernel modules

On Mon, Mar 25, 2002 at 12:47:27AM -0700, Eric W. Biederman wrote:

> What is wrong with using ptrace? That should already give you a hook into
> every syscall.

it's too slow, and how do you manage to follow every process ?

regards
john

--
"Way back at the beginning of time around 1970 the first man page was
handed down from on high. Every one since is an edited copy."
- John Hasler <[email protected]>