I'm running an ext4 root filesystem and regularly get SELinux denials
like:
Oct 16 08:32:55 localhost kernel: type=1400 audit(1224160369.076:5):
avc: denied { sys_resource } for pid=1624 comm="dbus-daemon"
capability=24 scontext=system_u:system_r:system_dbusd_t:s0
tcontext=system_u:system_r:system_dbusd_t:s0 tclass=capability
https://bugzilla.redhat.com/show_bug.cgi?id=467216
Since this doesn't happen with people who have ext3 filesystems but
everything else the same it lead me to look at ext4. I see that
ext?_has_free_blocks() has changed since ext3 and now we always check
for capable(CAP_SYS_RESOUCE). If a process actually has the capability
in pE (as many root processes would) but doesn't have the capability in
SELinux policy we will get a denial.
I can think of a couple ways to fix this:
the first (and one I like) is to change ext4 to stop checking
CAP_SYS_RESOURCE all the time. It's not really 'pretty' but I think you
would actually get a better performing function. Just always calculate
root_blocks and if we don't have enough room then then do the whole
check to see if are root and recalculate without root_blocks. I'd guess
that a great majority of the time operations will succeed even with a
non-zero root_blocks and I would guess that most process aren't going to
be root processes and so we would be calculating root_blocks anyway.
This would (like ext3) only cause these denials when it was filled up.
We've been living with that forever, so I don't see a problem there...
The second way would be a new lsm hook. Instead of calling capable(),
ext4 could call something like a new capable_noaudit() which would
return the same result but would tell the lsm that this isn't a security
decision and shouldn't be audited. The LSM doesn't currently have any
kind of syntax or representation like this exposed to the main kernel,
so I'm a little skeptical how the LSM community at large would respond
to exposing such a thing...
Another would be a new specific LSM call to just check cap_sys_resource
which also doesn't get audited.
Do others have thoughts?
-Eric
On Fri, 2008-10-24 at 11:05 -0400, Eric Paris wrote:
> I'm running an ext4 root filesystem and regularly get SELinux denials
> like:
>
> Oct 16 08:32:55 localhost kernel: type=1400 audit(1224160369.076:5):
> avc: denied { sys_resource } for pid=1624 comm="dbus-daemon"
> capability=24 scontext=system_u:system_r:system_dbusd_t:s0
> tcontext=system_u:system_r:system_dbusd_t:s0 tclass=capability
>
> https://bugzilla.redhat.com/show_bug.cgi?id=467216
>
> Since this doesn't happen with people who have ext3 filesystems but
> everything else the same it lead me to look at ext4. I see that
> ext?_has_free_blocks() has changed since ext3 and now we always check
> for capable(CAP_SYS_RESOUCE). If a process actually has the capability
> in pE (as many root processes would) but doesn't have the capability in
> SELinux policy we will get a denial.
>
> I can think of a couple ways to fix this:
>
> the first (and one I like) is to change ext4 to stop checking
> CAP_SYS_RESOURCE all the time. It's not really 'pretty' but I think you
> would actually get a better performing function. Just always calculate
> root_blocks and if we don't have enough room then then do the whole
> check to see if are root and recalculate without root_blocks. I'd guess
> that a great majority of the time operations will succeed even with a
> non-zero root_blocks and I would guess that most process aren't going to
> be root processes and so we would be calculating root_blocks anyway.
> This would (like ext3) only cause these denials when it was filled up.
> We've been living with that forever, so I don't see a problem there...
>
> The second way would be a new lsm hook. Instead of calling capable(),
> ext4 could call something like a new capable_noaudit() which would
> return the same result but would tell the lsm that this isn't a security
> decision and shouldn't be audited. The LSM doesn't currently have any
> kind of syntax or representation like this exposed to the main kernel,
> so I'm a little skeptical how the LSM community at large would respond
> to exposing such a thing...
>
> Another would be a new specific LSM call to just check cap_sys_resource
> which also doesn't get audited.
>
> Do others have thoughts?
Seems similar to the vm_enough_memory() case, where we likewise
introduced a separate security hook that internally checks without
auditing.
The OOM killer likewise ought to be using a non-auditing form of
capability checks.
--
Stephen Smalley
National Security Agency
Eric Paris wrote:
> I'm running an ext4 root filesystem and regularly get SELinux denials
> like:
>
> Oct 16 08:32:55 localhost kernel: type=1400 audit(1224160369.076:5):
> avc: denied { sys_resource } for pid=1624 comm="dbus-daemon"
> capability=24 scontext=system_u:system_r:system_dbusd_t:s0
> tcontext=system_u:system_r:system_dbusd_t:s0 tclass=capability
>
> https://bugzilla.redhat.com/show_bug.cgi?id=467216
>
> Since this doesn't happen with people who have ext3 filesystems but
> everything else the same it lead me to look at ext4. I see that
> ext?_has_free_blocks() has changed since ext3 and now we always check
> for capable(CAP_SYS_RESOUCE). If a process actually has the capability
> in pE (as many root processes would) but doesn't have the capability in
> SELinux policy we will get a denial.
>
> I can think of a couple ways to fix this:
>
> the first (and one I like) is to change ext4 to stop checking
> CAP_SYS_RESOURCE all the time. It's not really 'pretty' but I think you
> would actually get a better performing function. Just always calculate
> root_blocks and if we don't have enough room then then do the whole
> check to see if are root and recalculate without root_blocks. I'd guess
> that a great majority of the time operations will succeed even with a
> non-zero root_blocks and I would guess that most process aren't going to
> be root processes and so we would be calculating root_blocks anyway.
> This would (like ext3) only cause these denials when it was filled up.
> We've been living with that forever, so I don't see a problem there...
Thanks Eric, I'll look into this. It seems that ext4_has_free_blocks is
now overly complex; it used to return how many blocks are available, if
that number is < nblocks, but the single caller now only checks
success/failure for having nblocks free. I'll see if I can simplify it
and delay the cap check as you suggest.
-Eric
On Fri, 2008-10-24 at 11:08 -0400, Stephen Smalley wrote:
> On Fri, 2008-10-24 at 11:05 -0400, Eric Paris wrote:
> > Do others have thoughts?
>
> Seems similar to the vm_enough_memory() case, where we likewise
> introduced a separate security hook that internally checks without
> auditing.
>
> The OOM killer likewise ought to be using a non-auditing form of
> capability checks.
So would you suggest a generic non-auditing capability checking
mechanism or a specific hook for "things to use"
* capable_noaudit(current, cap)
* security_capable_noaudit(current, cap)
* security_cap_sys_resource(current)
Looks like oom also checks CAP_SYS_ADMIN so maybe a generic cap
interface would be best.
esandeen: I still think it would be a good idea to simplify
ext4_claim_free_blocks() and ext4_has_free_blocks() which seems to have
a lot of code duplication and both have the unconditional capable
calls...
-Eric
On Fri, 2008-10-24 at 13:28 -0400, Eric Paris wrote:
> On Fri, 2008-10-24 at 11:08 -0400, Stephen Smalley wrote:
> > On Fri, 2008-10-24 at 11:05 -0400, Eric Paris wrote:
>
> > > Do others have thoughts?
> >
> > Seems similar to the vm_enough_memory() case, where we likewise
> > introduced a separate security hook that internally checks without
> > auditing.
> >
> > The OOM killer likewise ought to be using a non-auditing form of
> > capability checks.
>
> So would you suggest a generic non-auditing capability checking
> mechanism or a specific hook for "things to use"
>
> * capable_noaudit(current, cap)
> * security_capable_noaudit(current, cap)
> * security_cap_sys_resource(current)
>
> Looks like oom also checks CAP_SYS_ADMIN so maybe a generic cap
> interface would be best.
In the vm_enough_memory() case, I think it was Alan Cox's idea to take
the entire policy logic into a distinct security hook so that you could
ultimately support policy-based resource constraints. Versus only
introducing a non-auditing variant of the capable check. If we wanted
to be consistent, we'd likewise introduce distinct hooks for these cases
and take more of the logic into them, not just the capability check.
I'm open to either approach though.
> esandeen: I still think it would be a good idea to simplify
> ext4_claim_free_blocks() and ext4_has_free_blocks() which seems to have
> a lot of code duplication and both have the unconditional capable
> calls...
--
Stephen Smalley
National Security Agency
在 2008-10-24五的 11:56 -0500,Eric Sandeen写道:
> Eric Paris wrote:
> > I'm running an ext4 root filesystem and regularly get SELinux denials
> > like:
> >
> > Oct 16 08:32:55 localhost kernel: type=1400 audit(1224160369.076:5):
> > avc: denied { sys_resource } for pid=1624 comm="dbus-daemon"
> > capability=24 scontext=system_u:system_r:system_dbusd_t:s0
> > tcontext=system_u:system_r:system_dbusd_t:s0 tclass=capability
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=467216
> >
> > Since this doesn't happen with people who have ext3 filesystems but
> > everything else the same it lead me to look at ext4. I see that
> > ext?_has_free_blocks() has changed since ext3 and now we always check
> > for capable(CAP_SYS_RESOUCE). If a process actually has the capability
> > in pE (as many root processes would) but doesn't have the capability in
> > SELinux policy we will get a denial.
> >
> > I can think of a couple ways to fix this:
> >
> > the first (and one I like) is to change ext4 to stop checking
> > CAP_SYS_RESOURCE all the time. It's not really 'pretty' but I think you
> > would actually get a better performing function. Just always calculate
> > root_blocks and if we don't have enough room then then do the whole
> > check to see if are root and recalculate without root_blocks. I'd guess
> > that a great majority of the time operations will succeed even with a
> > non-zero root_blocks and I would guess that most process aren't going to
> > be root processes and so we would be calculating root_blocks anyway.
> > This would (like ext3) only cause these denials when it was filled up.
> > We've been living with that forever, so I don't see a problem there...
>
> Thanks Eric, I'll look into this. It seems that ext4_has_free_blocks is
> now overly complex; it used to return how many blocks are available, if
> that number is < nblocks, but the single caller now only checks
> success/failure for having nblocks free. I'll see if I can simplify it
> and delay the cap check as you suggest.
>
Most functionality in ext4_has_free_blocks() is duplicated in
ext4_claim_free_blocks(). I guess the ext4_has_free_blocks() could be
simplified a bit, or the two functional merge into one.
The delay cap check sounds right thing to me.
Mingming
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Mingming Cao wrote:
> Most functionality in ext4_has_free_blocks() is duplicated in
> ext4_claim_free_blocks(). I guess the ext4_has_free_blocks() could be
> simplified a bit, or the two functional merge into one.
>
> The delay cap check sounds right thing to me.
good point on the duplication, I'll merge them as part of this.
Thanks Mingming,
-Eric
Eric Paris wrote:
> I'm running an ext4 root filesystem and regularly get SELinux denials
> like:
>
> Oct 16 08:32:55 localhost kernel: type=1400 audit(1224160369.076:5):
> avc: denied { sys_resource } for pid=1624 comm="dbus-daemon"
> capability=24 scontext=system_u:system_r:system_dbusd_t:s0
> tcontext=system_u:system_r:system_dbusd_t:s0 tclass=capability
>
> https://bugzilla.redhat.com/show_bug.cgi?id=467216
For the record, I've put a couple patches into the ext4 patch queue that
should do Eric's first suggestion of deferring the capable() check until
it's really needed. Details are in the bug above.
Thanks,
-Eric