2019-08-15 21:06:42

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <[email protected]> wrote:
> On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > So if someone can explain to me how that works with lockdep I can of
> > course implement it. But afaics that doesn't exist (I tried to explain
> > that somewhere else already), and I'm no really looking forward to
> > hacking also on lockdep for this little series.
>
> Hmm, kind of looks like it is done by calling preempt_disable()

Yup. That was v1, then came the suggestion that disabling preemption
is maybe not the best thing (the oom reaper could still run for a long
time comparatively, if it's cleaning out gigabytes of process memory
or what not, hence this dedicated debug infrastructure).

> Probably the debug option is CONFIG_DEBUG_PREEMPT, not lockdep?

CONFIG_DEBUG_ATOMIC_SLEEP. Like in my patch.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


2019-08-16 01:01:57

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Thu, Aug 15, 2019 at 10:49:31PM +0200, Daniel Vetter wrote:
> On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <[email protected]> wrote:
> > On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > > So if someone can explain to me how that works with lockdep I can of
> > > course implement it. But afaics that doesn't exist (I tried to explain
> > > that somewhere else already), and I'm no really looking forward to
> > > hacking also on lockdep for this little series.
> >
> > Hmm, kind of looks like it is done by calling preempt_disable()
>
> Yup. That was v1, then came the suggestion that disabling preemption
> is maybe not the best thing (the oom reaper could still run for a long
> time comparatively, if it's cleaning out gigabytes of process memory
> or what not, hence this dedicated debug infrastructure).

Oh, I'm coming in late, sorry

Anyhow, I was thinking since we agreed this can trigger on some
CONFIG_DEBUG flag, something like

/* This is a sleepable region, but use preempt_disable to get debugging
* for calls that are not allowed to block for OOM [.. insert
* Michal's explanation.. ] */
if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !mmu_notifier_range_blockable(range))
preempt_disable();
ops->invalidate_range_start();

And I have also been idly mulling doing something like

if (IS_ENABLED(CONFIG_DEBUG_NOTIFIERS) &&
rand &&
mmu_notifier_range_blockable(range)) {
range->flags = 0
if (!ops->invalidate_range_start(range))
continue

// Failed, try again as blockable
range->flags = MMU_NOTIFIER_RANGE_BLOCKABLE
}
ops->invalidate_range_start(range);

Which would give coverage for this corner case without forcing OOM.

Jason

2019-08-16 06:22:03

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 3:00 AM Jason Gunthorpe <[email protected]> wrote:
> On Thu, Aug 15, 2019 at 10:49:31PM +0200, Daniel Vetter wrote:
> > On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <[email protected]> wrote:
> > > On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > > > So if someone can explain to me how that works with lockdep I can of
> > > > course implement it. But afaics that doesn't exist (I tried to explain
> > > > that somewhere else already), and I'm no really looking forward to
> > > > hacking also on lockdep for this little series.
> > >
> > > Hmm, kind of looks like it is done by calling preempt_disable()
> >
> > Yup. That was v1, then came the suggestion that disabling preemption
> > is maybe not the best thing (the oom reaper could still run for a long
> > time comparatively, if it's cleaning out gigabytes of process memory
> > or what not, hence this dedicated debug infrastructure).
>
> Oh, I'm coming in late, sorry
>
> Anyhow, I was thinking since we agreed this can trigger on some
> CONFIG_DEBUG flag, something like
>
> /* This is a sleepable region, but use preempt_disable to get debugging
> * for calls that are not allowed to block for OOM [.. insert
> * Michal's explanation.. ] */
> if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !mmu_notifier_range_blockable(range))
> preempt_disable();
> ops->invalidate_range_start();

I think we also discussed that, and some expressed concerns it would
change behaviour/timing too much for testing. Since this does does
disable preemption for real, not just for might_sleep.

> And I have also been idly mulling doing something like
>
> if (IS_ENABLED(CONFIG_DEBUG_NOTIFIERS) &&
> rand &&
> mmu_notifier_range_blockable(range)) {
> range->flags = 0
> if (!ops->invalidate_range_start(range))
> continue
>
> // Failed, try again as blockable
> range->flags = MMU_NOTIFIER_RANGE_BLOCKABLE
> }
> ops->invalidate_range_start(range);
>
> Which would give coverage for this corner case without forcing OOM.

Hm, this sounds like a neat idea to slap on top. The rand is going to
be a bit tricky though, but I guess for this we could stuff another
counter into task_struct and just e.g. do this every 1000th or so
invalidate (well need to pick a prime so we cycle through notifiers in
case there's multiple). I like.

Michal, thoughts?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2019-08-16 12:13:40

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 08:20:55AM +0200, Daniel Vetter wrote:
> On Fri, Aug 16, 2019 at 3:00 AM Jason Gunthorpe <[email protected]> wrote:
> > On Thu, Aug 15, 2019 at 10:49:31PM +0200, Daniel Vetter wrote:
> > > On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <[email protected]> wrote:
> > > > On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > > > > So if someone can explain to me how that works with lockdep I can of
> > > > > course implement it. But afaics that doesn't exist (I tried to explain
> > > > > that somewhere else already), and I'm no really looking forward to
> > > > > hacking also on lockdep for this little series.
> > > >
> > > > Hmm, kind of looks like it is done by calling preempt_disable()
> > >
> > > Yup. That was v1, then came the suggestion that disabling preemption
> > > is maybe not the best thing (the oom reaper could still run for a long
> > > time comparatively, if it's cleaning out gigabytes of process memory
> > > or what not, hence this dedicated debug infrastructure).
> >
> > Oh, I'm coming in late, sorry
> >
> > Anyhow, I was thinking since we agreed this can trigger on some
> > CONFIG_DEBUG flag, something like
> >
> > /* This is a sleepable region, but use preempt_disable to get debugging
> > * for calls that are not allowed to block for OOM [.. insert
> > * Michal's explanation.. ] */
> > if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !mmu_notifier_range_blockable(range))
> > preempt_disable();
> > ops->invalidate_range_start();
>
> I think we also discussed that, and some expressed concerns it would
> change behaviour/timing too much for testing. Since this does does
> disable preemption for real, not just for might_sleep.

I don't follow, this is a debug kernel, it will have widly different
timing.

Further the point of this debugging on atomic_sleep is to be as
timing-independent as possible since functions with rare sleeps should
be guarded by might_sleep() in their common paths.

I guess I don't get the push to have some low overhead debugging for
this? Is there something special you are looking for?

Jason

2019-08-16 14:12:56

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 2:12 PM Jason Gunthorpe <[email protected]> wrote:
>
> On Fri, Aug 16, 2019 at 08:20:55AM +0200, Daniel Vetter wrote:
> > On Fri, Aug 16, 2019 at 3:00 AM Jason Gunthorpe <[email protected]> wrote:
> > > On Thu, Aug 15, 2019 at 10:49:31PM +0200, Daniel Vetter wrote:
> > > > On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <[email protected]> wrote:
> > > > > On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > > > > > So if someone can explain to me how that works with lockdep I can of
> > > > > > course implement it. But afaics that doesn't exist (I tried to explain
> > > > > > that somewhere else already), and I'm no really looking forward to
> > > > > > hacking also on lockdep for this little series.
> > > > >
> > > > > Hmm, kind of looks like it is done by calling preempt_disable()
> > > >
> > > > Yup. That was v1, then came the suggestion that disabling preemption
> > > > is maybe not the best thing (the oom reaper could still run for a long
> > > > time comparatively, if it's cleaning out gigabytes of process memory
> > > > or what not, hence this dedicated debug infrastructure).
> > >
> > > Oh, I'm coming in late, sorry
> > >
> > > Anyhow, I was thinking since we agreed this can trigger on some
> > > CONFIG_DEBUG flag, something like
> > >
> > > /* This is a sleepable region, but use preempt_disable to get debugging
> > > * for calls that are not allowed to block for OOM [.. insert
> > > * Michal's explanation.. ] */
> > > if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !mmu_notifier_range_blockable(range))
> > > preempt_disable();
> > > ops->invalidate_range_start();
> >
> > I think we also discussed that, and some expressed concerns it would
> > change behaviour/timing too much for testing. Since this does does
> > disable preemption for real, not just for might_sleep.
>
> I don't follow, this is a debug kernel, it will have widly different
> timing.
>
> Further the point of this debugging on atomic_sleep is to be as
> timing-independent as possible since functions with rare sleeps should
> be guarded by might_sleep() in their common paths.
>
> I guess I don't get the push to have some low overhead debugging for
> this? Is there something special you are looking for?

Don't ask me, I'm just trying to get _some_ debugging for this in. I
don't care one bit how much overhead it has because in our CI our
debug build has lockdep and everything and it crawls anyway. I started
out with the preempt_disable/enable thing like you suggested, and a
few rounds later we're here. We can go back to v1 on this one, but I'd
prefer to not do the lap too often.

Also, aside from this patch (which is prep for the next) and some
simple reordering conflicts they're all independent. So if there's no
way to paint this bikeshed here (technicolor perhaps?) then I'd like
to get at least the others considered.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2019-08-16 14:39:16

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 04:11:34PM +0200, Daniel Vetter wrote:
> Also, aside from this patch (which is prep for the next) and some
> simple reordering conflicts they're all independent. So if there's no
> way to paint this bikeshed here (technicolor perhaps?) then I'd like
> to get at least the others considered.

Sure, I think for conflict avoidance reasons I'm probably taking
mmu_notifier stuff via hmm.git, so:

- Andrew had a minor remark on #1, I am ambivalent and would take it
as-is. Your decision if you want to respin.
- #2/#3 is this issue, I would stand by the preempt_disable/etc path
Our situation matches yours, debug tests run lockdep/etc.
- #4 I like a lot, except the map should enclose range_end too,
this can be done after the mm_has_notifiers inside the
__mmu_notifier function
Can you respin?
I will propose preloading the map in another patch
- #5 is already applied in -rc

Jason

2019-08-16 16:38:09

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 4:38 PM Jason Gunthorpe <[email protected]> wrote:
>
> On Fri, Aug 16, 2019 at 04:11:34PM +0200, Daniel Vetter wrote:
> > Also, aside from this patch (which is prep for the next) and some
> > simple reordering conflicts they're all independent. So if there's no
> > way to paint this bikeshed here (technicolor perhaps?) then I'd like
> > to get at least the others considered.
>
> Sure, I think for conflict avoidance reasons I'm probably taking
> mmu_notifier stuff via hmm.git, so:
>
> - Andrew had a minor remark on #1, I am ambivalent and would take it
> as-is. Your decision if you want to respin.

I like mine better, see also the reply from Ralph Campbell.

> - #2/#3 is this issue, I would stand by the preempt_disable/etc path
> Our situation matches yours, debug tests run lockdep/etc.

Since Michal requested the current flavour I think we need spin a bit
more on these here. I guess I'll just rebase them to the end so
they're not holding up the others.

> - #4 I like a lot, except the map should enclose range_end too,
> this can be done after the mm_has_notifiers inside the
> __mmu_notifier function

To make sure I get this right: The same lockdep context, but also
wrapped around invalidate_range_end? From my understanding of pte
zapping that makes sense, but I'm definitely not well-versed enough
for that.

> Can you respin?

Will do.

> I will propose preloading the map in another patch
> - #5 is already applied in -rc

Yup, I'll drop that one.

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2019-08-16 16:57:18

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

On Fri, Aug 16, 2019 at 06:36:52PM +0200, Daniel Vetter wrote:
> On Fri, Aug 16, 2019 at 4:38 PM Jason Gunthorpe <[email protected]> wrote:
> >
> > On Fri, Aug 16, 2019 at 04:11:34PM +0200, Daniel Vetter wrote:
> > > Also, aside from this patch (which is prep for the next) and some
> > > simple reordering conflicts they're all independent. So if there's no
> > > way to paint this bikeshed here (technicolor perhaps?) then I'd like
> > > to get at least the others considered.
> >
> > Sure, I think for conflict avoidance reasons I'm probably taking
> > mmu_notifier stuff via hmm.git, so:
> >
> > - Andrew had a minor remark on #1, I am ambivalent and would take it
> > as-is. Your decision if you want to respin.
>
> I like mine better, see also the reply from Ralph Campbell.

Sure

> > - #2/#3 is this issue, I would stand by the preempt_disable/etc path
> > Our situation matches yours, debug tests run lockdep/etc.
>
> Since Michal requested the current flavour I think we need spin a bit
> more on these here. I guess I'll just rebase them to the end so
> they're not holding up the others.
>
> > - #4 I like a lot, except the map should enclose range_end too,
> > this can be done after the mm_has_notifiers inside the
> > __mmu_notifier function
>
> To make sure I get this right: The same lockdep context, but also
> wrapped around invalidate_range_end?

Yes, the locking context of _range_start and _range_end should be
identical, last time I checked callers this was the case.

So, just add it to __mmu_notifier_invalidate_range_end() outside the
SRCU as there is no reason to burden debug kernel callers twice when
mmu notifiers are not enabled

Jason