On Tue, Aug 29, 2023 at 1:12 PM Tejun Heo <[email protected]> wrote:
>
> Hello,
>
> On Tue, Aug 29, 2023 at 12:54:06PM -0700, Yosry Ahmed wrote:
> ...
> > > Maybe leave the global lock as-is and gate the userland flushers with a
> > > mutex so that there's only ever one contenting on the rstat lock from
> > > userland side?
> >
> > Waiman suggested this as well. We can do that for sure, although I
> > think we should wait until we are sure it's needed.
> >
> > One question. If whoever is holding that mutex is either flushing with
> > the spinlock held or spinning (i.e not sleepable or preemptable),
> > wouldn't this be equivalent to just changing the spinlock with a mutex
> > and disable preemption while holding it?
>
> Well, it creates layering so that userspace can't flood the inner lock which
> can cause contention issues for kernel side users. Not sleeping while
> actively flushing is an side-effect too but the code at least doesn't look
> as anti-patterny as disabling preemption right after grabbing a mutex.
I see. At most one kernel side flusher will be spinning for the lock
at any given point anyway, but I guess having that one kernel side
flusher competing against one user side flusher is better competing
with N flushers.
I will add a mutex on the userspace read side then and spin a v3.
Hopefully this addresses Michal's concern as well. The lock dropping
logic will still exist for the inner lock, but when one userspace
reader drops the inner lock other readers won't be able to pick it up.
>
> I don't have a strong preference. As long as we stay away from introducing a
> new user interface construct and can address the noticed scalability issues,
> it should be fine. Note that there are other ways to address priority
> inversions and contentions too - e.g. we can always bounce flushing to a
> [kthread_]kworker and rate limit (or rather latency limit) how often
> different classes of users can trigger flushing. I don't think we have to go
> there yet but if the simpler meaures don't work out, there are still many
> ways to solve the problem within the kernel.
I whole-heartedly agree with the preference to fix the problem within
the kernel with minimal/none user space involvement.
Thanks!
On Tue 29-08-23 13:20:34, Yosry Ahmed wrote:
[...]
> I will add a mutex on the userspace read side then and spin a v3.
> Hopefully this addresses Michal's concern as well. The lock dropping
> logic will still exist for the inner lock, but when one userspace
> reader drops the inner lock other readers won't be able to pick it up.
Yes, that would minimize the risk of the worst case pathological
behavior.
> > I don't have a strong preference. As long as we stay away from introducing a
> > new user interface construct and can address the noticed scalability issues,
> > it should be fine. Note that there are other ways to address priority
> > inversions and contentions too - e.g. we can always bounce flushing to a
> > [kthread_]kworker and rate limit (or rather latency limit) how often
> > different classes of users can trigger flushing. I don't think we have to go
> > there yet but if the simpler meaures don't work out, there are still many
> > ways to solve the problem within the kernel.
>
> I whole-heartedly agree with the preference to fix the problem within
> the kernel with minimal/none user space involvement.
Let's see. While I would love to see a solution that works for everybody
without explicit interface we have hit problems with locks involved in
stat files in the past.
--
Michal Hocko
SUSE Labs