On Sat, 31 Oct 2015, Tetsuo Handa wrote:
> Then, you need to update below description (or drop it) because
> patch 3/3 alone will not guarantee that the counters are up to date.
The vmstat system does not guarantee that the counters are up to date
always. The whole point is the deferral of updates for performance
reasons. They are updated *at some point* within stat_interval. That needs
to happen and that is what this patchset is fixing.
Christoph Lameter wrote:
> On Sat, 31 Oct 2015, Tetsuo Handa wrote:
>
> > Then, you need to update below description (or drop it) because
> > patch 3/3 alone will not guarantee that the counters are up to date.
>
> The vmstat system does not guarantee that the counters are up to date
> always. The whole point is the deferral of updates for performance
> reasons. They are updated *at some point* within stat_interval. That needs
> to happen and that is what this patchset is fixing.
>
I'm still unclear. I think that the result of this patchset is
The counters are never updated even after stat_interval
if some workqueue item is doing a __GFP_WAIT memory allocation.
but the patch description sounds as if
The counters will be updated even if some workqueue item is
doing a __GFP_WAIT memory allocation.
which denies the actual result I tested with this patchset applied.
On Tue, 3 Nov 2015, Tetsuo Handa wrote:
> I'm still unclear. I think that the result of this patchset is
>
> The counters are never updated even after stat_interval
> if some workqueue item is doing a __GFP_WAIT memory allocation.
>
> but the patch description sounds as if
>
> The counters will be updated even if some workqueue item is
> doing a __GFP_WAIT memory allocation.
>
> which denies the actual result I tested with this patchset applied.
Well true that is dependend on the correct workqueue operation. I though
that was fixed by Tejun?
On Mon, Nov 02, 2015 at 12:10:04PM -0600, Christoph Lameter wrote:
> Well true that is dependend on the correct workqueue operation. I though
> that was fixed by Tejun?
At least for now, we're going with Tetsuo's short sleep patch.
Thanks.
--
tejun
Christoph Lameter wrote:
> On Sat, 31 Oct 2015, Tetsuo Handa wrote:
>
> > Then, you need to update below description (or drop it) because
> > patch 3/3 alone will not guarantee that the counters are up to date.
>
> The vmstat system does not guarantee that the counters are up to date
> always. The whole point is the deferral of updates for performance
> reasons. They are updated *at some point* within stat_interval. That needs
> to happen and that is what this patchset is fixing.
So, if you refer to the blocking of the execution of vmstat updates,
description for patch 3/3 sould be updated to something like below?
----------
Since __GFP_WAIT memory allocations do not call schedule()
when there is nothing to reclaim, and workqueue does not kick
remaining workqueue items unless in-flight workqueue item calls
schedule(), __GFP_WAIT memory allocation requests by workqueue
items can block vmstat_update work item forever.
Since zone_reclaimable() decision depends on vmstat counters
to be up to dated, a silent lockup occurs because a workqueue
item doing a __GFP_WAIT memory allocation request continues
using outdated vmstat counters.
In order to fix this problem, we need to allocate a dedicated
workqueue for vmstat. Note that this patch itself does not fix
lockup problem. Tejun will develop a patch which detects lockup
situation and kick remaining workqueue items.
----------
On Fri, 6 Nov 2015, Tetsuo Handa wrote:
> So, if you refer to the blocking of the execution of vmstat updates,
> description for patch 3/3 sould be updated to something like below?
Ok that is much better.
> ----------
> Since __GFP_WAIT memory allocations do not call schedule()
> when there is nothing to reclaim, and workqueue does not kick
> remaining workqueue items unless in-flight workqueue item calls
> schedule(), __GFP_WAIT memory allocation requests by workqueue
> items can block vmstat_update work item forever.
>
> Since zone_reclaimable() decision depends on vmstat counters
> to be up to dated, a silent lockup occurs because a workqueue
> item doing a __GFP_WAIT memory allocation request continues
> using outdated vmstat counters.
>
> In order to fix this problem, we need to allocate a dedicated
> workqueue for vmstat. Note that this patch itself does not fix
> lockup problem. Tejun will develop a patch which detects lockup
> situation and kick remaining workqueue items.
> ----------
>