Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752071Ab3FZNp5 (ORCPT ); Wed, 26 Jun 2013 09:45:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:9766 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751902Ab3FZNp4 (ORCPT ); Wed, 26 Jun 2013 09:45:56 -0400 Date: Wed, 26 Jun 2013 09:45:48 -0400 From: Luiz Capitulino To: Michal Hocko Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, anton@enomsg.org, akpm@linux-foundation.org Subject: Re: [PATCH] vmpressure: implement strict mode Message-ID: <20130626094548.24c68511@redhat.com> In-Reply-To: <20130626080827.GE28748@dhcp22.suse.cz> References: <20130625175129.7c0d79e1@redhat.com> <20130626080827.GE28748@dhcp22.suse.cz> Organization: Red Hat Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5759 Lines: 153 On Wed, 26 Jun 2013 10:08:27 +0200 Michal Hocko wrote: > On Tue 25-06-13 17:51:29, Luiz Capitulino wrote: > > Currently, applications are notified for the level they registered for > > _plus_ higher levels. > > > > This is a problem if the application wants to implement different > > actions for different levels. For example, an application might want > > to release 10% of its cache on level low, 50% on medium and 100% on > > critical. To do this, the application has to register a different fd > > for each event. However, fd low is always going to be notified and > > and all fds are going to be notified on level critical. > > OK, I am not user of this interface but I thought that the application > would take an action of the highest level it gets notification. But yes > this might get clumsy to implement. > > > Strict mode solves this problem by strictly notifiying the event > > an fd has registered for. It's optional. By default we still notify > > on higher levels. > > OK, makes some sense to me and it should work with the proposed edge > trigerring as well. > > > Signed-off-by: Luiz Capitulino > > --- > > > > PS: I'm following the discussion on the event storm problem, but I believe > > strict mode is orthogonal to what has been suggested (although the > > patches conflict) > > > > Documentation/cgroups/memory.txt | 10 ++++++---- > > mm/vmpressure.c | 19 +++++++++++++++++-- > > 2 files changed, 23 insertions(+), 6 deletions(-) > > > > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt > > index ddf4f93..3c589cf 100644 > > --- a/Documentation/cgroups/memory.txt > > +++ b/Documentation/cgroups/memory.txt > > @@ -807,12 +807,14 @@ register a notification, an application must: > > > > - create an eventfd using eventfd(2); > > - open memory.pressure_level; > > -- write string like " " > > +- write string like " [strict]" > > to cgroup.event_control. > > > > -Application will be notified through eventfd when memory pressure is at > > -the specific level (or higher). Read/write operations to > > -memory.pressure_level are no implemented. > > +Applications will be notified through eventfd when memory pressure is at > > +the specific level or higher. If strict is passed, then applications > > +will only be notified when memory pressure reaches the specified level. > > It would be good to describe when is strick and when the default > appropriate. Yeah, Anton asked for the same thing. Will do it. > > + > > +Read/write operations to memory.pressure_level are no implemented. > > > > Test: > > > > diff --git a/mm/vmpressure.c b/mm/vmpressure.c > > index 736a601..6289ede 100644 > > --- a/mm/vmpressure.c > > +++ b/mm/vmpressure.c > > @@ -137,6 +137,7 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned, > > struct vmpressure_event { > > struct eventfd_ctx *efd; > > enum vmpressure_levels level; > > + bool strict_mode; > > Any reason to not using a flag for this? If there are any other modes to > come them we would end up with zilions of bools which is not very nice. Good point, I'll change it. > > > struct list_head node; > > }; > > > > @@ -153,6 +154,9 @@ static bool vmpressure_event(struct vmpressure *vmpr, > > > > list_for_each_entry(ev, &vmpr->events, node) { > > if (level >= ev->level) { > > + /* strict mode ensures level == ev->level */ > > + if (ev->strict_mode && level != ev->level) > > + continue; > > eventfd_signal(ev->efd, 1); > > signalled = true; > > } > > @@ -292,7 +296,7 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio) > > * infrastructure, so that the notifications will be delivered to the > > * @eventfd. The @args parameter is a string that denotes pressure level > > * threshold (one of vmpressure_str_levels, i.e. "low", "medium", or > > - * "critical"). > > + * "critical") and optionally a different operating mode (i.e. "strict") > > * > > * This function should not be used directly, just pass it to (struct > > * cftype).register_event, and then cgroup core will handle everything by > > @@ -303,22 +307,33 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft, > > { > > struct vmpressure *vmpr = cg_to_vmpressure(cg); > > struct vmpressure_event *ev; > > + bool smode = false; > > + const char *p; > > int level; > > > > for (level = 0; level < VMPRESSURE_NUM_LEVELS; level++) { > > - if (!strcmp(vmpressure_str_levels[level], args)) > > + p = vmpressure_str_levels[level]; > > + if (!strncmp(p, args, strlen(p))) > > break; > > } > > > > if (level >= VMPRESSURE_NUM_LEVELS) > > return -EINVAL; > > > > + p = strchr(args, ' '); > > + if (p) { > > + if (strncmp(++p, "strict", 6)) > > + return -EINVAL; > > + smode = true; > > + } > > + > > ev = kzalloc(sizeof(*ev), GFP_KERNEL); > > if (!ev) > > return -ENOMEM; > > > > ev->efd = eventfd; > > ev->level = level; > > + ev->strict_mode = smode; > > > > mutex_lock(&vmpr->events_lock); > > list_add(&ev->node, &vmpr->events); > > -- > > 1.8.1.4 > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: email@kvack.org > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/