The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
returns nothing. This is now modified to show the time of the last
cache shrink operation in us.
CONFIG_SLUB_DEBUG depends on CONFIG_SYSFS. So the new shrink_us field
is always available to the shrink methods.
Signed-off-by: Waiman Long <[email protected]>
---
Documentation/ABI/testing/sysfs-kernel-slab | 2 ++
include/linux/slub_def.h | 1 +
mm/slub.c | 12 +++++++++---
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
index 94ffd47fc8d7..9869a3f57dc3 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -437,6 +437,8 @@ Description:
write for shrinking the cache. Other input values are
considered invalid. If it is a root cache, all the
child memcg caches will also be shrunk, if available.
+ When read, the time in us of the last cache shrink
+ operation is shown.
What: /sys/kernel/slab/cache/slab_size
Date: May 2007
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index d2153789bd9f..055474197e83 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -113,6 +113,7 @@ struct kmem_cache {
/* For propagation, maximum size of a stored attr */
unsigned int max_attr_size;
#ifdef CONFIG_SYSFS
+ unsigned int shrink_us; /* Cache shrink time in us */
struct kset *memcg_kset;
#endif
#endif
diff --git a/mm/slub.c b/mm/slub.c
index 9736eb10dcb8..77d67a55ce43 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -34,6 +34,7 @@
#include <linux/prefetch.h>
#include <linux/memcontrol.h>
#include <linux/random.h>
+#include <linux/sched/clock.h>
#include <trace/events/kmem.h>
@@ -5287,16 +5288,21 @@ SLAB_ATTR(failslab);
static ssize_t shrink_show(struct kmem_cache *s, char *buf)
{
- return 0;
+ return sprintf(buf, "%u\n", s->shrink_us);
}
static ssize_t shrink_store(struct kmem_cache *s,
const char *buf, size_t length)
{
- if (buf[0] == '1')
+ if (buf[0] == '1') {
+ u64 start = sched_clock();
+
kmem_cache_shrink_all(s);
- else
+ s->shrink_us = (unsigned int)div_u64(sched_clock() - start,
+ NSEC_PER_USEC);
+ } else {
return -EINVAL;
+ }
return length;
}
SLAB_ATTR(shrink);
--
2.18.1
On Wed, 17 Jul 2019, Waiman Long wrote:
> The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
> returns nothing. This is now modified to show the time of the last
> cache shrink operation in us.
What is this useful for? Any use cases?
> CONFIG_SLUB_DEBUG depends on CONFIG_SYSFS. So the new shrink_us field
> is always available to the shrink methods.
Aside from minimal systems without CONFIG_SYSFS... Does this build without
CONFIG_SYSFS?
On 7/18/19 7:39 AM, Christopher Lameter wrote:
> On Wed, 17 Jul 2019, Waiman Long wrote:
>
>> The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
>> returns nothing. This is now modified to show the time of the last
>> cache shrink operation in us.
> What is this useful for? Any use cases?
I got query about how much time will the slab_mutex be held when
shrinking the cache. I don't have a solid answer as it depends on how
many memcg caches are there. This patch is a partial answer to that as
it give a rough upper bound of the lock hold time.
>> CONFIG_SLUB_DEBUG depends on CONFIG_SYSFS. So the new shrink_us field
>> is always available to the shrink methods.
> Aside from minimal systems without CONFIG_SYSFS... Does this build without
> CONFIG_SYSFS?
The sysfs code in mm/slub.c is guarded by CONFIG_SLUB_DEBUG which, in
turn, depends on CONFIG_SYSFS. So if CONFIG_SYSFS is off, the shrink
sysfs methods will be off as well. I haven't tried doing a minimal
build. I will certainly try that, but I don't expect any problem here.
Cheers,
Longman
On 7/18/19 10:36 AM, Waiman Long wrote:
>>> CONFIG_SLUB_DEBUG depends on CONFIG_SYSFS. So the new shrink_us field
>>> is always available to the shrink methods.
>> Aside from minimal systems without CONFIG_SYSFS... Does this build without
>> CONFIG_SYSFS?
> The sysfs code in mm/slub.c is guarded by CONFIG_SLUB_DEBUG which, in
> turn, depends on CONFIG_SYSFS. So if CONFIG_SYSFS is off, the shrink
> sysfs methods will be off as well. I haven't tried doing a minimal
> build. I will certainly try that, but I don't expect any problem here.
I have tried a tiny config with slub. There was no compilation problem.
-Longman
On Wed 17-07-19 16:24:13, Waiman Long wrote:
> The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
> returns nothing. This is now modified to show the time of the last
> cache shrink operation in us.
Isn't this something that tracing can be used for without any kernel
modifications?
--
Michal Hocko
SUSE Labs
On 7/19/19 2:14 AM, Michal Hocko wrote:
> On Wed 17-07-19 16:24:13, Waiman Long wrote:
>> The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
>> returns nothing. This is now modified to show the time of the last
>> cache shrink operation in us.
> Isn't this something that tracing can be used for without any kernel
> modifications?
That is true, but it will be a bit more cumbersome to get the data.
Anyway, this is just a nice to have patch for me. I am perfectly fine
with dropping it if this does not prove to be that useful.
Thanks,
Longman
On Fri 19-07-19 10:07:20, Waiman Long wrote:
> On 7/19/19 2:14 AM, Michal Hocko wrote:
> > On Wed 17-07-19 16:24:13, Waiman Long wrote:
> >> The show method of /sys/kernel/slab/<slab>/shrink sysfs file currently
> >> returns nothing. This is now modified to show the time of the last
> >> cache shrink operation in us.
> > Isn't this something that tracing can be used for without any kernel
> > modifications?
>
> That is true, but it will be a bit more cumbersome to get the data.
I have no say for this code but if there is a way to capture timing data
I prefer to rely on the tracing infrastructure. If the current tooling
makes it cumbersome to get then this is a good reason to ask for a less
cumbersome way. On the other hand, if you somehow hardwire it to a user
visible interface then you just establish ABI which might stand in way
for potential/future development.
So take it as my 2c
--
Michal Hocko
SUSE Labs