2021-07-27 05:37:42

by Vasily Averin

[permalink] [raw]
Subject: [PATCH v7 09/10] memcg: enable accounting for tty-related objects

At each login the user forces the kernel to create a new terminal and
allocate up to ~1Kb memory for the tty-related structures.

By default it's allowed to create up to 4096 ptys with 1024 reserve for
initial mount namespace only and the settings are controlled by host admin.

Though this default is not enough for hosters with thousands
of containers per node. Host admin can be forced to increase it
up to NR_UNIX98_PTY_MAX = 1<<20.

By default container is restricted by pty mount_opt.max = 1024,
but admin inside container can change it via remount. As a result,
one container can consume almost all allowed ptys
and allocate up to 1Gb of unaccounted memory.

It is not enough per-se to trigger OOM on host, however anyway, it allows
to significantly exceed the assigned memcg limit and leads to troubles
on the over-committed node.

It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.

Signed-off-by: Vasily Averin <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/tty_io.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 26debec..e787f6f 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
/* Stash the termios data */
tp = tty->driver->termios[idx];
if (tp == NULL) {
- tp = kmalloc(sizeof(*tp), GFP_KERNEL);
+ tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);
if (tp == NULL)
return;
tty->driver->termios[idx] = tp;
@@ -3119,7 +3119,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver *driver, int idx)
{
struct tty_struct *tty;

- tty = kzalloc(sizeof(*tty), GFP_KERNEL);
+ tty = kzalloc(sizeof(*tty), GFP_KERNEL_ACCOUNT);
if (!tty)
return NULL;

--
1.8.3.1


2021-07-27 06:10:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v7 09/10] memcg: enable accounting for tty-related objects

On Tue, Jul 27, 2021 at 08:34:14AM +0300, Vasily Averin wrote:
> At each login the user forces the kernel to create a new terminal and
> allocate up to ~1Kb memory for the tty-related structures.
>
> By default it's allowed to create up to 4096 ptys with 1024 reserve for
> initial mount namespace only and the settings are controlled by host admin.
>
> Though this default is not enough for hosters with thousands
> of containers per node. Host admin can be forced to increase it
> up to NR_UNIX98_PTY_MAX = 1<<20.
>
> By default container is restricted by pty mount_opt.max = 1024,
> but admin inside container can change it via remount. As a result,
> one container can consume almost all allowed ptys
> and allocate up to 1Gb of unaccounted memory.
>
> It is not enough per-se to trigger OOM on host, however anyway, it allows
> to significantly exceed the assigned memcg limit and leads to troubles
> on the over-committed node.
>
> It makes sense to account for them to restrict the host's memory
> consumption from inside the memcg-limited container.
>
> Signed-off-by: Vasily Averin <[email protected]>
> Acked-by: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/tty/tty_io.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

As this is independant of all of the rest, I'll just take this through
my tree now so that you do not have to keep resending it.

thanks,

greg k-h

2021-07-27 06:56:24

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH v7 09/10] memcg: enable accounting for tty-related objects

On 27. 07. 21, 7:34, Vasily Averin wrote:
> At each login the user forces the kernel to create a new terminal and
> allocate up to ~1Kb memory for the tty-related structures.
>
> By default it's allowed to create up to 4096 ptys with 1024 reserve for
> initial mount namespace only and the settings are controlled by host admin.
>
> Though this default is not enough for hosters with thousands
> of containers per node. Host admin can be forced to increase it
> up to NR_UNIX98_PTY_MAX = 1<<20.
>
> By default container is restricted by pty mount_opt.max = 1024,
> but admin inside container can change it via remount. As a result,
> one container can consume almost all allowed ptys
> and allocate up to 1Gb of unaccounted memory.
>
> It is not enough per-se to trigger OOM on host, however anyway, it allows
> to significantly exceed the assigned memcg limit and leads to troubles
> on the over-committed node.
>
> It makes sense to account for them to restrict the host's memory
> consumption from inside the memcg-limited container.
>
> Signed-off-by: Vasily Averin <[email protected]>
> Acked-by: Greg Kroah-Hartman <[email protected]>
> ---
> drivers/tty/tty_io.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 26debec..e787f6f 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
> /* Stash the termios data */
> tp = tty->driver->termios[idx];
> if (tp == NULL) {
> - tp = kmalloc(sizeof(*tp), GFP_KERNEL);
> + tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);

termios are not saved for PTYs (TTY_DRIVER_RESET_TERMIOS). Am I missing
something?

> if (tp == NULL)
> return;
> tty->driver->termios[idx] = tp;
> @@ -3119,7 +3119,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver *driver, int idx)
> {
> struct tty_struct *tty;
>
> - tty = kzalloc(sizeof(*tty), GFP_KERNEL);
> + tty = kzalloc(sizeof(*tty), GFP_KERNEL_ACCOUNT);
> if (!tty)
> return NULL;
>
>

thanks,
--
js
suse labs

2021-07-27 08:07:11

by Vasily Averin

[permalink] [raw]
Subject: Re: [PATCH v7 09/10] memcg: enable accounting for tty-related objects

On 7/27/21 9:54 AM, Jiri Slaby wrote:
> On 27. 07. 21, 7:34, Vasily Averin wrote:
>> At each login the user forces the kernel to create a new terminal and
>> allocate up to ~1Kb memory for the tty-related structures.
>>
>> By default it's allowed to create up to 4096 ptys with 1024 reserve for
>> initial mount namespace only and the settings are controlled by host admin.
>>
>> Though this default is not enough for hosters with thousands
>> of containers per node. Host admin can be forced to increase it
>> up to NR_UNIX98_PTY_MAX = 1<<20.
>>
>> By default container is restricted by pty mount_opt.max = 1024,
>> but admin inside container can change it via remount. As a result,
>> one container can consume almost all allowed ptys
>> and allocate up to 1Gb of unaccounted memory.
>>
>> It is not enough per-se to trigger OOM on host, however anyway, it allows
>> to significantly exceed the assigned memcg limit and leads to troubles
>> on the over-committed node.
>>
>> It makes sense to account for them to restrict the host's memory
>> consumption from inside the memcg-limited container.
>>
>> Signed-off-by: Vasily Averin <[email protected]>
>> Acked-by: Greg Kroah-Hartman <[email protected]>
>> ---
>>   drivers/tty/tty_io.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>> index 26debec..e787f6f 100644
>> --- a/drivers/tty/tty_io.c
>> +++ b/drivers/tty/tty_io.c
>> @@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
>>       /* Stash the termios data */
>>       tp = tty->driver->termios[idx];
>>       if (tp == NULL) {
>> -        tp = kmalloc(sizeof(*tp), GFP_KERNEL);
>> +        tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);
>
> termios are not saved for PTYs (TTY_DRIVER_RESET_TERMIOS). Am I missing something?

No, you are right, I've missed this.
Typical terminals inside containers use TTY_DRIVER_RESET_TERMIOS flag and therefore do not save termios.
So its accounting have near-to-zero impact in real life.
I'll prepare fixup to drop GFP_KERNEL_ACCOUNT here.

Thank you very much,
Vasily Averin

>>           if (tp == NULL)
>>               return;
>>           tty->driver->termios[idx] = tp;
>> @@ -3119,7 +3119,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver *driver, int idx)
>>   {
>>       struct tty_struct *tty;
>>   -    tty = kzalloc(sizeof(*tty), GFP_KERNEL);
>> +    tty = kzalloc(sizeof(*tty), GFP_KERNEL_ACCOUNT);
>>       if (!tty)
>>           return NULL;
>>  
>
> thanks,


2021-07-27 09:30:57

by Vasily Averin

[permalink] [raw]
Subject: [PATCH TTY] memcg: drop GFP_KERNEL_ACCOUNT use in tty_save_termios()

Jiri Slaby pointed that termios are not saved for PTYs and for other
terminals used inside containers. Therefore accounting for saved
termios have near to zero impact in real life scenarios.

Cc: Jiri Slaby <[email protected]>
Fixes: 854dd8a572a0 ("memcg: enable accounting for tty-related objects")
Signed-off-by: Vasily Averin <[email protected]>
---
drivers/tty/tty_io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index e787f6f..a6230b2 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
/* Stash the termios data */
tp = tty->driver->termios[idx];
if (tp == NULL) {
- tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);
+ tp = kmalloc(sizeof(*tp), GFP_KERNEL);
if (tp == NULL)
return;
tty->driver->termios[idx] = tp;
--
1.8.3.1


2021-07-27 09:33:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v7 09/10] memcg: enable accounting for tty-related objects

On Tue, Jul 27, 2021 at 11:02:31AM +0300, Vasily Averin wrote:
> On 7/27/21 9:54 AM, Jiri Slaby wrote:
> > On 27. 07. 21, 7:34, Vasily Averin wrote:
> >> At each login the user forces the kernel to create a new terminal and
> >> allocate up to ~1Kb memory for the tty-related structures.
> >>
> >> By default it's allowed to create up to 4096 ptys with 1024 reserve for
> >> initial mount namespace only and the settings are controlled by host admin.
> >>
> >> Though this default is not enough for hosters with thousands
> >> of containers per node. Host admin can be forced to increase it
> >> up to NR_UNIX98_PTY_MAX = 1<<20.
> >>
> >> By default container is restricted by pty mount_opt.max = 1024,
> >> but admin inside container can change it via remount. As a result,
> >> one container can consume almost all allowed ptys
> >> and allocate up to 1Gb of unaccounted memory.
> >>
> >> It is not enough per-se to trigger OOM on host, however anyway, it allows
> >> to significantly exceed the assigned memcg limit and leads to troubles
> >> on the over-committed node.
> >>
> >> It makes sense to account for them to restrict the host's memory
> >> consumption from inside the memcg-limited container.
> >>
> >> Signed-off-by: Vasily Averin <[email protected]>
> >> Acked-by: Greg Kroah-Hartman <[email protected]>
> >> ---
> >> ? drivers/tty/tty_io.c | 4 ++--
> >> ? 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> >> index 26debec..e787f6f 100644
> >> --- a/drivers/tty/tty_io.c
> >> +++ b/drivers/tty/tty_io.c
> >> @@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
> >> ????? /* Stash the termios data */
> >> ????? tp = tty->driver->termios[idx];
> >> ????? if (tp == NULL) {
> >> -??????? tp = kmalloc(sizeof(*tp), GFP_KERNEL);
> >> +??????? tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);
> >
> > termios are not saved for PTYs (TTY_DRIVER_RESET_TERMIOS). Am I missing something?
>
> No, you are right, I've missed this.
> Typical terminals inside containers use TTY_DRIVER_RESET_TERMIOS flag and therefore do not save termios.
> So its accounting have near-to-zero impact in real life.
> I'll prepare fixup to drop GFP_KERNEL_ACCOUNT here.

I'll go drop this patch from my tree.

thanks,

greg k-h

2021-07-27 09:34:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH TTY] memcg: drop GFP_KERNEL_ACCOUNT use in tty_save_termios()

On Tue, Jul 27, 2021 at 12:26:12PM +0300, Vasily Averin wrote:
> Jiri Slaby pointed that termios are not saved for PTYs and for other
> terminals used inside containers. Therefore accounting for saved
> termios have near to zero impact in real life scenarios.
>
> Cc: Jiri Slaby <[email protected]>
> Fixes: 854dd8a572a0 ("memcg: enable accounting for tty-related objects")
> Signed-off-by: Vasily Averin <[email protected]>
> ---
> drivers/tty/tty_io.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index e787f6f..a6230b2 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -1493,7 +1493,7 @@ void tty_save_termios(struct tty_struct *tty)
> /* Stash the termios data */
> tp = tty->driver->termios[idx];
> if (tp == NULL) {
> - tp = kmalloc(sizeof(*tp), GFP_KERNEL_ACCOUNT);
> + tp = kmalloc(sizeof(*tp), GFP_KERNEL);
> if (tp == NULL)
> return;
> tty->driver->termios[idx] = tp;
> --
> 1.8.3.1
>

I can just drop the original patch from my tree, it has not gone into my
unmutable branch yet.

thanks,

greg k-h

2022-02-28 10:03:17

by Vasily Averin

[permalink] [raw]
Subject: [PATCH v2] memcg: enable accounting for tty-related objects

At each login the user forces the kernel to create a new terminal and
allocate up to ~1Kb memory for the tty-related structures.

By default it's allowed to create up to 4096 ptys with 1024 reserve for
initial mount namespace only and the settings are controlled by host admin.

Though this default is not enough for hosters with thousands
of containers per node. Host admin can be forced to increase it
up to NR_UNIX98_PTY_MAX = 1<<20.

By default container is restricted by pty mount_opt.max = 1024,
but admin inside container can change it via remount. As a result,
one container can consume almost all allowed ptys
and allocate up to 1Gb of unaccounted memory.

It is not enough per-se to trigger OOM on host, however anyway, it allows
to significantly exceed the assigned memcg limit and leads to troubles
on the over-committed node.

It makes sense to account for them to restrict the host's memory
consumption from inside the memcg-limited container.

v2: removed hunk patched tty_save_termios()
Jiri Slaby pointed that termios are not saved for PTYs and for other
terminals used inside containers. Therefore accounting for saved
termios have near to zero impact in real life scenarios.
v1 patch version was dropped due to noticed issue,
however hunk patched alloc_tty_struct is still actual.

Signed-off-by: Vasily Averin <[email protected]>
---
drivers/tty/tty_io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 7e8b3bd59c7b..8fec1d8648f5 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3088,7 +3088,7 @@ struct tty_struct *alloc_tty_struct(struct tty_driver *driver, int idx)
{
struct tty_struct *tty;

- tty = kzalloc(sizeof(*tty), GFP_KERNEL);
+ tty = kzalloc(sizeof(*tty), GFP_KERNEL_ACCOUNT);
if (!tty)
return NULL;

--
2.25.1