2021-10-15 11:15:07

by Xianting Tian

[permalink] [raw]
Subject: [PATCH v11 0/3] make hvc pass dma capable memory to its backend

Dear all,

This patch series make hvc framework pass DMA capable memory to
put_chars() of hvc backend(eg, virtio-console), and revert commit
c4baad5029 ("virtio-console: avoid DMA from stack”)

V1
virtio-console: avoid DMA from vmalloc area
https://lkml.org/lkml/2021/7/27/494

For v1 patch, Arnd Bergmann suggests to fix the issue in the first
place:
Make hvc pass DMA capable memory to put_chars()
The fix suggestion is included in v2.

V2
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/1/8
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/1/9

For v2 patch, Arnd Bergmann suggests to make new buf part of the
hvc_struct structure, and fix the compile issue.
The fix suggestion is included in v3.

V3
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/3/1347
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/3/1348

For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of
hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to
calculate the size of hvc_struct. The fix suggestion is included in
v4.

V4
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
https://lkml.org/lkml/2021/8/5/1350
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/5/1351
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/5/1352

For v4 patch, Arnd Bergmann suggests to introduce another
array(cons_outbuf[]) for the buffer pointers next to the cons_ops[]
and vtermnos[] arrays. This fix included in this v5 patch.

V5
Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment,
use 'sizeof(long)' as dma alignment is wrong. fix it in v6.

V6
It contains coding error, fix it in v7 and it worked normally
according to test result.

V7
Greg KH suggests to add test and code review developer,
Jiri Slaby suggests to use lockless buffer and fix dma alignment
in separate patch.
fix above things in v8.

V8
This contains coding error when switch to use new buffer. fix it in v9.

V9
It didn't make things much clearer, it needs add more comments for new added buf.
Add use lock to protect new added buffer. fix in v10.

V10
Remove 'char outchar' and its lock from hvc_struct, adjust hvc_struct and use
pahole to display the struct layout.
fix it in v11.

********TEST STEPS*********
1, config guest console=hvc0
2, start guest
3, login guest
Welcome to Buildroot
buildroot login: root
#
# cat /proc/cmdline
console=hvc0 root=/dev/vda rw init=/sbin/init
#

drivers/tty/hvc/hvc_console.c | 36 ++++++++++++++++++++---------------
drivers/tty/hvc/hvc_console.h | 21 +++++++++++++++++++-
drivers/char/virtio_console.c | 12 ++----------
3 file changed


2021-10-15 11:15:07

by Xianting Tian

[permalink] [raw]
Subject: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

As well known, hvc backend can register its opertions to hvc backend.
the operations contain put_chars(), get_chars() and so on.

Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
static void hvc_poll_put_char(,,char ch)
{
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
n = hp->ops->put_chars(hp->vtermno, &ch, 1);
} while (n <= 0);
}

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.

In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
so hp->cons_outbuf is no longer the stack memory, we can use it in above
cases safely. We also add lock to protect cons_outbuf instead of using
the global lock of hvc.

Introduce another array(cons_hvcs[]) for hvc pointers next to the
cons_ops[] and vtermnos[] arrays. With the array, we can easily find
hvc's cons_outbuf and its lock.

With the patch, we can revert the fix c4baad5029.

Signed-off-by: Xianting Tian <[email protected]>
Signed-off-by: Shile Zhang <[email protected]>
---
drivers/tty/hvc/hvc_console.c | 36 ++++++++++++++++++++---------------
drivers/tty/hvc/hvc_console.h | 21 +++++++++++++++++++-
2 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5957ab728..11f2463a1 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -41,16 +41,6 @@
*/
#define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */

-/*
- * These sizes are most efficient for vio, because they are the
- * native transfer size. We could make them selectable in the
- * future to better deal with backends that want other buffer sizes.
- */
-#define N_OUTBUF 16
-#define N_INBUF 16
-
-#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
-
static struct tty_driver *hvc_driver;
static struct task_struct *hvc_task;

@@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
{[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
+static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES];

/*
* Console APIs, NOT TTY. These APIs are available immediately when
@@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
static void hvc_console_print(struct console *co, const char *b,
unsigned count)
{
- char c[N_OUTBUF] __ALIGNED__;
+ char *c;
unsigned i = 0, n = 0;
int r, donecr = 0, index = co->index;
+ unsigned long flags;
+ struct hvc_struct *hp;

/* Console access attempt outside of acceptable console range. */
if (index >= MAX_NR_HVC_CONSOLES)
@@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b,
if (vtermnos[index] == -1)
return;

+ hp = cons_hvcs[index];
+ if (!hp)
+ return;
+
+ c = hp->cons_outbuf;
+
+ spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
while (count > 0 || i > 0) {
if (count > 0 && i < sizeof(c)) {
if (b[n] == '\n' && !donecr) {
@@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b,
}
}
}
+ spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
hvc_console_flush(cons_ops[index], vtermnos[index]);
}

@@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
+ unsigned long flags;

do {
- n = hp->ops->put_chars(hp->vtermno, &ch, 1);
+ spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
+ hp->cons_outbuf[0] = ch;
+ n = hp->ops->put_chars(hp->vtermno, &hp->cons_outbuf[0], 1);
+ spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
} while (n <= 0);
}
#endif
@@ -922,8 +927,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
return ERR_PTR(err);
}

- hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
- GFP_KERNEL);
+ hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
if (!hp)
return ERR_PTR(-ENOMEM);

@@ -931,13 +935,13 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
hp->data = data;
hp->ops = ops;
hp->outbuf_size = outbuf_size;
- hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];

tty_port_init(&hp->port);
hp->port.ops = &hvc_port_ops;

INIT_WORK(&hp->tty_resize, hvc_set_winsz);
spin_lock_init(&hp->lock);
+ spin_lock_init(&hp->cons_outbuf_lock);
mutex_lock(&hvc_structs_mutex);

/*
@@ -964,6 +968,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
if (i < MAX_NR_HVC_CONSOLES) {
cons_ops[i] = ops;
vtermnos[i] = vtermno;
+ cons_hvcs[i] = hp;
}

list_add_tail(&(hp->next), &hvc_structs);
@@ -988,6 +993,7 @@ int hvc_remove(struct hvc_struct *hp)
if (hp->index < MAX_NR_HVC_CONSOLES) {
vtermnos[hp->index] = -1;
cons_ops[hp->index] = NULL;
+ cons_hvcs[hp->index] = NULL;
}

/* Don't whack hp->irq because tty_hangup() will need to free the irq. */
diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index 18d005814..2c32ab67b 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -32,12 +32,21 @@
*/
#define HVC_ALLOC_TTY_ADAPTERS 8

+/*
+ * These sizes are most efficient for vio, because they are the
+ * native transfer size. We could make them selectable in the
+ * future to better deal with backends that want other buffer sizes.
+ */
+#define N_OUTBUF 16
+#define N_INBUF 16
+
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
+
struct hvc_struct {
struct tty_port port;
spinlock_t lock;
int index;
int do_wakeup;
- char *outbuf;
int outbuf_size;
int n_outbuf;
uint32_t vtermno;
@@ -48,6 +57,16 @@ struct hvc_struct {
struct work_struct tty_resize;
struct list_head next;
unsigned long flags;
+
+ /*
+ * the buf and its lock are used in hvc console api for putting chars,
+ * and also used in hvc_poll_put_char() for putting single char.
+ */
+ spinlock_t cons_outbuf_lock;
+ char cons_outbuf[N_OUTBUF] __ALIGNED__;
+
+ /* the buf is used for putting chars to tty */
+ char outbuf[] __ALIGNED__;
};

/* implemented by a low level driver */
--
2.17.1

2021-10-15 11:15:08

by Xianting Tian

[permalink] [raw]
Subject: [PATCH v11 1/3] tty: hvc: use correct dma alignment size

Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)' as
dma alignment is wrong.

Signed-off-by: Xianting Tian <[email protected]>
Signed-off-by: Shile Zhang <[email protected]>
---
drivers/tty/hvc/hvc_console.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..5957ab728 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -49,7 +49,7 @@
#define N_OUTBUF 16
#define N_INBUF 16

-#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))

static struct tty_driver *hvc_driver;
static struct task_struct *hvc_task;
--
2.17.1

2021-10-15 11:15:07

by Xianting Tian

[permalink] [raw]
Subject: [PATCH v11 3/3] virtio-console: remove unnecessary kmemdup()

This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")

hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, we can remove it.

Signed-off-by: Xianting Tian <[email protected]>
Reviewed-by: Shile Zhang <[email protected]>
---
drivers/char/virtio_console.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count)
{
struct port *port;
struct scatterlist sg[1];
- void *data;
- int ret;

if (unlikely(early_put_chars))
return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count)
if (!port)
return -EPIPE;

- data = kmemdup(buf, count, GFP_ATOMIC);
- if (!data)
- return -ENOMEM;
-
- sg_init_one(sg, data, count);
- ret = __send_to_port(port, sg, 1, count, data, false);
- kfree(data);
- return ret;
+ sg_init_one(sg, buf, count);
+ return __send_to_port(port, sg, 1, count, (void *)buf, false);
}

/*
--
2.17.1

2021-10-16 03:09:20

by Xianting Tian

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

Hi Greg and experts

Is this version ok for you?  thanks

在 2021/10/15 上午10:46, Xianting Tian 写道:
> As well known, hvc backend can register its opertions to hvc backend.
> the operations contain put_chars(), get_chars() and so on.
>
> Some hvc backend may do dma in its operations. eg, put_chars() of
> virtio-console. But in the code of hvc framework, it may pass DMA
> incapable memory to put_chars() under a specific configuration, which
> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
> 1, c[] is on stack,
> hvc_console_print():
> char c[N_OUTBUF] __ALIGNED__;
> cons_ops[index]->put_chars(vtermnos[index], c, i);
> 2, ch is on stack,
> static void hvc_poll_put_char(,,char ch)
> {
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
>
> do {
> n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> } while (n <= 0);
> }
>
> Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
> is passed to virtio-console by hvc framework in above code. But I think
> the fix is aggressive, it directly uses kmemdup() to alloc new buffer
> from kmalloc area and do memcpy no matter the memory is in kmalloc area
> or not. But most importantly, it should better be fixed in the hvc
> framework, by changing it to never pass stack memory to the put_chars()
> function in the first place. Otherwise, we still face the same issue if
> a new hvc backend using dma added in the furture.
>
> In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
> so hp->cons_outbuf is no longer the stack memory, we can use it in above
> cases safely. We also add lock to protect cons_outbuf instead of using
> the global lock of hvc.
>
> Introduce another array(cons_hvcs[]) for hvc pointers next to the
> cons_ops[] and vtermnos[] arrays. With the array, we can easily find
> hvc's cons_outbuf and its lock.
>
> With the patch, we can revert the fix c4baad5029.
>
> Signed-off-by: Xianting Tian <[email protected]>
> Signed-off-by: Shile Zhang <[email protected]>
> ---
> drivers/tty/hvc/hvc_console.c | 36 ++++++++++++++++++++---------------
> drivers/tty/hvc/hvc_console.h | 21 +++++++++++++++++++-
> 2 files changed, 41 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
> index 5957ab728..11f2463a1 100644
> --- a/drivers/tty/hvc/hvc_console.c
> +++ b/drivers/tty/hvc/hvc_console.c
> @@ -41,16 +41,6 @@
> */
> #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
>
> -/*
> - * These sizes are most efficient for vio, because they are the
> - * native transfer size. We could make them selectable in the
> - * future to better deal with backends that want other buffer sizes.
> - */
> -#define N_OUTBUF 16
> -#define N_INBUF 16
> -
> -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
> -
> static struct tty_driver *hvc_driver;
> static struct task_struct *hvc_task;
>
> @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
> static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
> static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
> +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES];
>
> /*
> * Console APIs, NOT TTY. These APIs are available immediately when
> @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> static void hvc_console_print(struct console *co, const char *b,
> unsigned count)
> {
> - char c[N_OUTBUF] __ALIGNED__;
> + char *c;
> unsigned i = 0, n = 0;
> int r, donecr = 0, index = co->index;
> + unsigned long flags;
> + struct hvc_struct *hp;
>
> /* Console access attempt outside of acceptable console range. */
> if (index >= MAX_NR_HVC_CONSOLES)
> @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b,
> if (vtermnos[index] == -1)
> return;
>
> + hp = cons_hvcs[index];
> + if (!hp)
> + return;
> +
> + c = hp->cons_outbuf;
> +
> + spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
> while (count > 0 || i > 0) {
> if (count > 0 && i < sizeof(c)) {
> if (b[n] == '\n' && !donecr) {
> @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b,
> }
> }
> }
> + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
> hvc_console_flush(cons_ops[index], vtermnos[index]);
> }
>
> @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
> + unsigned long flags;
>
> do {
> - n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> + spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
> + hp->cons_outbuf[0] = ch;
> + n = hp->ops->put_chars(hp->vtermno, &hp->cons_outbuf[0], 1);
> + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
> } while (n <= 0);
> }
> #endif
> @@ -922,8 +927,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> return ERR_PTR(err);
> }
>
> - hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
> - GFP_KERNEL);
> + hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
> if (!hp)
> return ERR_PTR(-ENOMEM);
>
> @@ -931,13 +935,13 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> hp->data = data;
> hp->ops = ops;
> hp->outbuf_size = outbuf_size;
> - hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>
> tty_port_init(&hp->port);
> hp->port.ops = &hvc_port_ops;
>
> INIT_WORK(&hp->tty_resize, hvc_set_winsz);
> spin_lock_init(&hp->lock);
> + spin_lock_init(&hp->cons_outbuf_lock);
> mutex_lock(&hvc_structs_mutex);
>
> /*
> @@ -964,6 +968,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> if (i < MAX_NR_HVC_CONSOLES) {
> cons_ops[i] = ops;
> vtermnos[i] = vtermno;
> + cons_hvcs[i] = hp;
> }
>
> list_add_tail(&(hp->next), &hvc_structs);
> @@ -988,6 +993,7 @@ int hvc_remove(struct hvc_struct *hp)
> if (hp->index < MAX_NR_HVC_CONSOLES) {
> vtermnos[hp->index] = -1;
> cons_ops[hp->index] = NULL;
> + cons_hvcs[hp->index] = NULL;
> }
>
> /* Don't whack hp->irq because tty_hangup() will need to free the irq. */
> diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
> index 18d005814..2c32ab67b 100644
> --- a/drivers/tty/hvc/hvc_console.h
> +++ b/drivers/tty/hvc/hvc_console.h
> @@ -32,12 +32,21 @@
> */
> #define HVC_ALLOC_TTY_ADAPTERS 8
>
> +/*
> + * These sizes are most efficient for vio, because they are the
> + * native transfer size. We could make them selectable in the
> + * future to better deal with backends that want other buffer sizes.
> + */
> +#define N_OUTBUF 16
> +#define N_INBUF 16
> +
> +#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
> +
> struct hvc_struct {
> struct tty_port port;
> spinlock_t lock;
> int index;
> int do_wakeup;
> - char *outbuf;
> int outbuf_size;
> int n_outbuf;
> uint32_t vtermno;
> @@ -48,6 +57,16 @@ struct hvc_struct {
> struct work_struct tty_resize;
> struct list_head next;
> unsigned long flags;
> +
> + /*
> + * the buf and its lock are used in hvc console api for putting chars,
> + * and also used in hvc_poll_put_char() for putting single char.
> + */
> + spinlock_t cons_outbuf_lock;
> + char cons_outbuf[N_OUTBUF] __ALIGNED__;
> +
> + /* the buf is used for putting chars to tty */
> + char outbuf[] __ALIGNED__;
> };
>
> /* implemented by a low level driver */

2021-10-20 08:49:46

by Xianting Tian

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

hi Greg,

Could I get  your comments of this new version patches? thanks

在 2021/10/15 上午10:46, Xianting Tian 写道:
> As well known, hvc backend can register its opertions to hvc backend.
> the operations contain put_chars(), get_chars() and so on.
>
> Some hvc backend may do dma in its operations. eg, put_chars() of
> virtio-console. But in the code of hvc framework, it may pass DMA
> incapable memory to put_chars() under a specific configuration, which
> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
> 1, c[] is on stack,
> hvc_console_print():
> char c[N_OUTBUF] __ALIGNED__;
> cons_ops[index]->put_chars(vtermnos[index], c, i);
> 2, ch is on stack,
> static void hvc_poll_put_char(,,char ch)
> {
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
>
> do {
> n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> } while (n <= 0);
> }
>
> Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
> is passed to virtio-console by hvc framework in above code. But I think
> the fix is aggressive, it directly uses kmemdup() to alloc new buffer
> from kmalloc area and do memcpy no matter the memory is in kmalloc area
> or not. But most importantly, it should better be fixed in the hvc
> framework, by changing it to never pass stack memory to the put_chars()
> function in the first place. Otherwise, we still face the same issue if
> a new hvc backend using dma added in the furture.
>
> In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
> so hp->cons_outbuf is no longer the stack memory, we can use it in above
> cases safely. We also add lock to protect cons_outbuf instead of using
> the global lock of hvc.
>
> Introduce another array(cons_hvcs[]) for hvc pointers next to the
> cons_ops[] and vtermnos[] arrays. With the array, we can easily find
> hvc's cons_outbuf and its lock.
>
> With the patch, we can revert the fix c4baad5029.
>
> Signed-off-by: Xianting Tian <[email protected]>
> Signed-off-by: Shile Zhang <[email protected]>
> ---
> drivers/tty/hvc/hvc_console.c | 36 ++++++++++++++++++++---------------
> drivers/tty/hvc/hvc_console.h | 21 +++++++++++++++++++-
> 2 files changed, 41 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
> index 5957ab728..11f2463a1 100644
> --- a/drivers/tty/hvc/hvc_console.c
> +++ b/drivers/tty/hvc/hvc_console.c
> @@ -41,16 +41,6 @@
> */
> #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
>
> -/*
> - * These sizes are most efficient for vio, because they are the
> - * native transfer size. We could make them selectable in the
> - * future to better deal with backends that want other buffer sizes.
> - */
> -#define N_OUTBUF 16
> -#define N_INBUF 16
> -
> -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
> -
> static struct tty_driver *hvc_driver;
> static struct task_struct *hvc_task;
>
> @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
> static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
> static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
> +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES];
>
> /*
> * Console APIs, NOT TTY. These APIs are available immediately when
> @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> static void hvc_console_print(struct console *co, const char *b,
> unsigned count)
> {
> - char c[N_OUTBUF] __ALIGNED__;
> + char *c;
> unsigned i = 0, n = 0;
> int r, donecr = 0, index = co->index;
> + unsigned long flags;
> + struct hvc_struct *hp;
>
> /* Console access attempt outside of acceptable console range. */
> if (index >= MAX_NR_HVC_CONSOLES)
> @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b,
> if (vtermnos[index] == -1)
> return;
>
> + hp = cons_hvcs[index];
> + if (!hp)
> + return;
> +
> + c = hp->cons_outbuf;
> +
> + spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
> while (count > 0 || i > 0) {
> if (count > 0 && i < sizeof(c)) {
> if (b[n] == '\n' && !donecr) {
> @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b,
> }
> }
> }
> + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
> hvc_console_flush(cons_ops[index], vtermnos[index]);
> }
>
> @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
> + unsigned long flags;
>
> do {
> - n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> + spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
> + hp->cons_outbuf[0] = ch;
> + n = hp->ops->put_chars(hp->vtermno, &hp->cons_outbuf[0], 1);
> + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
> } while (n <= 0);
> }
> #endif
> @@ -922,8 +927,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> return ERR_PTR(err);
> }
>
> - hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
> - GFP_KERNEL);
> + hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
> if (!hp)
> return ERR_PTR(-ENOMEM);
>
> @@ -931,13 +935,13 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> hp->data = data;
> hp->ops = ops;
> hp->outbuf_size = outbuf_size;
> - hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
>
> tty_port_init(&hp->port);
> hp->port.ops = &hvc_port_ops;
>
> INIT_WORK(&hp->tty_resize, hvc_set_winsz);
> spin_lock_init(&hp->lock);
> + spin_lock_init(&hp->cons_outbuf_lock);
> mutex_lock(&hvc_structs_mutex);
>
> /*
> @@ -964,6 +968,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
> if (i < MAX_NR_HVC_CONSOLES) {
> cons_ops[i] = ops;
> vtermnos[i] = vtermno;
> + cons_hvcs[i] = hp;
> }
>
> list_add_tail(&(hp->next), &hvc_structs);
> @@ -988,6 +993,7 @@ int hvc_remove(struct hvc_struct *hp)
> if (hp->index < MAX_NR_HVC_CONSOLES) {
> vtermnos[hp->index] = -1;
> cons_ops[hp->index] = NULL;
> + cons_hvcs[hp->index] = NULL;
> }
>
> /* Don't whack hp->irq because tty_hangup() will need to free the irq. */
> diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
> index 18d005814..2c32ab67b 100644
> --- a/drivers/tty/hvc/hvc_console.h
> +++ b/drivers/tty/hvc/hvc_console.h
> @@ -32,12 +32,21 @@
> */
> #define HVC_ALLOC_TTY_ADAPTERS 8
>
> +/*
> + * These sizes are most efficient for vio, because they are the
> + * native transfer size. We could make them selectable in the
> + * future to better deal with backends that want other buffer sizes.
> + */
> +#define N_OUTBUF 16
> +#define N_INBUF 16
> +
> +#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
> +
> struct hvc_struct {
> struct tty_port port;
> spinlock_t lock;
> int index;
> int do_wakeup;
> - char *outbuf;
> int outbuf_size;
> int n_outbuf;
> uint32_t vtermno;
> @@ -48,6 +57,16 @@ struct hvc_struct {
> struct work_struct tty_resize;
> struct list_head next;
> unsigned long flags;
> +
> + /*
> + * the buf and its lock are used in hvc console api for putting chars,
> + * and also used in hvc_poll_put_char() for putting single char.
> + */
> + spinlock_t cons_outbuf_lock;
> + char cons_outbuf[N_OUTBUF] __ALIGNED__;
> +
> + /* the buf is used for putting chars to tty */
> + char outbuf[] __ALIGNED__;
> };
>
> /* implemented by a low level driver */

2021-10-20 08:58:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Wed, Oct 20, 2021 at 04:47:23PM +0800, Xianting Tian wrote:
> hi Greg,
>
> Could I get  your comments of this new version patches? thanks

It has been less than 5 days. Please relax, and only worry after 2
weeks have gone by. We have lots of patches to review. To help
maintainers out, why don't you review other patches on the mailing lists
as well while you wait?

thanks,

greg k-h

2021-10-21 08:37:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v11 0/3] make hvc pass dma capable memory to its backend

On Fri, Oct 15, 2021 at 10:46:55AM +0800, Xianting Tian wrote:
> Dear all,
>
> This patch series make hvc framework pass DMA capable memory to
> put_chars() of hvc backend(eg, virtio-console), and revert commit
> c4baad5029 ("virtio-console: avoid DMA from stack”)

Thanks for sticking with this, looks much better now, all now queued up.

greg k-h

2021-10-21 08:58:44

by Xianting Tian

[permalink] [raw]
Subject: Re: [PATCH v11 0/3] make hvc pass dma capable memory to its backend

I am very glad to get this reply:)

Thank you and other experts' kindly help and guide, which improved me a lot.

在 2021/10/21 下午4:35, Greg KH 写道:
> On Fri, Oct 15, 2021 at 10:46:55AM +0800, Xianting Tian wrote:
>> Dear all,
>>
>> This patch series make hvc framework pass DMA capable memory to
>> put_chars() of hvc backend(eg, virtio-console), and revert commit
>> c4baad5029 ("virtio-console: avoid DMA from stack”)
> Thanks for sticking with this, looks much better now, all now queued up.
>
> greg k-h

2021-10-26 07:17:01

by Xianting Tian

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

在 2021/10/26 下午1:10, Jiri Slaby 写道:
> On 15. 10. 21, 4:46, Xianting Tian wrote:
>> @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
>>   static void hvc_console_print(struct console *co, const char *b,
>>                     unsigned count)
>>   {
>> -    char c[N_OUTBUF] __ALIGNED__;
>> +    char *c;
>>       unsigned i = 0, n = 0;
>>       int r, donecr = 0, index = co->index;
>> +    unsigned long flags;
>> +    struct hvc_struct *hp;
>>         /* Console access attempt outside of acceptable console
>> range. */
>>       if (index >= MAX_NR_HVC_CONSOLES)
>> @@ -163,6 +156,13 @@ static void hvc_console_print(struct console
>> *co, const char *b,
>>       if (vtermnos[index] == -1)
>>           return;
>>   +    hp = cons_hvcs[index];
>> +    if (!hp)
>> +        return;
>
> You effectively make the console unusable until someone calls
> hvc_alloc() for this device, correct? This doesn't look right. Neither
> you describe this change of behaviour in the commit log.

I mentioned such info in the commit log:
'Introduce another array(cons_hvcs[]) for hvc pointers next to the
cons_ops[] and vtermnos[] arrays. With the array, we can easily find
hvc's cons_outbuf and its lock.'

After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method.
I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks


>
> regards,

2021-10-26 07:18:17

by Xianting Tian

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()


在 2021/10/26 下午2:10, Greg KH 写道:
> On Tue, Oct 26, 2021 at 02:02:21PM +0800, Xianting Tian wrote:
>> 在 2021/10/26 下午1:10, Jiri Slaby 写道:
>>> On 15. 10. 21, 4:46, Xianting Tian wrote:
>>>> @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
>>>>   static void hvc_console_print(struct console *co, const char *b,
>>>>                     unsigned count)
>>>>   {
>>>> -    char c[N_OUTBUF] __ALIGNED__;
>>>> +    char *c;
>>>>       unsigned i = 0, n = 0;
>>>>       int r, donecr = 0, index = co->index;
>>>> +    unsigned long flags;
>>>> +    struct hvc_struct *hp;
>>>>         /* Console access attempt outside of acceptable console
>>>> range. */
>>>>       if (index >= MAX_NR_HVC_CONSOLES)
>>>> @@ -163,6 +156,13 @@ static void hvc_console_print(struct console
>>>> *co, const char *b,
>>>>       if (vtermnos[index] == -1)
>>>>           return;
>>>>   +    hp = cons_hvcs[index];
>>>> +    if (!hp)
>>>> +        return;
>>> You effectively make the console unusable until someone calls
>>> hvc_alloc() for this device, correct? This doesn't look right. Neither
>>> you describe this change of behaviour in the commit log.
>> I mentioned such info in the commit log:
>> 'Introduce another array(cons_hvcs[]) for hvc pointers next to the
>> cons_ops[] and vtermnos[] arrays. With the array, we can easily find
>> hvc's cons_outbuf and its lock.'
>>
>> After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method.
>> I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks
> Let me revert these now and you can send an updated version.
OK, thanks.
>
> thanks,
>
> greg k-h

2021-10-26 08:47:33

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

On 15. 10. 21, 4:46, Xianting Tian wrote:
> @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> static void hvc_console_print(struct console *co, const char *b,
> unsigned count)
> {
> - char c[N_OUTBUF] __ALIGNED__;
> + char *c;
> unsigned i = 0, n = 0;
> int r, donecr = 0, index = co->index;
> + unsigned long flags;
> + struct hvc_struct *hp;
>
> /* Console access attempt outside of acceptable console range. */
> if (index >= MAX_NR_HVC_CONSOLES)
> @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b,
> if (vtermnos[index] == -1)
> return;
>
> + hp = cons_hvcs[index];
> + if (!hp)
> + return;

You effectively make the console unusable until someone calls
hvc_alloc() for this device, correct? This doesn't look right. Neither
you describe this change of behaviour in the commit log.

regards,
--
js
suse labs

2021-10-26 08:56:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

On Tue, Oct 26, 2021 at 02:11:51PM +0800, Xianting Tian wrote:
>
> 在 2021/10/26 下午2:10, Greg KH 写道:
> > On Tue, Oct 26, 2021 at 02:02:21PM +0800, Xianting Tian wrote:
> > > 在 2021/10/26 下午1:10, Jiri Slaby 写道:
> > > > On 15. 10. 21, 4:46, Xianting Tian wrote:
> > > > > @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> > > > >   static void hvc_console_print(struct console *co, const char *b,
> > > > >                     unsigned count)
> > > > >   {
> > > > > -    char c[N_OUTBUF] __ALIGNED__;
> > > > > +    char *c;
> > > > >       unsigned i = 0, n = 0;
> > > > >       int r, donecr = 0, index = co->index;
> > > > > +    unsigned long flags;
> > > > > +    struct hvc_struct *hp;
> > > > >         /* Console access attempt outside of acceptable console
> > > > > range. */
> > > > >       if (index >= MAX_NR_HVC_CONSOLES)
> > > > > @@ -163,6 +156,13 @@ static void hvc_console_print(struct console
> > > > > *co, const char *b,
> > > > >       if (vtermnos[index] == -1)
> > > > >           return;
> > > > >   +    hp = cons_hvcs[index];
> > > > > +    if (!hp)
> > > > > +        return;
> > > > You effectively make the console unusable until someone calls
> > > > hvc_alloc() for this device, correct? This doesn't look right. Neither
> > > > you describe this change of behaviour in the commit log.
> > > I mentioned such info in the commit log:
> > > 'Introduce another array(cons_hvcs[]) for hvc pointers next to the
> > > cons_ops[] and vtermnos[] arrays. With the array, we can easily find
> > > hvc's cons_outbuf and its lock.'
> > >
> > > After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method.
> > > I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks
> > Let me revert these now and you can send an updated version.
> OK, thanks.

I have now reverted patches 2/3 and 3/3 in this series from my tree.
The first patch was just fine.

thanks,

greg k-h

2021-10-26 09:20:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()

On Tue, Oct 26, 2021 at 02:02:21PM +0800, Xianting Tian wrote:
> 在 2021/10/26 下午1:10, Jiri Slaby 写道:
> > On 15. 10. 21, 4:46, Xianting Tian wrote:
> > > @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
> > >   static void hvc_console_print(struct console *co, const char *b,
> > >                     unsigned count)
> > >   {
> > > -    char c[N_OUTBUF] __ALIGNED__;
> > > +    char *c;
> > >       unsigned i = 0, n = 0;
> > >       int r, donecr = 0, index = co->index;
> > > +    unsigned long flags;
> > > +    struct hvc_struct *hp;
> > >         /* Console access attempt outside of acceptable console
> > > range. */
> > >       if (index >= MAX_NR_HVC_CONSOLES)
> > > @@ -163,6 +156,13 @@ static void hvc_console_print(struct console
> > > *co, const char *b,
> > >       if (vtermnos[index] == -1)
> > >           return;
> > >   +    hp = cons_hvcs[index];
> > > +    if (!hp)
> > > +        return;
> >
> > You effectively make the console unusable until someone calls
> > hvc_alloc() for this device, correct? This doesn't look right. Neither
> > you describe this change of behaviour in the commit log.
>
> I mentioned such info in the commit log:
> 'Introduce another array(cons_hvcs[]) for hvc pointers next to the
> cons_ops[] and vtermnos[] arrays. With the array, we can easily find
> hvc's cons_outbuf and its lock.'
>
> After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method.
> I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks

Let me revert these now and you can send an updated version.

thanks,

greg k-h