Dear all,
This patch series make hvc framework pass DMA capable memory to
put_chars() of hvc backend(eg, virtio-console), and revert commit
c4baad5029 ("virtio-console: avoid DMA from stack”)
V1
virtio-console: avoid DMA from vmalloc area
https://lkml.org/lkml/2021/7/27/494
For v1 patch, Arnd Bergmann suggests to fix the issue in the first
place:
Make hvc pass DMA capable memory to put_chars()
The fix suggestion is included in v2.
V2
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/1/8
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/1/9
For v2 patch, Arnd Bergmann suggests to make new buf part of the
hvc_struct structure, and fix the compile issue.
The fix suggestion is included in v3.
V3
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/3/1347
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/3/1348
For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of
hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to
calculate the size of hvc_struct. The fix suggestion is included in
v4.
V4
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
https://lkml.org/lkml/2021/8/5/1350
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/5/1351
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/5/1352
For v4 patch, Arnd Bergmann suggests to introduce another
array(cons_outbuf[]) for the buffer pointers next to the cons_ops[]
and vtermnos[] arrays. This fix included in this v5 patch.
V5
Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment,
use 'sizeof(long)' as dma alignment is wrong. fix it in v6.
V6
It contains coding error, fix it in v7 and it worked normally
according to test result.
V7
Greg KH suggests to add test and code review developer,
Jiri Slaby suggests to use lockless buffer and fix dma alignment
in separate patch.
fix above things in v8.
V8
This contains coding error when switch to use new buffer. fix it in v9.
V9
It didn't make things much clearer, it needs add more comments for new added buf.
Add use lock to protect new added buffer. fix in v10.
V10
Remove 'char outchar' and its lock from hvc_struct, adjust hvc_struct and use
pahole to display the struct layout.
fix it in v11.
V11
Fix early console print issue, which is broken by v11, in v12.
********TEST STEPS*********
1, config guest console=hvc0
2, start guest
3, login guest
Welcome to Buildroot
buildroot login: root
#
# cat /proc/cmdline
console=hvc0 root=/dev/vda rw init=/sbin/init
#
drivers/tty/hvc/hvc_console.c | 48 ++++++++++++++++++++++++-----------
drivers/tty/hvc/hvc_console.h | 30 +++++++++++++++++++++-
drivers/char/virtio_console.c | 12 ++----------
3 file changed
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")
hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, we can remove it.
Signed-off-by: Xianting Tian <[email protected]>
Reviewed-by: Shile Zhang <[email protected]>
---
drivers/char/virtio_console.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count)
{
struct port *port;
struct scatterlist sg[1];
- void *data;
- int ret;
if (unlikely(early_put_chars))
return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count)
if (!port)
return -EPIPE;
- data = kmemdup(buf, count, GFP_ATOMIC);
- if (!data)
- return -ENOMEM;
-
- sg_init_one(sg, data, count);
- ret = __send_to_port(port, sg, 1, count, data, false);
- kfree(data);
- return ret;
+ sg_init_one(sg, buf, count);
+ return __send_to_port(port, sg, 1, count, (void *)buf, false);
}
/*
--
2.17.1
As well known, hvc backend can register its opertions to hvc backend.
the operations contain put_chars(), get_chars() and so on.
Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
static void hvc_poll_put_char(,,char ch)
{
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
do {
n = hp->ops->put_chars(hp->vtermno, &ch, 1);
} while (n <= 0);
}
Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.
In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
so hp->cons_outbuf is no longer the stack memory, we can use it in above
cases safely. We also add lock to protect cons_outbuf instead of using
the global lock of hvc.
Introduce array cons_hvcs[] for hvc pointers next to the cons_ops[] and
vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf
and its lock.
Introduce array cons_early_outbuf[] to ensure the mechanism of early
console still work normally.
With the patch, we can revert the fix c4baad5029.
Signed-off-by: Xianting Tian <[email protected]>
Reviewed-by: Shile Zhang <[email protected]>
---
drivers/tty/hvc/hvc_console.c | 55 +++++++++++++++++++++++++----------
drivers/tty/hvc/hvc_console.h | 30 ++++++++++++++++++-
2 files changed, 69 insertions(+), 16 deletions(-)
diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5957ab728..b9521f0f0 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -41,16 +41,6 @@
*/
#define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
-/*
- * These sizes are most efficient for vio, because they are the
- * native transfer size. We could make them selectable in the
- * future to better deal with backends that want other buffer sizes.
- */
-#define N_OUTBUF 16
-#define N_INBUF 16
-
-#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
-
static struct tty_driver *hvc_driver;
static struct task_struct *hvc_task;
@@ -142,6 +132,8 @@ static int hvc_flush(struct hvc_struct *hp)
static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
{[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
+static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES];
+static struct hvc_early_outbuf *cons_early_outbufs[MAX_NR_HVC_CONSOLES];
/*
* Console APIs, NOT TTY. These APIs are available immediately when
@@ -151,9 +143,12 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
static void hvc_console_print(struct console *co, const char *b,
unsigned count)
{
- char c[N_OUTBUF] __ALIGNED__;
+ char *c;
unsigned i = 0, n = 0;
int r, donecr = 0, index = co->index;
+ unsigned long flags;
+ struct hvc_struct *hp;
+ spinlock_t *lock;
/* Console access attempt outside of acceptable console range. */
if (index >= MAX_NR_HVC_CONSOLES)
@@ -163,6 +158,27 @@ static void hvc_console_print(struct console *co, const char *b,
if (vtermnos[index] == -1)
return;
+ hp = cons_hvcs[index];
+ if (hp) {
+ c = hp->cons_outbuf;
+ lock = &hp->cons_outbuf_lock;
+
+ /*
+ * To make free early console outbuf locklessly,
+ * free it here when we start to use hp's outbuf.
+ * The actual free action is executed only once.
+ */
+ if (cons_early_outbufs[index]) {
+ kfree(cons_early_outbufs[index]);
+ cons_early_outbufs[index] = NULL;
+ }
+ } else {
+ /* For early console print */
+ c = cons_early_outbufs[index]->outbuf;
+ lock = &cons_early_outbufs[index]->lock;
+ }
+
+ spin_lock_irqsave(lock, flags);
while (count > 0 || i > 0) {
if (count > 0 && i < sizeof(c)) {
if (b[n] == '\n' && !donecr) {
@@ -191,6 +207,7 @@ static void hvc_console_print(struct console *co, const char *b,
}
}
}
+ spin_unlock_irqrestore(lock, flags);
hvc_console_flush(cons_ops[index], vtermnos[index]);
}
@@ -299,6 +316,9 @@ int hvc_instantiate(uint32_t vtermno, int index, const struct hv_ops *ops)
return -1;
}
+ cons_early_outbufs[index] = kzalloc(sizeof(struct hvc_early_outbuf),
+ GFP_KERNEL);
+ spin_lock_init(&cons_early_outbufs[index]->lock);
vtermnos[index] = vtermno;
cons_ops[index] = ops;
@@ -878,9 +898,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
+ unsigned long flags;
do {
- n = hp->ops->put_chars(hp->vtermno, &ch, 1);
+ spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
+ hp->cons_outbuf[0] = ch;
+ n = hp->ops->put_chars(hp->vtermno, &hp->cons_outbuf[0], 1);
+ spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
} while (n <= 0);
}
#endif
@@ -922,8 +946,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
return ERR_PTR(err);
}
- hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
- GFP_KERNEL);
+ hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
if (!hp)
return ERR_PTR(-ENOMEM);
@@ -931,13 +954,13 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
hp->data = data;
hp->ops = ops;
hp->outbuf_size = outbuf_size;
- hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
tty_port_init(&hp->port);
hp->port.ops = &hvc_port_ops;
INIT_WORK(&hp->tty_resize, hvc_set_winsz);
spin_lock_init(&hp->lock);
+ spin_lock_init(&hp->cons_outbuf_lock);
mutex_lock(&hvc_structs_mutex);
/*
@@ -964,6 +987,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
if (i < MAX_NR_HVC_CONSOLES) {
cons_ops[i] = ops;
vtermnos[i] = vtermno;
+ cons_hvcs[i] = hp;
}
list_add_tail(&(hp->next), &hvc_structs);
@@ -988,6 +1012,7 @@ int hvc_remove(struct hvc_struct *hp)
if (hp->index < MAX_NR_HVC_CONSOLES) {
vtermnos[hp->index] = -1;
cons_ops[hp->index] = NULL;
+ cons_hvcs[hp->index] = NULL;
}
/* Don't whack hp->irq because tty_hangup() will need to free the irq. */
diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index 18d005814..3a9fe9275 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -32,12 +32,30 @@
*/
#define HVC_ALLOC_TTY_ADAPTERS 8
+/*
+ * These sizes are most efficient for vio, because they are the
+ * native transfer size. We could make them selectable in the
+ * future to better deal with backends that want other buffer sizes.
+ */
+#define N_OUTBUF 16
+#define N_INBUF 16
+
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
+
+/*
+ * This is for early console print when hvc backend calls
+ * hvc_instantiate() to create an early console.
+ */
+struct hvc_early_outbuf {
+ spinlock_t lock;
+ char outbuf[N_OUTBUF] __ALIGNED__;
+};
+
struct hvc_struct {
struct tty_port port;
spinlock_t lock;
int index;
int do_wakeup;
- char *outbuf;
int outbuf_size;
int n_outbuf;
uint32_t vtermno;
@@ -48,6 +66,16 @@ struct hvc_struct {
struct work_struct tty_resize;
struct list_head next;
unsigned long flags;
+
+ /*
+ * the buf and its lock are used in hvc console api for putting chars,
+ * and also used in hvc_poll_put_char() for putting single char.
+ */
+ spinlock_t cons_outbuf_lock;
+ char cons_outbuf[N_OUTBUF] __ALIGNED__;
+
+ /* the buf is used for putting chars to tty */
+ char outbuf[] __ALIGNED__;
};
/* implemented by a low level driver */
--
2.17.1
On 28. 10. 21, 17:09, Xianting Tian wrote:
> As well known, hvc backend can register its opertions to hvc backend.
> the operations contain put_chars(), get_chars() and so on.
>
> Some hvc backend may do dma in its operations. eg, put_chars() of
> virtio-console. But in the code of hvc framework, it may pass DMA
> incapable memory to put_chars() under a specific configuration, which
> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
> 1, c[] is on stack,
> hvc_console_print():
> char c[N_OUTBUF] __ALIGNED__;
> cons_ops[index]->put_chars(vtermnos[index], c, i);
> 2, ch is on stack,
> static void hvc_poll_put_char(,,char ch)
> {
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
>
> do {
> n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> } while (n <= 0);
> }
>
> Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
> is passed to virtio-console by hvc framework in above code. But I think
> the fix is aggressive, it directly uses kmemdup() to alloc new buffer
> from kmalloc area and do memcpy no matter the memory is in kmalloc area
> or not. But most importantly, it should better be fixed in the hvc
> framework, by changing it to never pass stack memory to the put_chars()
> function in the first place. Otherwise, we still face the same issue if
> a new hvc backend using dma added in the furture.
>
> In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
> so hp->cons_outbuf is no longer the stack memory, we can use it in above
> cases safely. We also add lock to protect cons_outbuf instead of using
> the global lock of hvc.
>
> Introduce array cons_hvcs[] for hvc pointers next to the cons_ops[] and
> vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf
> and its lock.
Hi,
this is still overly complicated IMO. As I already noted in:
https://lore.kernel.org/all/[email protected]/
this:
=============
In fact, you need only a single char for the poll case
(hvc_poll_put_char), so hvc_struct needs to contain only c, not an array.
OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but
not whole hvc_struct. So cons_hvcs should be an array of structs
composed of only the lock and the buffer.
=============
And I would do it even simpler now. One c[N_OUTBUF] buffer for all
consoles and a single lock.
And "char c" in struct hvc_struct.
No need for the complex logic in hvc_console_print.
> Introduce array cons_early_outbuf[] to ensure the mechanism of early
> console still work normally.
thanks,
--
js
suse labs
在 2021/11/2 下午2:33, Jiri Slaby 写道:
> On 28. 10. 21, 17:09, Xianting Tian wrote:
>> As well known, hvc backend can register its opertions to hvc backend.
>> the operations contain put_chars(), get_chars() and so on.
>>
>> Some hvc backend may do dma in its operations. eg, put_chars() of
>> virtio-console. But in the code of hvc framework, it may pass DMA
>> incapable memory to put_chars() under a specific configuration, which
>> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
>> 1, c[] is on stack,
>> hvc_console_print():
>> char c[N_OUTBUF] __ALIGNED__;
>> cons_ops[index]->put_chars(vtermnos[index], c, i);
>> 2, ch is on stack,
>> static void hvc_poll_put_char(,,char ch)
>> {
>> struct tty_struct *tty = driver->ttys[0];
>> struct hvc_struct *hp = tty->driver_data;
>> int n;
>>
>> do {
>> n = hp->ops->put_chars(hp->vtermno, &ch, 1);
>> } while (n <= 0);
>> }
>>
>> Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
>> is passed to virtio-console by hvc framework in above code. But I think
>> the fix is aggressive, it directly uses kmemdup() to alloc new buffer
>> from kmalloc area and do memcpy no matter the memory is in kmalloc area
>> or not. But most importantly, it should better be fixed in the hvc
>> framework, by changing it to never pass stack memory to the put_chars()
>> function in the first place. Otherwise, we still face the same issue if
>> a new hvc backend using dma added in the furture.
>>
>> In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
>> so hp->cons_outbuf is no longer the stack memory, we can use it in above
>> cases safely. We also add lock to protect cons_outbuf instead of using
>> the global lock of hvc.
>>
>> Introduce array cons_hvcs[] for hvc pointers next to the cons_ops[] and
>> vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf
>> and its lock.
>
> Hi,
>
> this is still overly complicated IMO. As I already noted in:
> https://lore.kernel.org/all/[email protected]/
>
>
> this:
> =============
> In fact, you need only a single char for the poll case
> (hvc_poll_put_char), so hvc_struct needs to contain only c, not an array.
I ever did so in v10, and Greg suggested:
So you have a lock for a character and a different one for a longer
string? Why can they not just use the same lock? Why are 2 needed at
all, can't you just use the first character of cons_outbuf[] instead?
Surely you do not have 2 sends happening at the same time, right?
https://lkml.org/lkml/2021/10/9/214 <https://lkml.org/lkml/2021/10/9/214>
So I change to use one outbuf.
>
> OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but
> not whole hvc_struct. So cons_hvcs should be an array of structs
> composed of only the lock and the buffer.
It is ok for me.
> =============
>
> And I would do it even simpler now. One c[N_OUTBUF] buffer for all
> consoles and a single lock.
>
> And "char c" in struct hvc_struct.
>
> No need for the complex logic in hvc_console_print.
So you will implement this? I don't need to send further patches?
>
>> Introduce array cons_early_outbuf[] to ensure the mechanism of early
>> console still work normally.
>
>
> thanks,
On 04. 11. 21, 14:06, Xianting Tian wrote:
>> OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but
>> not whole hvc_struct. So cons_hvcs should be an array of structs
>> composed of only the lock and the buffer.
> It is ok for me.
>> =============
>>
>> And I would do it even simpler now. One c[N_OUTBUF] buffer for all
>> consoles and a single lock.
>>
>> And "char c" in struct hvc_struct.
>>
>> No need for the complex logic in hvc_console_print.
>
> So you will implement this?
No, there is a slight difference between "I would" and "I will" :). I
don't have anything to test this hvc change on…
thanks,
--
js
suse labs