2009-01-28 18:04:57

by Ira W. Snyder

[permalink] [raw]
Subject: [PATCH] firmware: speed up request_firmware()

Trying to load a large file with request_firmware() is currently very slow,
due to calling vmalloc() for each write. This patch makes the allocations
double in size each time.

Without this patch, loading a 12MB firmware image (FPGA bitfiles) called
vmalloc() approximately 3000 times. With the patch, vmalloc() is only
called 13 times, however, some memory is wasted until firmware_release() is
called. Usually, this happens very quickly, so it shouldn't be a problem.

Without this patch, it took about 2 1/2 minutes to load the 12MB firmware
image. With the patch, it now takes about 1/2 second.

Signed-off-by: Ira W. Snyder <[email protected]>
---
drivers/base/firmware_class.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index b7e5710..a947a88 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -207,7 +207,12 @@ fw_realloc_buffer(struct firmware_priv *fw_priv, int min_size)
if (min_size <= fw_priv->alloc_size)
return 0;

- new_size = ALIGN(min_size, PAGE_SIZE);
+ /* Allocate double the current size. If that is not enough, allocate
+ * the minimum amount needed instead (aligned to PAGE_SIZE). */
+ new_size = fw_priv->alloc_size * 2;
+ if (new_size < min_size)
+ new_size = ALIGN(min_size, PAGE_SIZE);
+
new_data = vmalloc(new_size);
if (!new_data) {
printk(KERN_ERR "%s: unable to alloc buffer\n", __func__);
--
1.5.4.3


2009-01-28 18:11:18

by Alan

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

> Without this patch, loading a 12MB firmware image (FPGA bitfiles) called
> vmalloc() approximately 3000 times. With the patch, vmalloc() is only
> called 13 times, however, some memory is wasted until firmware_release() is
> called. Usually, this happens very quickly, so it shouldn't be a problem.

Particularly on 32bit x86 vmalloc memory space is very precious so while
the current buffer size default may be too small doubling repeatedly is
not a good model IMHO.

2009-01-28 18:35:51

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

On Wed, 2009-01-28 at 18:11 +0000, Alan Cox wrote:
> > Without this patch, loading a 12MB firmware image (FPGA bitfiles) called
> > vmalloc() approximately 3000 times. With the patch, vmalloc() is only
> > called 13 times, however, some memory is wasted until firmware_release() is
> > called. Usually, this happens very quickly, so it shouldn't be a problem.
>
> Particularly on 32bit x86 vmalloc memory space is very precious so while
> the current buffer size default may be too small doubling repeatedly is
> not a good model IMHO.

How about a small exponent like 1.1?

--
http://selenic.com : development and support for Mercurial and Linux

2009-01-28 19:03:50

by Andrey Borzenkov

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

Ira Snyder wrote:

>
> Without this patch, loading a 12MB firmware image (FPGA bitfiles) called
> vmalloc() approximately 3000 times. With the patch, vmalloc() is only
> called 13 times, however, some memory is wasted until firmware_release()
> is called. Usually, this happens very quickly, so it shouldn't be a
> problem.
>

Some drivers cache firmware in memory. Doubing the amount of needed memory
definitely would not be the best idea. Check drivers/net/wireless for
examples.

2009-01-28 19:45:39

by Alan

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

> Some drivers cache firmware in memory. Doubing the amount of needed memory
> definitely would not be the best idea. Check drivers/net/wireless for
> examples.

A lot of drivers could perfectly happily exist with a simple iterator
helper and being returned sg lists of pages. It seems that for big
firmwares at least there is a root cause which is deeper than how you
grow your vmalloc buffer.

Alan

2009-01-28 20:57:40

by Ira W. Snyder

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

On Wed, Jan 28, 2009 at 07:45:34PM +0000, Alan Cox wrote:
> > Some drivers cache firmware in memory. Doubing the amount of needed memory
> > definitely would not be the best idea. Check drivers/net/wireless for
> > examples.
>
> A lot of drivers could perfectly happily exist with a simple iterator
> helper and being returned sg lists of pages. It seems that for big
> firmwares at least there is a root cause which is deeper than how you
> grow your vmalloc buffer.
>

An sg list of pages would be perfect for my usage. I didn't want to
change an existing kernel interface, so I just made the easiest change
that worked for me.

Another thing that could be done is trimming the vmalloc() down to the
exact size needed after the firmware has finished loading. That would
still waste memory until the copy from userspace is finished, though.

I'd be happy to test patches anyone comes up with.

Thanks,
Ira

2009-04-03 06:46:21

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

Ira Snyder wrote:
> I didn't want to change an existing kernel interface, so I just made
> the easiest change that worked for me.

Well, userspace does know the actual size of the image, so I see no
reason why it shouldn't be able to tell the kernel about it beforehand.

> I'd be happy to test patches anyone comes up with.

=====

This adds a data_size attribute to the firmware loading device so that
userspace can tell us about the firmware image size. This allows us to
preallocate the buffer with the final size, thus avoiding reallocating
the buffer for every page of data as it comes in.

Signed-off-by: Clemens Ladisch <[email protected]>

--- linux-2.6.orig/Documentation/firmware_class/README
+++ linux-2.6/Documentation/firmware_class/README
@@ -21,7 +21,7 @@
kernel(driver): calls request_firmware(&fw_entry, $FIRMWARE, device)

userspace:
- - /sys/class/firmware/xxx/{loading,data} appear.
+ - /sys/class/firmware/xxx/{loading,data,data_size} appear.
- hotplug gets called with a firmware identifier in $FIRMWARE
and the usual hotplug environment.
- hotplug: echo 1 > /sys/class/firmware/xxx/loading
@@ -29,11 +29,13 @@
kernel: Discard any previous partial load.

userspace:
+ - hotplug: echo ... > /sys/class/firmware/xxx/data_size
- hotplug: cat appropriate_firmware_image > \
/sys/class/firmware/xxx/data

- kernel: grows a buffer in PAGE_SIZE increments to hold the image as it
- comes in.
+ kernel: Copies the firmware image into a buffer of the specified size.
+ If the image is larger, the buffer automatically grows in
+ PAGE_SIZE increments.

userspace:
- hotplug: echo 0 > /sys/class/firmware/xxx/loading
@@ -60,6 +62,9 @@

HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/

+ if [ -e /sys/$DEVPATH/data_size ]; then
+ stat -c %s $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data_size
+ fi
echo 1 > /sys/$DEVPATH/loading
cat $HOTPLUG_FW_DIR/$FIRMWARE > /sysfs/$DEVPATH/data
echo 0 > /sys/$DEVPATH/loading
@@ -73,6 +78,9 @@
- firmware_data_read() and firmware_loading_show() are just provided
for testing and completeness, they are not called in normal use.

+ - /sys/class/firmware/xxx/data_size is optional for compatibility with
+ older kernels.
+
- There is also /sys/class/firmware/timeout which holds a timeout in
seconds for the whole load operation.

--- linux-2.6.orig/Documentation/firmware_class/hotplug-script
+++ linux-2.6/Documentation/firmware_class/hotplug-script
@@ -6,6 +6,9 @@

HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/

+if [ -e /sys/$DEVPATH/data_size ]; then
+ stat -c %s $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data_size
+fi
echo 1 > /sys/$DEVPATH/loading
cat $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data
echo 0 > /sys/$DEVPATH/loading
--- linux-2.6.orig/drivers/base/firmware_class.c
+++ linux-2.6/drivers/base/firmware_class.c
@@ -46,6 +46,7 @@ struct firmware_priv {
struct firmware *fw;
unsigned long status;
int alloc_size;
+ int size_hint;
struct timer_list timeout;
};

@@ -114,6 +115,32 @@ static struct class firmware_class = {
.dev_release = fw_dev_release,
};

+static ssize_t firmware_data_size_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct firmware_priv *fw_priv = dev_get_drvdata(dev);
+
+ return sprintf(buf, "%d\n", fw_priv->size_hint);
+}
+
+static ssize_t firmware_data_size_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct firmware_priv *fw_priv = dev_get_drvdata(dev);
+ long value;
+ int err;
+
+ err = strict_strtol(buf, 10, &value);
+ if (err)
+ return err;
+ fw_priv->size_hint = value;
+ return count;
+}
+
+static DEVICE_ATTR(data_size, 0644,
+ firmware_data_size_show, firmware_data_size_store);
+
static ssize_t firmware_loading_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -207,6 +234,7 @@ fw_realloc_buffer(struct firmware_priv *
if (min_size <= fw_priv->alloc_size)
return 0;

+ min_size = max(min_size, fw_priv->size_hint);
new_size = ALIGN(min_size, PAGE_SIZE);
new_data = vmalloc(new_size);
if (!new_data) {
@@ -359,6 +387,12 @@ static int fw_setup_device(struct firmwa
goto error_unreg;
}

+ retval = device_create_file(f_dev, &dev_attr_data_size);
+ if (retval) {
+ dev_err(device, "%s: device_create_file failed\n", __func__);
+ goto error_unreg;
+ }
+
retval = device_create_file(f_dev, &dev_attr_loading);
if (retval) {
dev_err(device, "%s: device_create_file failed\n", __func__);

2009-04-03 17:25:48

by Andrey Borzenkov

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

On Friday 03 of April 2009 10:46:11 Clemens Ladisch wrote:
> Ira Snyder wrote:
> > I didn't want to change an existing kernel interface, so I just
> > made the easiest change that worked for me.
>
> Well, userspace does know the actual size of the image, so I see no
> reason why it shouldn't be able to tell the kernel about it
> beforehand.
>

Right; but it will need some time for user space to catch up.

> > I'd be happy to test patches anyone comes up with.
>
> =====
>
> This adds a data_size attribute to the firmware loading device so
> that userspace can tell us about the firmware image size. This
> allows us to preallocate the buffer with the final size, thus
> avoiding reallocating the buffer for every page of data as it comes
> in.
>
> Signed-off-by: Clemens Ladisch <[email protected]>
>
> --- linux-2.6.orig/Documentation/firmware_class/README
> +++ linux-2.6/Documentation/firmware_class/README
> @@ -21,7 +21,7 @@
> kernel(driver): calls request_firmware(&fw_entry, $FIRMWARE,
> device)
>
> userspace:
> - - /sys/class/firmware/xxx/{loading,data} appear.
> + - /sys/class/firmware/xxx/{loading,data,data_size} appear.
> - hotplug gets called with a firmware identifier in $FIRMWARE
> and the usual hotplug environment.
> - hotplug: echo 1 > /sys/class/firmware/xxx/loading
> @@ -29,11 +29,13 @@
> kernel: Discard any previous partial load.
>
> userspace:
> + - hotplug: echo ... > /sys/class/firmware/xxx/data_size
> - hotplug: cat appropriate_firmware_image > \
> /sys/class/firmware/xxx/data
>
> - kernel: grows a buffer in PAGE_SIZE increments to hold the image as
> it - comes in.
> + kernel: Copies the firmware image into a buffer of the specified
> size. + If the image is larger, the buffer automatically grows in +
> PAGE_SIZE increments.
>
> userspace:
> - hotplug: echo 0 > /sys/class/firmware/xxx/loading
> @@ -60,6 +62,9 @@
>
> HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/
>
> + if [ -e /sys/$DEVPATH/data_size ]; then
> + stat -c %s $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data_size
> + fi
> echo 1 > /sys/$DEVPATH/loading
> cat $HOTPLUG_FW_DIR/$FIRMWARE > /sysfs/$DEVPATH/data
> echo 0 > /sys/$DEVPATH/loading
> @@ -73,6 +78,9 @@
> - firmware_data_read() and firmware_loading_show() are just
> provided for testing and completeness, they are not called in normal
> use.
>
> + - /sys/class/firmware/xxx/data_size is optional for compatibility
> with + older kernels.
> +
> - There is also /sys/class/firmware/timeout which holds a timeout
> in seconds for the whole load operation.
>
> --- linux-2.6.orig/Documentation/firmware_class/hotplug-script
> +++ linux-2.6/Documentation/firmware_class/hotplug-script
> @@ -6,6 +6,9 @@
>
> HOTPLUG_FW_DIR=/usr/lib/hotplug/firmware/
>
> +if [ -e /sys/$DEVPATH/data_size ]; then
> + stat -c %s $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data_size
> +fi
> echo 1 > /sys/$DEVPATH/loading
> cat $HOTPLUG_FW_DIR/$FIRMWARE > /sys/$DEVPATH/data
> echo 0 > /sys/$DEVPATH/loading
> --- linux-2.6.orig/drivers/base/firmware_class.c
> +++ linux-2.6/drivers/base/firmware_class.c
> @@ -46,6 +46,7 @@ struct firmware_priv {
> struct firmware *fw;
> unsigned long status;
> int alloc_size;
> + int size_hint;

Unsigned?

> struct timer_list timeout;
> };
>
> @@ -114,6 +115,32 @@ static struct class firmware_class = {
> .dev_release = fw_dev_release,
> };
>
> +static ssize_t firmware_data_size_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct firmware_priv *fw_priv = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "%d\n", fw_priv->size_hint);
> +}
> +

Why would you need it? It is no importance once firmware had been
loaded. I'd personally consider it as write-only.

> +static ssize_t firmware_data_size_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct firmware_priv *fw_priv = dev_get_drvdata(dev);
> + long value;
> + int err;
> +
> + err = strict_strtol(buf, 10, &value);
> + if (err)
> + return err;
> + fw_priv->size_hint = value;

Should not there be some protection against using silly large values?

> + return count;
> +}
> +
> +static DEVICE_ATTR(data_size, 0644,
> + firmware_data_size_show, firmware_data_size_store);
> +
> static ssize_t firmware_loading_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> @@ -207,6 +234,7 @@ fw_realloc_buffer(struct firmware_priv *
> if (min_size <= fw_priv->alloc_size)
> return 0;
>
> + min_size = max(min_size, fw_priv->size_hint);
> new_size = ALIGN(min_size, PAGE_SIZE);
> new_data = vmalloc(new_size);
> if (!new_data) {
> @@ -359,6 +387,12 @@ static int fw_setup_device(struct firmwa
> goto error_unreg;
> }
>
> + retval = device_create_file(f_dev, &dev_attr_data_size);
> + if (retval) {
> + dev_err(device, "%s: device_create_file failed\n", __func__);
> + goto error_unreg;
> + }
> +
> retval = device_create_file(f_dev, &dev_attr_loading);
> if (retval) {
> dev_err(device, "%s: device_create_file failed\n", __func__);


Attachments:
(No filename) (4.99 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2009-04-06 08:48:47

by Clemens Ladisch

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

Andrey Borzenkov wrote:
> On Friday 03 of April 2009 10:46:11 Clemens Ladisch wrote:
> > Well, userspace does know the actual size of the image, so I see no
> > reason why it shouldn't be able to tell the kernel about it
> > beforehand.
>
> Right; but it will need some time for user space to catch up.

It's only an optimization.

> > @@ -46,6 +46,7 @@ struct firmware_priv {
> > struct firmware *fw;
> > unsigned long status;
> > int alloc_size;
> > + int size_hint;
>
> Unsigned?

It is as signed as alloc_size. ;-)

Yes, all these variables probably should be size_t.

> > +static ssize_t firmware_data_size_show(struct device *dev,
> > + struct device_attribute *attr, char *buf)
> > +{
> > + struct firmware_priv *fw_priv = dev_get_drvdata(dev);
> > +
> > + return sprintf(buf, "%d\n", fw_priv->size_hint);
> > +}
>
> Why would you need it?

Good question. Apparently, for the same reason why we'd need
firmware_data_read ...

> > +static ssize_t firmware_data_size_store(struct device *dev,
> > + struct device_attribute *attr,
> > + const char *buf, size_t count)
> > +{
> > + struct firmware_priv *fw_priv = dev_get_drvdata(dev);
> > + long value;
> > + int err;
> > +
> > + err = strict_strtol(buf, 10, &value);
> > + if (err)
> > + return err;
> > + fw_priv->size_hint = value;
>
> Should not there be some protection against using silly large values?

It is already possible to do "cat /dev/zero > .../data". What we'd need
is a limit not only on this variable but on the size of the firmware
image itself.


Okay, I'll do a bunch of patches to fix these warts in the firmware
loader.


Best regards,
Clemens

2009-04-09 19:09:23

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

On Wed, 2009-01-28 at 12:34 -0800, Ira Snyder wrote:
> On Wed, Jan 28, 2009 at 07:45:34PM +0000, Alan Cox wrote:
> > > Some drivers cache firmware in memory. Doubing the amount of needed memory
> > > definitely would not be the best idea. Check drivers/net/wireless for
> > > examples.
> >
> > A lot of drivers could perfectly happily exist with a simple iterator
> > helper and being returned sg lists of pages. It seems that for big
> > firmwares at least there is a root cause which is deeper than how you
> > grow your vmalloc buffer.
> >
>
> An sg list of pages would be perfect for my usage. I didn't want to
> change an existing kernel interface, so I just made the easiest change
> that worked for me.
>
> Another thing that could be done is trimming the vmalloc() down to the
> exact size needed after the firmware has finished loading. That would
> still waste memory until the copy from userspace is finished, though.

Or you just allocate individual pages one at a time as you're copying
the data in from userspace, then vmap() them once you're _done_.

We could potentially extend this to allow drivers to say "I only need an
array of pages, not contiguous VM space", but that comes later.

This approach has the benefit that we can force the virtual mapping of
the firmware image to be read-only -- one step up from making it 'const'
as we did already.

> I'd be happy to test patches anyone comes up with.

Totally untested here...

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index d3a59c6..bf151cd 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -17,7 +17,7 @@
#include <linux/bitops.h>
#include <linux/mutex.h>
#include <linux/kthread.h>
-
+#include <linux/highmem.h>
#include <linux/firmware.h>
#include "base.h"

@@ -45,7 +45,10 @@ struct firmware_priv {
struct bin_attribute attr_data;
struct firmware *fw;
unsigned long status;
- int alloc_size;
+ struct page **pages;
+ int nr_pages;
+ int page_array_size;
+ const char *vdata;
struct timer_list timeout;
};

@@ -141,6 +144,7 @@ static ssize_t firmware_loading_store(struct device *dev,
{
struct firmware_priv *fw_priv = dev_get_drvdata(dev);
int loading = simple_strtol(buf, NULL, 10);
+ int i;

switch (loading) {
case 1:
@@ -151,13 +155,30 @@ static ssize_t firmware_loading_store(struct device *dev,
}
vfree(fw_priv->fw->data);
fw_priv->fw->data = NULL;
+ for (i = 0; i < fw_priv->nr_pages; i++)
+ __free_page(fw_priv->pages[i]);
+ kfree(fw_priv->pages);
+ fw_priv->pages = NULL;
+ fw_priv->page_array_size = 0;
+ fw_priv->nr_pages = 0;
fw_priv->fw->size = 0;
- fw_priv->alloc_size = 0;
set_bit(FW_STATUS_LOADING, &fw_priv->status);
mutex_unlock(&fw_lock);
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
+ vfree(fw_priv->fw->data);
+ fw_priv->fw->data = vmap(fw_priv->pages,
+ fw_priv->nr_pages,
+ 0, PAGE_KERNEL_RO);
+ if (!fw_priv->fw->data) {
+ dev_err(dev, "%s: vmap() failed\n", __func__);
+ goto err;
+ }
+ /* Pages will be freed by vfree() */
+ fw_priv->pages = NULL;
+ fw_priv->page_array_size = 0;
+ fw_priv->nr_pages = 0;
complete(&fw_priv->completion);
clear_bit(FW_STATUS_LOADING, &fw_priv->status);
break;
@@ -167,6 +188,7 @@ static ssize_t firmware_loading_store(struct device *dev,
dev_err(dev, "%s: unexpected value (%d)\n", __func__, loading);
/* fallthrough */
case -1:
+ err:
fw_load_abort(fw_priv);
break;
}
@@ -191,8 +213,28 @@ firmware_data_read(struct kobject *kobj, struct bin_attribute *bin_attr,
ret_count = -ENODEV;
goto out;
}
- ret_count = memory_read_from_buffer(buffer, count, &offset,
- fw->data, fw->size);
+ if (offset > fw->size)
+ return 0;
+ if (count > fw->size - offset)
+ count = fw->size - offset;
+
+ ret_count = count;
+
+ while (count) {
+ void *page_data;
+ int page_nr = offset >> PAGE_SHIFT;
+ int page_ofs = offset & (PAGE_SIZE-1);
+ int page_cnt = min(PAGE_SIZE - page_ofs, PAGE_SIZE);
+
+ page_data = kmap(fw_priv->pages[page_nr]);
+
+ memcpy(buffer, page_data + page_ofs, page_cnt);
+
+ kunmap(page_data);
+ buffer += page_cnt;
+ offset += page_cnt;
+ count -= page_cnt;
+ }
out:
mutex_unlock(&fw_lock);
return ret_count;
@@ -201,27 +243,39 @@ out:
static int
fw_realloc_buffer(struct firmware_priv *fw_priv, int min_size)
{
- u8 *new_data;
- int new_size = fw_priv->alloc_size;
+ int pages_needed = ALIGN(min_size, PAGE_SIZE) >> PAGE_SHIFT;
+
+ /* If the array of pages is too small, grow it... */
+ if (fw_priv->page_array_size < pages_needed) {
+ int new_array_size = min(pages_needed,
+ fw_priv->page_array_size * 2);
+ struct page **new_pages;
+
+ new_pages = kmalloc(new_array_size * sizeof(void *),
+ GFP_KERNEL);
+ if (!new_pages) {
+ fw_load_abort(fw_priv);
+ return -ENOMEM;
+ }
+ memcpy(new_pages, fw_priv->pages,
+ fw_priv->page_array_size * sizeof(void *));
+ memset(&new_pages[fw_priv->page_array_size], 0, sizeof(void *) *
+ (new_array_size - fw_priv->page_array_size));
+ kfree(fw_priv->pages);
+ fw_priv->pages = new_pages;
+ fw_priv->page_array_size = new_array_size;
+ }

- if (min_size <= fw_priv->alloc_size)
- return 0;
+ while (fw_priv->nr_pages < pages_needed) {
+ fw_priv->pages[fw_priv->nr_pages] =
+ alloc_page(GFP_KERNEL | __GFP_HIGHMEM);

- new_size = ALIGN(min_size, PAGE_SIZE);
- new_data = vmalloc(new_size);
- if (!new_data) {
- printk(KERN_ERR "%s: unable to alloc buffer\n", __func__);
- /* Make sure that we don't keep incomplete data */
- fw_load_abort(fw_priv);
- return -ENOMEM;
- }
- fw_priv->alloc_size = new_size;
- if (fw_priv->fw->data) {
- memcpy(new_data, fw_priv->fw->data, fw_priv->fw->size);
- vfree(fw_priv->fw->data);
+ if (!fw_priv->pages[fw_priv->nr_pages]) {
+ fw_load_abort(fw_priv);
+ return -ENOMEM;
+ }
+ fw_priv->nr_pages++;
}
- fw_priv->fw->data = new_data;
- BUG_ON(min_size > fw_priv->alloc_size);
return 0;
}

@@ -258,10 +312,25 @@ firmware_data_write(struct kobject *kobj, struct bin_attribute *bin_attr,
if (retval)
goto out;

- memcpy((u8 *)fw->data + offset, buffer, count);
-
- fw->size = max_t(size_t, offset + count, fw->size);
retval = count;
+
+ while (count) {
+ void *page_data;
+ int page_nr = offset >> PAGE_SHIFT;
+ int page_ofs = offset & (PAGE_SIZE-1);
+ int page_cnt = min(PAGE_SIZE - page_ofs, PAGE_SIZE);
+
+ page_data = kmap(fw_priv->pages[page_nr]);
+
+ memcpy(page_data + page_ofs, buffer, page_cnt);
+
+ kunmap(page_data);
+ buffer += page_cnt;
+ offset += page_cnt;
+ count -= page_cnt;
+ }
+
+ fw->size = max_t(size_t, offset, fw->size);
out:
mutex_unlock(&fw_lock);
return retval;
@@ -277,7 +346,11 @@ static struct bin_attribute firmware_attr_data_tmpl = {
static void fw_dev_release(struct device *dev)
{
struct firmware_priv *fw_priv = dev_get_drvdata(dev);
+ int i;

+ for (i = 0; i < fw_priv->nr_pages; i++)
+ __free_page(fw_priv->pages[i]);
+ kfree(fw_priv->pages);
kfree(fw_priv);
kfree(dev);


--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-10 05:04:24

by David Woodhouse

[permalink] [raw]
Subject: Re: [PATCH] firmware: speed up request_firmware()

Rather than calling vmalloc() repeatedly to grow the firmware image as
we receive data from userspace, just allocate and fill individual pages.
Then vmap() the whole lot in one go when we're done.

A quick test with a 337KiB iwlagn firmware shows the time taken for
request_firmware() going from ~32ms to ~5ms after I apply this patch.

Signed-off-by: David Woodhouse <[email protected]>

---
Various failure modes and corner cases, including firmware_data_read(),
are still untested.

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index d3a59c6..90cb7c2 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -17,7 +17,7 @@
#include <linux/bitops.h>
#include <linux/mutex.h>
#include <linux/kthread.h>
-
+#include <linux/highmem.h>
#include <linux/firmware.h>
#include "base.h"

@@ -45,7 +45,10 @@ struct firmware_priv {
struct bin_attribute attr_data;
struct firmware *fw;
unsigned long status;
- int alloc_size;
+ struct page **pages;
+ int nr_pages;
+ int page_array_size;
+ const char *vdata;
struct timer_list timeout;
};

@@ -141,6 +144,7 @@ static ssize_t firmware_loading_store(struct device *dev,
{
struct firmware_priv *fw_priv = dev_get_drvdata(dev);
int loading = simple_strtol(buf, NULL, 10);
+ int i;

switch (loading) {
case 1:
@@ -151,13 +155,30 @@ static ssize_t firmware_loading_store(struct device *dev,
}
vfree(fw_priv->fw->data);
fw_priv->fw->data = NULL;
+ for (i = 0; i < fw_priv->nr_pages; i++)
+ __free_page(fw_priv->pages[i]);
+ kfree(fw_priv->pages);
+ fw_priv->pages = NULL;
+ fw_priv->page_array_size = 0;
+ fw_priv->nr_pages = 0;
fw_priv->fw->size = 0;
- fw_priv->alloc_size = 0;
set_bit(FW_STATUS_LOADING, &fw_priv->status);
mutex_unlock(&fw_lock);
break;
case 0:
if (test_bit(FW_STATUS_LOADING, &fw_priv->status)) {
+ vfree(fw_priv->fw->data);
+ fw_priv->fw->data = vmap(fw_priv->pages,
+ fw_priv->nr_pages,
+ 0, PAGE_KERNEL_RO);
+ if (!fw_priv->fw->data) {
+ dev_err(dev, "%s: vmap() failed\n", __func__);
+ goto err;
+ }
+ /* Pages will be freed by vfree() */
+ fw_priv->pages = NULL;
+ fw_priv->page_array_size = 0;
+ fw_priv->nr_pages = 0;
complete(&fw_priv->completion);
clear_bit(FW_STATUS_LOADING, &fw_priv->status);
break;
@@ -167,6 +188,7 @@ static ssize_t firmware_loading_store(struct device *dev,
dev_err(dev, "%s: unexpected value (%d)\n", __func__, loading);
/* fallthrough */
case -1:
+ err:
fw_load_abort(fw_priv);
break;
}
@@ -191,8 +213,28 @@ firmware_data_read(struct kobject *kobj, struct bin_attribute *bin_attr,
ret_count = -ENODEV;
goto out;
}
- ret_count = memory_read_from_buffer(buffer, count, &offset,
- fw->data, fw->size);
+ if (offset > fw->size)
+ return 0;
+ if (count > fw->size - offset)
+ count = fw->size - offset;
+
+ ret_count = count;
+
+ while (count) {
+ void *page_data;
+ int page_nr = offset >> PAGE_SHIFT;
+ int page_ofs = offset & (PAGE_SIZE-1);
+ int page_cnt = min(PAGE_SIZE - page_ofs, count);
+
+ page_data = kmap(fw_priv->pages[page_nr]);
+
+ memcpy(buffer, page_data + page_ofs, page_cnt);
+
+ kunmap(page_data);
+ buffer += page_cnt;
+ offset += page_cnt;
+ count -= page_cnt;
+ }
out:
mutex_unlock(&fw_lock);
return ret_count;
@@ -201,27 +243,39 @@ out:
static int
fw_realloc_buffer(struct firmware_priv *fw_priv, int min_size)
{
- u8 *new_data;
- int new_size = fw_priv->alloc_size;
+ int pages_needed = ALIGN(min_size, PAGE_SIZE) >> PAGE_SHIFT;
+
+ /* If the array of pages is too small, grow it... */
+ if (fw_priv->page_array_size < pages_needed) {
+ int new_array_size = max(pages_needed,
+ fw_priv->page_array_size * 2);
+ struct page **new_pages;
+
+ new_pages = kmalloc(new_array_size * sizeof(void *),
+ GFP_KERNEL);
+ if (!new_pages) {
+ fw_load_abort(fw_priv);
+ return -ENOMEM;
+ }
+ memcpy(new_pages, fw_priv->pages,
+ fw_priv->page_array_size * sizeof(void *));
+ memset(&new_pages[fw_priv->page_array_size], 0, sizeof(void *) *
+ (new_array_size - fw_priv->page_array_size));
+ kfree(fw_priv->pages);
+ fw_priv->pages = new_pages;
+ fw_priv->page_array_size = new_array_size;
+ }

- if (min_size <= fw_priv->alloc_size)
- return 0;
+ while (fw_priv->nr_pages < pages_needed) {
+ fw_priv->pages[fw_priv->nr_pages] =
+ alloc_page(GFP_KERNEL | __GFP_HIGHMEM);

- new_size = ALIGN(min_size, PAGE_SIZE);
- new_data = vmalloc(new_size);
- if (!new_data) {
- printk(KERN_ERR "%s: unable to alloc buffer\n", __func__);
- /* Make sure that we don't keep incomplete data */
- fw_load_abort(fw_priv);
- return -ENOMEM;
- }
- fw_priv->alloc_size = new_size;
- if (fw_priv->fw->data) {
- memcpy(new_data, fw_priv->fw->data, fw_priv->fw->size);
- vfree(fw_priv->fw->data);
+ if (!fw_priv->pages[fw_priv->nr_pages]) {
+ fw_load_abort(fw_priv);
+ return -ENOMEM;
+ }
+ fw_priv->nr_pages++;
}
- fw_priv->fw->data = new_data;
- BUG_ON(min_size > fw_priv->alloc_size);
return 0;
}

@@ -258,10 +312,25 @@ firmware_data_write(struct kobject *kobj, struct bin_attribute *bin_attr,
if (retval)
goto out;

- memcpy((u8 *)fw->data + offset, buffer, count);
-
- fw->size = max_t(size_t, offset + count, fw->size);
retval = count;
+
+ while (count) {
+ void *page_data;
+ int page_nr = offset >> PAGE_SHIFT;
+ int page_ofs = offset & (PAGE_SIZE - 1);
+ int page_cnt = min(PAGE_SIZE - page_ofs, count);
+
+ page_data = kmap(fw_priv->pages[page_nr]);
+
+ memcpy(page_data + page_ofs, buffer, page_cnt);
+
+ kunmap(page_data);
+ buffer += page_cnt;
+ offset += page_cnt;
+ count -= page_cnt;
+ }
+
+ fw->size = max_t(size_t, offset, fw->size);
out:
mutex_unlock(&fw_lock);
return retval;
@@ -277,7 +346,11 @@ static struct bin_attribute firmware_attr_data_tmpl = {
static void fw_dev_release(struct device *dev)
{
struct firmware_priv *fw_priv = dev_get_drvdata(dev);
+ int i;

+ for (i = 0; i < fw_priv->nr_pages; i++)
+ __free_page(fw_priv->pages[i]);
+ kfree(fw_priv->pages);
kfree(fw_priv);
kfree(dev);




--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation