2013-10-03 04:04:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 00/13] 3.0.99-stable review

This is the start of the stable review cycle for the 3.0.99 release.
There are 13 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.0.99-rc1

Eric Dumazet <[email protected]>
splice: fix racy pipe->buffers uses

Henrik Rydberg <[email protected]>
hwmon: (applesmc) Silence uninitialized warnings

Khalid Aziz <[email protected]>
mm: fix aio performance regression for database caused by THP

Henrik Rydberg <[email protected]>
hwmon: (applesmc) Check key count before proceeding

Jani Nikula <[email protected]>
drm/i915/dp: increase i2c-over-aux retry interval on AUX DEFER

Mikulas Patocka <[email protected]>
dm-snapshot: fix performance degradation due to small hash size

Mikulas Patocka <[email protected]>
dm snapshot: workaround for a false positive lockdep warning

Kurt Garloff <[email protected]>
usb/core/devio.c: Don't reject control message to endpoint with wrong direction bit

Florian Wolter <[email protected]>
xhci: Fix race between ep halt and URB cancellation

Mathias Nyman <[email protected]>
xhci: Fix oops happening after address device timeout

Malcolm Priestley <[email protected]>
staging: vt6656: [BUG] main_usb.c oops on device_close move flag earlier.

Josh Boyer <[email protected]>
x86, efi: Don't map Boot Services on i386

Masoud Sharbiani <[email protected]>
x86/reboot: Add quirk to make Dell C6100 use reboot=pci automatically


-------------

Diffstat:

Makefile | 4 +--
arch/x86/kernel/reboot.c | 16 ++++++++++
arch/x86/platform/efi/efi.c | 11 ++++---
drivers/gpu/drm/i915/intel_dp.c | 13 +++++++-
drivers/hwmon/applesmc.c | 19 ++++++++++--
drivers/md/dm-snap-persistent.c | 2 +-
drivers/md/dm-snap.c | 5 ++-
drivers/staging/vt6656/main_usb.c | 3 +-
drivers/usb/core/devio.c | 16 ++++++++++
drivers/usb/host/xhci-ring.c | 14 +++++++--
fs/splice.c | 35 ++++++++++++---------
include/linux/splice.h | 8 ++---
kernel/relay.c | 5 +--
kernel/trace/trace.c | 6 ++--
mm/swap.c | 65 ++++++++++++++++++++++++++++-----------
net/core/skbuff.c | 3 +-
16 files changed, 166 insertions(+), 59 deletions(-)


2013-10-03 04:04:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 01/13] x86/reboot: Add quirk to make Dell C6100 use reboot=pci automatically

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Masoud Sharbiani <[email protected]>

commit 4f0acd31c31f03ba42494c8baf6c0465150e2621 upstream.

Dell PowerEdge C6100 machines fail to completely reboot about 20% of the time.

Signed-off-by: Masoud Sharbiani <[email protected]>
Signed-off-by: Vinson Lee <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Russell King <[email protected]>
Cc: Guan Xuetao <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/reboot.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -460,6 +460,22 @@ static struct dmi_system_id __initdata p
DMI_MATCH(DMI_PRODUCT_NAME, "Precision M6600"),
},
},
+ { /* Handle problems with rebooting on the Dell PowerEdge C6100. */
+ .callback = set_pci_reboot,
+ .ident = "Dell PowerEdge C6100",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
+ },
+ },
+ { /* Some C6100 machines were shipped with vendor being 'Dell'. */
+ .callback = set_pci_reboot,
+ .ident = "Dell PowerEdge C6100",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "C6100"),
+ },
+ },
{ }
};


2013-10-03 04:04:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 03/13] staging: vt6656: [BUG] main_usb.c oops on device_close move flag earlier.

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Malcolm Priestley <[email protected]>

commit e3eb270fab7734427dd8171a93e4946fe28674bc upstream.

The vt6656 is prone to resetting on the usb bus.

It seems there is a race condition and wpa supplicant is
trying to open the device via iw_handlers before its actually
closed at a stage that the buffers are being removed.

The device is longer considered open when the
buffers are being removed. So move ~DEVICE_FLAGS_OPENED
flag to before freeing the device buffers.

Signed-off-by: Malcolm Priestley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/staging/vt6656/main_usb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/staging/vt6656/main_usb.c
+++ b/drivers/staging/vt6656/main_usb.c
@@ -1228,6 +1228,8 @@ device_release_WPADEV(pDevice);
memset(pMgmt->abyCurrBSSID, 0, 6);
pMgmt->eCurrState = WMAC_STATE_IDLE;

+ pDevice->flags &= ~DEVICE_FLAGS_OPENED;
+
device_free_tx_bufs(pDevice);
device_free_rx_bufs(pDevice);
device_free_int_bufs(pDevice);
@@ -1239,7 +1241,6 @@ device_release_WPADEV(pDevice);
usb_free_urb(pDevice->pInterruptURB);

BSSvClearNodeDBTable(pDevice, 0);
- pDevice->flags &=(~DEVICE_FLAGS_OPENED);

DBG_PRT(MSG_LEVEL_DEBUG, KERN_INFO "device_close2 \n");


2013-10-03 04:04:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 02/13] x86, efi: Dont map Boot Services on i386

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josh Boyer <[email protected]>

commit 700870119f49084da004ab588ea2b799689efaf7 upstream.

Add patch to fix 32bit EFI service mapping (rhbz 726701)

Multiple people are reporting hitting the following WARNING on i386,

WARNING: at arch/x86/mm/ioremap.c:102 __ioremap_caller+0x3d3/0x440()
Modules linked in:
Pid: 0, comm: swapper Not tainted 3.9.0-rc7+ #95
Call Trace:
[<c102b6af>] warn_slowpath_common+0x5f/0x80
[<c1023fb3>] ? __ioremap_caller+0x3d3/0x440
[<c1023fb3>] ? __ioremap_caller+0x3d3/0x440
[<c102b6ed>] warn_slowpath_null+0x1d/0x20
[<c1023fb3>] __ioremap_caller+0x3d3/0x440
[<c106007b>] ? get_usage_chars+0xfb/0x110
[<c102d937>] ? vprintk_emit+0x147/0x480
[<c1418593>] ? efi_enter_virtual_mode+0x1e4/0x3de
[<c102406a>] ioremap_cache+0x1a/0x20
[<c1418593>] ? efi_enter_virtual_mode+0x1e4/0x3de
[<c1418593>] efi_enter_virtual_mode+0x1e4/0x3de
[<c1407984>] start_kernel+0x286/0x2f4
[<c1407535>] ? repair_env_string+0x51/0x51
[<c1407362>] i386_start_kernel+0x12c/0x12f

Due to the workaround described in commit 916f676f8 ("x86, efi: Retain
boot service code until after switching to virtual mode") EFI Boot
Service regions are mapped for a period during boot. Unfortunately, with
the limited size of the i386 direct kernel map it's possible that some
of the Boot Service regions will not be directly accessible, which
causes them to be ioremap()'d, triggering the above warning as the
regions are marked as E820_RAM in the e820 memmap.

There are currently only two situations where we need to map EFI Boot
Service regions,

1. To workaround the firmware bug described in 916f676f8
2. To access the ACPI BGRT image

but since we haven't seen an i386 implementation that requires either,
this simple fix should suffice for now.

[ Added to changelog - Matt ]

Reported-by: Bryan O'Donoghue <[email protected]>
Acked-by: Tom Zanussi <[email protected]>
Acked-by: Darren Hart <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Josh Boyer <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/platform/efi/efi.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -588,10 +588,13 @@ void __init efi_enter_virtual_mode(void)

for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
md = p;
- if (!(md->attribute & EFI_MEMORY_RUNTIME) &&
- md->type != EFI_BOOT_SERVICES_CODE &&
- md->type != EFI_BOOT_SERVICES_DATA)
- continue;
+ if (!(md->attribute & EFI_MEMORY_RUNTIME)) {
+#ifdef CONFIG_X86_64
+ if (md->type != EFI_BOOT_SERVICES_CODE &&
+ md->type != EFI_BOOT_SERVICES_DATA)
+#endif
+ continue;
+ }

size = md->num_pages << EFI_PAGE_SHIFT;
end = md->phys_addr + size;

2013-10-03 04:05:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 06/13] usb/core/devio.c: Dont reject control message to endpoint with wrong direction bit

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kurt Garloff <[email protected]>

commit 831abf76643555a99b80a3b54adfa7e4fa0a3259 upstream.

Trying to read data from the Pegasus Technologies NoteTaker (0e20:0101)
[1] with the Windows App (EasyNote) works natively but fails when
Windows is running under KVM (and the USB device handed to KVM).

The reason is a USB control message
usb 4-2.2: control urb: bRequestType=22 bRequest=09 wValue=0200 wIndex=0001 wLength=0008
This goes to endpoint address 0x01 (wIndex); however, endpoint address
0x01 does not exist. There is an endpoint 0x81 though (same number,
but other direction); the app may have meant that endpoint instead.

The kernel thus rejects the IO and thus we see the failure.

Apparently, Linux is more strict here than Windows ... we can't change
the Win app easily, so that's a problem.

It seems that the Win app/driver is buggy here and the driver does not
behave fully according to the USB HID class spec that it claims to
belong to. The device seems to happily deal with that though (and
seems to not really care about this value much).

So the question is whether the Linux kernel should filter here.
Rejecting has the risk that somewhat non-compliant userspace apps/
drivers (most likely in a virtual machine) are prevented from working.
Not rejecting has the risk of confusing an overly sensitive device with
such a transfer. Given the fact that Windows does not filter it makes
this risk rather small though.

The patch makes the kernel more tolerant: If the endpoint address in
wIndex does not exist, but an endpoint with toggled direction bit does,
it will let the transfer through. (It does NOT change the message.)

With attached patch, the app in Windows in KVM works.
usb 4-2.2: check_ctrlrecip: process 13073 (qemu-kvm) requesting ep 01 but needs 81

I suspect this will mostly affect apps in virtual environments; as on
Linux the apps would have been adapted to the stricter handling of the
kernel. I have done that for mine[2].

[1] http://www.pegatech.com/
[2] https://sourceforge.net/projects/notetakerpen/

Signed-off-by: Kurt Garloff <[email protected]>
Acked-by: Alan Stern <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/core/devio.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -645,6 +645,22 @@ static int check_ctrlrecip(struct dev_st
if ((index & ~USB_DIR_IN) == 0)
return 0;
ret = findintfep(ps->dev, index);
+ if (ret < 0) {
+ /*
+ * Some not fully compliant Win apps seem to get
+ * index wrong and have the endpoint number here
+ * rather than the endpoint address (with the
+ * correct direction). Win does let this through,
+ * so we'll not reject it here but leave it to
+ * the device to not break KVM. But we warn.
+ */
+ ret = findintfep(ps->dev, index ^ 0x80);
+ if (ret >= 0)
+ dev_info(&ps->dev->dev,
+ "%s: process %i (%s) requesting ep %02x but needs %02x\n",
+ __func__, task_pid_nr(current),
+ current->comm, index, index ^ 0x80);
+ }
if (ret >= 0)
ret = checkintf(ps, ret);
break;

2013-10-03 04:05:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 07/13] dm snapshot: workaround for a false positive lockdep warning

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <[email protected]>

commit 5ea330a75bd86b2b2a01d7b85c516983238306fb upstream.

The kernel reports a lockdep warning if a snapshot is invalidated because
it runs out of space.

The lockdep warning was triggered by commit 0976dfc1d0cd80a4e9dfaf87bd87
("workqueue: Catch more locking problems with flush_work()") in v3.5.

The warning is false positive. The real cause for the warning is that
the lockdep engine treats different instances of md->lock as a single
lock.

This patch is a workaround - we use flush_workqueue instead of flush_work.
This code path is not performance sensitive (it is called only on
initialization or invalidation), thus it doesn't matter that we flush the
whole workqueue.

The real fix for the problem would be to teach the lockdep engine to treat
different instances of md->lock as separate locks.

Signed-off-by: Mikulas Patocka <[email protected]>
Acked-by: Alasdair G Kergon <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/md/dm-snap-persistent.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -251,7 +251,7 @@ static int chunk_io(struct pstore *ps, v
*/
INIT_WORK_ONSTACK(&req.work, do_metadata);
queue_work(ps->metadata_wq, &req.work);
- flush_work(&req.work);
+ flush_workqueue(ps->metadata_wq);

return req.result;
}

2013-10-03 04:05:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 08/13] dm-snapshot: fix performance degradation due to small hash size

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <[email protected]>

commit 60e356f381954d79088d0455e357db48cfdd6857 upstream.

LVM2, since version 2.02.96, creates origin with zero size, then loads
the snapshot driver and then loads the origin. Consequently, the
snapshot driver sees the origin size zero and sets the hash size to the
lower bound 64. Such small hash table causes performance degradation.

This patch changes it so that the hash size is determined by the size of
snapshot volume, not minimum of origin and snapshot size. It doesn't
make sense to set the snapshot size significantly larger than the origin
size, so we do not need to take origin size into account when
calculating the hash size.

Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/md/dm-snap.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -724,17 +724,16 @@ static int calc_max_buckets(void)
*/
static int init_hash_tables(struct dm_snapshot *s)
{
- sector_t hash_size, cow_dev_size, origin_dev_size, max_buckets;
+ sector_t hash_size, cow_dev_size, max_buckets;

/*
* Calculate based on the size of the original volume or
* the COW volume...
*/
cow_dev_size = get_dev_size(s->cow->bdev);
- origin_dev_size = get_dev_size(s->origin->bdev);
max_buckets = calc_max_buckets();

- hash_size = min(origin_dev_size, cow_dev_size) >> s->store->chunk_shift;
+ hash_size = cow_dev_size >> s->store->chunk_shift;
hash_size = min(hash_size, max_buckets);

if (hash_size < 64)

2013-10-03 04:05:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 10/13] hwmon: (applesmc) Check key count before proceeding

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Henrik Rydberg <[email protected]>

commit 5f4513864304672e6ea9eac60583eeac32e679f2 upstream.

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied
regardless.

Reported-by: <[email protected]>
Suggested-by: Guenter Roeck <[email protected]>
Signed-off-by: Henrik Rydberg <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hwmon/applesmc.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -485,16 +485,25 @@ static int applesmc_init_smcreg_try(void
{
struct applesmc_registers *s = &smcreg;
bool left_light_sensor, right_light_sensor;
+ unsigned int count;
u8 tmp[1];
int ret;

if (s->init_complete)
return 0;

- ret = read_register_count(&s->key_count);
+ ret = read_register_count(&count);
if (ret)
return ret;

+ if (s->cache && s->key_count != count) {
+ pr_warn("key count changed from %d to %d\n",
+ s->key_count, count);
+ kfree(s->cache);
+ s->cache = NULL;
+ }
+ s->key_count = count;
+
if (!s->cache)
s->cache = kcalloc(s->key_count, sizeof(*s->cache), GFP_KERNEL);
if (!s->cache)

2013-10-03 04:05:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 09/13] drm/i915/dp: increase i2c-over-aux retry interval on AUX DEFER

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jani Nikula <[email protected]>

commit 8d16f258217f2f583af1fd57c5144aa4bbe73e48 upstream.

There is no clear cut rules or specs for the retry interval, as there
are many factors that affect overall response time. Increase the
interval, and even more so on branch devices which may have limited i2c
bit rates.

Signed-off-by: Jani Nikula <[email protected]>
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=60263
Tested-by: Nicolas Suzor <[email protected]>
Reviewed-by: Todd Previte <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/intel_dp.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -531,7 +531,18 @@ intel_dp_i2c_aux_ch(struct i2c_adapter *
DRM_DEBUG_KMS("aux_ch native nack\n");
return -EREMOTEIO;
case AUX_NATIVE_REPLY_DEFER:
- udelay(100);
+ /*
+ * For now, just give more slack to branch devices. We
+ * could check the DPCD for I2C bit rate capabilities,
+ * and if available, adjust the interval. We could also
+ * be more careful with DP-to-Legacy adapters where a
+ * long legacy cable may force very low I2C bit rates.
+ */
+ if (intel_dp->dpcd[DP_DOWNSTREAMPORT_PRESENT] &
+ DP_DWN_STRM_PORT_PRESENT)
+ usleep_range(500, 600);
+ else
+ usleep_range(300, 400);
continue;
default:
DRM_ERROR("aux_ch invalid native reply 0x%02x\n",

2013-10-03 04:06:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 12/13] hwmon: (applesmc) Silence uninitialized warnings

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Henrik Rydberg <[email protected]>

commit 0fc86eca1b338d06ec500b34ef7def79c32b602b upstream.

Some error paths do not set a result, leading to the (false)
assumption that the value may be used uninitialized. Set results for
those paths as well.

Signed-off-by: Henrik Rydberg <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hwmon/applesmc.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -344,8 +344,10 @@ static int applesmc_get_lower_bound(unsi
while (begin != end) {
int middle = begin + (end - begin) / 2;
entry = applesmc_get_entry_by_index(middle);
- if (IS_ERR(entry))
+ if (IS_ERR(entry)) {
+ *lo = 0;
return PTR_ERR(entry);
+ }
if (strcmp(entry->key, key) < 0)
begin = middle + 1;
else
@@ -364,8 +366,10 @@ static int applesmc_get_upper_bound(unsi
while (begin != end) {
int middle = begin + (end - begin) / 2;
entry = applesmc_get_entry_by_index(middle);
- if (IS_ERR(entry))
+ if (IS_ERR(entry)) {
+ *hi = smcreg.key_count;
return PTR_ERR(entry);
+ }
if (strcmp(key, entry->key) < 0)
end = middle;
else

2013-10-03 04:06:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 13/13] splice: fix racy pipe->buffers uses

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>

commit 047fe3605235888f3ebcda0c728cb31937eadfe6 upstream.

Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()

commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.

Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.

Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.

splice_shrink_spd() loses its struct pipe_inode_info argument.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Tom Herbert <[email protected]>
Cc: stable <[email protected]> # 2.6.35
Tested-by: Dave Jones <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/splice.c | 35 ++++++++++++++++++++---------------
include/linux/splice.h | 8 ++++----
kernel/relay.c | 5 +++--
kernel/trace/trace.c | 6 ++++--
net/core/skbuff.c | 3 ++-
5 files changed, 33 insertions(+), 24 deletions(-)

--- a/fs/splice.c
+++ b/fs/splice.c
@@ -274,13 +274,16 @@ static void spd_release_page(struct spli
* Check if we need to grow the arrays holding pages and partial page
* descriptions.
*/
-int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ unsigned int buffers = ACCESS_ONCE(pipe->buffers);
+
+ spd->nr_pages_max = buffers;
+ if (buffers <= PIPE_DEF_BUFFERS)
return 0;

- spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL);
- spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL);
+ spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL);
+ spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL);

if (spd->pages && spd->partial)
return 0;
@@ -290,10 +293,9 @@ int splice_grow_spd(struct pipe_inode_in
return -ENOMEM;
}

-void splice_shrink_spd(struct pipe_inode_info *pipe,
- struct splice_pipe_desc *spd)
+void splice_shrink_spd(struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ if (spd->nr_pages_max <= PIPE_DEF_BUFFERS)
return;

kfree(spd->pages);
@@ -316,6 +318,7 @@ __generic_file_splice_read(struct file *
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &page_cache_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -327,7 +330,7 @@ __generic_file_splice_read(struct file *
index = *ppos >> PAGE_CACHE_SHIFT;
loff = *ppos & ~PAGE_CACHE_MASK;
req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- nr_pages = min(req_pages, pipe->buffers);
+ nr_pages = min(req_pages, spd.nr_pages_max);

/*
* Lookup the (hopefully) full range of pages we need.
@@ -498,7 +501,7 @@ fill_it:
if (spd.nr_pages)
error = splice_to_pipe(pipe, &spd);

- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return error;
}

@@ -599,6 +602,7 @@ ssize_t default_file_splice_read(struct
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &default_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -609,8 +613,8 @@ ssize_t default_file_splice_read(struct

res = -ENOMEM;
vec = __vec;
- if (pipe->buffers > PIPE_DEF_BUFFERS) {
- vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL);
+ if (spd.nr_pages_max > PIPE_DEF_BUFFERS) {
+ vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL);
if (!vec)
goto shrink_ret;
}
@@ -618,7 +622,7 @@ ssize_t default_file_splice_read(struct
offset = *ppos & ~PAGE_CACHE_MASK;
nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;

- for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) {
+ for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) {
struct page *page;

page = alloc_page(GFP_USER);
@@ -666,7 +670,7 @@ ssize_t default_file_splice_read(struct
shrink_ret:
if (vec != __vec)
kfree(vec);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return res;

err:
@@ -1618,6 +1622,7 @@ static long vmsplice_to_pipe(struct file
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &user_page_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -1633,13 +1638,13 @@ static long vmsplice_to_pipe(struct file

spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages,
spd.partial, flags & SPLICE_F_GIFT,
- pipe->buffers);
+ spd.nr_pages_max);
if (spd.nr_pages <= 0)
ret = spd.nr_pages;
else
ret = splice_to_pipe(pipe, &spd);

- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
}

--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -51,7 +51,8 @@ struct partial_page {
struct splice_pipe_desc {
struct page **pages; /* page map */
struct partial_page *partial; /* pages[] may not be contig */
- int nr_pages; /* number of pages in map */
+ int nr_pages; /* number of populated pages in map */
+ unsigned int nr_pages_max; /* pages[] & partial[] arrays size */
unsigned int flags; /* splice flags */
const struct pipe_buf_operations *ops;/* ops associated with output pipe */
void (*spd_release)(struct splice_pipe_desc *, unsigned int);
@@ -85,8 +86,7 @@ extern ssize_t splice_direct_to_actor(st
/*
* for dynamic pipe sizing
*/
-extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *);
-extern void splice_shrink_spd(struct pipe_inode_info *,
- struct splice_pipe_desc *);
+extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
+extern void splice_shrink_spd(struct splice_pipe_desc *);

#endif
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struc
struct splice_pipe_desc spd = {
.pages = pages,
.nr_pages = 0,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.partial = partial,
.flags = flags,
.ops = &relay_pipe_buf_ops,
@@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struc
ret += padding;

out:
- splice_shrink_spd(pipe, &spd);
- return ret;
+ splice_shrink_spd(&spd);
+ return ret;
}

static ssize_t relay_file_splice_read(struct file *in,
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3364,6 +3364,7 @@ static ssize_t tracing_splice_read_pipe(
.pages = pages_def,
.partial = partial_def,
.nr_pages = 0, /* This gets updated below. */
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &tracing_pipe_buf_ops,
.spd_release = tracing_spd_release_pipe,
@@ -3435,7 +3436,7 @@ static ssize_t tracing_splice_read_pipe(

ret = splice_to_pipe(pipe, &spd);
out:
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;

out_err:
@@ -3848,6 +3849,7 @@ tracing_buffers_splice_read(struct file
struct splice_pipe_desc spd = {
.pages = pages_def,
.partial = partial_def,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &buffer_pipe_buf_ops,
.spd_release = buffer_spd_release,
@@ -3936,7 +3938,7 @@ tracing_buffers_splice_read(struct file
}

ret = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
out:
return ret;
}
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1535,6 +1535,7 @@ int skb_splice_bits(struct sk_buff *skb,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = MAX_SKB_FRAGS,
.flags = flags,
.ops = &sock_pipe_buf_ops,
.spd_release = sock_spd_release,
@@ -1581,7 +1582,7 @@ done:
lock_sock(sk);
}

- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
}


2013-10-03 04:06:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 11/13] mm: fix aio performance regression for database caused by THP

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Khalid Aziz <[email protected]>

commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream.

This patch needed to be backported due to changes to mm/swap.c some time
after 3.6 kernel.

I am working with a tool that simulates oracle database I/O workload.
This tool (orion to be specific -
<http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>)
allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then
does aio into these pages from flash disks using various common block
sizes used by database. I am looking at performance with two of the most
common block sizes - 1M and 64K. aio performance with these two block
sizes plunged after Transparent HugePages was introduced in the kernel.
Here are performance numbers:

pre-THP 2.6.39 3.11-rc5
1M read 8384 MB/s 5629 MB/s 6501 MB/s
64K read 7867 MB/s 4576 MB/s 4251 MB/s

I have narrowed the performance impact down to the overheads introduced by
THP in __get_page_tail() and put_compound_page() routines. perf top shows
>40% of cycles being spent in these two routines. Every time direct I/O
to hugetlbfs pages starts, kernel calls get_page() to grab a reference to
the pages and calls put_page() when I/O completes to put the reference
away. THP introduced significant amount of locking overhead to get_page()
and put_page() when dealing with compound pages because hugepages can be
split underneath get_page() and put_page(). It added this overhead
irrespective of whether it is dealing with hugetlbfs pages or transparent
hugepages. This resulted in 20%-45% drop in aio performance when using
hugetlbfs pages.

Since hugetlbfs pages can not be split, there is no reason to go through
all the locking overhead for these pages from what I can see. I added
code to __get_page_tail() and put_compound_page() to bypass all the
locking code when working with hugetlbfs pages. This improved performance
significantly. Performance numbers with this patch:

pre-THP 3.11-rc5 3.11-rc5 + Patch
1M read 8384 MB/s 6501 MB/s 8371 MB/s
64K read 7867 MB/s 4251 MB/s 6510 MB/s

Performance with 64K read is still lower than what it was before THP, but
still a 53% improvement. It does mean there is more work to be done but I
will take a 53% improvement for now.

Please take a look at the following patch and let me know if it looks
reasonable.

[[email protected]: tweak comments]
Signed-off-by: Khalid Aziz <[email protected]>
Cc: Pravin B Shelar <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/swap.c | 65 ++++++++++++++++++++++++++++++++++++++++++++------------------
1 file changed, 47 insertions(+), 18 deletions(-)

--- a/mm/swap.c
+++ b/mm/swap.c
@@ -41,6 +41,8 @@ static DEFINE_PER_CPU(struct pagevec[NR_
static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);

+int PageHuge(struct page *page);
+
/*
* This path almost never happens for VM activity - pages are normally
* freed via pagevecs. But it gets used by networking.
@@ -69,13 +71,26 @@ static void __put_compound_page(struct p
{
compound_page_dtor *dtor;

- __page_cache_release(page);
+ if (!PageHuge(page))
+ __page_cache_release(page);
dtor = get_compound_page_dtor(page);
(*dtor)(page);
}

static void put_compound_page(struct page *page)
{
+ /*
+ * hugetlbfs pages can not be split from under us. So if this
+ * is a hugetlbfs page, check refcount on head page and release
+ * the page if refcount is zero.
+ */
+ if (PageHuge(page)) {
+ page = compound_head(page);
+ if (put_page_testzero(page))
+ __put_compound_page(page);
+ return;
+ }
+
if (unlikely(PageTail(page))) {
/* __split_huge_page_refcount can run under us */
struct page *page_head = compound_trans_head(page);
@@ -158,26 +173,40 @@ bool __get_page_tail(struct page *page)
* proper PT lock that already serializes against
* split_huge_page().
*/
- unsigned long flags;
bool got = false;
- struct page *page_head = compound_trans_head(page);

- if (likely(page != page_head && get_page_unless_zero(page_head))) {
- /*
- * page_head wasn't a dangling pointer but it
- * may not be a head page anymore by the time
- * we obtain the lock. That is ok as long as it
- * can't be freed from under us.
- */
- flags = compound_lock_irqsave(page_head);
- /* here __split_huge_page_refcount won't run anymore */
- if (likely(PageTail(page))) {
- __get_page_tail_foll(page, false);
- got = true;
+ /*
+ * If this is a hugetlbfs page, it can not be split under
+ * us. Simply increment counts for tail page and its head page
+ */
+ if (PageHuge(page)) {
+ struct page *page_head;
+
+ page_head = compound_head(page);
+ atomic_inc(&page_head->_count);
+ got = true;
+ } else {
+ struct page *page_head = compound_trans_head(page);
+ unsigned long flags;
+
+ if (likely(page != page_head &&
+ get_page_unless_zero(page_head))) {
+ /*
+ * page_head wasn't a dangling pointer but it
+ * may not be a head page anymore by the time
+ * we obtain the lock. That is ok as long as it
+ * can't be freed from under us.
+ */
+ flags = compound_lock_irqsave(page_head);
+ /* here __split_huge_page_refcount won't run anymore */
+ if (likely(PageTail(page))) {
+ __get_page_tail_foll(page, false);
+ got = true;
+ }
+ compound_unlock_irqrestore(page_head, flags);
+ if (unlikely(!got))
+ put_page(page_head);
}
- compound_unlock_irqrestore(page_head, flags);
- if (unlikely(!got))
- put_page(page_head);
}
return got;
}

2013-10-03 04:06:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 04/13] xhci: Fix oops happening after address device timeout

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Nyman <[email protected]>

commit 284d20552461466b04d6bfeafeb1c47a8891b591 upstream.

When a command times out, the command ring is first aborted,
and then stopped. If the command ring is empty when it is stopped
the stop event will point to next command which is not yet set.
xHCI tries to handle this next event often causing an oops.

Don't handle command completion events on stopped cmd ring if ring is
empty.

This patch should be backported to kernels as old as 3.7, that contain
the commit b92cc66c047ff7cf587b318fe377061a353c120f "xHCI: add aborting
command ring function"

Signed-off-by: Mathias Nyman <[email protected]>
Reported-by: Giovanni <[email protected]>
Signed-off-by: Sarah Sharp <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/host/xhci-ring.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1377,6 +1377,12 @@ static void handle_cmd_completion(struct
inc_deq(xhci, xhci->cmd_ring, false);
return;
}
+ /* There is no command to handle if we get a stop event when the
+ * command ring is empty, event->cmd_trb points to the next
+ * unset command
+ */
+ if (xhci->cmd_ring->dequeue == xhci->cmd_ring->enqueue)
+ return;
}

switch (le32_to_cpu(xhci->cmd_ring->dequeue->generic.field[3])

2013-10-03 04:05:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [ 05/13] xhci: Fix race between ep halt and URB cancellation

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Wolter <[email protected]>

commit 526867c3ca0caa2e3e846cb993b0f961c33c2abb upstream.

The halted state of a endpoint cannot be cleared over CLEAR_HALT from a
user process, because the stopped_td variable was overwritten in the
handle_stopped_endpoint() function. So the xhci_endpoint_reset() function will
refuse the reset and communication with device can not run over this endpoint.
https://bugzilla.kernel.org/show_bug.cgi?id=60699

Signed-off-by: Florian Wolter <[email protected]>
Signed-off-by: Sarah Sharp <[email protected]>
Cc: Jonghwan Choi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/host/xhci-ring.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -882,8 +882,12 @@ remove_finished_td:
/* Otherwise ring the doorbell(s) to restart queued transfers */
ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
}
- ep->stopped_td = NULL;
- ep->stopped_trb = NULL;
+
+ /* Clear stopped_td and stopped_trb if endpoint is not halted */
+ if (!(ep->ep_state & EP_HALTED)) {
+ ep->stopped_td = NULL;
+ ep->stopped_trb = NULL;
+ }

/*
* Drop the lock and complete the URBs in the cancelled TD list.

2013-10-03 05:53:45

by Guenter Roeck

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.0.99 release.
> There are 13 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> and the diffstat can be found below.
>

Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.

mm/built-in.o: In function `__put_compound_page':
slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
mm/built-in.o: In function `put_compound_page':
slab.c:(.text+0xaab0): undefined reference to `PageHuge'
mm/built-in.o: In function `__get_page_tail':
slab.c:(.text+0xb178): undefined reference to `PageHuge'
make: *** [.tmp_vmlinux1] Error 1

More tomorrow.

Guenter

2013-10-03 12:58:18

by Christoph Biedl

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

Guenter Roeck wrote...

> On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
> >This is the start of the stable review cycle for the 3.0.99 release.

> Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
>
> mm/built-in.o: In function `__put_compound_page':
> slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
> mm/built-in.o: In function `put_compound_page':
> slab.c:(.text+0xaab0): undefined reference to `PageHuge'
> mm/built-in.o: In function `__get_page_tail':
> slab.c:(.text+0xb178): undefined reference to `PageHuge'
> make: *** [.tmp_vmlinux1] Error 1

This is obviously due to

| [ 11/13] mm: fix aio performance regression for database caused by THP

and happens if CONFIG_HUGETLB_PAGE is not set.

Looking closer, upstream commit 7cb2ef56 included linux/hugetlb.h
while the backport for 3.0 just defines PageHuge. Reverting that like
in the patch below causes the build to complete, and the resulting
kernel shows no anomalies here.

However did that backport, why was it done that way? Or did I miss an
important point?

Christoph

--- a/mm/swap.c
+++ b/mm/swap.c
@@ -31,6 +31,7 @@
#include <linux/backing-dev.h>
#include <linux/memcontrol.h>
#include <linux/gfp.h>
+#include <linux/hugetlb.h>

#include "internal.h"

@@ -41,8 +42,6 @@
static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);

-int PageHuge(struct page *page);
-
/*
* This path almost never happens for VM activity - pages are normally
* freed via pagevecs. But it gets used by networking.

2013-10-03 13:29:07

by Guenter Roeck

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/03/2013 05:47 AM, Christoph Biedl wrote:
> Guenter Roeck wrote...
>
>> On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 3.0.99 release.
>
>> Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
>>
>> mm/built-in.o: In function `__put_compound_page':
>> slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
>> mm/built-in.o: In function `put_compound_page':
>> slab.c:(.text+0xaab0): undefined reference to `PageHuge'
>> mm/built-in.o: In function `__get_page_tail':
>> slab.c:(.text+0xb178): undefined reference to `PageHuge'
>> make: *** [.tmp_vmlinux1] Error 1
>
> This is obviously due to
>
> | [ 11/13] mm: fix aio performance regression for database caused by THP
>
> and happens if CONFIG_HUGETLB_PAGE is not set.
>

Thanks a lot for tracking this down.

Final build result is
total: 98 pass: 15 skipped: 16 fail: 67

for 3.0, which is obviously less than perfect.

All qemu tests failed as well, or rather the qemu images failed to build.

Guenter

> Looking closer, upstream commit 7cb2ef56 included linux/hugetlb.h
> while the backport for 3.0 just defines PageHuge. Reverting that like
> in the patch below causes the build to complete, and the resulting
> kernel shows no anomalies here.
>
> However did that backport, why was it done that way? Or did I miss an
> important point?
>
> Christoph
>
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -31,6 +31,7 @@
> #include <linux/backing-dev.h>
> #include <linux/memcontrol.h>
> #include <linux/gfp.h>
> +#include <linux/hugetlb.h>
>
> #include "internal.h"
>
> @@ -41,8 +42,6 @@
> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
>
> -int PageHuge(struct page *page);
> -
> /*
> * This path almost never happens for VM activity - pages are normally
> * freed via pagevecs. But it gets used by networking.
>
>
>

2013-10-03 13:35:47

by Khalid Aziz

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/03/2013 06:47 AM, Christoph Biedl wrote:
> Guenter Roeck wrote...
>
>> On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 3.0.99 release.
>
>> Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
>>
>> mm/built-in.o: In function `__put_compound_page':
>> slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
>> mm/built-in.o: In function `put_compound_page':
>> slab.c:(.text+0xaab0): undefined reference to `PageHuge'
>> mm/built-in.o: In function `__get_page_tail':
>> slab.c:(.text+0xb178): undefined reference to `PageHuge'
>> make: *** [.tmp_vmlinux1] Error 1
>
> This is obviously due to
>
> | [ 11/13] mm: fix aio performance regression for database caused by THP
>
> and happens if CONFIG_HUGETLB_PAGE is not set.
>
> Looking closer, upstream commit 7cb2ef56 included linux/hugetlb.h
> while the backport for 3.0 just defines PageHuge. Reverting that like
> in the patch below causes the build to complete, and the resulting
> kernel shows no anomalies here.
>
> However did that backport, why was it done that way? Or did I miss an
> important point?

Thanks for tracking this down. I had not tried a configuration with
CONFIG_HUGETLB_PAGE not set. In my config, I was getting many multiple
definition errors for bunch of other defines from linux/hugetlb.h. I
will look at my config again but chances are I had something else
screwed up in my build since you did not see those errors. Did you
compile with CONFIG_HUGETLB_PAGE set after including linux/hugetlb.h? If
you did, including linux/hugetlb.h instead of importing just the
definition of PageHuge in mm/swap.c would be the right thing to do.

--
Khalid

>
> Christoph
>
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -31,6 +31,7 @@
> #include <linux/backing-dev.h>
> #include <linux/memcontrol.h>
> #include <linux/gfp.h>
> +#include <linux/hugetlb.h>
>
> #include "internal.h"
>
> @@ -41,8 +42,6 @@
> static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
> static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
>
> -int PageHuge(struct page *page);
> -
> /*
> * This path almost never happens for VM activity - pages are normally
> * freed via pagevecs. But it gets used by networking.
>

2013-10-03 14:41:14

by Christoph Biedl

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

Khalid Aziz wrote...

> Thanks for tracking this down. I had not tried a configuration with
> CONFIG_HUGETLB_PAGE not set. In my config, I was getting many
> multiple definition errors for bunch of other defines from
> linux/hugetlb.h. I will look at my config again but chances are I
> had something else screwed up in my build since you did not see
> those errors. Did you compile with CONFIG_HUGETLB_PAGE set after
> including linux/hugetlb.h? If you did, including linux/hugetlb.h
> instead of importing just the definition of PageHuge in mm/swap.c
> would be the right thing to do.

Yes, one of my configurations has CONFIG_HUGETLB_PAGE, also
CONFIG_NUMA=y, and the kernel built. Could not test it, though.

There still might be other configuration settings that caused the
error messages you've seen. Manually picking both PageHuge definitions
from linux/hugetlb.h should be a safe alternative then, but that's
ugly.

Christoph

2013-10-03 14:57:15

by Khalid Aziz

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/03/2013 08:41 AM, Christoph Biedl wrote:
> Khalid Aziz wrote...
>
>> Thanks for tracking this down. I had not tried a configuration with
>> CONFIG_HUGETLB_PAGE not set. In my config, I was getting many
>> multiple definition errors for bunch of other defines from
>> linux/hugetlb.h. I will look at my config again but chances are I
>> had something else screwed up in my build since you did not see
>> those errors. Did you compile with CONFIG_HUGETLB_PAGE set after
>> including linux/hugetlb.h? If you did, including linux/hugetlb.h
>> instead of importing just the definition of PageHuge in mm/swap.c
>> would be the right thing to do.
>
> Yes, one of my configurations has CONFIG_HUGETLB_PAGE, also
> CONFIG_NUMA=y, and the kernel built. Could not test it, though.
>
> There still might be other configuration settings that caused the
> error messages you've seen. Manually picking both PageHuge definitions
> from linux/hugetlb.h should be a safe alternative then, but that's
> ugly.
>
> Christoph
>

Including linux/hugetlb.h is the right thing to do here. I cleaned up my
build directories and started from scratch again. I tested with the old
config where I had seen errors and I did not see errors again. I must
have had something messed up in my old build directories.

Greg, please apply the patch Christoph had included in his earlier post.

Ben, this will apply to 3.2 as well.

Thanks,
Khalid

2013-10-03 15:13:03

by Khalid Aziz

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/03/2013 08:56 AM, Khalid Aziz wrote:
>
> Greg, please apply the patch Christoph had included in his earlier post.
>
> Ben, this will apply to 3.2 as well.
>

Better yet, just pull this patch from stable from now. I will redo the
patch and send another one for the next round.

Thanks,
Khalid

2013-10-03 15:56:28

by Guenter Roeck

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Thu, Oct 03, 2013 at 07:35:35AM -0600, Khalid Aziz wrote:
> On 10/03/2013 06:47 AM, Christoph Biedl wrote:
> >Guenter Roeck wrote...
> >
> >>On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
> >>>This is the start of the stable review cycle for the 3.0.99 release.
> >
> >>Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
> >>
> >>mm/built-in.o: In function `__put_compound_page':
> >>slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
> >>mm/built-in.o: In function `put_compound_page':
> >>slab.c:(.text+0xaab0): undefined reference to `PageHuge'
> >>mm/built-in.o: In function `__get_page_tail':
> >>slab.c:(.text+0xb178): undefined reference to `PageHuge'
> >>make: *** [.tmp_vmlinux1] Error 1
> >
> >This is obviously due to
> >
> >| [ 11/13] mm: fix aio performance regression for database caused by THP
> >
> >and happens if CONFIG_HUGETLB_PAGE is not set.
> >
> >Looking closer, upstream commit 7cb2ef56 included linux/hugetlb.h
> >while the backport for 3.0 just defines PageHuge. Reverting that like
> >in the patch below causes the build to complete, and the resulting
> >kernel shows no anomalies here.
> >
> >However did that backport, why was it done that way? Or did I miss an
> >important point?
>
> Thanks for tracking this down. I had not tried a configuration with
> CONFIG_HUGETLB_PAGE not set. In my config, I was getting many
> multiple definition errors for bunch of other defines from
> linux/hugetlb.h. I will look at my config again but chances are I
> had something else screwed up in my build since you did not see
> those errors. Did you compile with CONFIG_HUGETLB_PAGE set after
> including linux/hugetlb.h? If you did, including linux/hugetlb.h
> instead of importing just the definition of PageHuge in mm/swap.c
> would be the right thing to do.
>

For my part, what I do is to compile lots of standard configurations.
I don't look into details. If you are interested, go to
http://server.roeck-us.net:8010/builders, click on any of the many
failed builds for 3.0 or 3.4, then click on "stdio" on the build page.
You'll see the build log, which also lists the names of the failed
configurations.

An easy start might be x86_64:allnoconfig or i386:allnoconfig,
both of which fail.

Thanks,
Guenter

> --
> Khalid
>
> >
> > Christoph
> >
> >--- a/mm/swap.c
> >+++ b/mm/swap.c
> >@@ -31,6 +31,7 @@
> > #include <linux/backing-dev.h>
> > #include <linux/memcontrol.h>
> > #include <linux/gfp.h>
> >+#include <linux/hugetlb.h>
> >
> > #include "internal.h"
> >
> >@@ -41,8 +42,6 @@
> > static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
> > static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
> >
> >-int PageHuge(struct page *page);
> >-
> > /*
> > * This path almost never happens for VM activity - pages are normally
> > * freed via pagevecs. But it gets used by networking.
> >
>
>

2013-10-03 18:34:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Thu, Oct 03, 2013 at 09:12:47AM -0600, Khalid Aziz wrote:
> On 10/03/2013 08:56 AM, Khalid Aziz wrote:
> >
> > Greg, please apply the patch Christoph had included in his earlier post.
> >
> > Ben, this will apply to 3.2 as well.
> >
>
> Better yet, just pull this patch from stable from now. I will redo the
> patch and send another one for the next round.

Now removed from 3.0 and I'll drop it from 3.4 as well, thanks.

greg k-h

2013-10-03 18:36:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Wed, Oct 02, 2013 at 09:04:25PM -0700, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.0.99 release.
> There are 13 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> and the diffstat can be found below.

Due to build problems in -rc1, I've dropped one patch and now have
posted a -rc2 patch:

kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc2.gz

thanks,

greg k-h

2013-10-03 18:40:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Wed, Oct 02, 2013 at 10:53:39PM -0700, Guenter Roeck wrote:
> On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.0.99 release.
> > There are 13 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> > and the diffstat can be found below.
> >
>
> Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
>
> mm/built-in.o: In function `__put_compound_page':
> slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
> mm/built-in.o: In function `put_compound_page':
> slab.c:(.text+0xaab0): undefined reference to `PageHuge'
> mm/built-in.o: In function `__get_page_tail':
> slab.c:(.text+0xb178): undefined reference to `PageHuge'
> make: *** [.tmp_vmlinux1] Error 1
>
> More tomorrow.

Should now be fixed, if not, please let me know.

thanks,

greg k-h

2013-10-03 19:15:27

by Christoph Biedl

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

Khalid Aziz wrote...

> Better yet, just pull this patch from stable from now. I will redo
> the patch and send another one for the next round.

FYI, after patching mm/swap.c accordingly, all the 3.0 and 3.4
configurations I use do build. Some boot tests will follow, I'll
follow up only if I see unusual behaviour.

Christoph

2013-10-03 20:03:52

by Khalid Aziz

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/03/2013 01:15 PM, Christoph Biedl wrote:
> Khalid Aziz wrote...
>
>> Better yet, just pull this patch from stable from now. I will redo
>> the patch and send another one for the next round.
>
> FYI, after patching mm/swap.c accordingly, all the 3.0 and 3.4
> configurations I use do build. Some boot tests will follow, I'll
> follow up only if I see unusual behaviour.
>
> Christoph
>

Thanks for testing, Christoph. I will create a v2 of this patch with
this change for the next round of stable kernels.

--
Khalid

2013-10-03 21:18:28

by Guenter Roeck

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Thu, Oct 03, 2013 at 11:40:48AM -0700, Greg Kroah-Hartman wrote:
> On Wed, Oct 02, 2013 at 10:53:39PM -0700, Guenter Roeck wrote:
> > On 10/02/2013 09:04 PM, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 3.0.99 release.
> > > There are 13 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> > > and the diffstat can be found below.
> > >
> >
> > Heads up: I am getting lots of build failures in 3.0 and 3.4 builds.
> >
> > mm/built-in.o: In function `__put_compound_page':
> > slab.c:(.text+0xaa3c): undefined reference to `PageHuge'
> > mm/built-in.o: In function `put_compound_page':
> > slab.c:(.text+0xaab0): undefined reference to `PageHuge'
> > mm/built-in.o: In function `__get_page_tail':
> > slab.c:(.text+0xb178): undefined reference to `PageHuge'
> > make: *** [.tmp_vmlinux1] Error 1
> >
> > More tomorrow.
>
> Should now be fixed, if not, please let me know.
>
Yes, much better:
total: 98 pass: 71 skipped: 16 fail: 11
This matches the results from the previous release.

qemu tests also pass.

Details are at http://server.roeck-us.net:8010/builders, as usual.

Thanks,
Guenter

2013-10-04 00:16:26

by Shuah Khan

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On 10/02/2013 10:04 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.0.99 release.
> There are 13 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Patch testing: 3.0.99-rc1 patch applied with white-space warnings and
3.0.99-rc2 applied cleanly.

Tested 3.0.99-rc1 and 3.0.99-rc2
Compile testing: 3.0.99-rc1 and 3.0.99-rc2 Passed
Boot testing: 3.0.99-rc1 and 3.0.99-rc2 Passed
dmesg regression testing: passed. dmesgs look good. No regressions
compared to the previous dmesgs for this release. dmesg emerg, crit,
alert, err are clean. No regressions in warn.

Test systems

Samsung Series 9 900X4C Intel Corei5 (3.4 and later)
HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics
HP Compaq dc7700 SFF desktop: x86-64 Intel Core-i2 (cross-compile
testing)

Cross-compile tests results

alpha defconfig Passed
arm defconfig Passed
arm64 defconfig Not applicable
blackfin defconfig Passed
c6x defconfig Not applicable
mips defconfig Passed
mipsel defconfig Passed
powerpc wii_defconfig Passed
sh defconfig Passed
sparc defconfig Passed
tile tilegx_defconfig Passed

-- Shuah

--
Shuah Khan
Senior Linux Kernel Developer - Open Source Group
Samsung Research America(Silicon Valley)
[email protected] | (970) 672-0658

2013-10-04 02:44:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [ 00/13] 3.0.99-stable review

On Thu, Oct 03, 2013 at 06:16:20PM -0600, Shuah Khan wrote:
> On 10/02/2013 10:04 PM, Greg Kroah-Hartman wrote:
> >This is the start of the stable review cycle for the 3.0.99 release.
> >There are 13 patches in this series, all will be posted as a response
> >to this one. If anyone has any issues with these being applied, please
> >let me know.
> >
> >Responses should be made by Sat Oct 5 04:03:47 UTC 2013.
> >Anything received after that time might be too late.
> >
> >The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.99-rc1.gz
> >and the diffstat can be found below.
> >
> >thanks,
> >
> >greg k-h
> >
>
> Patch testing: 3.0.99-rc1 patch applied with white-space warnings
> and 3.0.99-rc2 applied cleanly.
>
> Tested 3.0.99-rc1 and 3.0.99-rc2
> Compile testing: 3.0.99-rc1 and 3.0.99-rc2 Passed
> Boot testing: 3.0.99-rc1 and 3.0.99-rc2 Passed
> dmesg regression testing: passed. dmesgs look good. No regressions
> compared to the previous dmesgs for this release. dmesg emerg, crit,
> alert, err are clean. No regressions in warn.

Thanks for testing both of these, sorry for the mess with -rc2.

greg k-h