LinuxLists.cc - [ 00/11] 3.0.74-stable review

2013-04-15 02:17:49

Subject: [ 00/11] 3.0.74-stable review

This is the start of the stable review cycle for the 3.0.74 release.
There are 11 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Apr 17 02:05:34 UTC 2013.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.74-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.0.74-rc1

Hayes Wang <[email protected]>
r8169: fix auto speed down issue

Linus Torvalds <[email protected]>
mtdchar: fix offset overflow detection

Boris Ostrovsky <[email protected]>
x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal

Samu Kallio <[email protected]>
x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates

Thomas Gleixner <[email protected]>
sched_clock: Prevent 64bit inatomicity on 32bit systems

Nicholas Bellinger <[email protected]>
target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs

Huacai Chen <[email protected]>
PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()

Namhyung Kim <[email protected]>
tracing: Fix double free when function profile init failed

Alban Bedel <[email protected]>
ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running

Dave Hansen <[email protected]>
x86-32, mm: Rip out x86_32 NUMA remapping code

Eldad Zack <[email protected]>
ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_*

-------------

Diffstat:

Makefile | 4 +-
arch/x86/include/asm/paravirt.h | 5 +-
arch/x86/include/asm/paravirt_types.h | 2 +
arch/x86/kernel/paravirt.c | 25 +++---
arch/x86/lguest/boot.c | 1 +
arch/x86/mm/fault.c | 6 +-
arch/x86/mm/numa_32.c | 161 ----------------------------------
arch/x86/xen/mmu.c | 1 +
drivers/mtd/mtdchar.c | 48 ++++++++--
drivers/net/r8169.c | 30 ++++++-
drivers/target/target_core_alua.c | 3 +
kernel/sched_clock.c | 26 ++++++
kernel/sys.c | 3 +-
kernel/trace/ftrace.c | 1 -
sound/soc/codecs/wm8903.c | 2 +
sound/usb/mixer_quirks.c | 4 +-
sound/usb/quirks.c | 2 +-
17 files changed, 131 insertions(+), 193 deletions(-)

2013-04-15 02:17:59

by Greg KH

[permalink] [raw]

Subject: [ 03/11] ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alban Bedel <[email protected]>

commit f1ca493b0b5e8f42d3b2dc8877860db2983f47b6 upstream.

The Charge Pump needs the DSP clock to work properly, without it the
bypass to HP/LINEOUT is not working properly. This requirement is not
mentioned in the datasheet but has been confirmed by Mark Brown from
Wolfson.

Signed-off-by: Alban Bedel <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/wm8903.c | 2 ++
1 file changed, 2 insertions(+)

--- a/sound/soc/codecs/wm8903.c
+++ b/sound/soc/codecs/wm8903.c
@@ -1101,6 +1101,8 @@ static const struct snd_soc_dapm_route w
{ "ROP", NULL, "Right Speaker PGA" },
{ "RON", NULL, "Right Speaker PGA" },

+ { "Charge Pump", NULL, "CLK_DSP" },
+
{ "Left Headphone Output PGA", NULL, "Charge Pump" },
{ "Right Headphone Output PGA", NULL, "Charge Pump" },
{ "Left Line Output PGA", NULL, "Charge Pump" },

2013-04-15 02:18:19

by Greg KH

[permalink] [raw]

Subject: [ 02/11] x86-32, mm: Rip out x86_32 NUMA remapping code

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Hansen <[email protected]>

commit f03574f2d5b2d6229dcdf2d322848065f72953c7 upstream.

[was already included in 3.0, but I missed the patch hunk for
arch/x86/mm/numa_32.c - gregkh]

This code was an optimization for 32-bit NUMA systems.

It has probably been the cause of a number of subtle bugs over
the years, although the conditions to excite them would have
been hard to trigger. Essentially, we remap part of the kernel
linear mapping area, and then sometimes part of that area gets
freed back in to the bootmem allocator. If those pages get
used by kernel data structures (say mem_map[] or a dentry),
there's no big deal. But, if anyone ever tried to use the
linear mapping for these pages _and_ cared about their physical
address, bad things happen.

For instance, say you passed __GFP_ZERO to the page allocator
and then happened to get handed one of these pages, it zero the
remapped page, but it would make a pte to the _old_ page.
There are probably a hundred other ways that it could screw
with things.

We don't need to hang on to performance optimizations for
these old boxes any more. All my 32-bit NUMA systems are long
dead and buried, and I probably had access to more than most
people.

This code is causing real things to break today:

https://lkml.org/lkml/2013/1/9/376

I looked in to actually fixing this, but it requires surgery
to way too much brittle code, as well as stuff like
per_cpu_ptr_to_phys().

[ hpa: Cc: this for -stable, since it is a memory corruption issue.
However, an alternative is to simply mark NUMA as depends BROKEN
rather than EXPERIMENTAL in the X86_32 subclause... ]

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Jiri Slaby <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/mm/numa_32.c | 161 --------------------------------------------------
1 file changed, 161 deletions(-)

--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -73,167 +73,6 @@ unsigned long node_memmap_size_bytes(int

extern unsigned long highend_pfn, highstart_pfn;

-#define LARGE_PAGE_BYTES (PTRS_PER_PTE * PAGE_SIZE)
-
-static void *node_remap_start_vaddr[MAX_NUMNODES];
-void set_pmd_pfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags);
-
-/*
- * Remap memory allocator
- */
-static unsigned long node_remap_start_pfn[MAX_NUMNODES];
-static void *node_remap_end_vaddr[MAX_NUMNODES];
-static void *node_remap_alloc_vaddr[MAX_NUMNODES];
-
-/**
- * alloc_remap - Allocate remapped memory
- * @nid: NUMA node to allocate memory from
- * @size: The size of allocation
- *
- * Allocate @size bytes from the remap area of NUMA node @nid. The
- * size of the remap area is predetermined by init_alloc_remap() and
- * only the callers considered there should call this function. For
- * more info, please read the comment on top of init_alloc_remap().
- *
- * The caller must be ready to handle allocation failure from this
- * function and fall back to regular memory allocator in such cases.
- *
- * CONTEXT:
- * Single CPU early boot context.
- *
- * RETURNS:
- * Pointer to the allocated memory on success, %NULL on failure.
- */
-void *alloc_remap(int nid, unsigned long size)
-{
- void *allocation = node_remap_alloc_vaddr[nid];
-
- size = ALIGN(size, L1_CACHE_BYTES);
-
- if (!allocation || (allocation + size) > node_remap_end_vaddr[nid])
- return NULL;
-
- node_remap_alloc_vaddr[nid] += size;
- memset(allocation, 0, size);
-
- return allocation;
-}
-
-#ifdef CONFIG_HIBERNATION
-/**
- * resume_map_numa_kva - add KVA mapping to the temporary page tables created
- * during resume from hibernation
- * @pgd_base - temporary resume page directory
- */
-void resume_map_numa_kva(pgd_t *pgd_base)
-{
- int node;
-
- for_each_online_node(node) {
- unsigned long start_va, start_pfn, nr_pages, pfn;
-
- start_va = (unsigned long)node_remap_start_vaddr[node];
- start_pfn = node_remap_start_pfn[node];
- nr_pages = (node_remap_end_vaddr[node] -
- node_remap_start_vaddr[node]) >> PAGE_SHIFT;
-
- printk(KERN_DEBUG "%s: node %d\n", __func__, node);
-
- for (pfn = 0; pfn < nr_pages; pfn += PTRS_PER_PTE) {
- unsigned long vaddr = start_va + (pfn << PAGE_SHIFT);
- pgd_t *pgd = pgd_base + pgd_index(vaddr);
- pud_t *pud = pud_offset(pgd, vaddr);
- pmd_t *pmd = pmd_offset(pud, vaddr);
-
- set_pmd(pmd, pfn_pmd(start_pfn + pfn,
- PAGE_KERNEL_LARGE_EXEC));
-
- printk(KERN_DEBUG "%s: %08lx -> pfn %08lx\n",
- __func__, vaddr, start_pfn + pfn);
- }
- }
-}
-#endif
-
-/**
- * init_alloc_remap - Initialize remap allocator for a NUMA node
- * @nid: NUMA node to initizlie remap allocator for
- *
- * NUMA nodes may end up without any lowmem. As allocating pgdat and
- * memmap on a different node with lowmem is inefficient, a special
- * remap allocator is implemented which can be used by alloc_remap().
- *
- * For each node, the amount of memory which will be necessary for
- * pgdat and memmap is calculated and two memory areas of the size are
- * allocated - one in the node and the other in lowmem; then, the area
- * in the node is remapped to the lowmem area.
- *
- * As pgdat and memmap must be allocated in lowmem anyway, this
- * doesn't waste lowmem address space; however, the actual lowmem
- * which gets remapped over is wasted. The amount shouldn't be
- * problematic on machines this feature will be used.
- *
- * Initialization failure isn't fatal. alloc_remap() is used
- * opportunistically and the callers will fall back to other memory
- * allocation mechanisms on failure.
- */
-void __init init_alloc_remap(int nid, u64 start, u64 end)
-{
- unsigned long start_pfn = start >> PAGE_SHIFT;
- unsigned long end_pfn = end >> PAGE_SHIFT;
- unsigned long size, pfn;
- u64 node_pa, remap_pa;
- void *remap_va;
-
- /*
- * The acpi/srat node info can show hot-add memroy zones where
- * memory could be added but not currently present.
- */
- printk(KERN_DEBUG "node %d pfn: [%lx - %lx]\n",
- nid, start_pfn, end_pfn);
-
- /* calculate the necessary space aligned to large page size */
- size = node_memmap_size_bytes(nid, start_pfn, end_pfn);
- size += ALIGN(sizeof(pg_data_t), PAGE_SIZE);
- size = ALIGN(size, LARGE_PAGE_BYTES);
-
- /* allocate node memory and the lowmem remap area */
- node_pa = memblock_find_in_range(start, end, size, LARGE_PAGE_BYTES);
- if (node_pa == MEMBLOCK_ERROR) {
- pr_warning("remap_alloc: failed to allocate %lu bytes for node %d\n",
- size, nid);
- return;
- }
- memblock_x86_reserve_range(node_pa, node_pa + size, "KVA RAM");
-
- remap_pa = memblock_find_in_range(min_low_pfn << PAGE_SHIFT,
- max_low_pfn << PAGE_SHIFT,
- size, LARGE_PAGE_BYTES);
- if (remap_pa == MEMBLOCK_ERROR) {
- pr_warning("remap_alloc: failed to allocate %lu bytes remap area for node %d\n",
- size, nid);
- memblock_x86_free_range(node_pa, node_pa + size);
- return;
- }
- memblock_x86_reserve_range(remap_pa, remap_pa + size, "KVA PG");
- remap_va = phys_to_virt(remap_pa);
-
- /* perform actual remap */
- for (pfn = 0; pfn < size >> PAGE_SHIFT; pfn += PTRS_PER_PTE)
- set_pmd_pfn((unsigned long)remap_va + (pfn << PAGE_SHIFT),
- (node_pa >> PAGE_SHIFT) + pfn,
- PAGE_KERNEL_LARGE);
-
- /* initialize remap allocator parameters */
- node_remap_start_pfn[nid] = node_pa >> PAGE_SHIFT;
- node_remap_start_vaddr[nid] = remap_va;
- node_remap_end_vaddr[nid] = remap_va + size;
- node_remap_alloc_vaddr[nid] = remap_va;
-
- printk(KERN_DEBUG "remap_alloc: node %d [%08llx-%08llx) -> [%p-%p)\n",
- nid, node_pa, node_pa + size, remap_va, remap_va + size);
-}
-
void __init initmem_init(void)
{
x86_numa_init();

2013-04-15 02:18:11

by Greg KH

[permalink] [raw]

Subject: [ 08/11] x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Samu Kallio <[email protected]>

commit 1160c2779b826c6f5c08e5cc542de58fd1f667d5 upstream.

In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.

One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:

- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries

The OOPs usually looks as so:

------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b

Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.

RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <[email protected]>
Reported-and-Tested-by: Krishna Raman <[email protected]>
Signed-off-by: Samu Kallio <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Tested-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/mm/fault.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -376,10 +376,12 @@ static noinline __kprobes int vmalloc_fa
if (pgd_none(*pgd_ref))
return -1;

- if (pgd_none(*pgd))
+ if (pgd_none(*pgd)) {
set_pgd(pgd, *pgd_ref);
- else
+ arch_flush_lazy_mmu_mode();
+ } else {
BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+ }

/*
* Below here mismatches are bugs because these lower tables

2013-04-15 02:18:15

by Greg KH

[permalink] [raw]

Subject: [ 10/11] mtdchar: fix offset overflow detection

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <[email protected]>

commit 9c603e53d380459fb62fec7cd085acb0b74ac18f upstream.

Sasha Levin has been running trinity in a KVM tools guest, and was able
to trigger the BUG_ON() at arch/x86/mm/pat.c:279 (verifying the range of
the memory type). The call trace showed that it was mtdchar_mmap() that
created an invalid remap_pfn_range().

The problem is that mtdchar_mmap() does various really odd and subtle
things with the vma page offset etc, and uses the wrong types (and the
wrong overflow) detection for it.

For example, the page offset may well be 32-bit on a 32-bit
architecture, but after shifting it up by PAGE_SHIFT, we need to use a
potentially 64-bit resource_size_t to correctly hold the full value.

Also, we need to check that the vma length plus offset doesn't overflow
before we check that it is smaller than the length of the mtdmap region.

This fixes things up and tries to make the code a bit easier to read.

Reported-and-tested-by: Sasha Levin <[email protected]>
Acked-by: Suresh Siddha <[email protected]>
Acked-by: Artem Bityutskiy <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: Brad Spengler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/mtd/mtdchar.c | 48 ++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 42 insertions(+), 6 deletions(-)

--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -1064,6 +1064,33 @@ static unsigned long mtd_get_unmapped_ar
}
#endif

+static inline unsigned long get_vm_size(struct vm_area_struct *vma)
+{
+ return vma->vm_end - vma->vm_start;
+}
+
+static inline resource_size_t get_vm_offset(struct vm_area_struct *vma)
+{
+ return (resource_size_t) vma->vm_pgoff << PAGE_SHIFT;
+}
+
+/*
+ * Set a new vm offset.
+ *
+ * Verify that the incoming offset really works as a page offset,
+ * and that the offset and size fit in a resource_size_t.
+ */
+static inline int set_vm_offset(struct vm_area_struct *vma, resource_size_t off)
+{
+ pgoff_t pgoff = off >> PAGE_SHIFT;
+ if (off != (resource_size_t) pgoff << PAGE_SHIFT)
+ return -EINVAL;
+ if (off + get_vm_size(vma) - 1 < off)
+ return -EINVAL;
+ vma->vm_pgoff = pgoff;
+ return 0;
+}
+
/*
* set up a mapping for shared memory segments
*/
@@ -1073,20 +1100,29 @@ static int mtd_mmap(struct file *file, s
struct mtd_file_info *mfi = file->private_data;
struct mtd_info *mtd = mfi->mtd;
struct map_info *map = mtd->priv;
- unsigned long start;
- unsigned long off;
- u32 len;
+ resource_size_t start, off;
+ unsigned long len, vma_len;

if (mtd->type == MTD_RAM || mtd->type == MTD_ROM) {
- off = vma->vm_pgoff << PAGE_SHIFT;
+ off = get_vm_offset(vma);
start = map->phys;
len = PAGE_ALIGN((start & ~PAGE_MASK) + map->size);
start &= PAGE_MASK;
- if ((vma->vm_end - vma->vm_start + off) > len)
+ vma_len = get_vm_size(vma);
+
+ /* Overflow in off+len? */
+ if (vma_len + off < off)
+ return -EINVAL;
+ /* Does it fit in the mapping? */
+ if (vma_len + off > len)
return -EINVAL;

off += start;
- vma->vm_pgoff = off >> PAGE_SHIFT;
+ /* Did that overflow? */
+ if (off < start)
+ return -EINVAL;
+ if (set_vm_offset(vma, off) < 0)
+ return -EINVAL;
vma->vm_flags |= VM_IO | VM_RESERVED;

#ifdef pgprot_noncached

2013-04-15 02:18:03

by Greg KH

[permalink] [raw]

Subject: [ 05/11] PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Huacai Chen <[email protected]>

commit 6f389a8f1dd22a24f3d9afc2812b30d639e94625 upstream.

As commit 40dc166c (PM / Core: Introduce struct syscore_ops for core
subsystems PM) say, syscore_ops operations should be carried with one
CPU on-line and interrupts disabled. However, after commit f96972f2d
(kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()),
syscore_shutdown() is called before disable_nonboot_cpus(), so break
the rules. We have a MIPS machine with a 8259A PIC, and there is an
external timer (HPET) linked at 8259A. Since 8259A has been shutdown
too early (by syscore_shutdown()), disable_nonboot_cpus() runs without
timer interrupt, so it hangs and reboot fails. This patch call
syscore_shutdown() a little later (after disable_nonboot_cpus()) to
avoid reboot failure, this is the same way as poweroff does.

For consistency, add disable_nonboot_cpus() to kernel_halt().

Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/sys.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -320,7 +320,6 @@ void kernel_restart_prepare(char *cmd)
system_state = SYSTEM_RESTART;
usermodehelper_disable();
device_shutdown();
- syscore_shutdown();
}

/**
@@ -335,6 +334,7 @@ void kernel_restart(char *cmd)
{
kernel_restart_prepare(cmd);
disable_nonboot_cpus();
+ syscore_shutdown();
if (!cmd)
printk(KERN_EMERG "Restarting system.\n");
else
@@ -360,6 +360,7 @@ static void kernel_shutdown_prepare(enum
void kernel_halt(void)
{
kernel_shutdown_prepare(SYSTEM_HALT);
+ disable_nonboot_cpus();
syscore_shutdown();
printk(KERN_EMERG "System halted.\n");
kmsg_dump(KMSG_DUMP_HALT);

2013-04-15 02:18:46

by Greg KH

[permalink] [raw]

Subject: [ 11/11] r8169: fix auto speed down issue

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Hayes Wang <[email protected]>

commit e2409d83434d77874b461b78af6a19cd6e6a1280 upstream.

It would cause no link after suspending or shutdowning when the
nic changes the speed to 10M and connects to a link partner which
forces the speed to 100M.

Check the link partner ability to determine which speed to set.

The link speed down code path is not factored in this kernel version.

Signed-off-by: Hayes Wang <[email protected]>
Acked-by: Francois Romieu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/r8169.c | 30 ++++++++++++++++++++++++++----
1 file changed, 26 insertions(+), 4 deletions(-)

--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3105,11 +3105,34 @@ static void r810x_phy_power_up(struct rt
rtl_writephy(tp, MII_BMCR, BMCR_ANENABLE);
}

+static void rtl_speed_down(struct rtl8169_private *tp)
+{
+ u32 adv;
+ int lpa;
+
+ rtl_writephy(tp, 0x1f, 0x0000);
+ lpa = rtl_readphy(tp, MII_LPA);
+
+ if (lpa & (LPA_10HALF | LPA_10FULL))
+ adv = ADVERTISED_10baseT_Half | ADVERTISED_10baseT_Full;
+ else if (lpa & (LPA_100HALF | LPA_100FULL))
+ adv = ADVERTISED_10baseT_Half | ADVERTISED_10baseT_Full |
+ ADVERTISED_100baseT_Half | ADVERTISED_100baseT_Full;
+ else
+ adv = ADVERTISED_10baseT_Half | ADVERTISED_10baseT_Full |
+ ADVERTISED_100baseT_Half | ADVERTISED_100baseT_Full |
+ (tp->mii.supports_gmii ?
+ ADVERTISED_1000baseT_Half |
+ ADVERTISED_1000baseT_Full : 0);
+
+ rtl8169_set_speed(tp->dev, AUTONEG_ENABLE, SPEED_1000, DUPLEX_FULL,
+ adv);
+}
+
static void r810x_pll_power_down(struct rtl8169_private *tp)
{
if (__rtl8169_get_wol(tp) & WAKE_ANY) {
- rtl_writephy(tp, 0x1f, 0x0000);
- rtl_writephy(tp, MII_BMCR, 0x0000);
+ rtl_speed_down(tp);
return;
}

@@ -3201,8 +3224,7 @@ static void r8168_pll_power_down(struct
rtl_ephy_write(ioaddr, 0x19, 0xff64);

if (__rtl8169_get_wol(tp) & WAKE_ANY) {
- rtl_writephy(tp, 0x1f, 0x0000);
- rtl_writephy(tp, MII_BMCR, 0x0000);
+ rtl_speed_down(tp);

if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
tp->mac_version == RTL_GIGA_MAC_VER_33)

2013-04-15 02:18:08

by Greg KH

[permalink] [raw]

Subject: [ 07/11] sched_clock: Prevent 64bit inatomicity on 32bit systems

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <[email protected]>

commit a1cbcaa9ea87b87a96b9fc465951dcf36e459ca2 upstream.

The sched_clock_remote() implementation has the following inatomicity
problem on 32bit systems when accessing the remote scd->clock, which
is a 64bit value.

CPU0 CPU1

sched_clock_local() sched_clock_remote(CPU0)
...
remote_clock = scd[CPU0]->clock
read_low32bit(scd[CPU0]->clock)
cmpxchg64(scd->clock,...)
read_high32bit(scd[CPU0]->clock)

While the update of scd->clock is using an atomic64 mechanism, the
readout on the remote cpu is not, which can cause completely bogus
readouts.

It is a quite rare problem, because it requires the update to hit the
narrow race window between the low/high readout and the update must go
across the 32bit boundary.

The resulting misbehaviour is, that CPU1 will see the sched_clock on
CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
to this bogus timestamp. This stays that way due to the clamping
implementation for about 4 seconds until the synchronization with
CLOCK_MONOTONIC undoes the problem.

The issue is hard to observe, because it might only result in a less
accurate SCHED_OTHER timeslicing behaviour. To create observable
damage on realtime scheduling classes, it is necessary that the bogus
update of CPU1 sched_clock happens in the context of an realtime
thread, which then gets charged 4 seconds of RT runtime, which results
in the RT throttler mechanism to trigger and prevent scheduling of RT
tasks for a little less than 4 seconds. So this is quite unlikely as
well.

The issue was quite hard to decode as the reproduction time is between
2 days and 3 weeks and intrusive tracing makes it less likely, but the
following trace recorded with trace_clock=global, which uses
sched_clock_local(), gave the final hint:

<idle>-0 0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
<idle>-0 0d..30 400269.477151: hrtimer_start: hrtimer=0xf7061e80 ...
irq/20-S-587 1d..32 400273.772118: sched_wakeup: comm= ... target_cpu=0
<idle>-0 0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80

What happens is that CPU0 goes idle and invokes
sched_clock_idle_sleep_event() which invokes sched_clock_local() and
CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
sched_remote_clock(). The time jump gets propagated to CPU0 via
sched_remote_clock() and stays stale on both cores for ~4 seconds.

There are only two other possibilities, which could cause a stale
sched clock:

1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
wrong value.

2) sched_clock() which reads the TSC returns a sporadic wrong value.

#1 can be excluded because sched_clock would continue to increase for
one jiffy and then go stale.

#2 can be excluded because it would not make the clock jump
forward. It would just result in a stale sched_clock for one jiffy.

After quite some brain twisting and finding the same pattern on other
traces, sched_clock_remote() remained the only place which could cause
such a problem and as explained above it's indeed racy on 32bit
systems.

So while on 64bit systems the readout is atomic, we need to verify the
remote readout on 32bit machines. We need to protect the local->clock
readout in sched_clock_remote() on 32bit as well because an NMI could
hit between the low and the high readout, call sched_clock_local() and
modify local->clock.

Thanks to Siegfried Wulsch for bearing with my debug requests and
going through the tedious tasks of running a bunch of reproducer
systems to generate the debug information which let me decode the
issue.

Reported-by: Siegfried Wulsch <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionos
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/sched_clock.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

--- a/kernel/sched_clock.c
+++ b/kernel/sched_clock.c
@@ -176,10 +176,36 @@ static u64 sched_clock_remote(struct sch
u64 this_clock, remote_clock;
u64 *ptr, old_val, val;

+#if BITS_PER_LONG != 64
+again:
+ /*
+ * Careful here: The local and the remote clock values need to
+ * be read out atomic as we need to compare the values and
+ * then update either the local or the remote side. So the
+ * cmpxchg64 below only protects one readout.
+ *
+ * We must reread via sched_clock_local() in the retry case on
+ * 32bit as an NMI could use sched_clock_local() via the
+ * tracer and hit between the readout of
+ * the low32bit and the high 32bit portion.
+ */
+ this_clock = sched_clock_local(my_scd);
+ /*
+ * We must enforce atomic readout on 32bit, otherwise the
+ * update on the remote cpu can hit inbetween the readout of
+ * the low32bit and the high 32bit portion.
+ */
+ remote_clock = cmpxchg64(&scd->clock, 0, 0);
+#else
+ /*
+ * On 64bit the read of [my]scd->clock is atomic versus the
+ * update, so we can avoid the above 32bit dance.
+ */
sched_clock_local(my_scd);
again:
this_clock = my_scd->clock;
remote_clock = scd->clock;
+#endif

/*
* Use the opportunity that we have both locks

2013-04-15 02:19:14

by Greg KH

[permalink] [raw]

Subject: [ 09/11] x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Boris Ostrovsky <[email protected]>

commit 511ba86e1d386f671084b5d0e6f110bb30b8eeb2 upstream.

Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.

Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.

[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
updates" may cause a minor performance regression on
bare metal. This patch resolves that performance regression. It is
somewhat unclear to me if this is a good -stable candidate. ]

Signed-off-by: Boris Ostrovsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Tested-by: Josh Boyer <[email protected]>
Tested-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/paravirt.h | 5 ++++-
arch/x86/include/asm/paravirt_types.h | 2 ++
arch/x86/kernel/paravirt.c | 25 +++++++++++++------------
arch/x86/lguest/boot.c | 1 +
arch/x86/xen/mmu.c | 1 +
5 files changed, 21 insertions(+), 13 deletions(-)

--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -731,7 +731,10 @@ static inline void arch_leave_lazy_mmu_m
PVOP_VCALL0(pv_mmu_ops.lazy_mode.leave);
}

-void arch_flush_lazy_mmu_mode(void);
+static inline void arch_flush_lazy_mmu_mode(void)
+{
+ PVOP_VCALL0(pv_mmu_ops.lazy_mode.flush);
+}

static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
phys_addr_t phys, pgprot_t flags)
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -85,6 +85,7 @@ struct pv_lazy_ops {
/* Set deferred update mode, used for batching operations. */
void (*enter)(void);
void (*leave)(void);
+ void (*flush)(void);
};

struct pv_time_ops {
@@ -673,6 +674,7 @@ void paravirt_end_context_switch(struct

void paravirt_enter_lazy_mmu(void);
void paravirt_leave_lazy_mmu(void);
+void paravirt_flush_lazy_mmu(void);

void _paravirt_nop(void);
u32 _paravirt_ident_32(u32);
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -253,6 +253,18 @@ void paravirt_leave_lazy_mmu(void)
leave_lazy(PARAVIRT_LAZY_MMU);
}

+void paravirt_flush_lazy_mmu(void)
+{
+ preempt_disable();
+
+ if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) {
+ arch_leave_lazy_mmu_mode();
+ arch_enter_lazy_mmu_mode();
+ }
+
+ preempt_enable();
+}
+
void paravirt_start_context_switch(struct task_struct *prev)
{
BUG_ON(preemptible());
@@ -282,18 +294,6 @@ enum paravirt_lazy_mode paravirt_get_laz
return percpu_read(paravirt_lazy_mode);
}

-void arch_flush_lazy_mmu_mode(void)
-{
- preempt_disable();
-
- if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_MMU) {
- arch_leave_lazy_mmu_mode();
- arch_enter_lazy_mmu_mode();
- }
-
- preempt_enable();
-}
-
struct pv_info pv_info = {
.name = "bare hardware",
.paravirt_enabled = 0,
@@ -462,6 +462,7 @@ struct pv_mmu_ops pv_mmu_ops = {
.lazy_mode = {
.enter = paravirt_nop,
.leave = paravirt_nop,
+ .flush = paravirt_nop,
},

.set_fixmap = native_set_fixmap,
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1309,6 +1309,7 @@ __init void lguest_init(void)
pv_mmu_ops.read_cr3 = lguest_read_cr3;
pv_mmu_ops.lazy_mode.enter = paravirt_enter_lazy_mmu;
pv_mmu_ops.lazy_mode.leave = lguest_leave_lazy_mmu_mode;
+ pv_mmu_ops.lazy_mode.flush = paravirt_flush_lazy_mmu;
pv_mmu_ops.pte_update = lguest_pte_update;
pv_mmu_ops.pte_update_defer = lguest_pte_update;

--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -2011,6 +2011,7 @@ static const struct pv_mmu_ops xen_mmu_o
.lazy_mode = {
.enter = paravirt_enter_lazy_mmu,
.leave = xen_leave_lazy_mmu,
+ .flush = paravirt_flush_lazy_mmu,
},

.set_fixmap = xen_set_fixmap,

2013-04-15 02:19:51

by Greg KH

[permalink] [raw]

Subject: [ 06/11] target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Bellinger <[email protected]>

commit 30f359a6f9da65a66de8cadf959f0f4a0d498bba upstream.

This patch fixes a bug where a handful of informational / control CDBs
that should be allowed during ALUA access state Standby/Offline/Transition
where incorrectly returning CHECK_CONDITION + ASCQ_04H_ALUA_TG_PT_*.

This includes INQUIRY + REPORT_LUNS, which would end up preventing LUN
registration when LUN scanning occured during these ALUA access states.

Signed-off-by: Nicholas Bellinger <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/target/target_core_alua.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/target/target_core_alua.c
+++ b/drivers/target/target_core_alua.c
@@ -351,6 +351,7 @@ static inline int core_alua_state_standb
case REPORT_LUNS:
case RECEIVE_DIAGNOSTIC:
case SEND_DIAGNOSTIC:
+ return 0;
case MAINTENANCE_IN:
switch (cdb[1]) {
case MI_REPORT_TARGET_PGS:
@@ -393,6 +394,7 @@ static inline int core_alua_state_unavai
switch (cdb[0]) {
case INQUIRY:
case REPORT_LUNS:
+ return 0;
case MAINTENANCE_IN:
switch (cdb[1]) {
case MI_REPORT_TARGET_PGS:
@@ -433,6 +435,7 @@ static inline int core_alua_state_transi
switch (cdb[0]) {
case INQUIRY:
case REPORT_LUNS:
+ return 0;
case MAINTENANCE_IN:
switch (cdb[1]) {
case MI_REPORT_TARGET_PGS:

2013-04-15 02:20:10

by Greg KH

[permalink] [raw]

Subject: [ 04/11] tracing: Fix double free when function profile init failed

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <[email protected]>

commit 83e03b3fe4daffdebbb42151d5410d730ae50bd1 upstream.

On the failure path, stat->start and stat->pages will refer same page.
So it'll attempt to free the same page again and get kernel panic.

Link: http://lkml.kernel.org/r/[email protected]

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/ftrace.c | 1 -
1 file changed, 1 deletion(-)

--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -566,7 +566,6 @@ int ftrace_profile_pages_init(struct ftr
free_page(tmp);
}

- free_page((unsigned long)stat->pages);
stat->pages = NULL;
stat->start = NULL;

2013-04-15 02:17:56

by Greg KH

[permalink] [raw]

Subject: [ 01/11] ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_*

3.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eldad Zack <[email protected]>

commit 889d66848b12d891248b03abcb2a42047f8e172a upstream.

The usb_control_msg() function expects __u16 types and performs
the endianness conversions by itself.
However, in three places, a conversion is performed before it is
handed over to usb_control_msg(), which leads to a double conversion
(= no conversion):
* snd_usb_nativeinstruments_boot_quirk()
* snd_nativeinstruments_control_get()
* snd_nativeinstruments_control_put()

Caught by sparse:

sound/usb/mixer_quirks.c:512:38: warning: incorrect type in argument 6 (different base types)
sound/usb/mixer_quirks.c:512:38: expected unsigned short [unsigned] [usertype] index
sound/usb/mixer_quirks.c:512:38: got restricted __le16 [usertype] <noident>
sound/usb/mixer_quirks.c:543:35: warning: incorrect type in argument 5 (different base types)
sound/usb/mixer_quirks.c:543:35: expected unsigned short [unsigned] [usertype] value
sound/usb/mixer_quirks.c:543:35: got restricted __le16 [usertype] <noident>
sound/usb/mixer_quirks.c:543:56: warning: incorrect type in argument 6 (different base types)
sound/usb/mixer_quirks.c:543:56: expected unsigned short [unsigned] [usertype] index
sound/usb/mixer_quirks.c:543:56: got restricted __le16 [usertype] <noident>
sound/usb/quirks.c:502:35: warning: incorrect type in argument 5 (different base types)
sound/usb/quirks.c:502:35: expected unsigned short [unsigned] [usertype] value
sound/usb/quirks.c:502:35: got restricted __le16 [usertype] <noident>

Signed-off-by: Eldad Zack <[email protected]>
Acked-by: Daniel Mack <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/usb/mixer_quirks.c | 4 ++--
sound/usb/quirks.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

--- a/sound/usb/mixer_quirks.c
+++ b/sound/usb/mixer_quirks.c
@@ -396,7 +396,7 @@ static int snd_nativeinstruments_control
else
ret = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0), bRequest,
USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_IN,
- 0, cpu_to_le16(wIndex),
+ 0, wIndex,
&tmp, sizeof(tmp), 1000);
up_read(&mixer->chip->shutdown_rwsem);

@@ -427,7 +427,7 @@ static int snd_nativeinstruments_control
else
ret = usb_control_msg(dev, usb_sndctrlpipe(dev, 0), bRequest,
USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT,
- cpu_to_le16(wValue), cpu_to_le16(wIndex),
+ wValue, wIndex,
NULL, 0, 1000);
up_read(&mixer->chip->shutdown_rwsem);

--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -455,7 +455,7 @@ static int snd_usb_nativeinstruments_boo
{
int ret = usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
0xaf, USB_TYPE_VENDOR | USB_RECIP_DEVICE,
- cpu_to_le16(1), 0, NULL, 0, 1000);
+ 1, 0, NULL, 0, 1000);

if (ret < 0)
return ret;

2013-04-15 14:03:52

by Shuah Khan

[permalink] [raw]

Subject: Re: [ 00/11] 3.0.74-stable review

On Sun, Apr 14, 2013 at 8:17 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> This is the start of the stable review cycle for the 3.0.74 release.
> There are 11 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Apr 17 02:05:34 UTC 2013.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.74-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Patches applied cleanly to 3.0.73, 3.4.40, and 3.8.7

Compiled and booted on the following systems:

HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics
(on the road, only one system and no cross compile tests this time)

dmesgs for all releases look good. No regressions compared to the previous
dmesgs for each of these releases.

cross compile tests - skipped

-- Shuah

2013-04-15 20:55:35

by David Woodhouse

[permalink] [raw]

Subject: Re: [ 10/11] mtdchar: fix offset overflow detection

On Sun, 2013-04-14 at 19:17 -0700, Greg Kroah-Hartman wrote:
> 3.0-stable review patch. If anyone has any objections, please let me know.

Please use f5cf8f07423b2677cebebcebc863af77223a4972 instead (for 3.4
too).

--
dwmw2

Attachments:

smime.p7s (6.03 kB)

2013-04-15 22:35:44

by Greg KH

[permalink] [raw]

Subject: Re: [ 10/11] mtdchar: fix offset overflow detection

On Mon, Apr 15, 2013 at 09:55:20PM +0100, David Woodhouse wrote:
> On Sun, 2013-04-14 at 19:17 -0700, Greg Kroah-Hartman wrote:
> > 3.0-stable review patch. If anyone has any objections, please let me know.
>
> Please use f5cf8f07423b2677cebebcebc863af77223a4972 instead (for 3.4
> too).

Really? I love the comment in that commit, "disable it for now until it
can be fixed properly in the next merge window." Yet that was back in
October of last year, no one has actually fixed this issue, is that
because it can't be properly fixed, or just because no one got around to
it?

thanks,

greg k-h

2013-04-16 08:41:52

by David Woodhouse

[permalink] [raw]

Subject: Re: [ 10/11] mtdchar: fix offset overflow detection

On Mon, 2013-04-15 at 15:35 -0700, Greg Kroah-Hartman wrote:
> On Mon, Apr 15, 2013 at 09:55:20PM +0100, David Woodhouse wrote:
> > On Sun, 2013-04-14 at 19:17 -0700, Greg Kroah-Hartman wrote:
> > > 3.0-stable review patch. If anyone has any objections, please let
> me know.
> >
> > Please use f5cf8f07423b2677cebebcebc863af77223a4972 instead (for 3.4
> > too).
>
> Really? I love the comment in that commit, "disable it for now until
> it
> can be fixed properly in the next merge window." Yet that was back in
> October of last year, no one has actually fixed this issue, is that
> because it can't be properly fixed, or just because no one got around
> to it?

Because nobody cared. It's only really used on uclinux for XIP mapping
of code from RAM anyway. Support for MMU systems was added as an
afterthought. and it shows.

ISTR I was blaming dhowells for it at the time, and hoping for him to
fix it. Although that remembrance may be incorrect, or I could even have
been incorrect at the time. I'm not going to investigate further right
now; I'm not here. 2 days into paternity leave...

--
dwmw2

Attachments:

smime.p7s (6.03 kB)

2013-04-22 01:22:55

by Ben Hutchings

[permalink] [raw]

Subject: Re: [ 10/11] mtdchar: fix offset overflow detection

On Mon, 2013-04-15 at 21:55 +0100, David Woodhouse wrote:
> On Sun, 2013-04-14 at 19:17 -0700, Greg Kroah-Hartman wrote:
> > 3.0-stable review patch. If anyone has any objections, please let me know.
>
> Please use f5cf8f07423b2677cebebcebc863af77223a4972 instead (for 3.4
> too).

I've queued that up for 3.2, thanks.

Ben.

--
Ben Hutchings
Klipstein's 4th Law of Prototyping and Production:
A fail-safe circuit will destroy others.

Attachments:

signature.asc (828.00 B)
This is a digitally signed message part