2018-01-01 14:32:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/75] 4.9.74-stable review

This is the start of the stable review cycle for the 4.9.74 release.
There are 75 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.74-rc1

Johan Hovold <[email protected]>
tty: fix tty_ldisc_receive_buf() documentation

Linus Torvalds <[email protected]>
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)

Thomas Gleixner <[email protected]>
x86/smpboot: Remove stale TLB flush invocations

Thomas Gleixner <[email protected]>
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()

Thomas Gleixner <[email protected]>
timers: Reinitialize per cpu bases on hotplug

Thomas Gleixner <[email protected]>
timers: Invoke timer_start_debug() where it makes sense

Anna-Maria Gleixner <[email protected]>
timers: Use deferrable base independent of base::nohz_active

Daniel Thompson <[email protected]>
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201

Mathias Nyman <[email protected]>
USB: Fix off by one in type-specific length check of BOS SSP capability

Oliver Neukum <[email protected]>
usb: add RESET_RESUME for ELSA MicroLink 56K

Dmitry Fleytman Dmitry Fleytman <[email protected]>
usb: Add device quirk for Logitech HD Pro Webcam C925e

SZ Lin (林上智) <[email protected]>
USB: serial: option: adding support for YUGA CLM920-NC5

Daniele Palmas <[email protected]>
USB: serial: option: add support for Telit ME910 PID 0x1101

Reinhard Speyerer <[email protected]>
USB: serial: qcserial: add Sierra Wireless EM7565

Max Schulze <[email protected]>
USB: serial: ftdi_sio: add id for Airbus DS P8GR

Shuah Khan <[email protected]>
usbip: vhci: stop printing kernel pointer addresses in messages

Shuah Khan <[email protected]>
usbip: stub: stop printing kernel pointer addresses in messages

Shuah Khan <[email protected]>
usbip: prevent leaking socket pointer address in messages

Juan Zea <[email protected]>
usbip: fix usbip bind writing random string after command in match_busid

Julian Wiedmann <[email protected]>
s390/qeth: update takeover IPs after configuration change

Julian Wiedmann <[email protected]>
s390/qeth: lock IP table while applying takeover changes

Julian Wiedmann <[email protected]>
s390/qeth: don't apply takeover changes to RXIP

Julian Wiedmann <[email protected]>
s390/qeth: apply takeover changes when mode is toggled

Moni Shoua <[email protected]>
net/mlx5: Fix error flow in CREATE_QP command

Gal Pressman <[email protected]>
net/mlx5e: Prevent possible races in VXLAN control flow

Gal Pressman <[email protected]>
net/mlx5e: Add refcount to VXLAN structure

Gal Pressman <[email protected]>
net/mlx5e: Fix possible deadlock of VXLAN lock

Gal Pressman <[email protected]>
net/mlx5e: Fix features check of IPv6 traffic

Eran Ben Elisha <[email protected]>
net/mlx5: Fix rate limit packet pacing naming and struct

Yousuk Seung <[email protected]>
tcp: invalidate rate samples during SACK reneging

Willem de Bruijn <[email protected]>
sock: free skb in skb_complete_tx_timestamp on error

Grygorii Strashko <[email protected]>
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround

Eric W. Biederman <[email protected]>
net: Fix double free and memory corruption in get_net_ns_by_id()

Andrew Lunn <[email protected]>
net: fec: Allow reception of frames bigger than 1522 bytes

Nikolay Aleksandrov <[email protected]>
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks

Ido Schimmel <[email protected]>
ipv4: Fix use-after-free when flushing FIB tables

Nikita V. Shirokov <[email protected]>
adding missing rcu_read_unlock in ipxip6_rcv

Tonghao Zhang <[email protected]>
sctp: Replace use of sockets_allocated with specified macro.

Tobias Jordan <[email protected]>
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case

Mohamed Ghannam <[email protected]>
net: ipv4: fix for a race condition in raw_sendmsg

Brian King <[email protected]>
tg3: Fix rx hang on MTU change with 5717/5719

Christoph Paasch <[email protected]>
tcp md5sig: Use skb's saddr when replying to an incoming segment

Neal Cardwell <[email protected]>
tcp_bbr: record "full bw reached" decision in new full_bw_reached bit

Avinash Repaka <[email protected]>
RDS: Check cmsg_len before dereferencing CMSG_DATA

Michael S. Tsirkin <[email protected]>
ptr_ring: add barriers

Shaohua Li <[email protected]>
net: reevalulate autoflowlabel setting after sysctl setting

Sebastian Sjoholm <[email protected]>
net: qmi_wwan: add Sierra EM7565 1199:9091

Kevin Cernekee <[email protected]>
netlink: Add netns check on taps

Kevin Cernekee <[email protected]>
net: igmp: Use correct source address on IGMPv3 reports

Fugang Duan <[email protected]>
net: fec: unmap the xmit buffer that are not transferred by DMA

Eric Dumazet <[email protected]>
ipv6: mcast: better catch silly mtu values

Eric Dumazet <[email protected]>
ipv4: igmp: guard against silly MTU values

Linus Torvalds <[email protected]>
kbuild: add '-fno-stack-check' to kernel build options

Andy Lutomirski <[email protected]>
x86/mm/64: Fix reboot interaction with CR4.PCIDE

Andy Lutomirski <[email protected]>
x86/mm: Enable CR4.PCIDE on supported systems

Andy Lutomirski <[email protected]>
x86/mm: Add the 'nopcid' boot option to turn off PCID

Andy Lutomirski <[email protected]>
x86/mm: Disable PCID on 32-bit kernels

Andy Lutomirski <[email protected]>
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code

Andy Lutomirski <[email protected]>
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()

Andy Lutomirski <[email protected]>
x86/mm: Make flush_tlb_mm_range() more predictable

Andy Lutomirski <[email protected]>
x86/mm: Remove flush_tlb() and flush_tlb_current_task()

Andy Lutomirski <[email protected]>
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

Hui Wang <[email protected]>
ALSA: hda - fix headset mic detection issue on a Dell machine

Takashi Iwai <[email protected]>
ALSA: hda: Drop useless WARN_ON()

Andrew F. Davis <[email protected]>
ASoC: tlv320aic31xx: Fix GPIO1 register definition

Johan Hovold <[email protected]>
ASoC: twl4030: fix child-node lookup

Maciej S. Szmigiero <[email protected]>
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure

Johan Hovold <[email protected]>
ASoC: da7218: fix fix child-node lookup

Ben Hutchings <[email protected]>
ASoC: wm_adsp: Fix validation of firmware and coeff lengths

Steve Wise <[email protected]>
iw_cxgb4: Only validate the MSN for successful completions

Steven Rostedt (VMware) <[email protected]>
ring-buffer: Mask out the info bits when returning buffer page length

Jing Xia <[email protected]>
tracing: Fix crash when it fails to alloc ring buffer

Steven Rostedt (VMware) <[email protected]>
tracing: Fix possible double free on failure of allocating trace buffer

Steven Rostedt (VMware) <[email protected]>
tracing: Remove extra zeroing out of the ring buffer page

Greg Kroah-Hartman <[email protected]>
sync objtool's copy of x86-opcode-map.txt


-------------

Diffstat:

Documentation/kernel-parameters.txt | 2 +
Makefile | 7 +-
arch/x86/Kconfig | 2 +-
arch/x86/include/asm/disabled-features.h | 4 +-
arch/x86/include/asm/hardirq.h | 2 +-
arch/x86/include/asm/mmu.h | 6 --
arch/x86/include/asm/mmu_context.h | 2 -
arch/x86/include/asm/tlbflush.h | 99 +++--------------------
arch/x86/kernel/cpu/bugs.c | 8 ++
arch/x86/kernel/cpu/common.c | 40 +++++++++
arch/x86/kernel/reboot.c | 4 +
arch/x86/kernel/smpboot.c | 9 ---
arch/x86/kernel/vm86_32.c | 2 +-
arch/x86/mm/init.c | 2 -
arch/x86/mm/tlb.c | 73 +++--------------
arch/x86/xen/enlighten.c | 6 ++
drivers/infiniband/hw/cxgb4/cq.c | 6 +-
drivers/net/ethernet/broadcom/tg3.c | 4 +-
drivers/net/ethernet/freescale/fec_main.c | 14 +++-
drivers/net/ethernet/marvell/mvmdio.c | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/rl.c | 22 ++---
drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 64 ++++++++-------
drivers/net/ethernet/mellanox/mlx5/core/vxlan.h | 1 +
drivers/net/phy/micrel.c | 1 +
drivers/net/usb/qmi_wwan.c | 1 +
drivers/s390/net/qeth_core.h | 6 +-
drivers/s390/net/qeth_core_main.c | 6 +-
drivers/s390/net/qeth_l3.h | 2 +-
drivers/s390/net/qeth_l3_main.c | 36 +++++++--
drivers/s390/net/qeth_l3_sys.c | 75 +++++++++--------
drivers/tty/n_tty.c | 4 +-
drivers/tty/tty_buffer.c | 2 +-
drivers/usb/core/config.c | 2 +-
drivers/usb/core/quirks.c | 6 +-
drivers/usb/host/xhci-pci.c | 3 +
drivers/usb/serial/ftdi_sio.c | 1 +
drivers/usb/serial/ftdi_sio_ids.h | 6 ++
drivers/usb/serial/option.c | 17 ++++
drivers/usb/serial/qcserial.c | 3 +
drivers/usb/usbip/stub_dev.c | 3 +-
drivers/usb/usbip/stub_main.c | 5 +-
drivers/usb/usbip/stub_rx.c | 7 +-
drivers/usb/usbip/stub_tx.c | 6 +-
drivers/usb/usbip/usbip_common.c | 14 +---
drivers/usb/usbip/vhci_hcd.c | 12 +--
drivers/usb/usbip/vhci_rx.c | 23 +++---
drivers/usb/usbip/vhci_tx.c | 3 +-
include/linux/cpuhotplug.h | 2 +-
include/linux/ipv6.h | 3 +-
include/linux/mlx5/mlx5_ifc.h | 8 +-
include/linux/ptr_ring.h | 9 +++
include/linux/tcp.h | 3 +-
include/linux/timer.h | 4 +-
include/net/ip.h | 2 +
include/net/tcp.h | 2 +-
kernel/cpu.c | 4 +-
kernel/time/tick-sched.c | 19 ++++-
kernel/time/timer.c | 35 +++++---
kernel/trace/ring_buffer.c | 6 +-
kernel/trace/trace.c | 13 +--
net/bridge/br_netlink.c | 11 +--
net/core/net_namespace.c | 2 +-
net/core/skbuff.c | 6 +-
net/ipv4/devinet.c | 2 +-
net/ipv4/fib_frontend.c | 9 ++-
net/ipv4/igmp.c | 44 +++++++---
net/ipv4/ip_tunnel.c | 4 +-
net/ipv4/raw.c | 15 ++--
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_bbr.c | 7 +-
net/ipv4/tcp_input.c | 10 ++-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv4/tcp_rate.c | 10 ++-
net/ipv6/af_inet6.c | 1 -
net/ipv6/ip6_output.c | 12 ++-
net/ipv6/ip6_tunnel.c | 2 +-
net/ipv6/ipv6_sockglue.c | 1 +
net/ipv6/mcast.c | 25 +++---
net/ipv6/tcp_ipv6.c | 2 +-
net/netlink/af_netlink.c | 3 +
net/rds/send.c | 3 +
net/sctp/socket.c | 4 +-
sound/hda/hdac_i915.c | 2 +-
sound/pci/hda/patch_realtek.c | 5 ++
sound/soc/codecs/da7218.c | 2 +-
sound/soc/codecs/tlv320aic31xx.h | 2 +-
sound/soc/codecs/twl4030.c | 4 +-
sound/soc/codecs/wm_adsp.c | 12 +--
sound/soc/fsl/fsl_ssi.c | 18 +++--
tools/objtool/arch/x86/insn/x86-opcode-map.txt | 2 +-
tools/usb/usbip/src/utils.c | 9 ++-
94 files changed, 550 insertions(+), 429 deletions(-)



2018-01-01 14:33:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/75] sync objtools copy of x86-opcode-map.txt

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

When building objtool, we get the warning:
warning: objtool: x86 instruction decoder differs from kernel

That's due to commit 2816c0455cea088f07a210f8a00701a82a78aa9c which was
commit 12a78d43de767eaf8fb272facb7a7b6f2dc6a9df upstream that modified
arch/x86/lib/x86-opcode-map.txt without also updating the objtool copy.
The objtool copy was updated in a much larger patch upstream, but we
don't need all of that here, so just update the single file.

If this gets too annoying, I'll just end up doing what we did for 4.14
and backport the whole series to keep this from happening again, but as
this seems to be rare in the 4.9-stable series, this single patch should
be fine.

Cc: Masami Hiramatsu <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>



---
tools/objtool/arch/x86/insn/x86-opcode-map.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/objtool/arch/x86/insn/x86-opcode-map.txt
+++ b/tools/objtool/arch/x86/insn/x86-opcode-map.txt
@@ -896,7 +896,7 @@ EndTable

GrpTable: Grp3_1
0: TEST Eb,Ib
-1:
+1: TEST Eb,Ib
2: NOT Eb
3: NEG Eb
4: MUL AL,Eb


2018-01-01 14:33:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/75] ASoC: twl4030: fix child-node lookup

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johan Hovold <[email protected]>

commit 15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.

Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.

To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.

Fixes: 2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/twl4030.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -232,7 +232,7 @@ static struct twl4030_codec_data *twl403
struct twl4030_codec_data *pdata = dev_get_platdata(codec->dev);
struct device_node *twl4030_codec_node = NULL;

- twl4030_codec_node = of_find_node_by_name(codec->dev->parent->of_node,
+ twl4030_codec_node = of_get_child_by_name(codec->dev->parent->of_node,
"codec");

if (!pdata && twl4030_codec_node) {
@@ -241,9 +241,11 @@ static struct twl4030_codec_data *twl403
GFP_KERNEL);
if (!pdata) {
dev_err(codec->dev, "Can not allocate memory\n");
+ of_node_put(twl4030_codec_node);
return NULL;
}
twl4030_setup_pdata_of(pdata, twl4030_codec_node);
+ of_node_put(twl4030_codec_node);
}

return pdata;


2018-01-01 14:33:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/75] ASoC: tlv320aic31xx: Fix GPIO1 register definition

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew F. Davis <[email protected]>

commit 737e0b7b67bdfe24090fab2852044bb283282fc5 upstream.

GPIO1 control register is number 51, fix this here.

Fixes: bafcbfe429eb ("ASoC: tlv320aic31xx: Make the register values human readable")
Signed-off-by: Andrew F. Davis <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/tlv320aic31xx.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/soc/codecs/tlv320aic31xx.h
+++ b/sound/soc/codecs/tlv320aic31xx.h
@@ -115,7 +115,7 @@ struct aic31xx_pdata {
/* INT2 interrupt control */
#define AIC31XX_INT2CTRL AIC31XX_REG(0, 49)
/* GPIO1 control */
-#define AIC31XX_GPIO1 AIC31XX_REG(0, 50)
+#define AIC31XX_GPIO1 AIC31XX_REG(0, 51)

#define AIC31XX_DACPRB AIC31XX_REG(0, 60)
/* ADC Instruction Set Register */


2018-01-01 14:33:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/75] x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 9ccee2373f0658f234727700e619df097ba57023 upstream.

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/vm86_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -191,7 +191,7 @@ static void mark_screen_rdonly(struct mm
pte_unmap_unlock(pte, ptl);
out:
up_write(&mm->mmap_sem);
- flush_tlb();
+ flush_tlb_mm_range(mm, 0xA0000, 0xA0000 + 32*PAGE_SIZE, 0UL);
}




2018-01-01 14:33:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 15/75] x86/mm: Remove flush_tlb() and flush_tlb_current_task()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 29961b59a51f8c6838a26a45e871a7ed6771809b upstream.

I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/tlbflush.h | 9 ---------
arch/x86/mm/tlb.c | 17 -----------------
2 files changed, 26 deletions(-)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -205,7 +205,6 @@ static inline void __flush_tlb_one(unsig
/*
* TLB flushing:
*
- * - flush_tlb() flushes the current mm struct TLBs
* - flush_tlb_all() flushes all processes TLBs
* - flush_tlb_mm(mm) flushes the specified mm context TLB's
* - flush_tlb_page(vma, vmaddr) flushes one page
@@ -237,11 +236,6 @@ static inline void flush_tlb_all(void)
__flush_tlb_all();
}

-static inline void flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
static inline void local_flush_tlb(void)
{
__flush_tlb_up();
@@ -303,14 +297,11 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)

extern void flush_tlb_all(void);
-extern void flush_tlb_current_task(void);
extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);

-#define flush_tlb() flush_tlb_current_task()
-
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -287,23 +287,6 @@ void native_flush_tlb_others(const struc
smp_call_function_many(cpumask, flush_tlb_func, &info, 1);
}

-void flush_tlb_current_task(void)
-{
- struct mm_struct *mm = current->mm;
-
- preempt_disable();
-
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
-
- /* This is an implicit full barrier that synchronizes with switch_mm. */
- local_flush_tlb();
-
- trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL);
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL);
- preempt_enable();
-}
-
/*
* See Documentation/x86/tlb.txt for details. We choose 33
* because it is large enough to cover the vast majority (at


2018-01-01 14:33:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 17/75] x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit ca6c99c0794875c6d1db6e22f246699691ab7e6b upstream.

flush_tlb_page() was very similar to flush_tlb_mm_range() except that
it had a couple of issues:

- It was missing an smp_mb() in the case where
current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit)

- It was missing tracepoints and vm counter updates.

The only reason that I can see for keeping it at as a separate
function is that it could avoid a few branches that
flush_tlb_mm_range() needs to decide to flush just one page. This
hardly seems worthwhile. If we decide we want to get rid of those
branches again, a better way would be to introduce an
__flush_tlb_mm_range() helper and make both flush_tlb_page() and
flush_tlb_mm_range() use it.

Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.1495492063.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/tlbflush.h | 6 +++++-
arch/x86/mm/tlb.c | 27 ---------------------------
2 files changed, 5 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -297,11 +297,15 @@ static inline void flush_tlb_kernel_rang
flush_tlb_mm_range(vma->vm_mm, start, end, vma->vm_flags)

extern void flush_tlb_all(void);
-extern void flush_tlb_page(struct vm_area_struct *, unsigned long);
extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
unsigned long end, unsigned long vmflag);
extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);

+static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a)
+{
+ flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, VM_NONE);
+}
+
void native_flush_tlb_others(const struct cpumask *cpumask,
struct mm_struct *mm,
unsigned long start, unsigned long end);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -354,33 +354,6 @@ out:
preempt_enable();
}

-void flush_tlb_page(struct vm_area_struct *vma, unsigned long start)
-{
- struct mm_struct *mm = vma->vm_mm;
-
- preempt_disable();
-
- if (current->active_mm == mm) {
- if (current->mm) {
- /*
- * Implicit full barrier (INVLPG) that synchronizes
- * with switch_mm.
- */
- __flush_tlb_one(start);
- } else {
- leave_mm(smp_processor_id());
-
- /* Synchronize with switch_mm. */
- smp_mb();
- }
- }
-
- if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids)
- flush_tlb_others(mm_cpumask(mm), mm, start, start + PAGE_SIZE);
-
- preempt_enable();
-}
-
static void do_flush_tlb_all(void *info)
{
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);


2018-01-01 14:33:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 18/75] x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit ce4a4e565f5264909a18c733b864c3f74467f69e upstream.

The UP asm/tlbflush.h generates somewhat nicer code than the SMP version.
Aside from that, it's fallen quite a bit behind the SMP code:

- flush_tlb_mm_range() didn't flush individual pages if the range
was small.

- The lazy TLB code was much weaker. This usually wouldn't matter,
but, if a kernel thread flushed its lazy "active_mm" more than
once (due to reclaim or similar), it wouldn't be unlazied and
would instead pointlessly flush repeatedly.

- Tracepoints were missing.

Aside from that, simply having the UP code around was a maintanence
burden, since it means that any change to the TLB flush code had to
make sure not to break it.

Simplify everything by deleting the UP code.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/Kconfig | 2
arch/x86/include/asm/hardirq.h | 2
arch/x86/include/asm/mmu.h | 6 --
arch/x86/include/asm/mmu_context.h | 2
arch/x86/include/asm/tlbflush.h | 78 -------------------------------------
arch/x86/mm/init.c | 2
arch/x86/mm/tlb.c | 17 --------
7 files changed, 5 insertions(+), 104 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -45,7 +45,7 @@ config X86
select ARCH_USE_CMPXCHG_LOCKREF if X86_64
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
- select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH if SMP
+ select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_IPC_PARSE_VERSION if X86_32
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -22,8 +22,8 @@ typedef struct {
#ifdef CONFIG_SMP
unsigned int irq_resched_count;
unsigned int irq_call_count;
- unsigned int irq_tlb_count;
#endif
+ unsigned int irq_tlb_count;
#ifdef CONFIG_X86_THERMAL_VECTOR
unsigned int irq_thermal_count;
#endif
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -33,12 +33,6 @@ typedef struct {
#endif
} mm_context_t;

-#ifdef CONFIG_SMP
void leave_mm(int cpu);
-#else
-static inline void leave_mm(int cpu)
-{
-}
-#endif

#endif /* _ASM_X86_MMU_H */
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -99,10 +99,8 @@ static inline void load_mm_ldt(struct mm

static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
{
-#ifdef CONFIG_SMP
if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
this_cpu_write(cpu_tlbstate.state, TLBSTATE_LAZY);
-#endif
}

static inline int init_new_context(struct task_struct *tsk,
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -7,6 +7,7 @@
#include <asm/processor.h>
#include <asm/cpufeature.h>
#include <asm/special_insns.h>
+#include <asm/smp.h>

static inline void __invpcid(unsigned long pcid, unsigned long addr,
unsigned long type)
@@ -65,10 +66,8 @@ static inline void invpcid_flush_all_non
#endif

struct tlb_state {
-#ifdef CONFIG_SMP
struct mm_struct *active_mm;
int state;
-#endif

/*
* Access to this CR4 shadow and to H/W CR4 is protected by
@@ -216,79 +215,6 @@ static inline void __flush_tlb_one(unsig
* and page-granular flushes are available only on i486 and up.
*/

-#ifndef CONFIG_SMP
-
-/* "_up" is for UniProcessor.
- *
- * This is a helper for other header functions. *Not* intended to be called
- * directly. All global TLB flushes need to either call this, or to bump the
- * vm statistics themselves.
- */
-static inline void __flush_tlb_up(void)
-{
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- __flush_tlb();
-}
-
-static inline void flush_tlb_all(void)
-{
- count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
- __flush_tlb_all();
-}
-
-static inline void local_flush_tlb(void)
-{
- __flush_tlb_up();
-}
-
-static inline void flush_tlb_mm(struct mm_struct *mm)
-{
- if (mm == current->active_mm)
- __flush_tlb_up();
-}
-
-static inline void flush_tlb_page(struct vm_area_struct *vma,
- unsigned long addr)
-{
- if (vma->vm_mm == current->active_mm)
- __flush_tlb_one(addr);
-}
-
-static inline void flush_tlb_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end)
-{
- if (vma->vm_mm == current->active_mm)
- __flush_tlb_up();
-}
-
-static inline void flush_tlb_mm_range(struct mm_struct *mm,
- unsigned long start, unsigned long end, unsigned long vmflag)
-{
- if (mm == current->active_mm)
- __flush_tlb_up();
-}
-
-static inline void native_flush_tlb_others(const struct cpumask *cpumask,
- struct mm_struct *mm,
- unsigned long start,
- unsigned long end)
-{
-}
-
-static inline void reset_lazy_tlbstate(void)
-{
-}
-
-static inline void flush_tlb_kernel_range(unsigned long start,
- unsigned long end)
-{
- flush_tlb_all();
-}
-
-#else /* SMP */
-
-#include <asm/smp.h>
-
#define local_flush_tlb() __flush_tlb()

#define flush_tlb_mm(mm) flush_tlb_mm_range(mm, 0UL, TLB_FLUSH_ALL, 0UL)
@@ -319,8 +245,6 @@ static inline void reset_lazy_tlbstate(v
this_cpu_write(cpu_tlbstate.active_mm, &init_mm);
}

-#endif /* SMP */
-
#ifndef CONFIG_PARAVIRT
#define flush_tlb_others(mask, mm, start, end) \
native_flush_tlb_others(mask, mm, start, end)
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -764,10 +764,8 @@ void __init zone_sizes_init(void)
}

DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
-#ifdef CONFIG_SMP
.active_mm = &init_mm,
.state = 0,
-#endif
.cr4 = ~0UL, /* fail hard if we screw up cr4 shadow initialization */
};
EXPORT_SYMBOL_GPL(cpu_tlbstate);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -15,7 +15,7 @@
#include <linux/debugfs.h>

/*
- * Smarter SMP flushing macros.
+ * TLB flushing, formerly SMP-only
* c/o Linus Torvalds.
*
* These mean you can really definitely utterly forget about
@@ -28,8 +28,6 @@
* Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
*/

-#ifdef CONFIG_SMP
-
struct flush_tlb_info {
struct mm_struct *flush_mm;
unsigned long flush_start;
@@ -59,8 +57,6 @@ void leave_mm(int cpu)
}
EXPORT_SYMBOL_GPL(leave_mm);

-#endif /* CONFIG_SMP */
-
void switch_mm(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
{
@@ -91,10 +87,8 @@ void switch_mm_irqs_off(struct mm_struct
set_pgd(pgd, init_mm.pgd[stack_pgd_index]);
}

-#ifdef CONFIG_SMP
this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
this_cpu_write(cpu_tlbstate.active_mm, next);
-#endif

cpumask_set_cpu(cpu, mm_cpumask(next));

@@ -152,9 +146,7 @@ void switch_mm_irqs_off(struct mm_struct
if (unlikely(prev->context.ldt != next->context.ldt))
load_mm_ldt(next);
#endif
- }
-#ifdef CONFIG_SMP
- else {
+ } else {
this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK);
BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != next);

@@ -181,11 +173,8 @@ void switch_mm_irqs_off(struct mm_struct
load_mm_ldt(next);
}
}
-#endif
}

-#ifdef CONFIG_SMP
-
/*
* The flush IPI assumes that a thread switch happens in this order:
* [cpu0: the cpu that switches]
@@ -438,5 +427,3 @@ static int __init create_tlb_single_page
return 0;
}
late_initcall(create_tlb_single_page_flush_ceiling);
-
-#endif /* CONFIG_SMP */


2018-01-01 14:33:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 19/75] x86/mm: Disable PCID on 32-bit kernels

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit cba4671af7550e008f7a7835f06df0763825bf3e upstream.

32-bit kernels on new hardware will see PCID in CPUID, but PCID can
only be used in 64-bit mode. Rather than making all PCID code
conditional, just disable the feature on 32-bit builds.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Nadav Amit <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/disabled-features.h | 4 +++-
arch/x86/kernel/cpu/bugs.c | 8 ++++++++
2 files changed, 11 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -21,11 +21,13 @@
# define DISABLE_K6_MTRR (1<<(X86_FEATURE_K6_MTRR & 31))
# define DISABLE_CYRIX_ARR (1<<(X86_FEATURE_CYRIX_ARR & 31))
# define DISABLE_CENTAUR_MCR (1<<(X86_FEATURE_CENTAUR_MCR & 31))
+# define DISABLE_PCID 0
#else
# define DISABLE_VME 0
# define DISABLE_K6_MTRR 0
# define DISABLE_CYRIX_ARR 0
# define DISABLE_CENTAUR_MCR 0
+# define DISABLE_PCID (1<<(X86_FEATURE_PCID & 31))
#endif /* CONFIG_X86_64 */

#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
@@ -43,7 +45,7 @@
#define DISABLED_MASK1 0
#define DISABLED_MASK2 0
#define DISABLED_MASK3 (DISABLE_CYRIX_ARR|DISABLE_CENTAUR_MCR|DISABLE_K6_MTRR)
-#define DISABLED_MASK4 0
+#define DISABLED_MASK4 (DISABLE_PCID)
#define DISABLED_MASK5 0
#define DISABLED_MASK6 0
#define DISABLED_MASK7 0
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -19,6 +19,14 @@

void __init check_bugs(void)
{
+#ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+ * that it can't be enabled in 32-bit mode.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+#endif
+
identify_boot_cpu();
#ifndef CONFIG_SMP
pr_info("CPU: ");


2018-01-01 14:33:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/75] x86/mm: Add the nopcid boot option to turn off PCID

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 0790c9aad84901ca1bdc14746175549c8b5da215 upstream.

The parameter is only present on x86_64 systems to save a few bytes,
as PCID is always disabled on x86_32.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Nadav Amit <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
Documentation/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/common.c | 18 ++++++++++++++++++
2 files changed, 20 insertions(+)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2795,6 +2795,8 @@ bytes respectively. Such letter suffixes
nopat [X86] Disable PAT (page attribute table extension of
pagetables) support.

+ nopcid [X86-64] Disable the PCID cpu feature.
+
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -163,6 +163,24 @@ static int __init x86_mpx_setup(char *s)
}
__setup("nompx", x86_mpx_setup);

+#ifdef CONFIG_X86_64
+static int __init x86_pcid_setup(char *s)
+{
+ /* require an exact match without trailing characters */
+ if (strlen(s))
+ return 0;
+
+ /* do not emit a message if the feature is not present */
+ if (!boot_cpu_has(X86_FEATURE_PCID))
+ return 1;
+
+ setup_clear_cpu_cap(X86_FEATURE_PCID);
+ pr_info("nopcid: PCID feature disabled\n");
+ return 1;
+}
+__setup("nopcid", x86_pcid_setup);
+#endif
+
static int __init x86_noinvpcid_setup(char *s)
{
/* noinvpcid doesn't accept parameters */


2018-01-01 14:33:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 22/75] x86/mm/64: Fix reboot interaction with CR4.PCIDE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 924c6b900cfdf376b07bccfd80e62b21914f8a5a upstream.

Trying to reboot via real mode fails with PCID on: long mode cannot
be exited while CR4.PCIDE is set. (No, I have no idea why, but the
SDM and actual CPUs are in agreement here.) The result is a GPF and
a hang instead of a reboot.

I didn't catch this in testing because neither my computer nor my VM
reboots this way. I can trigger it with reboot=bios, though.

Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Reported-and-tested-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.1507524746.git.luto@kernel.org
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/reboot.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -106,6 +106,10 @@ void __noreturn machine_real_restart(uns
load_cr3(initial_page_table);
#else
write_cr3(real_mode_header->trampoline_pgd);
+
+ /* Exiting long mode will fail if CR4.PCIDE is set. */
+ if (static_cpu_has(X86_FEATURE_PCID))
+ cr4_clear_bits(X86_CR4_PCIDE);
#endif

/* Jump to the identity-mapped low memory code */


2018-01-01 14:34:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/75] tracing: Fix crash when it fails to alloc ring buffer

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jing Xia <[email protected]>

commit 24f2aaf952ee0b59f31c3a18b8b36c9e3d3c2cf5 upstream.

Double free of the ring buffer happens when it fails to alloc new
ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured.
The root cause is that the pointer is not set to NULL after the buffer
is freed in allocate_trace_buffers(), and the freeing of the ring
buffer is invoked again later if the pointer is not equal to Null,
as:

instance_mkdir()
|-allocate_trace_buffers()
|-allocate_trace_buffer(tr, &tr->trace_buffer...)
|-allocate_trace_buffer(tr, &tr->max_buffer...)

// allocate fail(-ENOMEM),first free
// and the buffer pointer is not set to null
|-ring_buffer_free(tr->trace_buffer.buffer)

// out_free_tr
|-free_trace_buffers()
|-free_trace_buffer(&tr->trace_buffer);

//if trace_buffer is not null, free again
|-ring_buffer_free(buf->buffer)
|-rb_free_cpu_buffer(buffer->buffers[cpu])
// ring_buffer_per_cpu is null, and
// crash in ring_buffer_per_cpu->pages

Link: http://lkml.kernel.org/r/[email protected]

Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code")
Signed-off-by: Jing Xia <[email protected]>
Signed-off-by: Chunyan Zhang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/trace.c | 2 ++
1 file changed, 2 insertions(+)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6979,7 +6979,9 @@ static int allocate_trace_buffers(struct
allocate_snapshot ? size : 1);
if (WARN_ON(ret)) {
ring_buffer_free(tr->trace_buffer.buffer);
+ tr->trace_buffer.buffer = NULL;
free_percpu(tr->trace_buffer.data);
+ tr->trace_buffer.data = NULL;
return -ENOMEM;
}
tr->allocated_snapshot = allocate_snapshot;


2018-01-01 14:34:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/75] ASoC: wm_adsp: Fix validation of firmware and coeff lengths

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <[email protected]>

commit 50dd2ea8ef67a1617e0c0658bcbec4b9fb03b936 upstream.

The checks for whether another region/block header could be present
are subtracting the size from the current offset. Obviously we should
instead subtract the offset from the size.

The checks for whether the region/block data fit in the file are
adding the data size to the current offset and header size, without
checking for integer overflow. Rearrange these so that overflow is
impossible.

Signed-off-by: Ben Hutchings <[email protected]>
Acked-by: Charles Keepax <[email protected]>
Tested-by: Charles Keepax <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/wm_adsp.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

--- a/sound/soc/codecs/wm_adsp.c
+++ b/sound/soc/codecs/wm_adsp.c
@@ -1465,7 +1465,7 @@ static int wm_adsp_load(struct wm_adsp *
le64_to_cpu(footer->timestamp));

while (pos < firmware->size &&
- pos - firmware->size > sizeof(*region)) {
+ sizeof(*region) < firmware->size - pos) {
region = (void *)&(firmware->data[pos]);
region_name = "Unknown";
reg = 0;
@@ -1526,8 +1526,8 @@ static int wm_adsp_load(struct wm_adsp *
regions, le32_to_cpu(region->len), offset,
region_name);

- if ((pos + le32_to_cpu(region->len) + sizeof(*region)) >
- firmware->size) {
+ if (le32_to_cpu(region->len) >
+ firmware->size - pos - sizeof(*region)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, regions, region_name,
@@ -1992,7 +1992,7 @@ static int wm_adsp_load_coeff(struct wm_

blocks = 0;
while (pos < firmware->size &&
- pos - firmware->size > sizeof(*blk)) {
+ sizeof(*blk) < firmware->size - pos) {
blk = (void *)(&firmware->data[pos]);

type = le16_to_cpu(blk->type);
@@ -2066,8 +2066,8 @@ static int wm_adsp_load_coeff(struct wm_
}

if (reg) {
- if ((pos + le32_to_cpu(blk->len) + sizeof(*blk)) >
- firmware->size) {
+ if (le32_to_cpu(blk->len) >
+ firmware->size - pos - sizeof(*blk)) {
adsp_err(dsp,
"%s.%d: %s region len %d bytes exceeds file length %zu\n",
file, blocks, region_name,


2018-01-01 14:34:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/75] ASoC: da7218: fix fix child-node lookup

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johan Hovold <[email protected]>

commit bc6476d6c1edcb9b97621b5131bd169aa81f27db upstream.

Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.

To make things worse, the parent codec node was also prematurely freed.

Fixes: 4d50934abd22 ("ASoC: da7218: Add da7218 codec driver")
Signed-off-by: Johan Hovold <[email protected]>
Acked-by: Adam Thomson <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/da7218.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/soc/codecs/da7218.c
+++ b/sound/soc/codecs/da7218.c
@@ -2519,7 +2519,7 @@ static struct da7218_pdata *da7218_of_to
}

if (da7218->dev_id == DA7218_DEV_ID) {
- hpldet_np = of_find_node_by_name(np, "da7218_hpldet");
+ hpldet_np = of_get_child_by_name(np, "da7218_hpldet");
if (!hpldet_np)
return pdata;



2018-01-01 14:34:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 09/75] ASoC: fsl_ssi: AC97 ops need regmap, clock and cleaning up on failure

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Maciej S. Szmigiero <[email protected]>

commit 695b78b548d8a26288f041e907ff17758df9e1d5 upstream.

AC'97 ops (register read / write) need SSI regmap and clock, so they have
to be set after them.

We also need to set these ops back to NULL if we fail the probe.

Signed-off-by: Maciej S. Szmigiero <[email protected]>
Acked-by: Nicolin Chen <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/fsl/fsl_ssi.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1467,12 +1467,6 @@ static int fsl_ssi_probe(struct platform
sizeof(fsl_ssi_ac97_dai));

fsl_ac97_data = ssi_private;
-
- ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
- if (ret) {
- dev_err(&pdev->dev, "could not set AC'97 ops\n");
- return ret;
- }
} else {
/* Initialize this copy of the CPU DAI driver structure */
memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_dai_template,
@@ -1583,6 +1577,14 @@ static int fsl_ssi_probe(struct platform
return ret;
}

+ if (fsl_ssi_is_ac97(ssi_private)) {
+ ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
+ if (ret) {
+ dev_err(&pdev->dev, "could not set AC'97 ops\n");
+ goto error_ac97_ops;
+ }
+ }
+
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
&ssi_private->cpu_dai_drv, 1);
if (ret) {
@@ -1666,6 +1668,10 @@ error_sound_card:
fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);

error_asoc_register:
+ if (fsl_ssi_is_ac97(ssi_private))
+ snd_soc_set_ac97_ops(NULL);
+
+error_ac97_ops:
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);



2018-01-01 14:34:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/75] net: fec: unmap the xmit buffer that are not transferred by DMA

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fugang Duan <[email protected]>


[ Upstream commit 178e5f57a8d8f8fc5799a624b96fc31ef9a29ffa ]

The enet IP only support 32 bit, it will use swiotlb buffer to do dma
mapping when xmit buffer DMA memory address is bigger than 4G in i.MX
platform. After stress suspend/resume test, it will print out:

log:
[12826.352864] fec 5b040000.ethernet: swiotlb buffer is full (sz: 191 bytes)
[12826.359676] DMA: Out of SW-IOMMU space for 191 bytes at device 5b040000.ethernet
[12826.367110] fec 5b040000.ethernet eth0: Tx DMA memory map failed

The issue is that the ready xmit buffers that are dma mapped but DMA still
don't copy them into fifo, once MAC restart, these DMA buffers are not unmapped.
So it should check the dma mapping buffer and unmap them.

Signed-off-by: Fugang Duan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/freescale/fec_main.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -813,6 +813,12 @@ static void fec_enet_bd_init(struct net_
for (i = 0; i < txq->bd.ring_size; i++) {
/* Initialize the BD for every fragment in the page. */
bdp->cbd_sc = cpu_to_fec16(0);
+ if (bdp->cbd_bufaddr &&
+ !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
+ dma_unmap_single(&fep->pdev->dev,
+ fec32_to_cpu(bdp->cbd_bufaddr),
+ fec16_to_cpu(bdp->cbd_datlen),
+ DMA_TO_DEVICE);
if (txq->tx_skbuff[i]) {
dev_kfree_skb_any(txq->tx_skbuff[i]);
txq->tx_skbuff[i] = NULL;


2018-01-01 14:34:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 35/75] tg3: Fix rx hang on MTU change with 5717/5719

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Brian King <[email protected]>


[ Upstream commit 748a240c589824e9121befb1cba5341c319885bc ]

This fixes a hang issue seen when changing the MTU size from 1500 MTU
to 9000 MTU on both 5717 and 5719 chips. In discussion with Broadcom,
they've indicated that these chipsets have the same phy as the 57766
chipset, so the same workarounds apply. This has been tested by IBM
on both Power 8 and Power 9 systems as well as by Broadcom on x86
hardware and has been confirmed to resolve the hang issue.

Signed-off-by: Brian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/tg3.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -14226,7 +14226,9 @@ static int tg3_change_mtu(struct net_dev
/* Reset PHY, otherwise the read DMA engine will be in a mode that
* breaks all requests to 256 bytes.
*/
- if (tg3_asic_rev(tp) == ASIC_REV_57766)
+ if (tg3_asic_rev(tp) == ASIC_REV_57766 ||
+ tg3_asic_rev(tp) == ASIC_REV_5717 ||
+ tg3_asic_rev(tp) == ASIC_REV_5719)
reset_phy = true;

err = tg3_restart_hw(tp, reset_phy);


2018-01-01 14:35:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 37/75] net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tobias Jordan <[email protected]>


[ Upstream commit 589bf32f09852041fbd3b7ce1a9e703f95c230ba ]

add appropriate calls to clk_disable_unprepare() by jumping to out_mdio
in case orion_mdio_probe() returns -EPROBE_DEFER.

Found by Linux Driver Verification project (linuxtesting.org).

Fixes: 3d604da1e954 ("net: mvmdio: get and enable optional clock")
Signed-off-by: Tobias Jordan <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/marvell/mvmdio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -232,7 +232,8 @@ static int orion_mdio_probe(struct platf
dev->regs + MVMDIO_ERR_INT_MASK);

} else if (dev->err_interrupt == -EPROBE_DEFER) {
- return -EPROBE_DEFER;
+ ret = -EPROBE_DEFER;
+ goto out_mdio;
}

mutex_init(&dev->lock);


2018-01-01 14:35:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 38/75] sctp: Replace use of sockets_allocated with specified macro.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tonghao Zhang <[email protected]>


[ Upstream commit 8cb38a602478e9f806571f6920b0a3298aabf042 ]

The patch(180d8cd942ce) replaces all uses of struct sock fields'
memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem
to accessor macros. But the sockets_allocated field of sctp sock is
not replaced at all. Then replace it now for unifying the code.

Fixes: 180d8cd942ce ("foundations of per-cgroup memory pressure controlling.")
Cc: Glauber Costa <[email protected]>
Signed-off-by: Tonghao Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/socket.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4246,7 +4246,7 @@ static int sctp_init_sock(struct sock *s
SCTP_DBG_OBJCNT_INC(sock);

local_bh_disable();
- percpu_counter_inc(&sctp_sockets_allocated);
+ sk_sockets_allocated_inc(sk);
sock_prot_inuse_add(net, sk->sk_prot, 1);

/* Nothing can fail after this block, otherwise
@@ -4290,7 +4290,7 @@ static void sctp_destroy_sock(struct soc
}
sctp_endpoint_free(sp->ep);
local_bh_disable();
- percpu_counter_dec(&sctp_sockets_allocated);
+ sk_sockets_allocated_dec(sk);
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
local_bh_enable();
}


2018-01-01 14:35:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 40/75] ipv4: Fix use-after-free when flushing FIB tables

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ido Schimmel <[email protected]>


[ Upstream commit b4681c2829e24943aadd1a7bb3a30d41d0a20050 ]

Since commit 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") the
local table uses the same trie allocated for the main table when custom
rules are not in use.

When a net namespace is dismantled, the main table is flushed and freed
(via an RCU callback) before the local table. In case the callback is
invoked before the local table is iterated, a use-after-free can occur.

Fix this by iterating over the FIB tables in reverse order, so that the
main table is always freed after the local table.

v3: Reworded comment according to Alex's suggestion.
v2: Add a comment to make the fix more explicit per Dave's and Alex's
feedback.

Fixes: 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Fengguang Wu <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/fib_frontend.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1253,14 +1253,19 @@ fail:

static void ip_fib_net_exit(struct net *net)
{
- unsigned int i;
+ int i;

rtnl_lock();
#ifdef CONFIG_IP_MULTIPLE_TABLES
RCU_INIT_POINTER(net->ipv4.fib_main, NULL);
RCU_INIT_POINTER(net->ipv4.fib_default, NULL);
#endif
- for (i = 0; i < FIB_TABLE_HASHSZ; i++) {
+ /* Destroy the tables in reverse order to guarantee that the
+ * local table, ID 255, is destroyed before the main table, ID
+ * 254. This is necessary as the local table may contain
+ * references to data contained in the main table.
+ */
+ for (i = FIB_TABLE_HASHSZ - 1; i >= 0; i--) {
struct hlist_head *head = &net->ipv4.fib_table_hash[i];
struct hlist_node *tmp;
struct fib_table *tb;


2018-01-01 14:35:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 42/75] net: fec: Allow reception of frames bigger than 1522 bytes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Lunn <[email protected]>


[ Upstream commit fbbeefdd21049fcf9437c809da3828b210577f36 ]

The FEC Receive Control Register has a 14 bit field indicating the
longest frame that may be received. It is being set to 1522. Frames
longer than this are discarded, but counted as being in error.

When using DSA, frames from the switch has an additional header,
either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame
of 1522 bytes received by the switch on a port becomes 1530 bytes when
passed to the host via the FEC interface.

Change the maximum receive size to 2048 - 64, where 64 is the maximum
rx_alignment applied on the receive buffer for AVB capable FEC
cores. Use this value also for the maximum receive buffer size. The
driver is already allocating a receive SKB of 2048 bytes, so this
change should not have any significant effects.

Tested on imx51, imx6, vf610.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/freescale/fec_main.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -172,10 +172,12 @@ MODULE_PARM_DESC(macaddr, "FEC Ethernet
#endif /* CONFIG_M5272 */

/* The FEC stores dest/src/type/vlan, data, and checksum for receive packets.
+ *
+ * 2048 byte skbufs are allocated. However, alignment requirements
+ * varies between FEC variants. Worst case is 64, so round down by 64.
*/
-#define PKT_MAXBUF_SIZE 1522
+#define PKT_MAXBUF_SIZE (round_down(2048 - 64, 64))
#define PKT_MINBUF_SIZE 64
-#define PKT_MAXBLR_SIZE 1536

/* FEC receive acceleration */
#define FEC_RACC_IPDIS (1 << 1)
@@ -853,7 +855,7 @@ static void fec_enet_enable_ring(struct
for (i = 0; i < fep->num_rx_queues; i++) {
rxq = fep->rx_queue[i];
writel(rxq->bd.dma, fep->hwp + FEC_R_DES_START(i));
- writel(PKT_MAXBLR_SIZE, fep->hwp + FEC_R_BUFF_SIZE(i));
+ writel(PKT_MAXBUF_SIZE, fep->hwp + FEC_R_BUFF_SIZE(i));

/* enable DMA1/2 */
if (i)


2018-01-01 14:35:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 43/75] net: Fix double free and memory corruption in get_net_ns_by_id()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>


[ Upstream commit 21b5944350052d2583e82dd59b19a9ba94a007f0 ]

(I can trivially verify that that idr_remove in cleanup_net happens
after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer) rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0] ...
__put_net(peer) get_net_ns_by_id(net, id)
spin_lock(&cleanup_list_lock)
list_add(&net->cleanup_list, &cleanup_list)
spin_unlock(&cleanup_list_lock)
queue_work() peer = idr_find(&net->netns_ids, id)
| get_net(peer) [count=1]
| ...
| (use after final put)
v ...
cleanup_net() ...
spin_lock(&cleanup_list_lock) ...
list_replace_init(&cleanup_list, ..) ...
spin_unlock(&cleanup_list_lock) ...
... ...
... put_net(peer)
... atomic_dec_and_test(&peer->count) [count=0]
... spin_lock(&cleanup_list_lock)
... list_add(&net->cleanup_list, &cleanup_list)
... spin_unlock(&cleanup_list_lock)
... queue_work()
... rtnl_unlock()
rtnl_lock() ...
for_each_net(tmp) { ...
id = __peernet2id(tmp, peer) ...
spin_lock_irq(&tmp->nsid_lock) ...
idr_remove(&tmp->netns_ids, id) ...
... ...
net_drop_ns() ...
net_free(peer) ...
} ...
|
v
cleanup_net()
...
(Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <[email protected]>
Signed-off-by: Kirill Tkhai <[email protected]>
Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <[email protected]>
Reviewed-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Eric W. Biederman <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/net_namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -263,7 +263,7 @@ struct net *get_net_ns_by_id(struct net
spin_lock_irqsave(&net->nsid_lock, flags);
peer = idr_find(&net->netns_ids, id);
if (peer)
- get_net(peer);
+ peer = maybe_get_net(peer);
spin_unlock_irqrestore(&net->nsid_lock, flags);
rcu_read_unlock();



2018-01-01 14:35:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 44/75] net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Grygorii Strashko <[email protected]>


[ Upstream commit c1a8d0a3accf64a014d605e6806ce05d1c17adf1 ]

Under some circumstances driver will perform PHY reset in
ksz9031_read_status() to fix autoneg failure case (idle error count =
0xFF). When this happens ksz9031 will not detect link status change any
more when connecting to Netgear 1G switch (link can be recovered sometimes by
restarting netdevice "ifconfig down up"). Reproduced with TI am572x board
equipped with ksz9031 PHY while connecting to Netgear 1G switch.

Fix the issue by reconfiguring autonegotiation after PHY reset in
ksz9031_read_status().

Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg")
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/phy/micrel.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -624,6 +624,7 @@ static int ksz9031_read_status(struct ph
phydev->link = 0;
if (phydev->drv->config_intr && phy_interrupt_is_valid(phydev))
phydev->drv->config_intr(phydev);
+ return genphy_config_aneg(phydev);
}

return 0;


2018-01-01 14:35:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kevin Cernekee <[email protected]>


[ Upstream commit a46182b00290839fa3fa159d54fd3237bd8669f0 ]

Closing a multicast socket after the final IPv4 address is deleted
from an interface can generate a membership report that uses the
source IP from a different interface. The following test script, run
from an isolated netns, reproduces the issue:

#!/bin/bash

ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link set dummy0 up
ip link set dummy1 up
ip addr add 10.1.1.1/24 dev dummy0
ip addr add 192.168.99.99/24 dev dummy1

tcpdump -U -i dummy0 &
socat EXEC:"sleep 2" \
UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &

sleep 1
ip addr del 10.1.1.1/24 dev dummy0
sleep 5
kill %tcpdump

RFC 3376 specifies that the report must be sent with a valid IP source
address from the destination subnet, or from address 0.0.0.0. Add an
extra check to make sure this is the case.

Signed-off-by: Kevin Cernekee <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/igmp.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -89,6 +89,7 @@
#include <linux/rtnetlink.h>
#include <linux/times.h>
#include <linux/pkt_sched.h>
+#include <linux/byteorder/generic.h>

#include <net/net_namespace.h>
#include <net/arp.h>
@@ -321,6 +322,23 @@ igmp_scount(struct ip_mc_list *pmc, int
return scount;
}

+/* source address selection per RFC 3376 section 4.2.13 */
+static __be32 igmpv3_get_srcaddr(struct net_device *dev,
+ const struct flowi4 *fl4)
+{
+ struct in_device *in_dev = __in_dev_get_rcu(dev);
+
+ if (!in_dev)
+ return htonl(INADDR_ANY);
+
+ for_ifa(in_dev) {
+ if (inet_ifa_match(fl4->saddr, ifa))
+ return fl4->saddr;
+ } endfor_ifa(in_dev);
+
+ return htonl(INADDR_ANY);
+}
+
static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
{
struct sk_buff *skb;
@@ -368,7 +386,7 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
- pip->saddr = fl4.saddr;
+ pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);


2018-01-01 14:35:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 47/75] net/mlx5: Fix rate limit packet pacing naming and struct

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eran Ben Elisha <[email protected]>


[ Upstream commit 37e92a9d4fe38dc3e7308913575983a6a088c8d4 ]

In mlx5_ifc, struct size was not complete, and thus driver was sending
garbage after the last defined field. Fixed it by adding reserved field
to complete the struct size.

In addition, rename all set_rate_limit to set_pp_rate_limit to be
compliant with the Firmware <-> Driver definition.

Fixes: 7486216b3a0b ("{net,IB}/mlx5: mlx5_ifc updates")
Fixes: 1466cc5b23d1 ("net/mlx5: Rate limit tables support")
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/rl.c | 22 +++++++++++-----------
include/linux/mlx5/mlx5_ifc.h | 8 +++++---
3 files changed, 18 insertions(+), 16 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -367,7 +367,7 @@ static int mlx5_internal_err_ret_value(s
case MLX5_CMD_OP_QUERY_VPORT_COUNTER:
case MLX5_CMD_OP_ALLOC_Q_COUNTER:
case MLX5_CMD_OP_QUERY_Q_COUNTER:
- case MLX5_CMD_OP_SET_RATE_LIMIT:
+ case MLX5_CMD_OP_SET_PP_RATE_LIMIT:
case MLX5_CMD_OP_QUERY_RATE_LIMIT:
case MLX5_CMD_OP_ALLOC_PD:
case MLX5_CMD_OP_ALLOC_UAR:
@@ -502,7 +502,7 @@ const char *mlx5_command_str(int command
MLX5_COMMAND_STR_CASE(ALLOC_Q_COUNTER);
MLX5_COMMAND_STR_CASE(DEALLOC_Q_COUNTER);
MLX5_COMMAND_STR_CASE(QUERY_Q_COUNTER);
- MLX5_COMMAND_STR_CASE(SET_RATE_LIMIT);
+ MLX5_COMMAND_STR_CASE(SET_PP_RATE_LIMIT);
MLX5_COMMAND_STR_CASE(QUERY_RATE_LIMIT);
MLX5_COMMAND_STR_CASE(ALLOC_PD);
MLX5_COMMAND_STR_CASE(DEALLOC_PD);
--- a/drivers/net/ethernet/mellanox/mlx5/core/rl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/rl.c
@@ -60,16 +60,16 @@ static struct mlx5_rl_entry *find_rl_ent
return ret_entry;
}

-static int mlx5_set_rate_limit_cmd(struct mlx5_core_dev *dev,
+static int mlx5_set_pp_rate_limit_cmd(struct mlx5_core_dev *dev,
u32 rate, u16 index)
{
- u32 in[MLX5_ST_SZ_DW(set_rate_limit_in)] = {0};
- u32 out[MLX5_ST_SZ_DW(set_rate_limit_out)] = {0};
+ u32 in[MLX5_ST_SZ_DW(set_pp_rate_limit_in)] = {0};
+ u32 out[MLX5_ST_SZ_DW(set_pp_rate_limit_out)] = {0};

- MLX5_SET(set_rate_limit_in, in, opcode,
- MLX5_CMD_OP_SET_RATE_LIMIT);
- MLX5_SET(set_rate_limit_in, in, rate_limit_index, index);
- MLX5_SET(set_rate_limit_in, in, rate_limit, rate);
+ MLX5_SET(set_pp_rate_limit_in, in, opcode,
+ MLX5_CMD_OP_SET_PP_RATE_LIMIT);
+ MLX5_SET(set_pp_rate_limit_in, in, rate_limit_index, index);
+ MLX5_SET(set_pp_rate_limit_in, in, rate_limit, rate);
return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
}

@@ -108,7 +108,7 @@ int mlx5_rl_add_rate(struct mlx5_core_de
entry->refcount++;
} else {
/* new rate limit */
- err = mlx5_set_rate_limit_cmd(dev, rate, entry->index);
+ err = mlx5_set_pp_rate_limit_cmd(dev, rate, entry->index);
if (err) {
mlx5_core_err(dev, "Failed configuring rate: %u (%d)\n",
rate, err);
@@ -144,7 +144,7 @@ void mlx5_rl_remove_rate(struct mlx5_cor
entry->refcount--;
if (!entry->refcount) {
/* need to remove rate */
- mlx5_set_rate_limit_cmd(dev, 0, entry->index);
+ mlx5_set_pp_rate_limit_cmd(dev, 0, entry->index);
entry->rate = 0;
}

@@ -197,8 +197,8 @@ void mlx5_cleanup_rl_table(struct mlx5_c
/* Clear all configured rates */
for (i = 0; i < table->max_size; i++)
if (table->rl_entry[i].rate)
- mlx5_set_rate_limit_cmd(dev, 0,
- table->rl_entry[i].index);
+ mlx5_set_pp_rate_limit_cmd(dev, 0,
+ table->rl_entry[i].index);

kfree(dev->priv.rl_table.rl_entry);
}
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -143,7 +143,7 @@ enum {
MLX5_CMD_OP_ALLOC_Q_COUNTER = 0x771,
MLX5_CMD_OP_DEALLOC_Q_COUNTER = 0x772,
MLX5_CMD_OP_QUERY_Q_COUNTER = 0x773,
- MLX5_CMD_OP_SET_RATE_LIMIT = 0x780,
+ MLX5_CMD_OP_SET_PP_RATE_LIMIT = 0x780,
MLX5_CMD_OP_QUERY_RATE_LIMIT = 0x781,
MLX5_CMD_OP_ALLOC_PD = 0x800,
MLX5_CMD_OP_DEALLOC_PD = 0x801,
@@ -6689,7 +6689,7 @@ struct mlx5_ifc_add_vxlan_udp_dport_in_b
u8 vxlan_udp_port[0x10];
};

-struct mlx5_ifc_set_rate_limit_out_bits {
+struct mlx5_ifc_set_pp_rate_limit_out_bits {
u8 status[0x8];
u8 reserved_at_8[0x18];

@@ -6698,7 +6698,7 @@ struct mlx5_ifc_set_rate_limit_out_bits
u8 reserved_at_40[0x40];
};

-struct mlx5_ifc_set_rate_limit_in_bits {
+struct mlx5_ifc_set_pp_rate_limit_in_bits {
u8 opcode[0x10];
u8 reserved_at_10[0x10];

@@ -6711,6 +6711,8 @@ struct mlx5_ifc_set_rate_limit_in_bits {
u8 reserved_at_60[0x20];

u8 rate_limit[0x20];
+
+ u8 reserved_at_a0[0x160];
};

struct mlx5_ifc_access_register_out_bits {


2018-01-01 14:35:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 48/75] net/mlx5e: Fix features check of IPv6 traffic

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gal Pressman <[email protected]>


[ Upstream commit 2989ad1ec03021ee6d2193c35414f1d970a243de ]

The assumption that the next header field contains the transport
protocol is wrong for IPv6 packets with extension headers.
Instead, we should look the inner-most next header field in the buffer.
This will fix TSO offload for tunnels over IPv6 with extension headers.

Performance testing: 19.25x improvement, cool!
Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap.
CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
TSO: Enabled
Before: 4,926.24 Mbps
Now : 94,827.91 Mbps

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3038,6 +3038,7 @@ static netdev_features_t mlx5e_vxlan_fea
struct sk_buff *skb,
netdev_features_t features)
{
+ unsigned int offset = 0;
struct udphdr *udph;
u16 proto;
u16 port = 0;
@@ -3047,7 +3048,7 @@ static netdev_features_t mlx5e_vxlan_fea
proto = ip_hdr(skb)->protocol;
break;
case htons(ETH_P_IPV6):
- proto = ipv6_hdr(skb)->nexthdr;
+ proto = ipv6_find_hdr(skb, &offset, -1, NULL, NULL);
break;
default:
goto out;


2018-01-01 14:35:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 49/75] net/mlx5e: Fix possible deadlock of VXLAN lock

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gal Pressman <[email protected]>


[ Upstream commit 6323514116404cc651df1b7fffa1311ddf8ce647 ]

mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user
context) and mlx5e_features_check (softirq), but the lock acquired does
not disable bottom half and might result in deadlock. Fix it by simply
replacing spin_lock() with spin_lock_bh().
While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh().

lockdep's WARNING: inconsistent lock state
[ 654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes:
[ 654.028321] (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[ 654.028528] {SOFTIRQ-ON-W} state was registered at:
[ 654.028607] _raw_spin_lock+0x3c/0x70
[ 654.028689] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
[ 654.028794] mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core]
[ 654.028878] process_one_work+0x1e9/0x640
[ 654.028942] worker_thread+0x4a/0x3f0
[ 654.029002] kthread+0x141/0x180
[ 654.029056] ret_from_fork+0x24/0x30
[ 654.029114] irq event stamp: 579088
[ 654.029174] hardirqs last enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0
[ 654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0
[ 654.029446] softirqs last enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80
[ 654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0
[ 654.029684] other info that might help us debug this:
[ 654.029781] Possible unsafe locking scenario:

[ 654.029868] CPU0
[ 654.029908] ----
[ 654.029947] lock(&(&vxlan_db->lock)->rlock);
[ 654.030045] <Interrupt>
[ 654.030090] lock(&(&vxlan_db->lock)->rlock);
[ 654.030162]
*** DEADLOCK ***

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -71,9 +71,9 @@ struct mlx5e_vxlan *mlx5e_vxlan_lookup_p
struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
struct mlx5e_vxlan *vxlan;

- spin_lock(&vxlan_db->lock);
+ spin_lock_bh(&vxlan_db->lock);
vxlan = radix_tree_lookup(&vxlan_db->tree, port);
- spin_unlock(&vxlan_db->lock);
+ spin_unlock_bh(&vxlan_db->lock);

return vxlan;
}
@@ -100,9 +100,9 @@ static void mlx5e_vxlan_add_port(struct

vxlan->udp_port = port;

- spin_lock_irq(&vxlan_db->lock);
+ spin_lock_bh(&vxlan_db->lock);
err = radix_tree_insert(&vxlan_db->tree, vxlan->udp_port, vxlan);
- spin_unlock_irq(&vxlan_db->lock);
+ spin_unlock_bh(&vxlan_db->lock);
if (err)
goto err_free;

@@ -121,9 +121,9 @@ static void __mlx5e_vxlan_core_del_port(
struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
struct mlx5e_vxlan *vxlan;

- spin_lock_irq(&vxlan_db->lock);
+ spin_lock_bh(&vxlan_db->lock);
vxlan = radix_tree_delete(&vxlan_db->tree, port);
- spin_unlock_irq(&vxlan_db->lock);
+ spin_unlock_bh(&vxlan_db->lock);

if (!vxlan)
return;
@@ -171,12 +171,12 @@ void mlx5e_vxlan_cleanup(struct mlx5e_pr
struct mlx5e_vxlan *vxlan;
unsigned int port = 0;

- spin_lock_irq(&vxlan_db->lock);
+ spin_lock_bh(&vxlan_db->lock);
while (radix_tree_gang_lookup(&vxlan_db->tree, (void **)&vxlan, port, 1)) {
port = vxlan->udp_port;
- spin_unlock_irq(&vxlan_db->lock);
+ spin_unlock_bh(&vxlan_db->lock);
__mlx5e_vxlan_core_del_port(priv, (u16)port);
- spin_lock_irq(&vxlan_db->lock);
+ spin_lock_bh(&vxlan_db->lock);
}
- spin_unlock_irq(&vxlan_db->lock);
+ spin_unlock_bh(&vxlan_db->lock);
}


2018-01-01 14:35:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 50/75] net/mlx5e: Add refcount to VXLAN structure

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gal Pressman <[email protected]>


[ Upstream commit 23f4cc2cd9ed92570647220aca60d0197d8c1fa9 ]

A refcount mechanism must be implemented in order to prevent unwanted
scenarios such as:
- Open an IPv4 VXLAN interface
- Open an IPv6 VXLAN interface (different socket)
- Remove one of the interfaces

With current implementation, the UDP port will be removed from our VXLAN
database and turn off the offloads for the other interface, which is
still active.
The reference count mechanism will only allow UDP port removals once all
consumers are gone.

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 50 ++++++++++++------------
drivers/net/ethernet/mellanox/mlx5/core/vxlan.h | 1
2 files changed, 28 insertions(+), 23 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -88,8 +88,11 @@ static void mlx5e_vxlan_add_port(struct
struct mlx5e_vxlan *vxlan;
int err;

- if (mlx5e_vxlan_lookup_port(priv, port))
+ vxlan = mlx5e_vxlan_lookup_port(priv, port);
+ if (vxlan) {
+ atomic_inc(&vxlan->refcount);
goto free_work;
+ }

if (mlx5e_vxlan_core_add_port_cmd(priv->mdev, port))
goto free_work;
@@ -99,6 +102,7 @@ static void mlx5e_vxlan_add_port(struct
goto err_delete_port;

vxlan->udp_port = port;
+ atomic_set(&vxlan->refcount, 1);

spin_lock_bh(&vxlan_db->lock);
err = radix_tree_insert(&vxlan_db->tree, vxlan->udp_port, vxlan);
@@ -116,32 +120,33 @@ free_work:
kfree(vxlan_work);
}

-static void __mlx5e_vxlan_core_del_port(struct mlx5e_priv *priv, u16 port)
+static void mlx5e_vxlan_del_port(struct work_struct *work)
{
+ struct mlx5e_vxlan_work *vxlan_work =
+ container_of(work, struct mlx5e_vxlan_work, work);
+ struct mlx5e_priv *priv = vxlan_work->priv;
struct mlx5e_vxlan_db *vxlan_db = &priv->vxlan;
+ u16 port = vxlan_work->port;
struct mlx5e_vxlan *vxlan;
+ bool remove = false;

spin_lock_bh(&vxlan_db->lock);
- vxlan = radix_tree_delete(&vxlan_db->tree, port);
- spin_unlock_bh(&vxlan_db->lock);
-
+ vxlan = radix_tree_lookup(&vxlan_db->tree, port);
if (!vxlan)
- return;
-
- mlx5e_vxlan_core_del_port_cmd(priv->mdev, vxlan->udp_port);
-
- kfree(vxlan);
-}
+ goto out_unlock;

-static void mlx5e_vxlan_del_port(struct work_struct *work)
-{
- struct mlx5e_vxlan_work *vxlan_work =
- container_of(work, struct mlx5e_vxlan_work, work);
- struct mlx5e_priv *priv = vxlan_work->priv;
- u16 port = vxlan_work->port;
+ if (atomic_dec_and_test(&vxlan->refcount)) {
+ radix_tree_delete(&vxlan_db->tree, port);
+ remove = true;
+ }

- __mlx5e_vxlan_core_del_port(priv, port);
+out_unlock:
+ spin_unlock_bh(&vxlan_db->lock);

+ if (remove) {
+ mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
+ kfree(vxlan);
+ }
kfree(vxlan_work);
}

@@ -171,12 +176,11 @@ void mlx5e_vxlan_cleanup(struct mlx5e_pr
struct mlx5e_vxlan *vxlan;
unsigned int port = 0;

- spin_lock_bh(&vxlan_db->lock);
+ /* Lockless since we are the only radix-tree consumers, wq is disabled */
while (radix_tree_gang_lookup(&vxlan_db->tree, (void **)&vxlan, port, 1)) {
port = vxlan->udp_port;
- spin_unlock_bh(&vxlan_db->lock);
- __mlx5e_vxlan_core_del_port(priv, (u16)port);
- spin_lock_bh(&vxlan_db->lock);
+ radix_tree_delete(&vxlan_db->tree, port);
+ mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
+ kfree(vxlan);
}
- spin_unlock_bh(&vxlan_db->lock);
}
--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.h
@@ -36,6 +36,7 @@
#include "en.h"

struct mlx5e_vxlan {
+ atomic_t refcount;
u16 udp_port;
};



2018-01-01 14:35:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 36/75] net: ipv4: fix for a race condition in raw_sendmsg

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mohamed Ghannam <[email protected]>


[ Upstream commit 8f659a03a0ba9289b9aeb9b4470e6fb263d6f483 ]

inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.

Fixes: c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/raw.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -502,11 +502,16 @@ static int raw_sendmsg(struct sock *sk,
int err;
struct ip_options_data opt_copy;
struct raw_frag_vec rfv;
+ int hdrincl;

err = -EMSGSIZE;
if (len > 0xFFFF)
goto out;

+ /* hdrincl should be READ_ONCE(inet->hdrincl)
+ * but READ_ONCE() doesn't work with bit fields
+ */
+ hdrincl = inet->hdrincl;
/*
* Check the flags.
*/
@@ -582,7 +587,7 @@ static int raw_sendmsg(struct sock *sk,
/* Linux does not mangle headers on raw sockets,
* so that IP options + IP_HDRINCL is non-sense.
*/
- if (inet->hdrincl)
+ if (hdrincl)
goto done;
if (ipc.opt->opt.srr) {
if (!daddr)
@@ -604,12 +609,12 @@ static int raw_sendmsg(struct sock *sk,

flowi4_init_output(&fl4, ipc.oif, sk->sk_mark, tos,
RT_SCOPE_UNIVERSE,
- inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ hdrincl ? IPPROTO_RAW : sk->sk_protocol,
inet_sk_flowi_flags(sk) |
- (inet->hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
+ (hdrincl ? FLOWI_FLAG_KNOWN_NH : 0),
daddr, saddr, 0, 0);

- if (!inet->hdrincl) {
+ if (!hdrincl) {
rfv.msg = msg;
rfv.hlen = 0;

@@ -634,7 +639,7 @@ static int raw_sendmsg(struct sock *sk,
goto do_confirm;
back_from_confirm:

- if (inet->hdrincl)
+ if (hdrincl)
err = raw_send_hdrinc(sk, &fl4, msg, len,
&rt, msg->msg_flags, &ipc.sockc);



2018-01-01 14:36:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 53/75] s390/qeth: apply takeover changes when mode is toggled

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <[email protected]>


[ Upstream commit 7fbd9493f0eeae8cef58300505a9ef5c8fce6313 ]

Just as for an explicit enable/disable, toggling the takeover mode also
requires that the IP addresses get updated. Otherwise all IPs that were
added to the table before the mode-toggle, get registered with the old
settings.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/s390/net/qeth_core.h | 2 +-
drivers/s390/net/qeth_core_main.c | 2 +-
drivers/s390/net/qeth_l3_sys.c | 35 +++++++++++++++++------------------
3 files changed, 19 insertions(+), 20 deletions(-)

--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -576,7 +576,7 @@ enum qeth_cq {
};

struct qeth_ipato {
- int enabled;
+ bool enabled;
int invert4;
int invert6;
struct list_head entries;
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -1475,7 +1475,7 @@ static int qeth_setup_card(struct qeth_c
qeth_set_intial_options(card);
/* IP address takeover */
INIT_LIST_HEAD(&card->ipato.entries);
- card->ipato.enabled = 0;
+ card->ipato.enabled = false;
card->ipato.invert4 = 0;
card->ipato.invert6 = 0;
/* init QDIO stuff */
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -374,6 +374,7 @@ static ssize_t qeth_l3_dev_ipato_enable_
struct qeth_card *card = dev_get_drvdata(dev);
struct qeth_ipaddr *addr;
int i, rc = 0;
+ bool enable;

if (!card)
return -EINVAL;
@@ -386,25 +387,23 @@ static ssize_t qeth_l3_dev_ipato_enable_
}

if (sysfs_streq(buf, "toggle")) {
- card->ipato.enabled = (card->ipato.enabled)? 0 : 1;
- } else if (sysfs_streq(buf, "1")) {
- card->ipato.enabled = 1;
- hash_for_each(card->ip_htable, i, addr, hnode) {
- if ((addr->type == QETH_IP_TYPE_NORMAL) &&
- qeth_l3_is_addr_covered_by_ipato(card, addr))
- addr->set_flags |=
- QETH_IPA_SETIP_TAKEOVER_FLAG;
- }
- } else if (sysfs_streq(buf, "0")) {
- card->ipato.enabled = 0;
- hash_for_each(card->ip_htable, i, addr, hnode) {
- if (addr->set_flags &
- QETH_IPA_SETIP_TAKEOVER_FLAG)
- addr->set_flags &=
- ~QETH_IPA_SETIP_TAKEOVER_FLAG;
- }
- } else
+ enable = !card->ipato.enabled;
+ } else if (kstrtobool(buf, &enable)) {
rc = -EINVAL;
+ goto out;
+ }
+
+ if (card->ipato.enabled == enable)
+ goto out;
+ card->ipato.enabled = enable;
+
+ hash_for_each(card->ip_htable, i, addr, hnode) {
+ if (!enable)
+ addr->set_flags &= ~QETH_IPA_SETIP_TAKEOVER_FLAG;
+ else if (addr->type == QETH_IP_TYPE_NORMAL &&
+ qeth_l3_is_addr_covered_by_ipato(card, addr))
+ addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
+ }
out:
mutex_unlock(&card->conf_mutex);
return rc ? rc : count;


2018-01-01 14:36:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 54/75] s390/qeth: dont apply takeover changes to RXIP

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <[email protected]>


[ Upstream commit b22d73d6689fd902a66c08ebe71ab2f3b351e22f ]

When takeover is switched off, current code clears the 'TAKEOVER' flag on
all IPs. But the flag is also used for RXIP addresses, and those should
not be affected by the takeover mode.
Fix the behaviour by consistenly applying takover logic to NORMAL
addresses only.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/s390/net/qeth_l3_main.c | 5 +++--
drivers/s390/net/qeth_l3_sys.c | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)

--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -178,6 +178,8 @@ int qeth_l3_is_addr_covered_by_ipato(str

if (!card->ipato.enabled)
return 0;
+ if (addr->type != QETH_IP_TYPE_NORMAL)
+ return 0;

qeth_l3_convert_addr_to_bits((u8 *) &addr->u, addr_bits,
(addr->proto == QETH_PROT_IPV4)? 4:16);
@@ -293,8 +295,7 @@ int qeth_l3_add_ip(struct qeth_card *car
memcpy(addr, tmp_addr, sizeof(struct qeth_ipaddr));
addr->ref_counter = 1;

- if (addr->type == QETH_IP_TYPE_NORMAL &&
- qeth_l3_is_addr_covered_by_ipato(card, addr)) {
+ if (qeth_l3_is_addr_covered_by_ipato(card, addr)) {
QETH_CARD_TEXT(card, 2, "tkovaddr");
addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
}
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -398,10 +398,11 @@ static ssize_t qeth_l3_dev_ipato_enable_
card->ipato.enabled = enable;

hash_for_each(card->ip_htable, i, addr, hnode) {
+ if (addr->type != QETH_IP_TYPE_NORMAL)
+ continue;
if (!enable)
addr->set_flags &= ~QETH_IPA_SETIP_TAKEOVER_FLAG;
- else if (addr->type == QETH_IP_TYPE_NORMAL &&
- qeth_l3_is_addr_covered_by_ipato(card, addr))
+ else if (qeth_l3_is_addr_covered_by_ipato(card, addr))
addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
}
out:


2018-01-01 14:36:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 56/75] s390/qeth: update takeover IPs after configuration change

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <[email protected]>


[ Upstream commit 02f510f326501470348a5df341e8232c3497bbbb ]

Any modification to the takeover IP-ranges requires that we re-evaluate
which IP addresses are takeover-eligible. Otherwise we might do takeover
for some addresses when we no longer should, or vice-versa.

Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/s390/net/qeth_core.h | 4 +-
drivers/s390/net/qeth_core_main.c | 4 +-
drivers/s390/net/qeth_l3.h | 2 -
drivers/s390/net/qeth_l3_main.c | 31 ++++++++++++++++--
drivers/s390/net/qeth_l3_sys.c | 63 ++++++++++++++++++++------------------
5 files changed, 67 insertions(+), 37 deletions(-)

--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -577,8 +577,8 @@ enum qeth_cq {

struct qeth_ipato {
bool enabled;
- int invert4;
- int invert6;
+ bool invert4;
+ bool invert6;
struct list_head entries;
};

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -1476,8 +1476,8 @@ static int qeth_setup_card(struct qeth_c
/* IP address takeover */
INIT_LIST_HEAD(&card->ipato.entries);
card->ipato.enabled = false;
- card->ipato.invert4 = 0;
- card->ipato.invert6 = 0;
+ card->ipato.invert4 = false;
+ card->ipato.invert6 = false;
/* init QDIO stuff */
qeth_init_qdio_info(card);
INIT_DELAYED_WORK(&card->buffer_reclaim_work, qeth_buffer_reclaim_work);
--- a/drivers/s390/net/qeth_l3.h
+++ b/drivers/s390/net/qeth_l3.h
@@ -80,7 +80,7 @@ void qeth_l3_del_vipa(struct qeth_card *
int qeth_l3_add_rxip(struct qeth_card *, enum qeth_prot_versions, const u8 *);
void qeth_l3_del_rxip(struct qeth_card *card, enum qeth_prot_versions,
const u8 *);
-int qeth_l3_is_addr_covered_by_ipato(struct qeth_card *, struct qeth_ipaddr *);
+void qeth_l3_update_ipato(struct qeth_card *card);
struct qeth_ipaddr *qeth_l3_get_addr_buffer(enum qeth_prot_versions);
int qeth_l3_add_ip(struct qeth_card *, struct qeth_ipaddr *);
int qeth_l3_delete_ip(struct qeth_card *, struct qeth_ipaddr *);
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -168,8 +168,8 @@ static void qeth_l3_convert_addr_to_bits
}
}

-int qeth_l3_is_addr_covered_by_ipato(struct qeth_card *card,
- struct qeth_ipaddr *addr)
+static bool qeth_l3_is_addr_covered_by_ipato(struct qeth_card *card,
+ struct qeth_ipaddr *addr)
{
struct qeth_ipato_entry *ipatoe;
u8 addr_bits[128] = {0, };
@@ -608,6 +608,27 @@ int qeth_l3_setrouting_v6(struct qeth_ca
/*
* IP address takeover related functions
*/
+
+/**
+ * qeth_l3_update_ipato() - Update 'takeover' property, for all NORMAL IPs.
+ *
+ * Caller must hold ip_lock.
+ */
+void qeth_l3_update_ipato(struct qeth_card *card)
+{
+ struct qeth_ipaddr *addr;
+ unsigned int i;
+
+ hash_for_each(card->ip_htable, i, addr, hnode) {
+ if (addr->type != QETH_IP_TYPE_NORMAL)
+ continue;
+ if (qeth_l3_is_addr_covered_by_ipato(card, addr))
+ addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
+ else
+ addr->set_flags &= ~QETH_IPA_SETIP_TAKEOVER_FLAG;
+ }
+}
+
static void qeth_l3_clear_ipato_list(struct qeth_card *card)
{
struct qeth_ipato_entry *ipatoe, *tmp;
@@ -619,6 +640,7 @@ static void qeth_l3_clear_ipato_list(str
kfree(ipatoe);
}

+ qeth_l3_update_ipato(card);
spin_unlock_bh(&card->ip_lock);
}

@@ -643,8 +665,10 @@ int qeth_l3_add_ipato_entry(struct qeth_
}
}

- if (!rc)
+ if (!rc) {
list_add_tail(&new->entry, &card->ipato.entries);
+ qeth_l3_update_ipato(card);
+ }

spin_unlock_bh(&card->ip_lock);

@@ -667,6 +691,7 @@ void qeth_l3_del_ipato_entry(struct qeth
(proto == QETH_PROT_IPV4)? 4:16) &&
(ipatoe->mask_bits == mask_bits)) {
list_del(&ipatoe->entry);
+ qeth_l3_update_ipato(card);
kfree(ipatoe);
}
}
--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -372,9 +372,8 @@ static ssize_t qeth_l3_dev_ipato_enable_
struct device_attribute *attr, const char *buf, size_t count)
{
struct qeth_card *card = dev_get_drvdata(dev);
- struct qeth_ipaddr *addr;
- int i, rc = 0;
bool enable;
+ int rc = 0;

if (!card)
return -EINVAL;
@@ -393,20 +392,12 @@ static ssize_t qeth_l3_dev_ipato_enable_
goto out;
}

- if (card->ipato.enabled == enable)
- goto out;
- card->ipato.enabled = enable;
-
- spin_lock_bh(&card->ip_lock);
- hash_for_each(card->ip_htable, i, addr, hnode) {
- if (addr->type != QETH_IP_TYPE_NORMAL)
- continue;
- if (!enable)
- addr->set_flags &= ~QETH_IPA_SETIP_TAKEOVER_FLAG;
- else if (qeth_l3_is_addr_covered_by_ipato(card, addr))
- addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
+ if (card->ipato.enabled != enable) {
+ card->ipato.enabled = enable;
+ spin_lock_bh(&card->ip_lock);
+ qeth_l3_update_ipato(card);
+ spin_unlock_bh(&card->ip_lock);
}
- spin_unlock_bh(&card->ip_lock);
out:
mutex_unlock(&card->conf_mutex);
return rc ? rc : count;
@@ -432,20 +423,27 @@ static ssize_t qeth_l3_dev_ipato_invert4
const char *buf, size_t count)
{
struct qeth_card *card = dev_get_drvdata(dev);
+ bool invert;
int rc = 0;

if (!card)
return -EINVAL;

mutex_lock(&card->conf_mutex);
- if (sysfs_streq(buf, "toggle"))
- card->ipato.invert4 = (card->ipato.invert4)? 0 : 1;
- else if (sysfs_streq(buf, "1"))
- card->ipato.invert4 = 1;
- else if (sysfs_streq(buf, "0"))
- card->ipato.invert4 = 0;
- else
+ if (sysfs_streq(buf, "toggle")) {
+ invert = !card->ipato.invert4;
+ } else if (kstrtobool(buf, &invert)) {
rc = -EINVAL;
+ goto out;
+ }
+
+ if (card->ipato.invert4 != invert) {
+ card->ipato.invert4 = invert;
+ spin_lock_bh(&card->ip_lock);
+ qeth_l3_update_ipato(card);
+ spin_unlock_bh(&card->ip_lock);
+ }
+out:
mutex_unlock(&card->conf_mutex);
return rc ? rc : count;
}
@@ -611,20 +609,27 @@ static ssize_t qeth_l3_dev_ipato_invert6
struct device_attribute *attr, const char *buf, size_t count)
{
struct qeth_card *card = dev_get_drvdata(dev);
+ bool invert;
int rc = 0;

if (!card)
return -EINVAL;

mutex_lock(&card->conf_mutex);
- if (sysfs_streq(buf, "toggle"))
- card->ipato.invert6 = (card->ipato.invert6)? 0 : 1;
- else if (sysfs_streq(buf, "1"))
- card->ipato.invert6 = 1;
- else if (sysfs_streq(buf, "0"))
- card->ipato.invert6 = 0;
- else
+ if (sysfs_streq(buf, "toggle")) {
+ invert = !card->ipato.invert6;
+ } else if (kstrtobool(buf, &invert)) {
rc = -EINVAL;
+ goto out;
+ }
+
+ if (card->ipato.invert6 != invert) {
+ card->ipato.invert6 = invert;
+ spin_lock_bh(&card->ip_lock);
+ qeth_l3_update_ipato(card);
+ spin_unlock_bh(&card->ip_lock);
+ }
+out:
mutex_unlock(&card->conf_mutex);
return rc ? rc : count;
}


2018-01-01 14:36:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 59/75] usbip: stub: stop printing kernel pointer addresses in messages

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shuah Khan <[email protected]>

commit 248a22044366f588d46754c54dfe29ffe4f8b4df upstream.

Remove and/or change debug, info. and error messages to not print
kernel pointer addresses.

Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/usbip/stub_main.c | 5 +++--
drivers/usb/usbip/stub_rx.c | 7 ++-----
drivers/usb/usbip/stub_tx.c | 6 +++---
3 files changed, 8 insertions(+), 10 deletions(-)

--- a/drivers/usb/usbip/stub_main.c
+++ b/drivers/usb/usbip/stub_main.c
@@ -252,11 +252,12 @@ void stub_device_cleanup_urbs(struct stu
struct stub_priv *priv;
struct urb *urb;

- dev_dbg(&sdev->udev->dev, "free sdev %p\n", sdev);
+ dev_dbg(&sdev->udev->dev, "Stub device cleaning up urbs\n");

while ((priv = stub_priv_pop(sdev))) {
urb = priv->urb;
- dev_dbg(&sdev->udev->dev, "free urb %p\n", urb);
+ dev_dbg(&sdev->udev->dev, "free urb seqnum %lu\n",
+ priv->seqnum);
usb_kill_urb(urb);

kmem_cache_free(stub_priv_cache, priv);
--- a/drivers/usb/usbip/stub_rx.c
+++ b/drivers/usb/usbip/stub_rx.c
@@ -225,9 +225,6 @@ static int stub_recv_cmd_unlink(struct s
if (priv->seqnum != pdu->u.cmd_unlink.seqnum)
continue;

- dev_info(&priv->urb->dev->dev, "unlink urb %p\n",
- priv->urb);
-
/*
* This matched urb is not completed yet (i.e., be in
* flight in usb hcd hardware/driver). Now we are
@@ -266,8 +263,8 @@ static int stub_recv_cmd_unlink(struct s
ret = usb_unlink_urb(priv->urb);
if (ret != -EINPROGRESS)
dev_err(&priv->urb->dev->dev,
- "failed to unlink a urb %p, ret %d\n",
- priv->urb, ret);
+ "failed to unlink a urb # %lu, ret %d\n",
+ priv->seqnum, ret);

return 0;
}
--- a/drivers/usb/usbip/stub_tx.c
+++ b/drivers/usb/usbip/stub_tx.c
@@ -102,7 +102,7 @@ void stub_complete(struct urb *urb)
/* link a urb to the queue of tx. */
spin_lock_irqsave(&sdev->priv_lock, flags);
if (sdev->ud.tcp_socket == NULL) {
- usbip_dbg_stub_tx("ignore urb for closed connection %p", urb);
+ usbip_dbg_stub_tx("ignore urb for closed connection\n");
/* It will be freed in stub_device_cleanup_urbs(). */
} else if (priv->unlinking) {
stub_enqueue_ret_unlink(sdev, priv->seqnum, urb->status);
@@ -204,8 +204,8 @@ static int stub_send_ret_submit(struct s

/* 1. setup usbip_header */
setup_ret_submit_pdu(&pdu_header, urb);
- usbip_dbg_stub_tx("setup txdata seqnum: %d urb: %p\n",
- pdu_header.base.seqnum, urb);
+ usbip_dbg_stub_tx("setup txdata seqnum: %d\n",
+ pdu_header.base.seqnum);
usbip_header_correct_endian(&pdu_header, 1);

iov[iovnum].iov_base = &pdu_header;


2018-01-01 14:36:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 60/75] usbip: vhci: stop printing kernel pointer addresses in messages

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shuah Khan <[email protected]>

commit 8272d099d05f7ab2776cf56a2ab9f9443be18907 upstream.

Remove and/or change debug, info. and error messages to not print
kernel pointer addresses.

Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/usbip/vhci_hcd.c | 10 ----------
drivers/usb/usbip/vhci_rx.c | 23 +++++++++++------------
drivers/usb/usbip/vhci_tx.c | 3 ++-
3 files changed, 13 insertions(+), 23 deletions(-)

--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -506,9 +506,6 @@ static int vhci_urb_enqueue(struct usb_h
struct vhci_device *vdev;
unsigned long flags;

- usbip_dbg_vhci_hc("enter, usb_hcd %p urb %p mem_flags %d\n",
- hcd, urb, mem_flags);
-
if (portnum > VHCI_HC_PORTS) {
pr_err("invalid port number %d\n", portnum);
return -ENODEV;
@@ -671,8 +668,6 @@ static int vhci_urb_dequeue(struct usb_h
struct vhci_device *vdev;
unsigned long flags;

- pr_info("dequeue a urb %p\n", urb);
-
spin_lock_irqsave(&vhci->lock, flags);

priv = urb->hcpriv;
@@ -700,7 +695,6 @@ static int vhci_urb_dequeue(struct usb_h
/* tcp connection is closed */
spin_lock(&vdev->priv_lock);

- pr_info("device %p seems to be disconnected\n", vdev);
list_del(&priv->list);
kfree(priv);
urb->hcpriv = NULL;
@@ -712,8 +706,6 @@ static int vhci_urb_dequeue(struct usb_h
* vhci_rx will receive RET_UNLINK and give back the URB.
* Otherwise, we give back it here.
*/
- pr_info("gives back urb %p\n", urb);
-
usb_hcd_unlink_urb_from_ep(hcd, urb);

spin_unlock_irqrestore(&vhci->lock, flags);
@@ -741,8 +733,6 @@ static int vhci_urb_dequeue(struct usb_h

unlink->unlink_seqnum = priv->seqnum;

- pr_info("device %p seems to be still connected\n", vdev);
-
/* send cmd_unlink and try to cancel the pending URB in the
* peer */
list_add_tail(&unlink->list, &vdev->unlink_tx);
--- a/drivers/usb/usbip/vhci_rx.c
+++ b/drivers/usb/usbip/vhci_rx.c
@@ -37,24 +37,23 @@ struct urb *pickup_urb_and_free_priv(str
urb = priv->urb;
status = urb->status;

- usbip_dbg_vhci_rx("find urb %p vurb %p seqnum %u\n",
- urb, priv, seqnum);
+ usbip_dbg_vhci_rx("find urb seqnum %u\n", seqnum);

switch (status) {
case -ENOENT:
/* fall through */
case -ECONNRESET:
- dev_info(&urb->dev->dev,
- "urb %p was unlinked %ssynchronuously.\n", urb,
- status == -ENOENT ? "" : "a");
+ dev_dbg(&urb->dev->dev,
+ "urb seq# %u was unlinked %ssynchronuously\n",
+ seqnum, status == -ENOENT ? "" : "a");
break;
case -EINPROGRESS:
/* no info output */
break;
default:
- dev_info(&urb->dev->dev,
- "urb %p may be in a error, status %d\n", urb,
- status);
+ dev_dbg(&urb->dev->dev,
+ "urb seq# %u may be in a error, status %d\n",
+ seqnum, status);
}

list_del(&priv->list);
@@ -80,8 +79,8 @@ static void vhci_recv_ret_submit(struct
spin_unlock_irqrestore(&vdev->priv_lock, flags);

if (!urb) {
- pr_err("cannot find a urb of seqnum %u\n", pdu->base.seqnum);
- pr_info("max seqnum %d\n",
+ pr_err("cannot find a urb of seqnum %u max seqnum %d\n",
+ pdu->base.seqnum,
atomic_read(&vhci->seqnum));
usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
return;
@@ -104,7 +103,7 @@ static void vhci_recv_ret_submit(struct
if (usbip_dbg_flag_vhci_rx)
usbip_dump_urb(urb);

- usbip_dbg_vhci_rx("now giveback urb %p\n", urb);
+ usbip_dbg_vhci_rx("now giveback urb %u\n", pdu->base.seqnum);

spin_lock_irqsave(&vhci->lock, flags);
usb_hcd_unlink_urb_from_ep(vhci_to_hcd(vhci), urb);
@@ -170,7 +169,7 @@ static void vhci_recv_ret_unlink(struct
pr_info("the urb (seqnum %d) was already given back\n",
pdu->base.seqnum);
} else {
- usbip_dbg_vhci_rx("now giveback urb %p\n", urb);
+ usbip_dbg_vhci_rx("now giveback urb %d\n", pdu->base.seqnum);

/* If unlink is successful, status is -ECONNRESET */
urb->status = pdu->u.ret_unlink.status;
--- a/drivers/usb/usbip/vhci_tx.c
+++ b/drivers/usb/usbip/vhci_tx.c
@@ -83,7 +83,8 @@ static int vhci_send_cmd_submit(struct v
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));

- usbip_dbg_vhci_tx("setup txdata urb %p\n", urb);
+ usbip_dbg_vhci_tx("setup txdata urb seqnum %lu\n",
+ priv->seqnum);

/* 1. setup usbip_header */
setup_cmd_submit_pdu(&pdu_header, urb);


2018-01-01 14:36:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 61/75] USB: serial: ftdi_sio: add id for Airbus DS P8GR

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Max Schulze <[email protected]>

commit c6a36ad383559a60a249aa6016cebf3cb8b6c485 upstream.

Add AIRBUS_DS_P8GR device IDs to ftdi_sio driver.

Signed-off-by: Max Schulze <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/serial/ftdi_sio.c | 1 +
drivers/usb/serial/ftdi_sio_ids.h | 6 ++++++
2 files changed, 7 insertions(+)

--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -1017,6 +1017,7 @@ static const struct usb_device_id id_tab
.driver_info = (kernel_ulong_t)&ftdi_jtag_quirk },
{ USB_DEVICE(CYPRESS_VID, CYPRESS_WICED_BT_USB_PID) },
{ USB_DEVICE(CYPRESS_VID, CYPRESS_WICED_WL_USB_PID) },
+ { USB_DEVICE(AIRBUS_DS_VID, AIRBUS_DS_P8GR) },
{ } /* Terminating entry */
};

--- a/drivers/usb/serial/ftdi_sio_ids.h
+++ b/drivers/usb/serial/ftdi_sio_ids.h
@@ -914,6 +914,12 @@
#define ICPDAS_I7563U_PID 0x0105

/*
+ * Airbus Defence and Space
+ */
+#define AIRBUS_DS_VID 0x1e8e /* Vendor ID */
+#define AIRBUS_DS_P8GR 0x6001 /* Tetra P8GR */
+
+/*
* RT Systems programming cables for various ham radios
*/
#define RTSYSTEMS_VID 0x2100 /* Vendor ID */


2018-01-01 14:36:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 63/75] USB: serial: option: add support for Telit ME910 PID 0x1101

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniele Palmas <[email protected]>

commit 08933099e6404f588f81c2050bfec7313e06eeaf upstream.

This patch adds support for PID 0x1101 of Telit ME910.

Signed-off-by: Daniele Palmas <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/serial/option.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -283,6 +283,7 @@ static void option_instat_callback(struc
#define TELIT_PRODUCT_LE922_USBCFG3 0x1043
#define TELIT_PRODUCT_LE922_USBCFG5 0x1045
#define TELIT_PRODUCT_ME910 0x1100
+#define TELIT_PRODUCT_ME910_DUAL_MODEM 0x1101
#define TELIT_PRODUCT_LE920 0x1200
#define TELIT_PRODUCT_LE910 0x1201
#define TELIT_PRODUCT_LE910_USBCFG4 0x1206
@@ -648,6 +649,11 @@ static const struct option_blacklist_inf
.reserved = BIT(1) | BIT(3),
};

+static const struct option_blacklist_info telit_me910_dual_modem_blacklist = {
+ .sendsetup = BIT(0),
+ .reserved = BIT(3),
+};
+
static const struct option_blacklist_info telit_le910_blacklist = {
.sendsetup = BIT(0),
.reserved = BIT(1) | BIT(2),
@@ -1247,6 +1253,8 @@ static const struct usb_device_id option
.driver_info = (kernel_ulong_t)&telit_le922_blacklist_usbcfg0 },
{ USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910),
.driver_info = (kernel_ulong_t)&telit_me910_blacklist },
+ { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM),
+ .driver_info = (kernel_ulong_t)&telit_me910_dual_modem_blacklist },
{ USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910),
.driver_info = (kernel_ulong_t)&telit_le910_blacklist },
{ USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910_USBCFG4),


2018-01-01 14:36:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 30/75] net: reevalulate autoflowlabel setting after sysctl setting

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shaohua Li <[email protected]>


[ Upstream commit 513674b5a2c9c7a67501506419da5c3c77ac6f08 ]

sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
supposed to not include flowlabel. This is true for normal packet, but
not for reset packet.

The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
changed, so the sock will keep the old behavior in terms of auto
flowlabel. Reset packet is suffering from this problem, because reset
packet is sent from a special control socket, which is created at boot
time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
socket will always have its ipv6_pinfo.autoflowlabel set, even after
user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
have flowlabel. Normal sock created before sysctl setting suffers from
the same issue. We can't even turn off autoflowlabel unless we kill all
socks in the hosts.

To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
autoflowlabel setting from user, otherwise we always call
ip6_default_np_autolabel() which has the new settings of sysctl.

Note, this changes behavior a little bit. Before commit 42240901f7c4
(ipv6: Implement different admin modes for automatic flow labels), the
autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
existing connection will change autoflowlabel behavior. After that
commit, autoflowlabel behavior is sticky in the whole life of the sock.
With this patch, the behavior isn't sticky again.

Cc: Martin KaFai Lau <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Tom Herbert <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/ipv6.h | 3 ++-
net/ipv6/af_inet6.c | 1 -
net/ipv6/ip6_output.c | 12 ++++++++++--
net/ipv6/ipv6_sockglue.c | 1 +
4 files changed, 13 insertions(+), 4 deletions(-)

--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -246,7 +246,8 @@ struct ipv6_pinfo {
* 100: prefer care-of address
*/
dontfrag:1,
- autoflowlabel:1;
+ autoflowlabel:1,
+ autoflowlabel_set:1;
__u8 min_hopcount;
__u8 tclass;
__be32 rcv_flowinfo;
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -209,7 +209,6 @@ lookup_protocol:
np->mcast_hops = IPV6_DEFAULT_MCASTHOPS;
np->mc_loop = 1;
np->pmtudisc = IPV6_PMTUDISC_WANT;
- np->autoflowlabel = ip6_default_np_autolabel(sock_net(sk));
sk->sk_ipv6only = net->ipv6.sysctl.bindv6only;

/* Init the ipv4 part of the socket since we can have sockets
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -156,6 +156,14 @@ int ip6_output(struct net *net, struct s
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
}

+static bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+{
+ if (!np->autoflowlabel_set)
+ return ip6_default_np_autolabel(net);
+ else
+ return np->autoflowlabel;
+}
+
/*
* xmit an sk_buff (used by TCP, SCTP and DCCP)
* Note : socket lock is not held for SYNACK packets, but might be modified
@@ -219,7 +227,7 @@ int ip6_xmit(const struct sock *sk, stru
hlimit = ip6_dst_hoplimit(dst);

ip6_flow_hdr(hdr, tclass, ip6_make_flowlabel(net, skb, fl6->flowlabel,
- np->autoflowlabel, fl6));
+ ip6_autoflowlabel(net, np), fl6));

hdr->payload_len = htons(seg_len);
hdr->nexthdr = proto;
@@ -1691,7 +1699,7 @@ struct sk_buff *__ip6_make_skb(struct so

ip6_flow_hdr(hdr, v6_cork->tclass,
ip6_make_flowlabel(net, skb, fl6->flowlabel,
- np->autoflowlabel, fl6));
+ ip6_autoflowlabel(net, np), fl6));
hdr->hop_limit = v6_cork->hop_limit;
hdr->nexthdr = proto;
hdr->saddr = fl6->saddr;
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -874,6 +874,7 @@ pref_skip_coa:
break;
case IPV6_AUTOFLOWLABEL:
np->autoflowlabel = valbool;
+ np->autoflowlabel_set = 1;
retv = 0;
break;
}


2018-01-01 14:36:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 31/75] ptr_ring: add barriers

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Michael S. Tsirkin" <[email protected]>


[ Upstream commit a8ceb5dbfde1092b466936bca0ff3be127ecf38e ]

Users of ptr_ring expect that it's safe to give the
data structure a pointer and have it be available
to consumers, but that actually requires an smb_wmb
or a stronger barrier.

In absence of such barriers and on architectures that reorder writes,
consumer might read an un=initialized value from an skb pointer stored
in the skb array. This was observed causing crashes.

To fix, add memory barriers. The barrier we use is a wmb, the
assumption being that producers do not need to read the value so we do
not need to order these reads.

Reported-by: George Cherian <[email protected]>
Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/ptr_ring.h | 9 +++++++++
1 file changed, 9 insertions(+)

--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -99,12 +99,18 @@ static inline bool ptr_ring_full_bh(stru

/* Note: callers invoking this in a loop must use a compiler barrier,
* for example cpu_relax(). Callers must hold producer_lock.
+ * Callers are responsible for making sure pointer that is being queued
+ * points to a valid data.
*/
static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
{
if (unlikely(!r->size) || r->queue[r->producer])
return -ENOSPC;

+ /* Make sure the pointer we are storing points to a valid data. */
+ /* Pairs with smp_read_barrier_depends in __ptr_ring_consume. */
+ smp_wmb();
+
r->queue[r->producer++] = ptr;
if (unlikely(r->producer >= r->size))
r->producer = 0;
@@ -244,6 +250,9 @@ static inline void *__ptr_ring_consume(s
if (ptr)
__ptr_ring_discard_one(r);

+ /* Make sure anyone accessing data through the pointer is up to date. */
+ /* Pairs with smp_wmb in __ptr_ring_produce. */
+ smp_read_barrier_depends();
return ptr;
}



2018-01-01 14:36:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 32/75] RDS: Check cmsg_len before dereferencing CMSG_DATA

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Avinash Repaka <[email protected]>


[ Upstream commit 14e138a86f6347c6199f610576d2e11c03bec5f0 ]

RDS currently doesn't check if the length of the control message is
large enough to hold the required data, before dereferencing the control
message data. This results in following crash:

BUG: KASAN: stack-out-of-bounds in rds_rdma_bytes net/rds/send.c:1013
[inline]
BUG: KASAN: stack-out-of-bounds in rds_sendmsg+0x1f02/0x1f90
net/rds/send.c:1066
Read of size 8 at addr ffff8801c928fb70 by task syzkaller455006/3157

CPU: 0 PID: 3157 Comm: syzkaller455006 Not tainted 4.15.0-rc3+ #161
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x25b/0x340 mm/kasan/report.c:409
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
rds_rdma_bytes net/rds/send.c:1013 [inline]
rds_sendmsg+0x1f02/0x1f90 net/rds/send.c:1066
sock_sendmsg_nosec net/socket.c:628 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:638
___sys_sendmsg+0x320/0x8b0 net/socket.c:2018
__sys_sendmmsg+0x1ee/0x620 net/socket.c:2108
SYSC_sendmmsg net/socket.c:2139 [inline]
SyS_sendmmsg+0x35/0x60 net/socket.c:2134
entry_SYSCALL_64_fastpath+0x1f/0x96
RIP: 0033:0x43fe49
RSP: 002b:00007fffbe244ad8 EFLAGS: 00000217 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fe49
RDX: 0000000000000001 RSI: 000000002020c000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004017b0
R13: 0000000000401840 R14: 0000000000000000 R15: 0000000000000000

To fix this, we verify that the cmsg_len is large enough to hold the
data to be read, before proceeding further.

Reported-by: syzbot <[email protected]>
Signed-off-by: Avinash Repaka <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Reviewed-by: Yuval Shaia <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/rds/send.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1006,6 +1006,9 @@ static int rds_rdma_bytes(struct msghdr
continue;

if (cmsg->cmsg_type == RDS_CMSG_RDMA_ARGS) {
+ if (cmsg->cmsg_len <
+ CMSG_LEN(sizeof(struct rds_rdma_args)))
+ return -EINVAL;
args = CMSG_DATA(cmsg);
*rdma_bytes += args->remote_vec.bytes;
}


2018-01-01 14:36:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 33/75] tcp_bbr: record "full bw reached" decision in new full_bw_reached bit

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <[email protected]>


[ Upstream commit c589e69b508d29ed8e644dfecda453f71c02ec27 ]

This commit records the "full bw reached" decision in a new
full_bw_reached bit. This is a pure refactor that does not change the
current behavior, but enables subsequent fixes and improvements.

In particular, this enables simple and clean fixes because the full_bw
and full_bw_cnt can be unconditionally zeroed without worrying about
forgetting that we estimated we filled the pipe in Startup. And it
enables future improvements because multiple code paths can be used
for estimating that we filled the pipe in Startup; any new code paths
only need to set this bit when they think the pipe is full.

Note that this fix intentionally reduces the width of the full_bw_cnt
counter, since we have never used the most significant bit.

Signed-off-by: Neal Cardwell <[email protected]>
Reviewed-by: Yuchung Cheng <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_bbr.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -81,7 +81,8 @@ struct bbr {
u32 lt_last_lost; /* LT intvl start: tp->lost */
u32 pacing_gain:10, /* current gain for setting pacing rate */
cwnd_gain:10, /* current gain for setting cwnd */
- full_bw_cnt:3, /* number of rounds without large bw gains */
+ full_bw_reached:1, /* reached full bw in Startup? */
+ full_bw_cnt:2, /* number of rounds without large bw gains */
cycle_idx:3, /* current index in pacing_gain cycle array */
has_seen_rtt:1, /* have we seen an RTT sample yet? */
unused_b:5;
@@ -151,7 +152,7 @@ static bool bbr_full_bw_reached(const st
{
const struct bbr *bbr = inet_csk_ca(sk);

- return bbr->full_bw_cnt >= bbr_full_bw_cnt;
+ return bbr->full_bw_reached;
}

/* Return the windowed max recent bandwidth sample, in pkts/uS << BW_SCALE. */
@@ -688,6 +689,7 @@ static void bbr_check_full_bw_reached(st
return;
}
++bbr->full_bw_cnt;
+ bbr->full_bw_reached = bbr->full_bw_cnt >= bbr_full_bw_cnt;
}

/* If pipe is probably full, drain the queue and then enter steady-state. */
@@ -821,6 +823,7 @@ static void bbr_init(struct sock *sk)
bbr->restore_cwnd = 0;
bbr->round_start = 0;
bbr->idle_restart = 0;
+ bbr->full_bw_reached = 0;
bbr->full_bw = 0;
bbr->full_bw_cnt = 0;
bbr->cycle_mstamp.v64 = 0;


2018-01-01 14:37:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/75] ipv6: mcast: better catch silly mtu values

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit b9b312a7a451e9c098921856e7cfbc201120e1a7 ]

syzkaller reported crashes in IPv6 stack [1]

Xin Long found that lo MTU was set to silly values.

IPv6 stack reacts to changes to small MTU, by disabling itself under
RTNL.

But there is a window where threads not using RTNL can see a wrong
device mtu. This can lead to surprises, in mld code where it is assumed
the mtu is suitable.

Fix this by reading device mtu once and checking IPv6 minimal MTU.

[1]
skbuff: skb_over_panic: text:0000000010b86b8d len:196 put:20
head:000000003b477e60 data:000000000e85441e tail:0xd4 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc2-mm1+ #39
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:skb_panic+0x15c/0x1f0 net/core/skbuff.c:100
RSP: 0018:ffff8801db307508 EFLAGS: 00010286
RAX: 0000000000000082 RBX: ffff8801c517e840 RCX: 0000000000000000
RDX: 0000000000000082 RSI: 1ffff1003b660e61 RDI: ffffed003b660e95
RBP: ffff8801db307570 R08: 1ffff1003b660e23 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85bd4020
R13: ffffffff84754ed2 R14: 0000000000000014 R15: ffff8801c4e26540
FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000463610 CR3: 00000001c6698000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
skb_over_panic net/core/skbuff.c:109 [inline]
skb_put+0x181/0x1c0 net/core/skbuff.c:1694
add_grhead.isra.24+0x42/0x3b0 net/ipv6/mcast.c:1695
add_grec+0xa55/0x1060 net/ipv6/mcast.c:1817
mld_send_cr net/ipv6/mcast.c:1903 [inline]
mld_ifc_timer_expire+0x4d2/0x770 net/ipv6/mcast.c:2448
call_timer_fn+0x23b/0x840 kernel/time/timer.c:1320
expire_timers kernel/time/timer.c:1357 [inline]
__run_timers+0x7e1/0xb60 kernel/time/timer.c:1660
run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
__do_softirq+0x29d/0xbb2 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1d3/0x210 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:540 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:920

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Tested-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/mcast.c | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)

--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1682,16 +1682,16 @@ static int grec_size(struct ifmcaddr6 *p
}

static struct sk_buff *add_grhead(struct sk_buff *skb, struct ifmcaddr6 *pmc,
- int type, struct mld2_grec **ppgr)
+ int type, struct mld2_grec **ppgr, unsigned int mtu)
{
- struct net_device *dev = pmc->idev->dev;
struct mld2_report *pmr;
struct mld2_grec *pgr;

- if (!skb)
- skb = mld_newpack(pmc->idev, dev->mtu);
- if (!skb)
- return NULL;
+ if (!skb) {
+ skb = mld_newpack(pmc->idev, mtu);
+ if (!skb)
+ return NULL;
+ }
pgr = (struct mld2_grec *)skb_put(skb, sizeof(struct mld2_grec));
pgr->grec_type = type;
pgr->grec_auxwords = 0;
@@ -1714,10 +1714,15 @@ static struct sk_buff *add_grec(struct s
struct mld2_grec *pgr = NULL;
struct ip6_sf_list *psf, *psf_next, *psf_prev, **psf_list;
int scount, stotal, first, isquery, truncate;
+ unsigned int mtu;

if (pmc->mca_flags & MAF_NOREPORT)
return skb;

+ mtu = READ_ONCE(dev->mtu);
+ if (mtu < IPV6_MIN_MTU)
+ return skb;
+
isquery = type == MLD2_MODE_IS_INCLUDE ||
type == MLD2_MODE_IS_EXCLUDE;
truncate = type == MLD2_MODE_IS_EXCLUDE ||
@@ -1738,7 +1743,7 @@ static struct sk_buff *add_grec(struct s
AVAILABLE(skb) < grec_size(pmc, type, gdeleted, sdeleted)) {
if (skb)
mld_sendpack(skb);
- skb = mld_newpack(idev, dev->mtu);
+ skb = mld_newpack(idev, mtu);
}
}
first = 1;
@@ -1774,12 +1779,12 @@ static struct sk_buff *add_grec(struct s
pgr->grec_nsrcs = htons(scount);
if (skb)
mld_sendpack(skb);
- skb = mld_newpack(idev, dev->mtu);
+ skb = mld_newpack(idev, mtu);
first = 1;
scount = 0;
}
if (first) {
- skb = add_grhead(skb, pmc, type, &pgr);
+ skb = add_grhead(skb, pmc, type, &pgr, mtu);
first = 0;
}
if (!skb)
@@ -1814,7 +1819,7 @@ empty_source:
mld_sendpack(skb);
skb = NULL; /* add_grhead will get a new one */
}
- skb = add_grhead(skb, pmc, type, &pgr);
+ skb = add_grhead(skb, pmc, type, &pgr, mtu);
}
}
if (pgr)


2018-01-01 14:37:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 74/75] n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <[email protected]>

commit 966031f340185eddd05affcf72b740549f056348 upstream.

We added support for EXTPROC back in 2010 in commit 26df6d13406d ("tty:
Add EXTPROC support for LINEMODE") and the intent was to allow it to
override some (all?) ICANON behavior. Quoting from that original commit
message:

There is a new bit in the termios local flag word, EXTPROC.
When this bit is set, several aspects of the terminal driver
are disabled. Input line editing, character echo, and mapping
of signals are all disabled. This allows the telnetd to turn
off these functions when in linemode, but still keep track of
what state the user wants the terminal to be in.

but the problem turns out that "several aspects of the terminal driver
are disabled" is a bit ambiguous, and you can really confuse the n_tty
layer by setting EXTPROC and then causing some of the ICANON invariants
to no longer be maintained.

This fixes at least one such case (TIOCINQ) becoming unhappy because of
the confusion over whether ICANON really means ICANON when EXTPROC is set.

This basically makes TIOCINQ match the case of read: if EXTPROC is set,
we ignore ICANON. Also, make sure to reset the ICANON state ie EXTPROC
changes, not just if ICANON changes.

Fixes: 26df6d13406d ("tty: Add EXTPROC support for LINEMODE")
Reported-by: Tetsuo Handa <[email protected]>
Reported-by: syzkaller <[email protected]>
Cc: Jiri Slaby <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/n_tty.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1764,7 +1764,7 @@ static void n_tty_set_termios(struct tty
{
struct n_tty_data *ldata = tty->disc_data;

- if (!old || (old->c_lflag ^ tty->termios.c_lflag) & ICANON) {
+ if (!old || (old->c_lflag ^ tty->termios.c_lflag) & (ICANON | EXTPROC)) {
bitmap_zero(ldata->read_flags, N_TTY_BUF_SIZE);
ldata->line_start = ldata->read_tail;
if (!L_ICANON(tty) || !read_cnt(ldata)) {
@@ -2427,7 +2427,7 @@ static int n_tty_ioctl(struct tty_struct
return put_user(tty_chars_in_buffer(tty), (int __user *) arg);
case TIOCINQ:
down_write(&tty->termios_rwsem);
- if (L_ICANON(tty))
+ if (L_ICANON(tty) && !L_EXTPROC(tty))
retval = inq_canon(ldata);
else
retval = read_cnt(ldata);


2018-01-01 14:37:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 75/75] tty: fix tty_ldisc_receive_buf() documentation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johan Hovold <[email protected]>

commit e7e51dcf3b8a5f65c5653a054ad57eb2492a90d0 upstream.

The tty_ldisc_receive_buf() helper returns the number of bytes
processed so drop the bogus "not" from the kernel doc comment.

Fixes: 8d082cd300ab ("tty: Unify receive_buf() code paths")
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/tty_buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -446,7 +446,7 @@ EXPORT_SYMBOL_GPL(tty_prepare_flip_strin
* Callers other than flush_to_ldisc() need to exclude the kworker
* from concurrent use of the line discipline, see paste_selection().
*
- * Returns the number of bytes not processed
+ * Returns the number of bytes processed
*/
int tty_ldisc_receive_buf(struct tty_ldisc *ld, unsigned char *p,
char *f, int count)


2018-01-01 14:37:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 66/75] usb: add RESET_RESUME for ELSA MicroLink 56K

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver Neukum <[email protected]>

commit b9096d9f15c142574ebebe8fbb137012bb9d99c2 upstream.

This modem needs this quirk to operate. It produces timeouts when
resumed without reset.

Signed-off-by: Oliver Neukum <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/core/quirks.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -155,6 +155,9 @@ static const struct usb_device_id usb_qu
/* Genesys Logic hub, internally used by KY-688 USB 3.1 Type-C Hub */
{ USB_DEVICE(0x05e3, 0x0612), .driver_info = USB_QUIRK_NO_LPM },

+ /* ELSA MicroLink 56K */
+ { USB_DEVICE(0x05cc, 0x2267), .driver_info = USB_QUIRK_RESET_RESUME },
+
/* Genesys Logic hub, internally used by Moshi USB to Ethernet Adapter */
{ USB_DEVICE(0x05e3, 0x0616), .driver_info = USB_QUIRK_NO_LPM },



2018-01-01 14:37:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 68/75] usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Thompson <[email protected]>

commit da99706689481717998d1d48edd389f339eea979 upstream.

When plugging in a USB webcam I see the following message:
xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs
XHCI_TRUST_TX_LENGTH quirk?
handle_tx_event: 913 callbacks suppressed

All is quiet again with this patch (and I've done a fair but of soak
testing with the camera since).

Signed-off-by: Daniel Thompson <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/host/xhci-pci.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -190,6 +190,9 @@ static void xhci_pci_quirks(struct devic
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
+ pdev->device == 0x0014)
+ xhci->quirks |= XHCI_TRUST_TX_LENGTH;
+ if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
pdev->device == 0x0015)
xhci->quirks |= XHCI_RESET_ON_RESUME;
if (pdev->vendor == PCI_VENDOR_ID_VIA)


2018-01-01 14:37:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 70/75] timers: Invoke timer_start_debug() where it makes sense

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <[email protected]>

commit fd45bb77ad682be728d1002431d77b8c73342836 upstream.

The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.

Call the debug function after setting the new base and flags.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Sebastian Siewior <[email protected]>
Cc: [email protected]
Cc: Paul McKenney <[email protected]>
Cc: Anna-Maria Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/time/timer.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1019,8 +1019,6 @@ __mod_timer(struct timer_list *timer, un
if (!ret && pending_only)
goto out_unlock;

- debug_activate(timer, expires);
-
new_base = get_target_base(base, timer->flags);

if (base != new_base) {
@@ -1044,6 +1042,8 @@ __mod_timer(struct timer_list *timer, un
}
}

+ debug_activate(timer, expires);
+
timer->expires = expires;
/*
* If 'idx' was calculated above and the base time did not advance


2018-01-01 14:37:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 71/75] timers: Reinitialize per cpu bases on hotplug

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <[email protected]>

commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base->must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Sebastian Siewior <[email protected]>
Cc: Anna-Maria Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/cpuhotplug.h | 2 +-
include/linux/timer.h | 4 +++-
kernel/cpu.c | 4 ++--
kernel/time/timer.c | 15 +++++++++++++++
4 files changed, 21 insertions(+), 4 deletions(-)

--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -48,7 +48,7 @@ enum cpuhp_state {
CPUHP_ARM_SHMOBILE_SCU_PREPARE,
CPUHP_SH_SH3X_PREPARE,
CPUHP_BLK_MQ_PREPARE,
- CPUHP_TIMERS_DEAD,
+ CPUHP_TIMERS_PREPARE,
CPUHP_NOTF_ERR_INJ_PREPARE,
CPUHP_MIPS_SOC_PREPARE,
CPUHP_BRINGUP_CPU,
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -274,9 +274,11 @@ unsigned long round_jiffies_up(unsigned
unsigned long round_jiffies_up_relative(unsigned long j);

#ifdef CONFIG_HOTPLUG_CPU
+int timers_prepare_cpu(unsigned int cpu);
int timers_dead_cpu(unsigned int cpu);
#else
-#define timers_dead_cpu NULL
+#define timers_prepare_cpu NULL
+#define timers_dead_cpu NULL
#endif

#endif
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1309,9 +1309,9 @@ static struct cpuhp_step cpuhp_bp_states
* before blk_mq_queue_reinit_notify() from notify_dead(),
* otherwise a RCU stall occurs.
*/
- [CPUHP_TIMERS_DEAD] = {
+ [CPUHP_TIMERS_PREPARE] = {
.name = "timers:dead",
- .startup.single = NULL,
+ .startup.single = timers_prepare_cpu,
.teardown.single = timers_dead_cpu,
},
/* Kicks the plugged cpu into life */
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1851,6 +1851,21 @@ static void migrate_timer_list(struct ti
}
}

+int timers_prepare_cpu(unsigned int cpu)
+{
+ struct timer_base *base;
+ int b;
+
+ for (b = 0; b < NR_BASES; b++) {
+ base = per_cpu_ptr(&timer_bases[b], cpu);
+ base->clk = jiffies;
+ base->next_expiry = base->clk + NEXT_TIMER_MAX_DELTA;
+ base->is_idle = false;
+ base->must_forward_clk = true;
+ }
+ return 0;
+}
+
int timers_dead_cpu(unsigned int cpu)
{
struct timer_base *old_base;


2018-01-01 14:37:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 72/75] nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <[email protected]>

commit 5d62c183f9e9df1deeea0906d099a94e8a43047a upstream.

The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:

if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))

If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.

Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....

Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.

Reported-by: Sebastian Siewior <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul McKenney <[email protected]>
Cc: Anna-Maria Gleixner <[email protected]>
Cc: Sebastian Siewior <[email protected]>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/time/tick-sched.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -663,6 +663,11 @@ static void tick_nohz_restart(struct tic
tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
}

+static inline bool local_timer_softirq_pending(void)
+{
+ return local_softirq_pending() & TIMER_SOFTIRQ;
+}
+
static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
ktime_t now, int cpu)
{
@@ -679,8 +684,18 @@ static ktime_t tick_nohz_stop_sched_tick
} while (read_seqretry(&jiffies_lock, seq));
ts->last_jiffies = basejiff;

- if (rcu_needs_cpu(basemono, &next_rcu) ||
- arch_needs_cpu() || irq_work_needs_cpu()) {
+ /*
+ * Keep the periodic tick, when RCU, architecture or irq_work
+ * requests it.
+ * Aside of that check whether the local timer softirq is
+ * pending. If so its a bad idea to call get_next_timer_interrupt()
+ * because there is an already expired timer, so it will request
+ * immeditate expiry, which rearms the hardware timer with a
+ * minimal delta which brings us back to this place
+ * immediately. Lather, rinse and repeat...
+ */
+ if (rcu_needs_cpu(basemono, &next_rcu) || arch_needs_cpu() ||
+ irq_work_needs_cpu() || local_timer_softirq_pending()) {
next_tick = basemono + TICK_NSEC;
} else {
/*


2018-01-01 14:37:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 73/75] x86/smpboot: Remove stale TLB flush invocations

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <[email protected]>

commit 322f8b8b340c824aef891342b0f5795d15e11562 upstream.

smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.

Digging in history revealed that the original code in the 2.1 era added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
TLB flush invocations stayed around forever.

Remove them along with the pointless pr_debug()s which come from the same 2.1
change.

Reported-by: Dominik Brodowski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/smpboot.c | 9 ---------
1 file changed, 9 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -115,14 +115,10 @@ static inline void smpboot_setup_warm_re
spin_lock_irqsave(&rtc_lock, flags);
CMOS_WRITE(0xa, 0xf);
spin_unlock_irqrestore(&rtc_lock, flags);
- local_flush_tlb();
- pr_debug("1.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_HIGH)) =
start_eip >> 4;
- pr_debug("2.\n");
*((volatile unsigned short *)phys_to_virt(TRAMPOLINE_PHYS_LOW)) =
start_eip & 0xf;
- pr_debug("3.\n");
}

static inline void smpboot_restore_warm_reset_vector(void)
@@ -130,11 +126,6 @@ static inline void smpboot_restore_warm_
unsigned long flags;

/*
- * Install writable page 0 entry to set BIOS data area.
- */
- local_flush_tlb();
-
- /*
* Paranoid: Set warm reset code and vector here back
* to default values.
*/


2018-01-01 14:37:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 69/75] timers: Use deferrable base independent of base::nohz_active

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Anna-Maria Gleixner <[email protected]>

commit ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd upstream.

During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.

Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.

To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Anna-Maria Gleixner <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sebastian Siewior <[email protected]>
Cc: [email protected]
Cc: Paul McKenney <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/time/timer.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -849,11 +849,10 @@ static inline struct timer_base *get_tim
struct timer_base *base = per_cpu_ptr(&timer_bases[BASE_STD], cpu);

/*
- * If the timer is deferrable and nohz is active then we need to use
- * the deferrable base.
+ * If the timer is deferrable and NO_HZ_COMMON is set then we need
+ * to use the deferrable base.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active &&
- (tflags & TIMER_DEFERRABLE))
+ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && (tflags & TIMER_DEFERRABLE))
base = per_cpu_ptr(&timer_bases[BASE_DEF], cpu);
return base;
}
@@ -863,11 +862,10 @@ static inline struct timer_base *get_tim
struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);

/*
- * If the timer is deferrable and nohz is active then we need to use
- * the deferrable base.
+ * If the timer is deferrable and NO_HZ_COMMON is set then we need
+ * to use the deferrable base.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active &&
- (tflags & TIMER_DEFERRABLE))
+ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && (tflags & TIMER_DEFERRABLE))
base = this_cpu_ptr(&timer_bases[BASE_DEF]);
return base;
}
@@ -1684,7 +1682,7 @@ static __latent_entropy void run_timer_s
base->must_forward_clk = false;

__run_timers(base);
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active)
+ if (IS_ENABLED(CONFIG_NO_HZ_COMMON))
__run_timers(this_cpu_ptr(&timer_bases[BASE_DEF]));
}



2018-01-01 14:38:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 67/75] USB: Fix off by one in type-specific length check of BOS SSP capability

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mathias Nyman <[email protected]>

commit 07b9f12864d16c3a861aef4817eb1efccbc5d0e6 upstream.

USB 3.1 devices are not detected as 3.1 capable since 4.15-rc3 due to a
off by one in commit 81cf4a45360f ("USB: core: Add type-specific length
check of BOS descriptors")

It uses USB_DT_USB_SSP_CAP_SIZE() to get SSP capability size which takes
the zero based SSAC as argument, not the actual count of sublink speed
attributes.

USB3 spec 9.6.2.5 says "The number of Sublink Speed Attributes = SSAC + 1."

The type-specific length check patch was added to stable and needs to be
fixed there as well

Fixes: 81cf4a45360f ("USB: core: Add type-specific length check of BOS descriptors")
CC: Masakazu Mokuno <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/core/config.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -1002,7 +1002,7 @@ int usb_get_bos_descriptor(struct usb_de
case USB_SSP_CAP_TYPE:
ssp_cap = (struct usb_ssp_cap_descriptor *)buffer;
ssac = (le32_to_cpu(ssp_cap->bmAttributes) &
- USB_SSP_SUBLINK_SPEED_ATTRIBS) + 1;
+ USB_SSP_SUBLINK_SPEED_ATTRIBS);
if (length >= USB_DT_USB_SSP_CAP_SIZE(ssac))
dev->bos->ssp_cap = ssp_cap;
break;


2018-01-01 14:37:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 65/75] usb: Add device quirk for Logitech HD Pro Webcam C925e

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Fleytman Dmitry Fleytman <[email protected]>

commit 7f038d256c723dd390d2fca942919573995f4cfd upstream.

Commit e0429362ab15
("usb: Add device quirk for Logitech HD Pro Webcams C920 and C930e")
introduced quirk to workaround an issue with some Logitech webcams.

There is one more model that has the same issue - C925e, so applying
the same quirk as well.

See aforementioned commit message for detailed explanation of the problem.

Signed-off-by: Dmitry Fleytman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/core/quirks.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -57,10 +57,11 @@ static const struct usb_device_id usb_qu
/* Microsoft LifeCam-VX700 v2.0 */
{ USB_DEVICE(0x045e, 0x0770), .driver_info = USB_QUIRK_RESET_RESUME },

- /* Logitech HD Pro Webcams C920, C920-C and C930e */
+ /* Logitech HD Pro Webcams C920, C920-C, C925e and C930e */
{ USB_DEVICE(0x046d, 0x082d), .driver_info = USB_QUIRK_DELAY_INIT },
{ USB_DEVICE(0x046d, 0x0841), .driver_info = USB_QUIRK_DELAY_INIT },
{ USB_DEVICE(0x046d, 0x0843), .driver_info = USB_QUIRK_DELAY_INIT },
+ { USB_DEVICE(0x046d, 0x085b), .driver_info = USB_QUIRK_DELAY_INIT },

/* Logitech ConferenceCam CC3000e */
{ USB_DEVICE(0x046d, 0x0847), .driver_info = USB_QUIRK_DELAY_INIT },


2018-01-01 15:13:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 34/75] tcp md5sig: Use skbs saddr when replying to an incoming segment

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoph Paasch <[email protected]>


[ Upstream commit 30791ac41927ebd3e75486f9504b6d2280463bf0 ]

The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.

Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.

This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.

Fixes: 9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -828,7 +828,7 @@ static void tcp_v4_reqsk_send_ack(const
tcp_time_stamp,
req->ts_recent,
0,
- tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->daddr,
+ tcp_md5_do_lookup(sk, (union tcp_md5_addr *)&ip_hdr(skb)->saddr,
AF_INET),
inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0,
ip_hdr(skb)->tos);
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -962,7 +962,7 @@ static void tcp_v6_reqsk_send_ack(const
tcp_rsk(req)->rcv_nxt,
req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if,
- tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr),
+ tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr),
0, 0);
}



2018-01-01 14:36:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 58/75] usbip: prevent leaking socket pointer address in messages

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shuah Khan <[email protected]>

commit 90120d15f4c397272aaf41077960a157fc4212bf upstream.

usbip driver is leaking socket pointer address in messages. Remove
the messages that aren't useful and print sockfd in the ones that
are useful for debugging.

Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/usbip/stub_dev.c | 3 +--
drivers/usb/usbip/usbip_common.c | 14 ++++----------
drivers/usb/usbip/vhci_hcd.c | 2 +-
3 files changed, 6 insertions(+), 13 deletions(-)

--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -163,8 +163,7 @@ static void stub_shutdown_connection(str
* step 1?
*/
if (ud->tcp_socket) {
- dev_dbg(&sdev->udev->dev, "shutdown tcp_socket %p\n",
- ud->tcp_socket);
+ dev_dbg(&sdev->udev->dev, "shutdown sockfd\n");
kernel_sock_shutdown(ud->tcp_socket, SHUT_RDWR);
}

--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -335,13 +335,10 @@ int usbip_recv(struct socket *sock, void
char *bp = buf;
int osize = size;

- usbip_dbg_xmit("enter\n");
-
- if (!sock || !buf || !size) {
- pr_err("invalid arg, sock %p buff %p size %d\n", sock, buf,
- size);
+ if (!sock || !buf || !size)
return -EINVAL;
- }
+
+ usbip_dbg_xmit("enter\n");

do {
sock->sk->sk_allocation = GFP_NOIO;
@@ -354,11 +351,8 @@ int usbip_recv(struct socket *sock, void
msg.msg_flags = MSG_NOSIGNAL;

result = kernel_recvmsg(sock, &msg, &iov, 1, size, MSG_WAITALL);
- if (result <= 0) {
- pr_debug("receive sock %p buf %p size %u ret %d total %d\n",
- sock, buf, size, result, total);
+ if (result <= 0)
goto err;
- }

size -= result;
buf += result;
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -823,7 +823,7 @@ static void vhci_shutdown_connection(str

/* need this? see stub_dev.c */
if (ud->tcp_socket) {
- pr_debug("shutdown tcp_socket %p\n", ud->tcp_socket);
+ pr_debug("shutdown tcp_socket\n");
kernel_sock_shutdown(ud->tcp_socket, SHUT_RDWR);
}



2018-01-01 15:15:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 57/75] usbip: fix usbip bind writing random string after command in match_busid

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Juan Zea <[email protected]>

commit 544c4605acc5ae4afe7dd5914147947db182f2fb upstream.

usbip bind writes commands followed by random string when writing to
match_busid attribute in sysfs, caused by using full variable size
instead of string length.

Signed-off-by: Juan Zea <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/usb/usbip/src/utils.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

--- a/tools/usb/usbip/src/utils.c
+++ b/tools/usb/usbip/src/utils.c
@@ -30,6 +30,7 @@ int modify_match_busid(char *busid, int
char command[SYSFS_BUS_ID_SIZE + 4];
char match_busid_attr_path[SYSFS_PATH_MAX];
int rc;
+ int cmd_size;

snprintf(match_busid_attr_path, sizeof(match_busid_attr_path),
"%s/%s/%s/%s/%s/%s", SYSFS_MNT_PATH, SYSFS_BUS_NAME,
@@ -37,12 +38,14 @@ int modify_match_busid(char *busid, int
attr_name);

if (add)
- snprintf(command, SYSFS_BUS_ID_SIZE + 4, "add %s", busid);
+ cmd_size = snprintf(command, SYSFS_BUS_ID_SIZE + 4, "add %s",
+ busid);
else
- snprintf(command, SYSFS_BUS_ID_SIZE + 4, "del %s", busid);
+ cmd_size = snprintf(command, SYSFS_BUS_ID_SIZE + 4, "del %s",
+ busid);

rc = write_sysfs_attribute(match_busid_attr_path, command,
- sizeof(command));
+ cmd_size);
if (rc < 0) {
dbg("failed to write match_busid: %s", strerror(errno));
return -1;


2018-01-01 15:15:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 55/75] s390/qeth: lock IP table while applying takeover changes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Wiedmann <[email protected]>


[ Upstream commit 8a03a3692b100d84785ee7a834e9215e304c9e00 ]

Modifying the flags of an IP addr object needs to be protected against
eg. concurrent removal of the same object from the IP table.

Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/s390/net/qeth_l3_sys.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/s390/net/qeth_l3_sys.c
+++ b/drivers/s390/net/qeth_l3_sys.c
@@ -397,6 +397,7 @@ static ssize_t qeth_l3_dev_ipato_enable_
goto out;
card->ipato.enabled = enable;

+ spin_lock_bh(&card->ip_lock);
hash_for_each(card->ip_htable, i, addr, hnode) {
if (addr->type != QETH_IP_TYPE_NORMAL)
continue;
@@ -405,6 +406,7 @@ static ssize_t qeth_l3_dev_ipato_enable_
else if (qeth_l3_is_addr_covered_by_ipato(card, addr))
addr->set_flags |= QETH_IPA_SETIP_TAKEOVER_FLAG;
}
+ spin_unlock_bh(&card->ip_lock);
out:
mutex_unlock(&card->conf_mutex);
return rc ? rc : count;


2018-01-01 14:36:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/75] netlink: Add netns check on taps

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kevin Cernekee <[email protected]>


[ Upstream commit 93c647643b48f0131f02e45da3bd367d80443291 ]

Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity. Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.

Test case:

vpnns -- bash -c "ip link add nlmon0 type nlmon; \
ip link set nlmon0 up; \
tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
spi 0x1 mode transport \
auth sha1 0x6162633132330000000000000000000000000000 \
enc aes 0x00000000000000000000000000000000
grep --binary abc123 /tmp/nlmon.pcap

Signed-off-by: Kevin Cernekee <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netlink/af_netlink.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -261,6 +261,9 @@ static int __netlink_deliver_tap_skb(str
struct sock *sk = skb->sk;
int ret = -ENOMEM;

+ if (!net_eq(dev_net(dev), sock_net(sk)))
+ return 0;
+
dev_hold(dev);

if (is_vmalloc_addr(skb->head))


2018-01-01 15:16:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 52/75] net/mlx5: Fix error flow in CREATE_QP command

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Moni Shoua <[email protected]>


[ Upstream commit dbff26e44dc3ec4de6578733b054a0114652a764 ]

In error flow, when DESTROY_QP command should be executed, the wrong
mailbox was set with data, not the one that is written to hardware,
Fix that.

Fixes: 09a7d9eca1a6 '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc'
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -303,8 +303,8 @@ int mlx5_core_create_qp(struct mlx5_core
err_cmd:
memset(din, 0, sizeof(din));
memset(dout, 0, sizeof(dout));
- MLX5_SET(destroy_qp_in, in, opcode, MLX5_CMD_OP_DESTROY_QP);
- MLX5_SET(destroy_qp_in, in, qpn, qp->qpn);
+ MLX5_SET(destroy_qp_in, din, opcode, MLX5_CMD_OP_DESTROY_QP);
+ MLX5_SET(destroy_qp_in, din, qpn, qp->qpn);
mlx5_cmd_exec(dev, din, sizeof(din), dout, sizeof(dout));
return err;
}


2018-01-01 14:35:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 51/75] net/mlx5e: Prevent possible races in VXLAN control flow

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gal Pressman <[email protected]>


[ Upstream commit 0c1cc8b2215f5122ca614b5adca60346018758c3 ]

When calling add/remove VXLAN port, a lock must be held in order to
prevent race scenarios when more than one add/remove happens at the
same time.
Fix by holding our state_lock (mutex) as done by all other parts of the
driver.
Note that the spinlock protecting the radix-tree is still needed in
order to synchronize radix-tree access from softirq context.

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/vxlan.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vxlan.c
@@ -88,6 +88,7 @@ static void mlx5e_vxlan_add_port(struct
struct mlx5e_vxlan *vxlan;
int err;

+ mutex_lock(&priv->state_lock);
vxlan = mlx5e_vxlan_lookup_port(priv, port);
if (vxlan) {
atomic_inc(&vxlan->refcount);
@@ -117,6 +118,7 @@ err_free:
err_delete_port:
mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
free_work:
+ mutex_unlock(&priv->state_lock);
kfree(vxlan_work);
}

@@ -130,6 +132,7 @@ static void mlx5e_vxlan_del_port(struct
struct mlx5e_vxlan *vxlan;
bool remove = false;

+ mutex_lock(&priv->state_lock);
spin_lock_bh(&vxlan_db->lock);
vxlan = radix_tree_lookup(&vxlan_db->tree, port);
if (!vxlan)
@@ -147,6 +150,7 @@ out_unlock:
mlx5e_vxlan_core_del_port_cmd(priv->mdev, port);
kfree(vxlan);
}
+ mutex_unlock(&priv->state_lock);
kfree(vxlan_work);
}



2018-01-01 14:35:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 46/75] tcp: invalidate rate samples during SACK reneging

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yousuk Seung <[email protected]>


[ Upstream commit d4761754b4fb2ef8d9a1e9d121c4bec84e1fe292 ]

Mark tcp_sock during a SACK reneging event and invalidate rate samples
while marked. Such rate samples may overestimate bw by including packets
that were SACKed before reneging.

< ack 6001 win 10000 sack 7001:38001
< ack 7001 win 0 sack 8001:38001 // Reneg detected
> seq 7001:8001 // RTO, SACK cleared.
< ack 38001 win 10000

In above example the rate sample taken after the last ack will count
7001-38001 as delivered while the actual delivery rate likely could
be much lower i.e. 7001-8001.

This patch adds a new field tcp_sock.sack_reneg and marks it when we
declare SACK reneging and entering TCP_CA_Loss, and unmarks it after
the last rate sample was taken before moving back to TCP_CA_Open. This
patch also invalidates rate samples taken while tcp_sock.is_sack_reneg
is set.

Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Yousuk Seung <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Acked-by: Priyaranjan Jha <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/tcp.h | 3 ++-
include/net/tcp.h | 2 +-
net/ipv4/tcp.c | 1 +
net/ipv4/tcp_input.c | 10 ++++++++--
net/ipv4/tcp_rate.c | 10 +++++++---
5 files changed, 19 insertions(+), 7 deletions(-)

--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -219,7 +219,8 @@ struct tcp_sock {
} rack;
u16 advmss; /* Advertised MSS */
u8 rate_app_limited:1, /* rate_{delivered,interval_us} limited? */
- unused:7;
+ is_sack_reneg:1, /* in recovery from loss with SACK reneg? */
+ unused:6;
u8 nonagle : 4,/* Disable Nagle algorithm? */
thin_lto : 1,/* Use linear timeouts for thin streams */
thin_dupack : 1,/* Fast retransmit on first dupack */
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1001,7 +1001,7 @@ void tcp_rate_skb_sent(struct sock *sk,
void tcp_rate_skb_delivered(struct sock *sk, struct sk_buff *skb,
struct rate_sample *rs);
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs);
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs);
void tcp_rate_check_app_limited(struct sock *sk);

/* These functions determine how the current flow behaves in respect of SACK
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2297,6 +2297,7 @@ int tcp_disconnect(struct sock *sk, int
tp->snd_cwnd_cnt = 0;
tp->window_clamp = 0;
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
tcp_clear_retrans(tp);
inet_csk_delack_init(sk);
/* Initialize rcv_mss to TCP_MIN_MSS to avoid division by 0
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1966,6 +1966,8 @@ void tcp_enter_loss(struct sock *sk)
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSACKRENEGING);
tp->sacked_out = 0;
tp->fackets_out = 0;
+ /* Mark SACK reneging until we recover from this loss event. */
+ tp->is_sack_reneg = 1;
}
tcp_clear_all_retrans_hints(tp);

@@ -2463,6 +2465,7 @@ static bool tcp_try_undo_recovery(struct
return true;
}
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
return false;
}

@@ -2494,8 +2497,10 @@ static bool tcp_try_undo_loss(struct soc
NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPSPURIOUSRTOS);
inet_csk(sk)->icsk_retransmits = 0;
- if (frto_undo || tcp_is_sack(tp))
+ if (frto_undo || tcp_is_sack(tp)) {
tcp_set_ca_state(sk, TCP_CA_Open);
+ tp->is_sack_reneg = 0;
+ }
return true;
}
return false;
@@ -3589,6 +3594,7 @@ static int tcp_ack(struct sock *sk, cons
struct tcp_sacktag_state sack_state;
struct rate_sample rs = { .prior_delivered = 0 };
u32 prior_snd_una = tp->snd_una;
+ bool is_sack_reneg = tp->is_sack_reneg;
u32 ack_seq = TCP_SKB_CB(skb)->seq;
u32 ack = TCP_SKB_CB(skb)->ack_seq;
bool is_dupack = false;
@@ -3711,7 +3717,7 @@ static int tcp_ack(struct sock *sk, cons
tcp_schedule_loss_probe(sk);
delivered = tp->delivered - delivered; /* freshly ACKed or SACKed */
lost = tp->lost - lost; /* freshly marked lost */
- tcp_rate_gen(sk, delivered, lost, &now, &rs);
+ tcp_rate_gen(sk, delivered, lost, is_sack_reneg, &now, &rs);
tcp_cong_control(sk, ack, delivered, flag, &rs);
tcp_xmit_recovery(sk, rexmit);
return 1;
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -106,7 +106,7 @@ void tcp_rate_skb_delivered(struct sock

/* Update the connection delivery information and generate a rate sample. */
void tcp_rate_gen(struct sock *sk, u32 delivered, u32 lost,
- struct skb_mstamp *now, struct rate_sample *rs)
+ bool is_sack_reneg, struct skb_mstamp *now, struct rate_sample *rs)
{
struct tcp_sock *tp = tcp_sk(sk);
u32 snd_us, ack_us;
@@ -124,8 +124,12 @@ void tcp_rate_gen(struct sock *sk, u32 d

rs->acked_sacked = delivered; /* freshly ACKed or SACKed */
rs->losses = lost; /* freshly marked lost */
- /* Return an invalid sample if no timing information is available. */
- if (!rs->prior_mstamp.v64) {
+ /* Return an invalid sample if no timing information is available or
+ * in recovery from loss with SACK reneging. Rate samples taken during
+ * a SACK reneging event may overestimate bw by including packets that
+ * were SACKed before the reneg.
+ */
+ if (!rs->prior_mstamp.v64 || is_sack_reneg) {
rs->delivered = -1;
rs->interval_us = -1;
return;


2018-01-01 15:17:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 45/75] sock: free skb in skb_complete_tx_timestamp on error

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit 35b99dffc3f710cafceee6c8c6ac6a98eb2cb4bf ]

skb_complete_tx_timestamp must ingest the skb it is passed. Call
kfree_skb if the skb cannot be enqueued.

Fixes: b245be1f4db1 ("net-timestamp: no-payload only sysctl")
Fixes: 9ac25fc06375 ("net: fix socket refcounting in skb_complete_tx_timestamp()")
Reported-by: Richard Cochran <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/skbuff.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3823,7 +3823,7 @@ void skb_complete_tx_timestamp(struct sk
struct sock *sk = skb->sk;

if (!skb_may_tx_timestamp(sk, false))
- return;
+ goto err;

/* Take a reference to prevent skb_orphan() from freeing the socket,
* but only if the socket refcount is not zero.
@@ -3832,7 +3832,11 @@ void skb_complete_tx_timestamp(struct sk
*skb_hwtstamps(skb) = *hwtstamps;
__skb_complete_tx_timestamp(skb, sk, SCM_TSTAMP_SND);
sock_put(sk);
+ return;
}
+
+err:
+ kfree_skb(skb);
}
EXPORT_SYMBOL_GPL(skb_complete_tx_timestamp);



2018-01-01 14:35:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 41/75] net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <[email protected]>


[ Upstream commit 84aeb437ab98a2bce3d4b2111c79723aedfceb33 ]

The early call to br_stp_change_bridge_id in bridge's newlink can cause
a memory leak if an error occurs during the newlink because the fdb
entries are not cleaned up if a different lladdr was specified, also
another minor issue is that it generates fdb notifications with
ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
which get added on NETDEV_REGISTER event, but are not cleaned up in the
newlink error path. To remove this special case the call to
br_stp_change_bridge_id is done after netdev register and we cleanup the
bridge on changelink error via br_dev_delete to plug all leaks.

This patch makes netlink bridge destruction on newlink error the same as
dellink and ioctl del which is necessary since at that point we have a
fully initialized bridge device.

To reproduce the issue:
$ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
RTNETLINK answers: Invalid argument

$ rmmod bridge
[ 1822.142525] =============================================================================
[ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
[ 1822.144821] -----------------------------------------------------------------------------

[ 1822.145990] Disabling lock debugging due to kernel taint
[ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
[ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
[ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1822.150008] Call Trace:
[ 1822.150510] dump_stack+0x78/0xa9
[ 1822.151156] slab_err+0xb1/0xd3
[ 1822.151834] ? __kmalloc+0x1bb/0x1ce
[ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
[ 1822.153395] shutdown_cache+0x13/0x144
[ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
[ 1822.154669] SyS_delete_module+0x194/0x244
[ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
[ 1822.156343] RIP: 0033:0x7f929bd38b17
[ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
[ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
[ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
[ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
[ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
[ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
[ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
[ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128

Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
Fixes: 5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bridge/br_netlink.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1092,19 +1092,20 @@ static int br_dev_newlink(struct net *sr
struct net_bridge *br = netdev_priv(dev);
int err;

+ err = register_netdevice(dev);
+ if (err)
+ return err;
+
if (tb[IFLA_ADDRESS]) {
spin_lock_bh(&br->lock);
br_stp_change_bridge_id(br, nla_data(tb[IFLA_ADDRESS]));
spin_unlock_bh(&br->lock);
}

- err = register_netdevice(dev);
- if (err)
- return err;
-
err = br_changelink(dev, tb, data);
if (err)
- unregister_netdevice(dev);
+ br_dev_delete(dev, NULL);
+
return err;
}



2018-01-01 15:19:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 39/75] adding missing rcu_read_unlock in ipxip6_rcv

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Nikita V. Shirokov" <[email protected]>


[ Upstream commit 74c4b656c3d92ec4c824ea1a4afd726b7b6568c8 ]

commit 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
introduced new exit point in ipxip6_rcv. however rcu_read_unlock is
missing there. this diff is fixing this

v1->v2:
instead of doing rcu_read_unlock in place, we are going to "drop"
section (to prevent skb leakage)

Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Signed-off-by: Nikita V. Shirokov <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_tunnel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -911,7 +911,7 @@ static int ipxip6_rcv(struct sk_buff *sk
if (t->parms.collect_md) {
tun_dst = ipv6_tun_rx_dst(skb, 0, 0, 0);
if (!tun_dst)
- return 0;
+ goto drop;
}
ret = __ip6_tnl_rcv(t, skb, tpi, tun_dst, dscp_ecn_decapsulate,
log_ecn_error);


2018-01-01 14:34:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/75] iw_cxgb4: Only validate the MSN for successful completions

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steve Wise <[email protected]>

commit f55688c45442bc863f40ad678c638785b26cdce6 upstream.

If the RECV CQE is in error, ignore the MSN check. This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).

Signed-off-by: Steve Wise <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/infiniband/hw/cxgb4/cq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -575,10 +575,10 @@ static int poll_cq(struct t4_wq *wq, str
ret = -EAGAIN;
goto skip_cqe;
}
- if (unlikely((CQE_WRID_MSN(hw_cqe) != (wq->rq.msn)))) {
+ if (unlikely(!CQE_STATUS(hw_cqe) &&
+ CQE_WRID_MSN(hw_cqe) != wq->rq.msn)) {
t4_set_wq_in_error(wq);
- hw_cqe->header |= htonl(CQE_STATUS_V(T4_ERR_MSN));
- goto proc_cqe;
+ hw_cqe->header |= cpu_to_be32(CQE_STATUS_V(T4_ERR_MSN));
}
goto proc_cqe;
}


2018-01-01 14:34:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/75] ring-buffer: Mask out the info bits when returning buffer page length

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <[email protected]>

commit 45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.

Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.

What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.

Fixes: 66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/ring_buffer.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -280,6 +280,8 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data
/* Missed count stored at end */
#define RB_MISSED_STORED (1 << 30)

+#define RB_MISSED_FLAGS (RB_MISSED_EVENTS|RB_MISSED_STORED)
+
struct buffer_data_page {
u64 time_stamp; /* page time stamp */
local_t commit; /* write committed index */
@@ -331,7 +333,9 @@ static void rb_init_page(struct buffer_d
*/
size_t ring_buffer_page_len(void *page)
{
- return local_read(&((struct buffer_data_page *)page)->commit)
+ struct buffer_data_page *bpage = page;
+
+ return (local_read(&bpage->commit) & ~RB_MISSED_FLAGS)
+ BUF_PAGE_HDR_SIZE;
}



2018-01-01 15:20:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/75] tracing: Fix possible double free on failure of allocating trace buffer

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <[email protected]>

commit 4397f04575c44e1440ec2e49b6302785c95fd2f8 upstream.

Jing Xia and Chunyan Zhang reported that on failing to allocate part of the
tracing buffer, memory is freed, but the pointers that point to them are not
initialized back to NULL, and later paths may try to free the freed memory
again. Jing and Chunyan fixed one of the locations that does this, but
missed a spot.

Link: http://lkml.kernel.org/r/[email protected]

Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code")
Reported-by: Jing Xia <[email protected]>
Reported-by: Chunyan Zhang <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/trace.c | 1 +
1 file changed, 1 insertion(+)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6955,6 +6955,7 @@ allocate_trace_buffer(struct trace_array
buf->data = alloc_percpu(struct trace_array_cpu);
if (!buf->data) {
ring_buffer_free(buf->buffer);
+ buf->buffer = NULL;
return -ENOMEM;
}



2018-01-01 15:22:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 21/75] x86/mm: Enable CR4.PCIDE on supported systems

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 660da7c9228f685b2ebe664f9fd69aaddcc420b5 upstream.

We can use PCID if the CPU has PCID and PGE and we're not on Xen.

By itself, this has no effect. A followup patch will start using PCID.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Nadav Amit <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/6327ecd907b32f79d5aa0d466f04503bbec5df88.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/tlbflush.h | 8 ++++++++
arch/x86/kernel/cpu/common.c | 22 ++++++++++++++++++++++
arch/x86/xen/enlighten.c | 6 ++++++
3 files changed, 36 insertions(+)

--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -191,6 +191,14 @@ static inline void __flush_tlb_all(void)
__flush_tlb_global();
else
__flush_tlb();
+
+ /*
+ * Note: if we somehow had PCID but not PGE, then this wouldn't work --
+ * we'd end up flushing kernel translations for the current ASID but
+ * we might fail to flush kernel translations for other cached ASIDs.
+ *
+ * To avoid this issue, we force PCID off if PGE is off.
+ */
}

static inline void __flush_tlb_one(unsigned long addr)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -324,6 +324,25 @@ static __always_inline void setup_smap(s
}
}

+static void setup_pcid(struct cpuinfo_x86 *c)
+{
+ if (cpu_has(c, X86_FEATURE_PCID)) {
+ if (cpu_has(c, X86_FEATURE_PGE)) {
+ cr4_set_bits(X86_CR4_PCIDE);
+ } else {
+ /*
+ * flush_tlb_all(), as currently implemented, won't
+ * work if PCID is on but PGE is not. Since that
+ * combination doesn't exist on real hardware, there's
+ * no reason to try to fully support it, but it's
+ * polite to avoid corrupting data if we're on
+ * an improperly configured VM.
+ */
+ clear_cpu_cap(c, X86_FEATURE_PCID);
+ }
+ }
+}
+
/*
* Protection Keys are not available in 32-bit mode.
*/
@@ -1082,6 +1101,9 @@ static void identify_cpu(struct cpuinfo_
setup_smep(c);
setup_smap(c);

+ /* Set up PCID */
+ setup_pcid(c);
+
/*
* The vendor-specific functions might have changed features.
* Now we do "generic changes."
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -444,6 +444,12 @@ static void __init xen_init_cpuid_mask(v
~((1 << X86_FEATURE_MTRR) | /* disable MTRR */
(1 << X86_FEATURE_ACC)); /* thermal monitoring */

+ /*
+ * Xen PV would need some work to support PCID: CR3 handling as well
+ * as xen_flush_tlb_others() would need updating.
+ */
+ cpuid_leaf1_ecx_mask &= ~(1 << (X86_FEATURE_PCID % 32)); /* disable PCID */
+
if (!xen_initial_domain())
cpuid_leaf1_edx_mask &=
~((1 << X86_FEATURE_ACPI)); /* disable ACPI */


2018-01-01 15:22:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/75] tracing: Remove extra zeroing out of the ring buffer page

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <[email protected]>

commit 6b7e633fe9c24682df550e5311f47fb524701586 upstream.

The ring_buffer_read_page() takes care of zeroing out any extra data in the
page that it returns. There's no need to zero it out again from the
consumer. It was removed from one consumer of this function, but
read_buffers_splice_read() did not remove it, and worse, it contained a
nasty bug because of it.

Fixes: 2711ca237a084 ("ring-buffer: Move zeroing out excess in page to ring buffer code")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/trace.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6181,7 +6181,7 @@ tracing_buffers_splice_read(struct file
.spd_release = buffer_spd_release,
};
struct buffer_ref *ref;
- int entries, size, i;
+ int entries, i;
ssize_t ret = 0;

#ifdef CONFIG_TRACER_MAX_TRACE
@@ -6232,14 +6232,6 @@ tracing_buffers_splice_read(struct file
break;
}

- /*
- * zero out any left over data, this is going to
- * user land.
- */
- size = ring_buffer_page_len(ref->page);
- if (size < PAGE_SIZE)
- memset(ref->page + size, 0, PAGE_SIZE - size);
-
page = virt_to_page(ref->page);

spd.pages[i] = page;


2018-01-01 14:33:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/75] x86/mm: Make flush_tlb_mm_range() more predictable

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit ce27374fabf553153c3f53efcaa9bfab9216bd8c upstream.

I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.

The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.

As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.

Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/mm/tlb.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -307,6 +307,12 @@ void flush_tlb_mm_range(struct mm_struct
unsigned long base_pages_to_flush = TLB_FLUSH_ALL;

preempt_disable();
+
+ if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
+ base_pages_to_flush = (end - start) >> PAGE_SHIFT;
+ if (base_pages_to_flush > tlb_single_page_flush_ceiling)
+ base_pages_to_flush = TLB_FLUSH_ALL;
+
if (current->active_mm != mm) {
/* Synchronize with switch_mm. */
smp_mb();
@@ -323,15 +329,11 @@ void flush_tlb_mm_range(struct mm_struct
goto out;
}

- if ((end != TLB_FLUSH_ALL) && !(vmflag & VM_HUGETLB))
- base_pages_to_flush = (end - start) >> PAGE_SHIFT;
-
/*
* Both branches below are implicit full barriers (MOV to CR or
* INVLPG) that synchronize with switch_mm.
*/
- if (base_pages_to_flush > tlb_single_page_flush_ceiling) {
- base_pages_to_flush = TLB_FLUSH_ALL;
+ if (base_pages_to_flush == TLB_FLUSH_ALL) {
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
local_flush_tlb();
} else {


2018-01-01 15:23:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/75] ALSA: hda - fix headset mic detection issue on a Dell machine

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Hui Wang <[email protected]>

commit 285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d upstream.

It has the codec alc256, and add its pin definition to pin quirk
table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.

Signed-off-by: Hui Wang <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/hda/patch_realtek.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5972,6 +5972,11 @@ static const struct snd_hda_pin_quirk al
{0x1b, 0x01011020},
{0x21, 0x02211010}),
SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x14, 0x90170110},
+ {0x1b, 0x01011020},
+ {0x21, 0x0221101f}),
+ SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x12, 0x90a60160},
{0x14, 0x90170120},
{0x21, 0x02211030}),


2018-01-01 15:24:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/75] ALSA: hda: Drop useless WARN_ON()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <[email protected]>

commit a36c2638380c0a4676647a1f553b70b20d3ebce1 upstream.

Since the commit 97cc2ed27e5a ("ALSA: hda - Fix yet another i915
pointer leftover in error path") cleared hdac_acomp pointer, the
WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give
a false-positive warning, as the function gets called no matter
whether the component is registered or not. For fixing it, let's get
rid of the spurious WARN_ON().

Fixes: 97cc2ed27e5a ("ALSA: hda - Fix yet another i915 pointer leftover in error path")
Reported-by: Kouta Okamoto <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/hda/hdac_i915.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -319,7 +319,7 @@ static int hdac_component_master_match(s
*/
int snd_hdac_i915_register_notifier(const struct i915_audio_component_audio_ops *aops)
{
- if (WARN_ON(!hdac_acomp))
+ if (!hdac_acomp)
return -ENODEV;

hdac_acomp->audio_ops = aops;


2018-01-01 20:38:27

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On 1 January 2018 at 20:01, Greg Kroah-Hartman
<[email protected]> wrote:
> This is the start of the stable review cycle for the 4.9.74 release.
> There are 75 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

Summary
------------------------------------------------------------------------

kernel: 4.9.74-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: b59b0bd326dce09081ffaa96d821f66b9dd4c8d6
git describe: v4.9.73-76-gb59b0bd326dc
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.73-76-gb59b0bd326dc


No regressions (compared to build v4.9.73)

Boards, architectures and test suites:
-------------------------------------

hi6220-hikey - arm64
* boot - pass: 20,
* kselftest - pass: 39, skip: 23
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 60,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 21, skip: 1
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 983, skip: 121
* ltp-timers-tests - pass: 12,

juno-r2 - arm64
* boot - pass: 20,
* kselftest - pass: 40, skip: 23
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 60,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 14,
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 987, skip: 121
* ltp-timers-tests - pass: 12,

x15 - arm
* boot - pass: 20,
* kselftest - pass: 37, skip: 25
* libhugetlbfs - pass: 87, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 60,
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 20, skip: 2
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 13, skip: 1
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1037, skip: 66
* ltp-timers-tests - pass: 12,

x86_64
* boot - pass: 20,
* kselftest - pass: 53, skip: 26
* libhugetlbfs - pass: 90, skip: 1
* ltp-cap_bounds-tests - pass: 2,
* ltp-containers-tests - pass: 64,
* ltp-fcntl-locktests-tests - pass: 2,
* ltp-filecaps-tests - pass: 2,
* ltp-fs-tests - pass: 61, skip: 1
* ltp-fs_bind-tests - pass: 2,
* ltp-fs_perms_simple-tests - pass: 19,
* ltp-fsx-tests - pass: 2,
* ltp-hugetlb-tests - pass: 22,
* ltp-io-tests - pass: 3,
* ltp-ipc-tests - pass: 9,
* ltp-math-tests - pass: 11,
* ltp-nptl-tests - pass: 2,
* ltp-pty-tests - pass: 4,
* ltp-sched-tests - pass: 9, skip: 1
* ltp-securebits-tests - pass: 4,
* ltp-syscalls-tests - pass: 1005, skip: 116
* ltp-timers-tests - pass: 12,

Documentation - https://collaborate.linaro.org/display/LKFT/Email+Reports
Tested-by: Naresh Kamboju <[email protected]>

2018-01-02 16:49:08

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Mon, Jan 01, 2018 at 03:31:37PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.74 release.
> There are 75 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> Anything received after that time might be too late.
>

Note: This is for v4.9.73-77-g79070be.

Build results:
total: 145 pass: 145 fail: 0
Qemu test results:
total: 126 pass: 126 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

2018-01-02 16:58:24

by Neal Cardwell

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Mon, Jan 1, 2018 at 9:31 AM, Greg Kroah-Hartman
<[email protected]> wrote:
> This is the start of the stable review cycle for the 4.9.74 release.
> There are 75 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.

Hi Greg,

In looking at the 4.9 and 4.14 patches yesterday, I noticed there were
two TCP BBR fixes that made it into 4.14 but not 4.9. Doing an
inventory of the TCP BBR fixes, AFAICT we have:

c589e69b508d tcp_bbr: record "full bw reached" decision in new
full_bw_reached bit
- in 4.9 and 4.14 (great)

2f6c498e4f15 tcp_bbr: reset full pipe detection on loss recovery undo
- in 4.14 (but not 4.9)

600647d467c6 tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
- in 4.14 (but not 4.9)

Lacking the second and third patches in 4.9 will not cause any new
problems, but it will miss out on some nice fixes. If it's possible to
get 2f6c498e4f15 and 600647d467c6 either into 4.9.74 or 4.9.75, I
would be very grateful.

Thanks!
neal

2018-01-02 18:21:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Tue, Jan 02, 2018 at 11:57:59AM -0500, Neal Cardwell wrote:
> On Mon, Jan 1, 2018 at 9:31 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > This is the start of the stable review cycle for the 4.9.74 release.
> > There are 75 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> > and the diffstat can be found below.
>
> Hi Greg,
>
> In looking at the 4.9 and 4.14 patches yesterday, I noticed there were
> two TCP BBR fixes that made it into 4.14 but not 4.9. Doing an
> inventory of the TCP BBR fixes, AFAICT we have:
>
> c589e69b508d tcp_bbr: record "full bw reached" decision in new
> full_bw_reached bit
> - in 4.9 and 4.14 (great)
>
> 2f6c498e4f15 tcp_bbr: reset full pipe detection on loss recovery undo
> - in 4.14 (but not 4.9)
>
> 600647d467c6 tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
> - in 4.14 (but not 4.9)
>
> Lacking the second and third patches in 4.9 will not cause any new
> problems, but it will miss out on some nice fixes. If it's possible to
> get 2f6c498e4f15 and 600647d467c6 either into 4.9.74 or 4.9.75, I
> would be very grateful.

I go with the set of backported patches from DaveM, so I just assume he
didn't include these in the 4.9 set of patches for a good reason.

You can ask on netdev@ about this and cc: me, to make it go a bit
faster, if I get an ACK from DaveM, I can queue them up directly.

thanks,

greg k-h

2018-01-02 18:21:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Tue, Jan 02, 2018 at 08:49:05AM -0800, Guenter Roeck wrote:
> On Mon, Jan 01, 2018 at 03:31:37PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.74 release.
> > There are 75 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> > Anything received after that time might be too late.
> >
>
> Note: This is for v4.9.73-77-g79070be.

Good! I snuck a i386 UP build fix in there :)

>
> Build results:
> total: 145 pass: 145 fail: 0
> Qemu test results:
> total: 126 pass: 126 fail: 0
>
> Details are available at http://kerneltests.org/builders.

Thanks for testing all of these and letting me know.

greg k-h

2018-01-02 18:32:04

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

From: Neal Cardwell <[email protected]>
Date: Tue, 2 Jan 2018 11:57:59 -0500

> On Mon, Jan 1, 2018 at 9:31 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
>> This is the start of the stable review cycle for the 4.9.74 release.
>> There are 75 patches in this series, all will be posted as a response
>> to this one. If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
>> Anything received after that time might be too late.
>>
>> The whole patch series can be found in one patch at:
>> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
>> or in the git tree and branch at:
>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
>> and the diffstat can be found below.
>
> Hi Greg,
>
> In looking at the 4.9 and 4.14 patches yesterday, I noticed there were
> two TCP BBR fixes that made it into 4.14 but not 4.9. Doing an
> inventory of the TCP BBR fixes, AFAICT we have:
>
> c589e69b508d tcp_bbr: record "full bw reached" decision in new
> full_bw_reached bit
> - in 4.9 and 4.14 (great)
>
> 2f6c498e4f15 tcp_bbr: reset full pipe detection on loss recovery undo
> - in 4.14 (but not 4.9)
>
> 600647d467c6 tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
> - in 4.14 (but not 4.9)
>
> Lacking the second and third patches in 4.9 will not cause any new
> problems, but it will miss out on some nice fixes. If it's possible to
> get 2f6c498e4f15 and 600647d467c6 either into 4.9.74 or 4.9.75, I
> would be very grateful.

These were not straight-forward to backport and I felt the risk outweighed
the gains.

If you want to do the backport yourself and you feel confident in it,
feel free.

2018-01-02 19:11:51

by Neal Cardwell

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Tue, Jan 2, 2018 at 1:32 PM, David Miller <[email protected]> wrote:
> From: Neal Cardwell <[email protected]>
> Date: Tue, 2 Jan 2018 11:57:59 -0500
>
>> On Mon, Jan 1, 2018 at 9:31 AM, Greg Kroah-Hartman
>> <[email protected]> wrote:
>>> This is the start of the stable review cycle for the 4.9.74 release.
>>> There are 75 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
>>> Anything received after that time might be too late.
>>>
>>> The whole patch series can be found in one patch at:
>>> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
>>> or in the git tree and branch at:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
>>> and the diffstat can be found below.
>>
>> Hi Greg,
>>
>> In looking at the 4.9 and 4.14 patches yesterday, I noticed there were
>> two TCP BBR fixes that made it into 4.14 but not 4.9. Doing an
>> inventory of the TCP BBR fixes, AFAICT we have:
>>
>> c589e69b508d tcp_bbr: record "full bw reached" decision in new
>> full_bw_reached bit
>> - in 4.9 and 4.14 (great)
>>
>> 2f6c498e4f15 tcp_bbr: reset full pipe detection on loss recovery undo
>> - in 4.14 (but not 4.9)
>>
>> 600647d467c6 tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
>> - in 4.14 (but not 4.9)
>>
>> Lacking the second and third patches in 4.9 will not cause any new
>> problems, but it will miss out on some nice fixes. If it's possible to
>> get 2f6c498e4f15 and 600647d467c6 either into 4.9.74 or 4.9.75, I
>> would be very grateful.
>
> These were not straight-forward to backport and I felt the risk outweighed
> the gains.
>
> If you want to do the backport yourself and you feel confident in it,
> feel free.

Thanks, Greg and David. Looks like these 2 patches will cherry-pick
cleanly if cherry-picked in the following sequence, on top of
4.9.74-rc1, which already has 6c9e73ef9aa7 ("tcp_bbr: record "full bw
reached" decision in new full_bw_reached bit"):

$ git checkout linux-stable-rc/linux-4.9.y

$ git cherry-pick 2f6c498e4f15
Performing inexact rename detection: 100% (17803152/17803152), done.
[detached HEAD 0982234c57e1] tcp_bbr: reset full pipe detection on
loss recovery undo
Date: Thu Dec 7 12:43:31 2017 -0500
1 file changed, 4 insertions(+)

$ git cherry-pick 600647d467c6
Performing inexact rename detection: 100% (17803152/17803152), done.
[detached HEAD 7e866eccd083] tcp_bbr: reset long-term bandwidth
sampling on loss recovery undo
Date: Thu Dec 7 12:43:32 2017 -0500
1 file changed, 1 insertion(+)

$ git log --oneline --decorate | head -3
7e866eccd083 (HEAD) tcp_bbr: reset long-term bandwidth sampling on
loss recovery undo
0982234c57e1 tcp_bbr: reset full pipe detection on loss recovery undo
79070be7f1ae (linux-stable-rc/linux-4.9.y) Linux 4.9.74-rc1

I verified that this compiles without warnings, and boots, and BBR works.

Shall I prepare another version of these 2 patches, or do we think
this recipe will be sufficient? (Sorry I am not more familiar with the
backport-to-stable process.)

Thanks!
neal

2018-01-02 19:12:54

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

From: Neal Cardwell <[email protected]>
Date: Tue, 2 Jan 2018 14:11:25 -0500

> Looks like these 2 patches will cherry-pick cleanly if cherry-picked
> in the following sequence, on top of 4.9.74-rc1, which already has
> 6c9e73ef9aa7 ("tcp_bbr: record "full bw reached" decision in new
> full_bw_reached bit"):
>
> $ git checkout linux-stable-rc/linux-4.9.y
>
> $ git cherry-pick 2f6c498e4f15
> Performing inexact rename detection: 100% (17803152/17803152), done.
> [detached HEAD 0982234c57e1] tcp_bbr: reset full pipe detection on
> loss recovery undo
> Date: Thu Dec 7 12:43:31 2017 -0500
> 1 file changed, 4 insertions(+)
>
> $ git cherry-pick 600647d467c6
> Performing inexact rename detection: 100% (17803152/17803152), done.
> [detached HEAD 7e866eccd083] tcp_bbr: reset long-term bandwidth
> sampling on loss recovery undo
> Date: Thu Dec 7 12:43:32 2017 -0500
> 1 file changed, 1 insertion(+)
>
> $ git log --oneline --decorate | head -3
> 7e866eccd083 (HEAD) tcp_bbr: reset long-term bandwidth sampling on
> loss recovery undo
> 0982234c57e1 tcp_bbr: reset full pipe detection on loss recovery undo
> 79070be7f1ae (linux-stable-rc/linux-4.9.y) Linux 4.9.74-rc1
>
> I verified that this compiles without warnings, and boots, and BBR works.
>
> Shall I prepare another version of these 2 patches, or do we think
> this recipe will be sufficient? (Sorry I am not more familiar with the
> backport-to-stable process.)

If this works and Greg is OK with it, I am fine with it too.

2018-01-02 20:08:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Tue, Jan 02, 2018 at 02:11:25PM -0500, Neal Cardwell wrote:
> On Tue, Jan 2, 2018 at 1:32 PM, David Miller <[email protected]> wrote:
> > From: Neal Cardwell <[email protected]>
> > Date: Tue, 2 Jan 2018 11:57:59 -0500
> >
> >> On Mon, Jan 1, 2018 at 9:31 AM, Greg Kroah-Hartman
> >> <[email protected]> wrote:
> >>> This is the start of the stable review cycle for the 4.9.74 release.
> >>> There are 75 patches in this series, all will be posted as a response
> >>> to this one. If anyone has any issues with these being applied, please
> >>> let me know.
> >>>
> >>> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> >>> Anything received after that time might be too late.
> >>>
> >>> The whole patch series can be found in one patch at:
> >>> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
> >>> or in the git tree and branch at:
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> >>> and the diffstat can be found below.
> >>
> >> Hi Greg,
> >>
> >> In looking at the 4.9 and 4.14 patches yesterday, I noticed there were
> >> two TCP BBR fixes that made it into 4.14 but not 4.9. Doing an
> >> inventory of the TCP BBR fixes, AFAICT we have:
> >>
> >> c589e69b508d tcp_bbr: record "full bw reached" decision in new
> >> full_bw_reached bit
> >> - in 4.9 and 4.14 (great)
> >>
> >> 2f6c498e4f15 tcp_bbr: reset full pipe detection on loss recovery undo
> >> - in 4.14 (but not 4.9)
> >>
> >> 600647d467c6 tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
> >> - in 4.14 (but not 4.9)
> >>
> >> Lacking the second and third patches in 4.9 will not cause any new
> >> problems, but it will miss out on some nice fixes. If it's possible to
> >> get 2f6c498e4f15 and 600647d467c6 either into 4.9.74 or 4.9.75, I
> >> would be very grateful.
> >
> > These were not straight-forward to backport and I felt the risk outweighed
> > the gains.
> >
> > If you want to do the backport yourself and you feel confident in it,
> > feel free.
>
> Thanks, Greg and David. Looks like these 2 patches will cherry-pick
> cleanly if cherry-picked in the following sequence, on top of
> 4.9.74-rc1, which already has 6c9e73ef9aa7 ("tcp_bbr: record "full bw
> reached" decision in new full_bw_reached bit"):
>
> $ git checkout linux-stable-rc/linux-4.9.y
>
> $ git cherry-pick 2f6c498e4f15
> Performing inexact rename detection: 100% (17803152/17803152), done.
> [detached HEAD 0982234c57e1] tcp_bbr: reset full pipe detection on
> loss recovery undo
> Date: Thu Dec 7 12:43:31 2017 -0500
> 1 file changed, 4 insertions(+)
>
> $ git cherry-pick 600647d467c6
> Performing inexact rename detection: 100% (17803152/17803152), done.
> [detached HEAD 7e866eccd083] tcp_bbr: reset long-term bandwidth
> sampling on loss recovery undo
> Date: Thu Dec 7 12:43:32 2017 -0500
> 1 file changed, 1 insertion(+)
>
> $ git log --oneline --decorate | head -3
> 7e866eccd083 (HEAD) tcp_bbr: reset long-term bandwidth sampling on
> loss recovery undo
> 0982234c57e1 tcp_bbr: reset full pipe detection on loss recovery undo
> 79070be7f1ae (linux-stable-rc/linux-4.9.y) Linux 4.9.74-rc1
>
> I verified that this compiles without warnings, and boots, and BBR works.
>
> Shall I prepare another version of these 2 patches, or do we think
> this recipe will be sufficient? (Sorry I am not more familiar with the
> backport-to-stable process.)

That works, those two patches are now queued up for the next stable
release, thanks!

greg k-h

2018-01-02 22:23:50

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On 01/01/2018 07:31 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.74 release.
> There are 75 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Jan 3 14:00:03 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.74-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

2018-01-02 22:32:24

by Neal Cardwell

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/75] 4.9.74-stable review

On Tue, Jan 2, 2018 at 3:08 PM, Greg KH <[email protected]> wrote:
> On Tue, Jan 02, 2018 at 02:11:25PM -0500, Neal Cardwell wrote:
...
>> Thanks, Greg and David. Looks like these 2 patches will cherry-pick
>> cleanly if cherry-picked in the following sequence, on top of
>> 4.9.74-rc1, which already has 6c9e73ef9aa7 ("tcp_bbr: record "full bw
>> reached" decision in new full_bw_reached bit"):
>>
>> $ git checkout linux-stable-rc/linux-4.9.y
>>
>> $ git cherry-pick 2f6c498e4f15
>> Performing inexact rename detection: 100% (17803152/17803152), done.
>> [detached HEAD 0982234c57e1] tcp_bbr: reset full pipe detection on
>> loss recovery undo
>> Date: Thu Dec 7 12:43:31 2017 -0500
>> 1 file changed, 4 insertions(+)
>>
>> $ git cherry-pick 600647d467c6
>> Performing inexact rename detection: 100% (17803152/17803152), done.
>> [detached HEAD 7e866eccd083] tcp_bbr: reset long-term bandwidth
>> sampling on loss recovery undo
>> Date: Thu Dec 7 12:43:32 2017 -0500
>> 1 file changed, 1 insertion(+)
>>
>> $ git log --oneline --decorate | head -3
>> 7e866eccd083 (HEAD) tcp_bbr: reset long-term bandwidth sampling on
>> loss recovery undo
>> 0982234c57e1 tcp_bbr: reset full pipe detection on loss recovery undo
>> 79070be7f1ae (linux-stable-rc/linux-4.9.y) Linux 4.9.74-rc1
>>
>> I verified that this compiles without warnings, and boots, and BBR works.
>>
>> Shall I prepare another version of these 2 patches, or do we think
>> this recipe will be sufficient? (Sorry I am not more familiar with the
>> backport-to-stable process.)
>
> That works, those two patches are now queued up for the next stable
> release, thanks!
>
> greg k-h

Great. Thank you, Greg and David!

neal

2018-01-16 03:50:53

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

please revert that on 4.9 and 4.14
it breaks igmp routing. it can be reproduced with any iptv connection
using igmp-proxy. reverting this patch fixes the issue.

Sebastian



Am 01.01.2018 um 15:32 schrieb Greg Kroah-Hartman:
> 4.9-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Kevin Cernekee <[email protected]>
>
>
> [ Upstream commit a46182b00290839fa3fa159d54fd3237bd8669f0 ]
>
> Closing a multicast socket after the final IPv4 address is deleted
> from an interface can generate a membership report that uses the
> source IP from a different interface. The following test script, run
> from an isolated netns, reproduces the issue:
>
> #!/bin/bash
>
> ip link add dummy0 type dummy
> ip link add dummy1 type dummy
> ip link set dummy0 up
> ip link set dummy1 up
> ip addr add 10.1.1.1/24 dev dummy0
> ip addr add 192.168.99.99/24 dev dummy1
>
> tcpdump -U -i dummy0 &
> socat EXEC:"sleep 2" \
> UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &
>
> sleep 1
> ip addr del 10.1.1.1/24 dev dummy0
> sleep 5
> kill %tcpdump
>
> RFC 3376 specifies that the report must be sent with a valid IP source
> address from the destination subnet, or from address 0.0.0.0. Add an
> extra check to make sure this is the case.
>
> Signed-off-by: Kevin Cernekee <[email protected]>
> Reviewed-by: Andrew Lunn <[email protected]>
> Signed-off-by: David S. Miller <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---
> net/ipv4/igmp.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -89,6 +89,7 @@
> #include <linux/rtnetlink.h>
> #include <linux/times.h>
> #include <linux/pkt_sched.h>
> +#include <linux/byteorder/generic.h>
>
> #include <net/net_namespace.h>
> #include <net/arp.h>
> @@ -321,6 +322,23 @@ igmp_scount(struct ip_mc_list *pmc, int
> return scount;
> }
>
> +/* source address selection per RFC 3376 section 4.2.13 */
> +static __be32 igmpv3_get_srcaddr(struct net_device *dev,
> + const struct flowi4 *fl4)
> +{
> + struct in_device *in_dev = __in_dev_get_rcu(dev);
> +
> + if (!in_dev)
> + return htonl(INADDR_ANY);
> +
> + for_ifa(in_dev) {
> + if (inet_ifa_match(fl4->saddr, ifa))
> + return fl4->saddr;
> + } endfor_ifa(in_dev);
> +
> + return htonl(INADDR_ANY);
> +}
> +
> static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
> {
> struct sk_buff *skb;
> @@ -368,7 +386,7 @@ static struct sk_buff *igmpv3_newpack(st
> pip->frag_off = htons(IP_DF);
> pip->ttl = 1;
> pip->daddr = fl4.daddr;
> - pip->saddr = fl4.saddr;
> + pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
> pip->protocol = IPPROTO_IGMP;
> pip->tot_len = 0; /* filled in later */
> ip_select_ident(net, skb, NULL);
>
>
>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 03:59:10

by Kevin Cernekee

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Mon, Jan 15, 2018 at 7:50 PM, Sebastian Gottschall
<[email protected]> wrote:
> please revert that on 4.9 and 4.14
> it breaks igmp routing. it can be reproduced with any iptv connection using
> igmp-proxy. reverting this patch fixes the issue.

Hi Sebastian,

Is this the correct igmp-proxy (based on mrouted)?

https://github.com/mirror/dd-wrt/tree/master/src/router/igmp-proxy

What is the actual vs. expected source address you are seeing on the
IGMP packets?

2018-01-16 04:26:37

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

Am 16.01.2018 um 04:58 schrieb Kevin Cernekee:
> On Mon, Jan 15, 2018 at 7:50 PM, Sebastian Gottschall
> <[email protected]> wrote:
>> please revert that on 4.9 and 4.14
>> it breaks igmp routing. it can be reproduced with any iptv connection using
>> igmp-proxy. reverting this patch fixes the issue.
> Hi Sebastian,
>
> Is this the correct igmp-proxy (based on mrouted)?
>
> https://github.com/mirror/dd-wrt/tree/master/src/router/igmp-proxy
>
> What is the actual vs. expected source address you are seeing on the
> IGMP packets?
this github mirror is unmaintained. (svn.dd-wrt.com is the correct
repository)
but yes, but same applies to upstream https://github.com/pali/igmpproxy

havent check the source addresses right now. i basicly discovered that
this patch breaks the igmp routing and all traffic stops
this here is from a working system with the reverted patch. if you
really need that i break it again using the patch you need to wait a
little bit

05:14:22.697962 IP 10.88.195.138 > 239.35.100.8: igmp v2 report 239.35.100.8
root@shellfast:/proc/4032/net# /tmp/ip mroute
(193.158.35.251, 239.35.20.4)    Iif: ppp0       Oifs: briptv
(10.88.193.141, 239.255.255.250) Iif: ppp0       Oifs: briptv
(10.88.193.145, 239.255.255.250) Iif: ppp0       Oifs: briptv
(10.88.195.138, 239.255.255.250) Iif: ppp0       Oifs: briptv
(10.88.195.129, 239.255.255.250) Iif: ppp0       Oifs: briptv
(10.88.193.134, 239.255.255.250) Iif: ppp0       Oifs: briptv
(10.88.195.1, 239.255.255.250)   Iif: ppp0       Oifs: briptv
(10.88.193.109, 239.255.255.250) Iif: ppp0       Oifs: br0
(10.88.193.145, 239.192.152.143) Iif: ppp0       Oifs: br0
(10.88.193.1, 239.255.255.250)   Iif: ppp0       Oifs: br0

>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 04:32:39

by Kevin Cernekee

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Mon, Jan 15, 2018 at 8:26 PM, Sebastian Gottschall
<[email protected]> wrote:
> havent check the source addresses right now. i basicly discovered that this
> patch breaks the igmp routing and all traffic stops
> this here is from a working system with the reverted patch. if you really
> need that i break it again using the patch you need to wait a little bit
>
> 05:14:22.697962 IP 10.88.195.138 > 239.35.100.8: igmp v2 report 239.35.100.8

The patch should only affect IGMPv3 behavior. I did not intend to
change IGMPv2 behavior. If it does, that might be a bug.

Is it possible that the kernel is using a source IP of 0.0.0.0, but
another host does not recognize it because it does not comply with RFC
3376?

Before/after packet traces would be the best way to see if the kernel
change is causing it to violate the standard.

2018-01-16 04:44:27

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

Am 16.01.2018 um 05:32 schrieb Kevin Cernekee:
> On Mon, Jan 15, 2018 at 8:26 PM, Sebastian Gottschall
> <[email protected]> wrote:
>> havent check the source addresses right now. i basicly discovered that this
>> patch breaks the igmp routing and all traffic stops
>> this here is from a working system with the reverted patch. if you really
>> need that i break it again using the patch you need to wait a little bit
>>
>> 05:14:22.697962 IP 10.88.195.138 > 239.35.100.8: igmp v2 report 239.35.100.8
> The patch should only affect IGMPv3 behavior. I did not intend to
> change IGMPv2 behavior. If it does, that might be a bug.
it does change the behaviour indeed. i dont know the reason. but i while
discovering the issue on 4.14 last week and newly on 4.9 this week while
testing
(my latest firmware i builded was from 30. december and worked) i got
tracked it down to this small patch and it immediatly worked after
reverting it
> Is it possible that the kernel is using a source IP of 0.0.0.0, but
> another host does not recognize it because it does not comply with RFC
> 3376?
this is possible yes, but i cannot look into the "deutsche telekom" host
>
> Before/after packet traces would be the best way to see if the kernel
> change is causing it to violate the standard.
let me just take a look into our patch
+ for_ifa(in_dev) {
+ if (inet_ifa_match(fl4->saddr, ifa))
+ return fl4->saddr;
+ } endfor_ifa(in_dev);
this looks like you're checking if the source address matches to a local
interface, if not you return 0.0.0.0 instead of the source address

(193.158.35.251, 239.35.20.4)    Iif: ppp0       Oifs: briptv

our first source address here 193.158.35.251 is from a remote network.
so your patch also will change the behaviour since the source address
will get ignored
>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 04:55:36

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

3.18 and 4.4 is affected too

Am 16.01.2018 um 04:50 schrieb Sebastian Gottschall:
> please revert that on 4.9 and 4.14
> it breaks igmp routing. it can be reproduced with any iptv connection
> using igmp-proxy. reverting this patch fixes the issue.
>
> Sebastian
>
>
>
> Am 01.01.2018 um 15:32 schrieb Greg Kroah-Hartman:
>> 4.9-stable review patch.  If anyone has any objections, please let me
>> know.
>>
>> ------------------
>>
>> From: Kevin Cernekee <[email protected]>
>>
>>
>> [ Upstream commit a46182b00290839fa3fa159d54fd3237bd8669f0 ]
>>
>> Closing a multicast socket after the final IPv4 address is deleted
>> from an interface can generate a membership report that uses the
>> source IP from a different interface.  The following test script, run
>> from an isolated netns, reproduces the issue:
>>
>>      #!/bin/bash
>>
>>      ip link add dummy0 type dummy
>>      ip link add dummy1 type dummy
>>      ip link set dummy0 up
>>      ip link set dummy1 up
>>      ip addr add 10.1.1.1/24 dev dummy0
>>      ip addr add 192.168.99.99/24 dev dummy1
>>
>>      tcpdump -U -i dummy0 &
>>      socat EXEC:"sleep 2" \
>> UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &
>>
>>      sleep 1
>>      ip addr del 10.1.1.1/24 dev dummy0
>>      sleep 5
>>      kill %tcpdump
>>
>> RFC 3376 specifies that the report must be sent with a valid IP source
>> address from the destination subnet, or from address 0.0.0.0. Add an
>> extra check to make sure this is the case.
>>
>> Signed-off-by: Kevin Cernekee <[email protected]>
>> Reviewed-by: Andrew Lunn <[email protected]>
>> Signed-off-by: David S. Miller <[email protected]>
>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>> ---
>>   net/ipv4/igmp.c |   20 +++++++++++++++++++-
>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> --- a/net/ipv4/igmp.c
>> +++ b/net/ipv4/igmp.c
>> @@ -89,6 +89,7 @@
>>   #include <linux/rtnetlink.h>
>>   #include <linux/times.h>
>>   #include <linux/pkt_sched.h>
>> +#include <linux/byteorder/generic.h>
>>     #include <net/net_namespace.h>
>>   #include <net/arp.h>
>> @@ -321,6 +322,23 @@ igmp_scount(struct ip_mc_list *pmc, int
>>       return scount;
>>   }
>>   +/* source address selection per RFC 3376 section 4.2.13 */
>> +static __be32 igmpv3_get_srcaddr(struct net_device *dev,
>> +                 const struct flowi4 *fl4)
>> +{
>> +    struct in_device *in_dev = __in_dev_get_rcu(dev);
>> +
>> +    if (!in_dev)
>> +        return htonl(INADDR_ANY);
>> +
>> +    for_ifa(in_dev) {
>> +        if (inet_ifa_match(fl4->saddr, ifa))
>> +            return fl4->saddr;
>> +    } endfor_ifa(in_dev);
>> +
>> +    return htonl(INADDR_ANY);
>> +}
>> +
>>   static struct sk_buff *igmpv3_newpack(struct net_device *dev,
>> unsigned int mtu)
>>   {
>>       struct sk_buff *skb;
>> @@ -368,7 +386,7 @@ static struct sk_buff *igmpv3_newpack(st
>>       pip->frag_off = htons(IP_DF);
>>       pip->ttl      = 1;
>>       pip->daddr    = fl4.daddr;
>> -    pip->saddr    = fl4.saddr;
>> +    pip->saddr    = igmpv3_get_srcaddr(dev, &fl4);
>>       pip->protocol = IPPROTO_IGMP;
>>       pip->tot_len  = 0;    /* filled in later */
>>       ip_select_ident(net, skb, NULL);
>>
>>
>>
>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 05:16:41

by Kevin Cernekee

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Mon, Jan 15, 2018 at 8:44 PM, Sebastian Gottschall
<[email protected]> wrote:
> Am 16.01.2018 um 05:32 schrieb Kevin Cernekee:
>>
>> On Mon, Jan 15, 2018 at 8:26 PM, Sebastian Gottschall
>> <[email protected]> wrote:
>>>
>>> havent check the source addresses right now. i basicly discovered that
>>> this
>>> patch breaks the igmp routing and all traffic stops
>>> this here is from a working system with the reverted patch. if you really
>>> need that i break it again using the patch you need to wait a little bit
>>>
>>> 05:14:22.697962 IP 10.88.195.138 > 239.35.100.8: igmp v2 report
>>> 239.35.100.8
>>
>> The patch should only affect IGMPv3 behavior. I did not intend to
>> change IGMPv2 behavior. If it does, that might be a bug.
>
> it does change the behaviour indeed. i dont know the reason. but i while
> discovering the issue on 4.14 last week and newly on 4.9 this week while
> testing
> (my latest firmware i builded was from 30. december and worked) i got
> tracked it down to this small patch and it immediatly worked after reverting
> it
>>
>> Is it possible that the kernel is using a source IP of 0.0.0.0, but
>> another host does not recognize it because it does not comply with RFC
>> 3376?
>
> this is possible yes, but i cannot look into the "deutsche telekom" host
>>
>>
>> Before/after packet traces would be the best way to see if the kernel
>> change is causing it to violate the standard.
>
> let me just take a look into our patch
> + for_ifa(in_dev) {
> + if (inet_ifa_match(fl4->saddr, ifa))
> + return fl4->saddr;
> + } endfor_ifa(in_dev);
> this looks like you're checking if the source address matches to a local
> interface, if not you return 0.0.0.0 instead of the source address
>
> (193.158.35.251, 239.35.20.4) Iif: ppp0 Oifs: briptv
>
> our first source address here 193.158.35.251 is from a remote network. so
> your patch also will change the behaviour since the source address will get
> ignored

According to my understanding of igmpv3_newpack(), the destination
address should always be IGMPV3_ALL_MCR = 224.0.0.22. That is what I
see in my testing.

However, your packet trace says 239.35.100.8. I don't know how the
code that we touched would be generating an IGMPv2 packet with that
destination address.

Would it be possible to get a stack trace for the case where the
source address is being cleared to 0.0.0.0 in your configuration?

2018-01-16 05:55:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Tue, Jan 16, 2018 at 04:50:39AM +0100, Sebastian Gottschall wrote:
> please revert that on 4.9 and 4.14
> it breaks igmp routing. it can be reproduced with any iptv connection using
> igmp-proxy. reverting this patch fixes the issue.

So Linus's kernel also is broken for you? Or does it work properly
there?

thanks,

greg k-h

2018-01-16 07:35:05

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

Am 16.01.2018 um 06:55 schrieb Greg Kroah-Hartman:
> On Tue, Jan 16, 2018 at 04:50:39AM +0100, Sebastian Gottschall wrote:
>> please revert that on 4.9 and 4.14
>> it breaks igmp routing. it can be reproduced with any iptv connection using
>> igmp-proxy. reverting this patch fixes the issue.
> So Linus's kernel also is broken for you? Or does it work properly
> there?^
all kernels are broken since 3th january. 3.18, 4.4, 4.9. 4.14 unless
this patch is removed.
this can be verified with deutsche telekom iptv and igmp-proxy.
i just stumbled accross this issue recently while i updated my kernel
source to latest revision. last working revision i tested with iptv was
late december and iptv stopped working after updating.
so i stepped back until i found this small patch which did the magic

Sebastian
>
> thanks,
>
> greg k-h
>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 08:15:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Tue, Jan 16, 2018 at 08:34:54AM +0100, Sebastian Gottschall wrote:
> Am 16.01.2018 um 06:55 schrieb Greg Kroah-Hartman:
> > On Tue, Jan 16, 2018 at 04:50:39AM +0100, Sebastian Gottschall wrote:
> > > please revert that on 4.9 and 4.14
> > > it breaks igmp routing. it can be reproduced with any iptv connection using
> > > igmp-proxy. reverting this patch fixes the issue.
> > So Linus's kernel also is broken for you? Or does it work properly
> > there?^
> all kernels are broken since 3th january. 3.18, 4.4, 4.9. 4.14 unless this
> patch is removed.

And also 4.15-rc8? That's what I meant by "Linus's kernel".

> this can be verified with deutsche telekom iptv and igmp-proxy.

I don't know waht this means, sorry.

thanks,

greg k-h

2018-01-16 09:18:24

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports


> According to my understanding of igmpv3_newpack(), the destination
> address should always be IGMPV3_ALL_MCR = 224.0.0.22. That is what I
> see in my testing.
>
> However, your packet trace says 239.35.100.8. I don't know how the
> code that we touched would be generating an IGMPv2 packet with that
> destination address.
easy answer from wikipedia. 224.0.x.x is not the only multicast block

224.0.0.0 to 224.0.0.255 Local subnetwork
224.0.1.0 to 224.0.1.255 Internetwork control
224.0.2.0 to 224.0.255.255 AD-HOC block 1
224.3.0.0 to 224.4.255.255 AD-HOC block 2
232.0.0.0 to 232.255.255.255 Source-specific multicas
233.0.0.0 to 233.251.255.255 GLOP addressing
233.252.0.0 to 233.255.255.255 AD-HOC block 3
234.0.0.0 to 234.255.255.255 Unicast-prefix-based
239.0.0.0 to 239.255.255.255 Administratively scoped


> Would it be possible to get a stack trace for the case where the
> source address is being cleared to 0.0.0.0 in your configuration?
you mean something like dumpstack and watching the flood comes over me?


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 09:21:31

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

in addition 224.0.0.x is not routable.
all other multicast addresses i showed you are routable. so your code 
will filter valid
multicast origins since they arent present on local interfaces. looks
like your code only works valid for 224.0.0.x addresses but not any
other valid multicast address



--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 15:26:03

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

From: Sebastian Gottschall <[email protected]>
Date: Tue, 16 Jan 2018 04:50:39 +0100

> please revert that on 4.9 and 4.14
> it breaks igmp routing. it can be reproduced with any iptv connection
> using igmp-proxy. reverting this patch fixes the issue.

Then should it be reverted in mainline as well?

Please submit a proper report to the netdev list with the patch
author CC:'d explaining the situation exactly so that we can move
forweard.

Thank you.

2018-01-16 15:32:17

by Kevin Cernekee

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

On Tue, Jan 16, 2018 at 1:18 AM, Sebastian Gottschall
<[email protected]> wrote:
>
>> According to my understanding of igmpv3_newpack(), the destination
>> address should always be IGMPV3_ALL_MCR = 224.0.0.22. That is what I
>> see in my testing.
>>
>> However, your packet trace says 239.35.100.8. I don't know how the
>> code that we touched would be generating an IGMPv2 packet with that
>> destination address.
>
> easy answer from wikipedia. 224.0.x.x is not the only multicast block

AFAICT the code that was changed by this patch should not have
anything to do with other multicast blocks. It generates an IGMPv3
report with destination address 224.0.0.22. So it would be useful to
get more information on how exactly it is causing a failure, so we can
find the root cause.

>> Would it be possible to get a stack trace for the case where the
>> source address is being cleared to 0.0.0.0 in your configuration?
>
> you mean something like dumpstack and watching the flood comes over me?

I would just add something like this into my local tree for testing:

WARN_ON_ONCE(pip->saddr == htonl(INADDR_ANY));

2018-01-16 15:34:37

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

Am 16.01.2018 um 16:25 schrieb David Miller:
> From: Sebastian Gottschall <[email protected]>
> Date: Tue, 16 Jan 2018 04:50:39 +0100
>
>> please revert that on 4.9 and 4.14
>> it breaks igmp routing. it can be reproduced with any iptv connection
>> using igmp-proxy. reverting this patch fixes the issue.
> Then should it be reverted in mainline as well?
yes
>
> Please submit a proper report to the netdev list with the patch
> author CC:'d explaining the situation exactly so that we can move
> forweard.+
could ne author of the patch please handle that bureaucracy.
i mean, i wanted to let you know what my findings are, beside this i'm
still a hard working developer. i integrated it into my own kernels and
informed the openwrt/lede people already
i dont have time for hours of discussions. i explained earlier my
thoughts about the cause of the fault in the patch. this should be enough
a more proper report would require deeper analyse and debugging of the
problem which isnt really neccessary for 10 code lines from my point of view

Sebastian
>
> Thank you.
>

--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2018-01-16 15:40:32

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH 4.9 27/75] net: igmp: Use correct source address on IGMPv3 reports

>> easy answer from wikipedia. 224.0.x.x is not the only multicast block
> AFAICT the code that was changed by this patch should not have
> anything to do with other multicast blocks. It generates an IGMPv3
> report with destination address 224.0.0.22. So it would be useful to
> get more information on how exactly it is causing a failure, so we can
> find the root cause.
from your code it always returns 0.0.0.0 and not the correct origin if
your rule doesnt match. this is a different behaviour. before, the
origin was used as is and not modified.
if the origin is set to zero, the routing of the network doesnt work
anymore. this works for local multicast addresses, but not for neighbor
networks which are using the routing table
but correct me if i'm wrong


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Stubenwaldallee 21a, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565