Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752187AbdGDLGm (ORCPT ); Tue, 4 Jul 2017 07:06:42 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46909 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751910AbdGDLGl (ORCPT ); Tue, 4 Jul 2017 07:06:41 -0400 Subject: Re: Question regarding MAX_ARG_STRLEN with execve() To: Michal Hocko References: <8138c533-dae2-6a6a-fabd-d090b72d4d99@linux.vnet.ibm.com> <20170630142218.GB22923@dhcp22.suse.cz> <20170703092151.GF3217@dhcp22.suse.cz> Cc: Linux Kernel Mailing List , Ingo Molnar , Alexander Viro From: Anshuman Khandual Date: Tue, 4 Jul 2017 16:36:25 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20170703092151.GF3217@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable x-cbid: 17070411-0040-0000-0000-0000033D0C1F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17070411-0041-0000-0000-00000CB83335 Message-Id: <97c0bbde-0f1a-2096-f99c-9a884eb39a67@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-04_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1707040192 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2636 Lines: 61 On 07/03/2017 02:51 PM, Michal Hocko wrote: > On Mon 03-07-17 13:58:59, Anshuman Khandual wrote: >> On 06/30/2017 07:52 PM, Michal Hocko wrote: >>> On Fri 30-06-17 11:59:37, Anshuman Khandual wrote: >>>> Hello, >>>> >>>> execve() system call should support argument length of >>>> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we >>>> are not able to pass 32 * PAGE_SIZE arguments into the execve() >>>> system call because of the following reasons. >>>> >>>> * struct linux_binprm's vma starts with a size of PAGE_SIZE >>>> >>>> vma->vm_end = STACK_TOP_MAX; >>>> vma->vm_start = vma->vm_end - PAGE_SIZE; >>>> >>>> * The VMA expands as much depending upon the argument size. So >>>> for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE. >>>> >>>> * 33 * PAGE_SIZE with 64K pages fails the following test in >>>> get_arg_page() function. 33 * PAGE_SIZE is more than 2MB >>>> (8 MB /4) with 64K page size. >>>> >>>> if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) >>>> >>>> * Right now RLIMIT_STACK is hard coded 8MB which does not take >>>> PAGE_SIZE into account. >>>> >>>> Wondering what should be the solution for this problem ? >>>> >>>> * Change the default stack size from 8MB ? >>> just increase the ulimit if you want to use such a large arguments. >>> >> >> Yeah that is possible but it does not still offset the fact that >> the calculation is broken on the page size of 64K. I mean, yeah >> its not practical to have such a large argument. But the point >> is whether we would want to support the MAX_ARG_STRLEN semantic >> for execve system call or not. At present its broken for 64K >> and I am asking whether we will be willing to revisit the >> '1/4th of the stack' condition. > > I dunno. We have this 1/4 of RLIMIT semantic for years and it doesn't > seem there were any bug reports. Yes, MAX_ARG_STRLEN being PAGE_SIZE > dependent is unfortunate because it makes an arch independent default > ulimit hard to get right but I am not sure we actually have to lose > sleep over this. I understand your point. > > Or do you have any specific proposal how to "fix" this limitation which > wouldn't break other userspace? There are three variables here MAX_ARG_STRLEN, RLIMIT_STACK and the 25% condition. Execve() is supporting MAX_ARG_STRLEN for a long time, hence it cannot be changed now. That leaves us to change either the default RLIMIT_STACK value or the 25% condition. Both are kernel internal implementation. But I am not sure how changing them might affect any other userspace behavior, hence asking for suggestions. I just wanted to explore the possibilities of a fix here.