Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751613AbdF1X6C (ORCPT ); Wed, 28 Jun 2017 19:58:02 -0400 Received: from gateway30.websitewelcome.com ([192.185.160.12]:29903 "EHLO gateway30.websitewelcome.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751619AbdF1X5X (ORCPT ); Wed, 28 Jun 2017 19:57:23 -0400 Date: Wed, 28 Jun 2017 18:57:19 -0500 Message-ID: <20170628185719.Horde.2-GXEXxQwqkKUOuLD4nHa0d@gator4166.hostgator.com> From: "Gustavo A. R. Silva" To: Frans Klaver Cc: Ingo Molnar , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: Re: [kernel-sched-cputime] question about probable bug in cputime_adjust() References: <20170627180353.Horde.IIRh_TLNBuucfm8xZhbOEix@gator4166.hostgator.com> In-Reply-To: User-Agent: Horde Application Framework 5 Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator4166.hostgator.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - embeddedor.com X-BWhitelist: no X-Source-IP: 108.167.133.22 X-Exim-ID: 1dQMpj-002uQY-Ny X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: gator4166.hostgator.com [108.167.133.22]:51906 X-Source-Auth: garsilva@embeddedor.com X-Email-Count: 7 X-Source-Cap: Z3V6aWRpbmU7Z3V6aWRpbmU7Z2F0b3I0MTY2Lmhvc3RnYXRvci5jb20= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6664 Lines: 208 Hi Frans, Quoting Frans Klaver : > On Wed, Jun 28, 2017 at 7:35 AM, Frans Klaver wrote: >> On Wed, Jun 28, 2017 at 1:03 AM, Gustavo A. R. Silva >> wrote: >>> >>> Hello everybody, >>> >>> While looking into Coverity ID 1371643 I ran into the following piece of >>> code at kernel/sched/cputime.c:568: >>> >>> 568/* >>> 569 * Adjust tick based cputime random precision against scheduler runtime >>> 570 * accounting. >>> 571 * >>> 572 * Tick based cputime accounting depend on random scheduling timeslices >>> of a >>> 573 * task to be interrupted or not by the timer. Depending on these >>> 574 * circumstances, the number of these interrupts may be over or >>> 575 * under-optimistic, matching the real user and system cputime with a >>> variable >>> 576 * precision. >>> 577 * >>> 578 * Fix this by scaling these tick based values against the total runtime >>> 579 * accounted by the CFS scheduler. >>> 580 * >>> 581 * This code provides the following guarantees: >>> 582 * >>> 583 * stime + utime == rtime >>> 584 * stime_i+1 >= stime_i, utime_i+1 >= utime_i >>> 585 * >>> 586 * Assuming that rtime_i+1 >= rtime_i. >>> 587 */ >>> 588static void cputime_adjust(struct task_cputime *curr, >>> 589 struct prev_cputime *prev, >>> 590 u64 *ut, u64 *st) >>> 591{ >>> 592 u64 rtime, stime, utime; >>> 593 unsigned long flags; >>> 594 >>> 595 /* Serialize concurrent callers such that we can honour our >>> guarantees */ >>> 596 raw_spin_lock_irqsave(&prev->lock, flags); >>> 597 rtime = curr->sum_exec_runtime; >>> 598 >>> 599 /* >>> 600 * This is possible under two circumstances: >>> 601 * - rtime isn't monotonic after all (a bug); >>> 602 * - we got reordered by the lock. >>> 603 * >>> 604 * In both cases this acts as a filter such that the rest of the >>> code >>> 605 * can assume it is monotonic regardless of anything else. >>> 606 */ >>> 607 if (prev->stime + prev->utime >= rtime) >>> 608 goto out; >>> 609 >>> 610 stime = curr->stime; >>> 611 utime = curr->utime; >>> 612 >>> 613 /* >>> 614 * If either stime or both stime and utime are 0, assume all >>> runtime is >>> 615 * userspace. Once a task gets some ticks, the >>> monotonicy code at >>> 616 * 'update' will ensure things converge to the observed ratio. >>> 617 */ >>> 618 if (stime == 0) { >>> 619 utime = rtime; >>> 620 goto update; >>> 621 } >>> 622 >>> 623 if (utime == 0) { >>> 624 stime = rtime; >>> 625 goto update; >>> 626 } >>> 627 >>> 628 stime = scale_stime(stime, rtime, stime + utime); >>> 629 >>> 630update: >>> 631 /* >>> 632 * Make sure stime doesn't go backwards; this preserves >>> monotonicity >>> 633 * for utime because rtime is monotonic. >>> 634 * >>> 635 * utime_i+1 = rtime_i+1 - stime_i >>> 636 * = rtime_i+1 - (rtime_i - utime_i) >>> 637 * = (rtime_i+1 - rtime_i) + utime_i >>> 638 * >= utime_i >>> 639 */ >>> 640 if (stime < prev->stime) >>> 641 stime = prev->stime; >>> 642 utime = rtime - stime; >>> 643 >>> 644 /* >>> 645 * Make sure utime doesn't go backwards; this still preserves >>> 646 * monotonicity for stime, analogous argument to above. >>> 647 */ >>> 648 if (utime < prev->utime) { >>> 649 utime = prev->utime; >>> 650 stime = rtime - utime; >>> 651 } >>> 652 >>> 653 prev->stime = stime; >>> 654 prev->utime = utime; >>> 655out: >>> 656 *ut = prev->utime; >>> 657 *st = prev->stime; >>> 658 raw_spin_unlock_irqrestore(&prev->lock, flags); >>> 659} >>> >>> >>> The issue here is that the value assigned to variable utime at line 619 is >>> overwritten at line 642, which would make such variable assignment useless. >> >> It isn't completely useless, since utime is used in line 628 to >> calculate stime. > > Oh, I missed 'goto update'. Never mind about this one. > > >>> But I'm suspicious that such assignment is actually correct and that line >>> 642 should be included into the IF block at line 640. Something similar to >>> the following patch: >>> >>> --- a/kernel/sched/cputime.c >>> +++ b/kernel/sched/cputime.c >>> @@ -637,9 +637,10 @@ static void cputime_adjust(struct task_cputime *curr, >>> * = (rtime_i+1 - rtime_i) + utime_i >>> * >= utime_i >>> */ >>> - if (stime < prev->stime) >>> + if (stime < prev->stime) { >>> stime = prev->stime; >>> - utime = rtime - stime; >>> + utime = rtime - stime; >>> + } >>> >>> >>> If you confirm this, I will send a patch in a full and proper form. >>> >>> I'd really appreciate your comments. >> >> If you do that, how would you meet the guarantee made in line 583? >> You are right, I see now. Then in this case the following patch would be the way to go: --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -615,10 +615,8 @@ static void cputime_adjust(struct task_cputime *curr, * userspace. Once a task gets some ticks, the monotonicy code at * 'update' will ensure things converge to the observed ratio. */ - if (stime == 0) { - utime = rtime; + if (stime == 0) goto update; - } if (utime == 0) { stime = rtime; but I think this one is even better: --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -615,19 +615,11 @@ static void cputime_adjust(struct task_cputime *curr, * userspace. Once a task gets some ticks, the monotonicy code at * 'update' will ensure things converge to the observed ratio. */ - if (stime == 0) { - utime = rtime; - goto update; - } - - if (utime == 0) { + if (stime != 0 && utime == 0) stime = rtime; - goto update; - } - - stime = scale_stime(stime, rtime, stime + utime); + else + stime = scale_stime(stime, rtime, stime + utime); -update: /* * Make sure stime doesn't go backwards; this preserves monotonicity * for utime because rtime is monotonic. What do you think? Thank you! -- Gustavo A. R. Silva