Optimization creates infinite loop

Users of numerical analysis packages such as those at www.netlib.org, especially the codes in the ACM TOMS group, often find themselves contending with Fortran 77 subroutines and functions that are now obsolete because those were replaced with intrinsics such as RADIX, EPSILON, HUGE, TINY, etc., in Fortran 90 and later. Although the proper fix is to rework the old code, replacing the add hoc pieces of code with RADIX, etc., that is not a chore that one wishes to undertake in every case -- often, we just want to build and run the old code with minimum effort.

Here is an example, where the user's purpose is defeated by the high quality of the IFort optimizer. I extracted a reproducer from www.netlib.org/toms, file 768.gz (TENSOLVE, a solver for simultaneous nonlinear equations). The test code is intended to output the floating point base (radix) of 2. With other compilers, and with /Od with IFort, it does. However, with /Ot or the default /fast, the program goes into an infinite loop, as did one of the test problem runs from TOMS-768 compiled with IFort.

      PROGRAM PBETA
      IMPLICIT NONE
      WRITE(*,*)'IBETA = ',IBETA()

      CONTAINS
      INTEGER FUNCTION IBETA()
C
C     returns radix(0d0)
C
      IMPLICIT NONE
      INTEGER ITEMP
      DOUBLE PRECISION A, B, TEMP, TEMP1
      DOUBLE PRECISION ZERO, ONE
      DATA ZERO, ONE/0.0D0, 1.0D0/

      A = ONE
      B = ONE
   10 CONTINUE
      A = A + A
      TEMP = A + ONE
      TEMP1 = TEMP - A
      IF (TEMP1-ONE .EQ. ZERO) GO TO 10
   20 CONTINUE
      B = B + B
      TEMP = A + B
      ITEMP = INT(TEMP-A)
      IF (ITEMP .EQ. 0) GO TO 20
      IBETA = ITEMP
      RETURN
      END FUNCTION
      END PROGRAM

The portion of the disassembly that corresponds to the lines from statement-10 to the IF statement four lines later is as follows. In effect, the optimizer replaces the IF statement before statement-20 with IF (0d0 .EQ. 0d0) GO TO 10. It similarly replaces lines 25 and 26 with, in effect, ITEMP = INT(B).

  00000076: 0F 28 CA           movaps      xmm1,xmm2     ; A=ONE
  00000079: 66 0F EF C0        pxor        xmm0,xmm0     ; should have been TEMP1 - ONE
  0000007D: 66 0F 2E C0        ucomisd     xmm0,xmm0     ; comparing 0D0 with 0D0
  00000081: F2 0F 58 C9        addsd       xmm1,xmm1     ; A=A+A
  00000085: 7A 02              jp          00000089      ; never happens
  00000087: 74 F4              je          0000007D      ; always true
  00000089: F2 0F 58 D2        addsd       xmm2,xmm2     ; B=B+B
  0000008D: F2 0F 2C C2        cvttsd2si   eax,xmm2

Lines 3 to 6 of the disassembly constitute the infinite loop. The loop is entered with xmm0 = 0 (from the PXOR), and xmm0 is never changed in the loop.

The compiler is allowed to make transformations and cannot be faulted based on standards-conformance. In fact, it could have continued its aggressive work and replaced the jp/je with an unconditional jmp. Perhaps, it could have removed the ucomisd instruction, which is not needed any more. However, the compiler can, after optimizing the code, see that it has created an infinite loop, at which point it is probably worth issuing a warning to the unsuspecting user.

After all, when a program built from about 9000 lines of code "works" with competing compilers, hangs with IFort/default-options but works with IFort /Od, one may suspect a compiler bug. Nor is it trivial to locate the problem in the source code. What I did was to use the fsplit utility on the source files, and I then compiled all pieces with /Od for the base run. I then compiled each split file with /Ot, until I found the one that caused the infinite loop to occur.

Optimization creates infinite loop

Trending Articles

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Download: Ziba Zako ft Rich Bizzy & General Kanene – Chikwati (Prod by: Bicko...

pinout ecu b5vf 18881a

Practice Sheet of Right form of verbs for HSC Students

Stories • Goddess Stepmom

LSI SMIS на ESXi 6.7

Re: No option for 'Guest Isolation' in VMware Workstation 16 player

Practical Research 2 DLP for SHS

Error when updating pager_heading in Views Module - "A valid cache entry...

South Sudan: CCM VACANCY FOR Primary Health Care Supervisor (PHCS) – SOUTH SUDAN

IP400 Series Phones Fail to Connect to CAS

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Sarah Samis, Emil Bove III

Who's been in the courts?

HP 9015 All in One Printer undefined IOREF 99999 Alert ID:8048

BQ40Z80EVM-020: Installation problems with Battery Management Studio Software...

Cops bust UVF goon Matthews at east Belfast gym

* Start SLD Registration * Failed to open HTTP connection

MDG F: Cost Centre Hierarchy - File upload

Burbank Police Log: May 16 – May 22