A large number of scientific studies use code written in the Fortran language. And, unfortunately, “scientific” applications are also not immune from trivial errors, such as uninitialized variables. Needless to say, what can such calculations lead to? Sometimes the effect of such mistakes can lead to "serious breakthroughs" in science, or cause really big problems - who knows where the results can be used (but, we guess where)? I would like to bring a number of simple and effective methods that will allow you to check the existing code in Fortran using the Intel compiler and avoid such troubles.
We will deal with problems with floating point numbers. Errors with uninitialized variables are difficult to find, especially if the code began to write on the Fortran 77 standard. The specificity is that even if we have not declared a variable, it will be declared implicitly, depending on the first letter of the name, according to the so-called rules implicit type definitions (this is also supported by the latest standards). Letters from I to N mean the type INTEGER, and the remaining letters are of type REAL. That is, if the variable F unexpectedly appears in our code, by which we multiply something, the compiler will not produce errors, but will simply make F a real type. Such a wonderful example can compile and execute quite well:
program test z = f*10 print *, z, f end program test
As you understand, there will be anything on the screen. I have so:
-1.0737418E+09 -1.0737418E+08
Interestingly, in the same standard there was the possibility of forbidding such “games” with declaring variables, but only within a program unit, writing
implicit none . True, if you forget to do this in some module, there will appear “phantom” variables there. It is curious that I once saw randomly added characters to the variable name in the calculations. Apparently, someone accidentally typed something in a notebook, and some of them were added to the program code when switching between windows. As a result, everything continued to be considered, and no one swore at the variable. It is extremely difficult to track down such errors, especially if the code worked for many years without problems.
')
Therefore, I highly recommend always using
implicit none and getting errors from the compiler about variables that were not explicitly defined (even if they are initialized and everything is fine with them):
program test implicit none ... end program test error #6404: This name does not have a type, and must have an explicit type. [Z] error #6404: This name does not have a type, and must have an explicit type. [F]
If we understand the already written code, then changing all the sources can be very labor-intensive, so you can use the
/ warn: declarations (Windows) compiler option or
-warn declarations (Linux) compiler. She will give us a warning:
warning #6717: This name has not been given an explicit type. [Z] warning #6717: This name has not been given an explicit type. [F]
When we deal with all the implicit declared variables and make sure that there are no errors with them, we can proceed to the next part of the Marlezonsky Ballet, namely, the search for uninitialized variables.
One of the standard methods is to initialize the compiler of all variables by some value, which, when working with a variable, we can easily understand that the developer has forgotten about initialization. This value should be very “unusual”, and when working with it, it is desirable to stop the execution of the application, so that, so to speak, “to take red-handed”.
It is quite logical to use the “signal” value of
SNaN - Signaling NaN (Not-a-Number). This is a floating point number that has a special idea, and when we try to perform any operation with it, we will get an exception. It is worth saying that a certain variable can get the value of NaN when performing certain operations, for example, dividing by zero, multiplying zero by infinity, dividing infinity by infinity, and so on. Therefore, before proceeding to the "catching" of uninitialized variables, I would like to make sure that there are no exceptions in our code related to working with floating-point numbers.
To do this, enable the
/ fpe: 0 and
/ traceback (Windows), or
–fpe0 and
–traceback (Linux)
options , build the application and run it. If everything went as usual, and the application came out without generating an exception, then we are great. But it is quite possible that already at this stage various “unforeseen moments” will “climb”. And all because
fpe0 changes the default operation with exceptions for floating point numbers. If they are disabled by default, and we calmly divide by 0, unaware of it, now, an exception will be generated and the program will be stopped. By the way, not only when dividing by 0 (divide-by-zero), but also when overflowing a floating-point number (floating point overflow), as well as in the case of illegal operations (floating invalid). At the same time, the numerical results may also change somewhat, since now the denormalized numbers will be “reset” to 0. This, in turn, can give significant acceleration when executing your application, since work with the denormalized numbers is extremely slow, but with you understand, with zeros.
Another interesting point is the possible receipt of exceptions with the
fpe0 option as a result of certain compiler optimizations, for example, vectorization. Let's say we loop and divide by value if it is not 0, doing an if check. It is possible that the division will still occur, because the compiler has decided that it will be much faster than using masked operations. In this case, we are working in speculative mode.
So this can be controlled using the
/ Qfp-speculation option
: strict (Windows) or
-fp-speculation = strict (Linux), and disable similar compiler optimizations when working with floating point numbers. Another way is to change the entire model of work through
-fp-model strict , which gives a big negative effect on the overall performance of the application. About what models are available in the Intel compiler, I already mentioned
earlier .
By the way, you can try and simply reduce the level of optimization through the
/ O1 or
/ Od options on Windows (
-O1 and
-O0 on Linux).
The
traceback option simply allows you to get more detailed information about where the error occurred (function name, file, and line of code).
Let's do a test on Windows by compiling without optimization (with the
/ Od option):
program test implicit none real a,b a=0 b = 1/a print *, 'b=', b end program test
As a result, we will see the following on the screen:
b= Infinity
Now turn on the
/ fpe: 0 and
/ traceback option and get the expected
exception :
forrtl: error (73): floating divide by zero Image PC Routine Line Source test.exe 00F51050 _MAIN__ 5 test.f90 …
We need to remove such problems from our code before the next stage, namely, force initialization with
SNaN values using the
/ Qinit option: snan, arrays / traceback (Windows) or
-init = snan, arrays -traceback (Linux).
Now every access to an uninitialized variable will result in a runtime error:
forrtl: error (182): floating invalid - possible uninitialized real/complex variable.
In the simplest example:
program test implicit none real a,b b = 1/a print *, 'b=', b end program test forrtl: error (182): floating invalid - possible uninitialized real/complex variable. Image PC Routine Line Source test.exe 00D01061 _MAIN__ 4 test.f90 …
A few words about what this strange
init option is. It appeared not so long ago, namely from the compiler version 16.0 (I remind you that the latest version of the compiler for today is 17.0), and allows you to initialize the following constructs in
SNaN :
- Static scalars and arrays (with attribute SAVE)
- Local scalars and arrays
- Automatic (formed when calling functions) arrays
- Variables from modules
- Dynamically allocated (with the attribute ALLOCATABLE) arrays and scalars
- Pointers (variables with the POINTER attribute)
But there are a number of limitations for which
init will not work:
- Variables in groups EQUIVALENCE
- Variables in COMMON block
- Inherited types and their components are not supported, except ALLOCATABLE and POINTER
- Formal (dummy) arguments in functions are not initialized to SNaN locally. However, the actual arguments passed to the function can be initialized in the calling function.
- Links to intrinsic function arguments and I / O expressions
By the way, the option is able not only to initialize values in
SNaN , but also to reset them. To do this, specify
/ Qinit: zero on Windows (
-init = zero on Linux), and not only the REAL / COMPLEX types, but the integer INTEGER / LOGICAL will be initialized. By adding
arrays , we will also initialize arrays, not just scalar values.
For example, options:
-init=snan,zero ! Linux and OS X systems /Qinit:snan,zero ! Windows systems
The scalars of the types REAL or COMPLEX are
initialized with the value
SNaN , and the types INTEGER or LOGICAL are initialized with zeros. The following example extends the initialization action also to arrays:
-init=zero -init=snan –init=arrays ! Linux and OS X systems /Qinit:zero /Qinit:snan /Qinit:arrays ! Windows systems
In the past, Intel tried to implement such functionality through the
-ftrapuv option, but today it is not recommended for use and is outdated, although it also had to initialize the values, as it was intended, it did not work out.
By the way, if you are working on the first generation Intel Xeon Phi coprocessor (Knights Corner), then the option will not be available to you, since there is no
SNaN support.
Well, at the end, a sample from the documentation, which we compile on Linux with all the proposed options and find the uninitialized variables in runtime:
! ============================================================== ! ! SAMPLE SOURCE CODE - SUBJECT TO THE TERMS OF SAMPLE CODE LICENSE AGREEMENT, ! http://software.intel.com/en-us/articles/intel-sample-source-code-license-agreement/ ! ! Copyright 2015 Intel Corporation ! ! THIS FILE IS PROVIDED "AS IS" WITH NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT ! NOT LIMITED TO ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR ! PURPOSE, NON-INFRINGEMENT OF INTELLECTUAL PROPERTY RIGHTS. ! ! =============================================================== module mymod integer, parameter :: n=100 real :: am real, allocatable, dimension(:) :: dm real, target, dimension(n) :: em real, pointer, dimension(:) :: fm end module mymod subroutine sub(a, b, c, d, e, m) use mymod integer, intent(in) :: m real, intent(in), dimension(n) :: c real, intent(in), dimension(*) :: d real, intent(inout), dimension(*) :: e real, automatic, dimension(m) :: f real :: a, b print *, a,b,c(2),c(n/2+1),c(n-1) print *, d(1:n:33) ! first and last elements uninitialized print *, e(1:n:30) ! middle two elements uninitialized print *, am, dm(n/2), em(n/2) print *, f(1:2) ! automatic array uninitialized e(1) = f(1) + f(2) em(1)= dm(1) + dm(2) em(2)= fm(1) + fm(2) b = 2.*am e(2) = d(1) + d(2) e(3) = c(1) + c(2) a = 2.*b end program uninit use mymod implicit none real, save :: a real, automatic :: b real, save, target, dimension(n) :: c real, allocatable, dimension(:) :: d real, dimension(n) :: e allocate (d (n)) allocate (dm(n)) fm => c d(5:96) = 1.0 e(1:20) = 2.0 e(80:100) = 3.0 call sub(a,b,c,d,e(:),n/2) deallocate(d) deallocate(dm) end program uninit
First, compile with
–fpe0 and run:
$ ifort -O0 -fpe0 -traceback uninitialized.f90; ./a.out 0.0000000E+00 -8.7806177E+13 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 1.000000 1.000000 0.0000000E+00 2.000000 0.0000000E+00 0.0000000E+00 3.000000 0.0000000E+00 0.0000000E+00 0.0000000E+00 1.1448686E+24 0.0000000E+00
It can be seen that there are no exceptions related to operations over floating point numbers in our application, but there are several "strange" values. We will search for uninitialized variables with the
init option:
$ ifort -O0 -init=snan -traceback uninitialized.f90; ./a.out NaN NaN 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 1.000000 1.000000 0.0000000E+00 2.000000 0.0000000E+00 0.0000000E+00 3.000000 NaN 0.0000000E+00 0.0000000E+00 1.1448686E+24 0.0000000E+00 forrtl: error (182): floating invalid - possible uninitialized real/complex variable. Image PC Routine Line Source a.out 0000000000477535 Unknown Unknown Unknown a.out 00000000004752F7 Unknown Unknown Unknown a.out 0000000000444BF4 Unknown Unknown Unknown a.out 0000000000444A06 Unknown Unknown Unknown a.out 0000000000425DB6 Unknown Unknown Unknown a.out 00000000004035D7 Unknown Unknown Unknown libpthread.so.0 00007FC66DD26130 Unknown Unknown Unknown a.out 0000000000402C11 sub_ 39 uninitialized.f90 a.out 0000000000403076 MAIN__ 62 uninitialized.f90 a.out 00000000004025DE Unknown Unknown Unknown libc.so.6 00007FC66D773AF5 Unknown Unknown Unknown a.out 00000000004024E9 Unknown Unknown Unknown Aborted (core dumped)
Now we see that on line 39 we refer to the uninitialized variable AM from the MYMOD module:
b = 2.*am
There are other errors in this code that I propose to find by using the Intel compiler. I really hope that this post will be useful to anyone who writes code on Fortran, and your applications will pass the necessary checks on uninitialized variables before the release to the light. Thank you for your attention and see you soon! Happy New Year, everyone!