Avoiding the Unnecessary Recompilation of Fortran 90 Modules

Files

Abstract

A problem with the current use of modules in Fortran 90 is the unnecessary recompilation of dependent modules when the interface has not changed, when using basic makefiles.

Presented here are two different methods to avoid the cascading recompilation effect when using modules. It is recommended that GNU Make be used for both methods. Additionally, the second method requires the use of a scripting language such as Perl.

Also addressed is the problem of matching target module information files to Fortran source files when they do not share the same, which can be avoided by the use of variables in GNU Make.

Also look at the PDF article "Cascade Compilation Revisited", T. Stern and D. Grimwood, ACM Fortran Forum, April 2002, 12-24, (2002).

Background

With the introduction of modules in Fortran 90, the module information file was born. This allows the separation of the interface and object code of modules. Module interfaces provide an efficient means of error checking during compilation when calling routines in one module from another.

Often one makes changes to a source file that affects the object code but not a module interface. For example, the file containing

        module int
        contains
          function factorial(theint) result (res)
            integer :: theint,j,res
            res=1
            do j=1,theint
              res=res+j
            enddo
          end function
        end module

is found to have an error, and gets corrected to

        module int
        contains
          function factorial(theint) result (res)
            integer :: theint,j,res
            res=1
            do j=1,theint
              res=res*j
            enddo
          end function
        end module

Notice that only a mathematical formula has changed, and the interface has not changed.

If the interface to a module is not changed, and there are no inter-module optimisations, then there is no logical reason why other modules that depend on this module need recompiling. This presents a problem to "Make", as although object files need recompilation when source files are changed, module information files may not. Basic makefiles do not allow for this, as compilers usually replace a module information file even if it's contents have not changed. Some compilers are slightly smarter, and do not touch the module information file if its contents have not changed, but the downside to this is that "make" thinks the module information file is out of date compared to the source file. To make matters worse, some compilers insert the compilation date into the module information file, which might be the only change made to that file, making it difficult to tell whether the interface really has changed.

Make uses file time stamps to determine whether dependencies need recompiling. What is required to prevent the unnecessary cascading recompilation of modules is the time when the interface of a module was last changed. This is not necessarily the last time the module was compiled.

For simplicity, the following methods have examples where the module, object and source files have the same base parts to their names. E.g., a.mod, a.o and a.f90. Further down this page, it is shown how to extend Makefiles to deal with multiple modules per filename, module names in uppercase/lowercase, and modules that simply have a name unrelated to their source files.

It is recommend to use GNU Make or some other version of Make that supports variables and user-defined functions.

Method 1 - Recursive Make

This method works due to the non-recursive nature of the dependency tree determined by Make. That is, when Make is invoked, it determines the entire dependency graph before compiling anything. This means that once Make has tried to recompile a file, it will not try again during that invocation of Make.

In this method, the module information file depends only on its corresponding object file, not on any other module information files or source code. It is the object file that depends on the source code, and might also depend on other module information files depending on its "USE" statements. This means it does not explicitly matter whether a module information file is up to date or not compared to its own source file. Separate rules must be written for additional dependencies on the object file ("USE" statements, include files), although there are scripts available to do this.

%.mod : %.o
        @if [! -f $@ ]; then \
          rm $< \
          $(MAKE) $< \
        fi

%.o : %.f90
        $(FC) -c -o $@ $<

In this example, there are several possibilities :

A variation on the theme :

%.mod :
        @if [! -f $@ ]; then \
          rm $(*F).o; \
        fi
        $(MAKE) $<

%.o : %.f90
        $(FC) -c -o $(*F).o $<

This is effectively the same as above, except Make is always recursively called to make the object file from the module information file rule. There are no explicit prerequisites for the module information files. It is expected that this method is slightly slower, due to the additional call to $(MAKE).

Note that if the compiler conditionally overwrites a module information file depending on whether it has changed, this method is simple. If the compiler always overwrites the module information file, then the conditional overwriting behaviour must be mimicked. This can be achieved by backing up the module information file, compiling the new module information files, and keeping either the backup or the new file depending on a comparison of the two files. A common example of such a script is

     mv a.mod a.mod~ ; \
     $(FC) -c a.f90 ; \
     if cmp -s a.mod a.mod~ ; then \
       mv a.mod~ a.mod ; \
     else \
       rm a.mod~ ; \
     fi

If the compiler inserts the compile time into the module information file, a more exotic comparison than "cmp -s" may be required. Furthermore, if there are multiple modules in the source file, then the script should back up and compare each module information file.

Requirements

No additional requirements apart from Make are necessary if your compiler does not overwrite module information files that do not change. However, to use the above script you will need the utilities "cmp", "mv" and "rm", which are present in most flavours of Unix. You can get these for windows from cygwin.

Evaluation

Pros :
very simple to implement.
Cons :
module information file timestamps may always be out of date.
some people do not like the recursive use of "make".

Method 2 - Touch Files

This method uses timestamps of additional files to record the date at which module information files last changed. The module information files themselves have their timestamps updated just as the compiler normaly does, so Make thinks it is just being recompiled as usual. In this method, the Makefile itself is written almost as if the module recompilation issue does not exist. It is a wrapper script that takes care of all the module recompilation issues.

The module information file(s) and the object file have the same dependencies and prerequisites. A simple Makefile involves :

%.o %.mod : %.f90
        $(FC) -o $(*F).o $<

In the method proposed, the Makefile is very similar,

%.o %.mod : %.f90
        $(script) -fc "$(FC) -o $(*F).o $<" -provides "$(*F).mod $(*F).o" \
                -requires "$^"

The script takes the compiler command as an argument, and runs that command based on the script's determination of whether the module requires recompilation. The internal workings of the script do not need to be known by the user.

The way the script works is that for each .mod file, there is a corresponding .time file, which has the timestamp of when the .mod file was last changed. Whenever the module gets recompiled, the .mod file has its timestamp updated, but the .time file will only get updated if the contents of the .mod file changed during the recompilation.

The way the targets and prerequisites are used by the script is as follows. Say source file "a.f90" depends on files "z.mod", "y.mod" and "defs.h", and that it produces files "a.o", "a.mod" and "b.mod" when compiled. The makefile rule is:

a.o a.mod b.mod : a.f90 z.mod y.mod defs.h
        $(script) -fc "$(FC) -o a.o a.f90" -provides "a.mod b.mod a.o" \
                -requires "a.f90 z.mod y.mod defs.h"

Telling the script what files go in the "-requires" line can be done by using variables in Make, such as is discussed on the web page GNU Make Variables and Pattern Rules. The script then takes the information from its arguments and internally constructs what is equivalent to :

a.o a.mod b.mod : a.f90 z.time y.time defs.h
        $(FC) -o a.o a.f90

If any of the target files are missing or out of date, then the compiler is run. Additionally, if the files "a.time" or "b.time" are missing, the compiler is also run. If the compiler is successfully run, then the target files are given up to date timestamps, and the "a.time" or "b.time" files are given new timestamps if their .mod files changed as a result of the compilation.

If none of the target files are missing or out of date, and the .time files exist (regardless of their timestamps), then the target files are given up to date timestamps, without recompilation.

Since there is no dependence on .o or .mod files in this method, but rather the .time files, the cascading recompilation is easily avoided when appropriate.

Example Module Compilation Script

The script compile_mod.pl may be used to perform this method. To use it, change "$(script)" in the example above to read "perl -w compile_mod.pl".

Note that this script requires the additional argument -cmp command, for example, -cmp "cmp -s" on Unix. The command is to compare module information files, and must return error code 0 for no difference and 1 for a difference between the two files being compared. An alternative to "cmp -s" is given in the section title "".

Requirements

To use the supplied script, you will need Perl. This comes standard with Linux. If you don't have it, you can get it from CPAN.

Evaluation

Pros :
module information file timestamps always get updated by recompilation.
Cons :
requires an external program such as a perl script.
fills a directory full of "touch files".

Module Information Files from various compilers

As stated earlier, module information files from some vendors contain the compilation date, which makes the comparison from before and after compilation difficult.

Here is what a comparison using "cmp -l" gave from successive compilations of the same module for different vendors.

The three likely scenarios are:

No change
The compiler does not put the date in. This is great!
Change
The file position changes between each run.The compiler does not put the date in. This is great!
System Easy to compare? Reason
Fujitsu v1.0 on Linux yes no change in .mod file
Fujitsu/Lahey 6.0 and 6.1 on Linux yes no change in .mod file
DEC on OSF1 yes no change in .mod file
Pacific Sierra Research vf90 Personal on Linux yes no change in .vo file
SGI MIPSpro version 7.30 yes no change in .mod file
SUN WorkShop Compilers 5.0 yes no change in .mod file
Compaq V5.3 on TRU64 yes change in .mod file always same position
IBM XLF90 on AIX no/yes it should be ok in principle, since the change in .mod file is near the end of file, but I haven't used xlf90 in a while to check.
Intel Fortran Compiler for Linux v6 yes change in .d file at same offset. Note though a .d file is produced for each source file, not each module.
Intel Fortran Compiler for Linux v7.x yes and no change in .mod file at same offset from end of file, but this offset depends on the module's name and where temporary files are located.
Intel Fortran Compiler for Linux v8.x yes change in .mod file always same position
Absoft Pro 8.0 / Linux yes no change in .mod file

The Perl script compare_module_file.pl may be used as a drop-in replacement for "cmp -s" to compare module information files. To use it as a direct replacement for "cmp -s", type

        perl -w compare_module_file.pl file1 file2

or if the script knows about your compiler, type something like

        perl -w compare_module_file.pl -compiler COMPAQ-f95-on-OSF1 file1 file2

Run the script without arguments to see what compilers it knows about.

If you have details of an unlisted compiler, please let me know. To test whether your compiler inserts extra garbage (such as the date) into the module information file, follow these steps:

  • Compile a module. Back up the module information file to a different directory.

  • Recompile the module. You may need to type "touch filename.f90" if you use make to do it.

  • Compare the backed up and new module information files by using "cmp -l backup/filename.mod filename.mod". This will print out the byte offsets where the files differ.

    Alternatively, use the Perl script print_cmp.pl and do "perl -w print_cmp.pl backup/filename.mod filename.mod". This script will also report the byte numbers ordered from the end of file, since some vendors insert the date at a specific offset from the end of the file.

  • If cmp returns nothing, you have a nice compiler and update_module_file will work as is.

  • If cmp returns some values, repeat the procedure with another module. If the two modules return the same offsets, then these are the values to insert into update_module_file. If they are different, then are they the same offset from the end of the files? We haven't yet written the code for this case, but it should not be difficult using Perl. If the offsets appear to have nothing in common, then the method probably cannot be applied to your compiler.

Wishlist

It would be easier if all compilers did not insert the date and other near-random data into .mod files, so that the comparison would be easy. Maybe even a compiler switch not to do it, for those that do. Unfortunately, the Fortran standard at the moment does not say that the module information file must not change if the module interface does not change (they don't exist at all in the standard). Until that happens, we have little hope of vendors making our lives easier.

The easiest solution from a user point of view would be to have a compiler switch so that if the .mod file already exists and is the same as the new one, then to not overwrite it at all. This would not be difficult for compilers to implement, especially if they use a temporary directory first, and it would remove the need for our scripts.

Acknowledgements

In compiling all this information, the author wishes to thank the following people for their ideas or input.

  • Ted Stern

  • Peter Shenkin

  • Arjen Markus

  • Dylan Jayatilaka

Author

Daniel Grimwood
reaper@ivec.org

Comments appreciated.