G95-XML HOWTO Philippe Marguinaud http://g95-xml.sourceforge.net 1- Description: G95-XML is based on the front-end of g95 ( http://www.g95.org ). g95 is a Fortran 95 compiler ( some parts of the Fortran 2003 standard have also been implemented ). G95-XML uses the parser of g95 and dumps the internal representation of g95 structures in XML. Some extra information regarding the exact form and semantics of the parsed Fortran code has been added. For instance, the exact location of statements and expressions have been added; no simplification of Fortran expressions takes place, hence it is possible to rewrite the code from the XML description. Some Perl scripts have been added, to use the big XML files. These scripts emulate a developping environment; their names are very well known: f95, ar, ld ( and a dummy cc and c++ ). The basic idea is to parse the code in the same manner it is compiled, that is using traditional Makefiles. There is also an Apache+mod_perl browser. This browser requires Apache v1.3 and mod_perl v1.0. Note that Apache v2.0 and mod_perl v2.0 do not work yet, but I will look at that very soon. 2- Configure and build G95-XML: $ tar zxvf g95-xml-20080505.tgz $ cd g95-xml-20080505 $ ./configure G95-XML configuration --perl-executable=... : Path to your Perl executable --httpd-executable=... : Path to your httpd+mod_perl executable --fortran-src=... : Path to your Fortran source directory --arch64 : Must be set to true on 64 bits machines --help : Display help message The configure script has the following arguments: --perl-executable Full path to your Perl executable, defaults to /usr/bin/perl, which should be OK --httpd-executable Path to your httpd v1.3 compiled with mod_perl v1.0 --fortran-src Path to your Fortran source directory; this is the directory you want to be visible to your apache web server. It defaults to `../examples', which we will be using for the HOWTO. --arch64 This option must be set for 64 bits platforms I run configure with the following arguments, on my 32 bits PC: $ ./configure --perl-executable=/opt/perl/bin/perl \ --httpd-executable=/opt/apache_1.3.34/bin/httpd Change to src directory and type make to build the G95-XML parser: $ cd src $ make This will build an executable named g95, which is the modified g95 parser. 3- Basic usage of G95-XML: Once you have built the g95 parser, you can run the sample programs. Change to the base directory and type: $ . ./f95.sh The last command will bring in the modified g95 and the Perl scripts from perl/. The change to the examples directory and type: $ cd examples $ ../src/g95 -xml program.F This will create an XML file: $ ls -lrt total 12 -rw-r--r-- 1 philou philou 129 mai 5 16:09 program.F -rw-r--r-- 1 philou philou 5456 mai 5 16:09 program.xml Now you can look at the XML file. At first glance, it seems rather hard to understand but the principles are very simple: - each entity ( as defined in the Fortran langage ) which appear in program.F is described by and XML tag ( enclosed in < > brackets ). - each entity has an id ( which is unique within an XML file ). - entities can refer to each other by means of their id. For instance, the xml fragment: should appear somewhere in the program.xml file ( with different ids though ). These lines tell us that `x' is a symbol which is a VARIABLE. The dimension and shape attributes shows that `x' is an array with explicit shape, whose dimensions are: DIMENSION( 0x818af58 ) Where `0x818af58' is a CONSTANT expression of type `0x8120a40' ( look for the definition of `0x8120a40' somewhere else in program.xml ). The value of `0x8120a40' is 10. The expression `0x818af58' also carries a field named `loc'; this attribute show the exact position of the expression in the current file ( first line, first character, last line, last character, all starting from zero ). It may look very difficult to adopt such a description for a source code file, but in fact it is very handy when the XML description is reparsed with Perl as we shall see in the next section. This description also provides all the details needed to analyse the source code. Now that you have understood the basics of the XML data produced by the g95 parser, just write custom programs and play with the parser to understand better the output of G95-XML. 4- Pseudo-compiler: Stay in the example directory and type now: $ f95 -c -keep-xml program.F Warnings: 0 Errors: 0 $ ls -lrt total 36 -rw-r--r-- 1 philou philou 129 mai 5 16:09 program.F -rw-r--r-- 1 philou philou 5456 mai 5 16:39 program.xml -rw-r--r-- 1 philou philou 24576 mai 5 16:39 program.o You have now a pseudo-object file, which contains Perl structures describing your Fortran source code. Type now: $ objdump program.o MAIN____________________________________ : D : PROGRAM main____________________________________ : D : PROGRAM You see now the names of the external entities used in main.o. The program is named `main', but since Fortran program can be unnamed, the g95 parser creates a default `MAIN_' symbol which appears above. Type now: $ objdump -silent -dump-symbol main program.o $VAR1 = bless( { 'namespace' => 'fortran::namespace:0x8193580', 'dump' => 'fortran::dump:0x0', 'name' => 'main', 'id' => '0x81939d0', 'flavor' => 'PROGRAM' }, 'fortran::symbol' ); $ objdump -silent -dump-id 0x8193580 program.o $VAR1 = bless( { 'dump' => 'fortran::dump:0x0', 'intrinsic_operators' => [], 'datas' => [], 'user_operators' => [], 'commons' => [], 'id' => '0x8193580', 'labels' => [], 'statement_tail' => 'fortran::statement:0x818be18', 'interface_namespaces' => [], 'symbol' => 'fortran::symbol:0x8193af8', 'contained_namespaces' => [], 'implicit' => [], 'generics' => [], 'symbols' => [ 'fortran::symbol:0x8193af8', 'fortran::symbol:0x81939d0', 'fortran::symbol:0x818b078' ], 'equivs' => [], 'statement_head' => 'fortran::statement:0x818a3a0' }, 'fortran::namespace' ); For those familiar with Perl, it is easy to recognize Perl objects which describe Fortran entities. I will describe in the next section how to write custom programs in Perl to access pseudo-object files. Perl objects contain real Perl references to each other, but in order to be printable, we have to fiddle with the output of Data::Dumper so that Perl references be replaced by character strings such as `fortran::symbol:0x...'. Otherwise, Data::Dumper would print out the whole structure. Type now: $ ar rv libprogram.a program.o $ ld -o program.exe program.o $ ls -lrt total 60 -rw-r--r-- 1 philou philou 129 mai 5 16:09 program.F -rw-r--r-- 1 philou philou 5456 mai 5 16:39 program.xml -rw-r--r-- 1 philou philou 24576 mai 5 16:39 program.o -rw-r--r-- 1 philou philou 12288 mai 5 16:51 libprogram.a -rw-r--r-- 1 philou philou 12288 mai 5 16:51 program.exe This has created two more files, as you expected, a pseudo-library and a pseudo-executable. You cannot execute `program.exe', but it contains the description on how the program was linked: $ objdump program.exe MAIN____________________________________ : D : PROGRAM : program.o main____________________________________ : D : PROGRAM : program.o We see from the lines above that the `main' symbol has been linked in from `program.o' file. Once they have been created, pseudo-objects, pseudo-libraries and pseudo-executables should not be moved around on your filesystem. At least their relative paths should be preserved; that is because `program.exe' does not contain `program.o', it just keeps track that the `main' symbol has been loaded from `program.o'. Type now: $ objdump -silent -rewrite program.o PROGRAM main IMPLICIT NONE REAL, DIMENSION(10) :: x x(1) = 1.2E+1 END PROGRAM main The code has been re-written from the description contained in `program.o'. The semantics are preserved, no simplification should ever occur. You can even re-compile the re-written code: $ objdump -silent -rewrite program.o > program1.F $ f95 -c program1.F Warnings: 0 Errors: 0 And compare it with the previous compilation: $ objdump -silent -compare program.o program1.o You can try `objdump' with the `-compare' option with other source code. It will tell you the first place where the two pseudo-object files differ. 5- Programming with the Perl modules: Here is a sample script which opens a Fortran pseudo-object and prints each statement type, and some extra info about ASSIGNMENT statements: ____ #!/opt/perl/bin/perl -w # the two following lines bring in the # Perl fortran modules use lib "/utemp/philou/fortran/g95-xml-20080505/perl"; use fortran; ( my $obj = shift ) or die( "Usage:$0 .o\n" ); my $o = 'fortran'->open( file => $obj ); my $dump = $o->dump(); for( my $stmt = $dump->{statement_head}; $stmt; $stmt = $stmt->{next} ) { print( $stmt->{type}, "\n" ); if( $stmt->{type} eq 'ASSIGNMENT' ) { print $stmt->as_dump(); } } ___ Note that in the Perl program above, you have to replace `/opt/perl/bin/perl' with the path to your Perl executable. Save the program to a file named `inspect' in your `examples' directory and type: $ ./inspect program.o PROGRAM IMPLICIT_NONE TYPE_DECLARATION ASSIGNMENT $VAR1 = bless( { 'expr2' => 'fortran::expr:0x818b868', 'next' => 'fortran::statement:0x818be18', 'prev' => 'fortran::statement:0x818b118', 'namespace' => 'fortran::namespace:0x8193580', 'dump' => 'fortran::dump:0x0', 'ext_locs_index' => [ '2', '2' ], 'loc' => [ '6', '6', '6', '16' ], 'expr1' => 'fortran::expr:0x818b480', 'type' => 'ASSIGNMENT', 'id' => '0x818bae8', 'f' => 'fortran::file:0x8193528' }, 'fortran::statement' ); END_PROGRAM Some comments about this program: - $o represents the object file. - $dump is an object of class 'fortran::object', it holds the whole Perl structure describing the Fortran source code; we will explore this structure in depth later. 6- Apache+mod_perl browser: This is where G95-XML shines. Look at the demo on http://g95-xml.sourceforge.net. To have the Fortran browser running, you need apache v1.3 ( get the source from http://httpd.apache.org/download.cgi ) and mod_perl v1.0 ( get it at http://perl.apache.org/download/index.html ). We explain here briefly how to set up apache+mod_perl ( be careful not to be in the G95-XML environment, because the C compiler is overloaded ): $ tar zxvf mod_perl-1.0-current.tar.gz $ tar zxvf apache_1.3.41.tar.gz $ cd mod_perl-1.30/ $ /opt/perl/bin/perl Makefile.PL USE_APACI=1 EVERYTHING=1 \ APACI_ARGS=' --prefix=/opt/apache_1.3.41 ' Will configure via APACI Configure mod_perl with ../apache_1.3.41/src ? [y] y Shall I build httpd in ../apache_1.3.41/src for you? y $ make $ make install We need Apache::Request too: $ /opt/perl/bin/perl -MCPAN -eshell Terminal does not support AddHistory. cpan shell -- CPAN exploration and modules installation (v1.7601) ReadLine support available (try 'install Bundle::CPAN') cpan> install Apache::Request Note that in the installation whose steps are described above, I have used my own installation of Perl ( `/opt/perl/bin/perl' ). You may have to install a custom Perl so that you can add custom modules such as `Apache::Request'. Assuming you have configured G95-XML with the right httpd location, you can now change to directory 'apache' of the G95-XML tree, and issue the command: $ ./apachectl start This will start an httpd+mod_perl on port 8282. You can open a browser and type in the location entry: http://localhost:8282/fortran/ ( do not forget the trailing slash ). There you will see the contents of the `../examples' directory. You can click on every of the files you see, but the results looks pretty dull. Load the G95-XML environment ( . ./f95.sh ), and change to directory `examples', issue the command: $ f95 -c -html program.F Warnings: 0 Errors: 0 The `-html' switch compiles the source code and embeds html in the object file. Then in your browser, click on the `program.o' link. You should now see the html produced by G95-XML. All variables, labels, blocks, operators are clickable. If you hold Ctrl+shift and click on a local variable, you will see its representation in terms of Perl object. Now type: $ ld -html -o program.exe program.o This creates a `program.exe' with built-in html. Go back to your browser and give a try to `program.exe'. I advise you to compile your own programs and experiment to get used to the browser. 7- Anatomy of a fortran::dump object: For those who are used to Perl, I thought it would be good to introduce the fortran::dump object. A Fortran unit is represented by a set of fortran::* objects; for instance, the following program would get 3 fortran::statement objects, 2 fortran::symbol objects ( for MAIN and X ), etc... PROGRAM MAIN REAL :: X END PROGRAM All those objects are embedded in an object whose class is fortran::dump; this object is unique in a pseudo-object file, and always has id `0x0'. It contains references to all other fortran::* objects present in the unit. You can explore it using the `objdump' command: $ objdump -silent -dump-id 0x0 program.o | less $VAR1 = bless( { 'basic' => { '0x812100c' => 'fortran::type:0x812100c', '0x8120f84' => 'fortran::type:0x8120f84', '0x8117ff7' => 'fortran::operator:0x8117ff7', '0x8117fd2' => 'fortran::operator:0x8117fd2', '0x815c9c8' => 'fortran::type:0x815c9c8', '0x8118007' => 'fortran::operator:0x8118007', '0x8193af8' => 'fortran::symbol:0x8193af8', '0x8118019' => 'fortran::operator:0x8118019', '0x8120f85' => 'fortran::type:0x8120f85', '0x8117fdd' => 'fortran::operator:0x8117fdd', '0x8193528' => 'fortran::file:0x8193528', '0x8120a80' => 'fortran::type:0x8120a80', '0x818b480' => 'fortran::expr:0x818b480', '0x8117fe3' => 'fortran::operator:0x8117fe3', ... You see that the `basic' slot contains all objects all differents classes, but objects are sorted by class in other slots ( for instance, the `expr' slot contains all objects of class fortran::expr ). The fortran::dump has attributes of importance: - file_head: the top level file of this source code. - statement_head: the first statement encountered in the source code. - statement_tail: the last statement. When opening a pseudo-object file, the first thing to do is to retrieve the fortran::dump object. After doing so, you can start inspecting the Fortran code. This is what we did in the little `inspect' program. 8- Locations of Fortran entities: In the big XML file, all Fortran items appearing in the code have full and explicit location. Statements, for instance have `f' and `loc' attributes: The `f' is a reference to the file object which contains this statement: The `loc' attribute of a statement is the exact position of the statement within the file referenced by `f'; hence, this statement starts at line 6, character 6 and ends at line 6, character 16, all indices starting at 0. The statement holds an attribute named `ext_locs_index'; these two numbers are and index and a length in the `ext_locs' attribute of the file object; hence for this statement, the ext_locs are: [6,6,6,7,'0x818b078'], [6,11,6,12,'0x8117d67'] These are the positions and ids of the symbols, operators, labels appearing in the statement: 1 PROGRAM MAIN 2 3 IMPLICIT NONE 4 5 REAL, DIMENSION( 10 ) :: X 6 7 X(1) = 12. ^ ^ 8 9 END PROGRAM `0x818b078' is a reference to the `x' symbol, and `0x8117d67' a reference to the assignment operator: