ABL2DB - Databasing ABL information for Analysis

The ABL compiler provides some valuable tools for analyzing individual compile units, but this information can become far more valuable if systematically gathered into a database and supplemented with other information and analysis tools. Presentation on this project can be found at http://cintegrity.com/content/Databasing-ABL-Code-and-Data-Relationships and http://cintegrity.com/content/How-Can-I-Fix-Applied-Use-ABL2DB%0B-Real-W... .

I am in the process of filling out the documentation, which should appear gradually so check back for additions and corrections. Check the Installation page for copies of the code. The initial release was 0.50. See the change log for on-going changes. I am in the process of integrating Proparse to perform some new functions which could not be achieved with just the listing and xref files. The Proparse versions will begin numbering with 1.01 to signify the substantial leap in capability. As this is new and has complexity not in prior versions, proceed with caution to start, but I am anxious to get some real testing done on other code bases.

This is very much a work-in-process, so as people try it out and find bugs or opportunities for improvement, please let me know at thomas at cintegrity dot com. I have a number of improvements and extensions in the pipeline and will happily consider others.

NOTE: The ABL code in this project will probably run on any reasonably recent version of Progress on any platform, but recent releases of ABL2DB use Proparse in multiple steps and those will only run on Windows. ABL2DB has no UI other than writing to logs and so can be used as a batch process without any particular development environment. Thus, a minimal installation should be possible with a client networking installation on Windows. The Analysis database could be on another platform and the reports should run on that platform.

Report bugs and feature requests on the ABL2DB & ABL2UML Forum


Installation

This tool has been designed for easily customization for site specific requirements. Installation instructions below assume installation in the context of PDSOE, but are easily adapted to batch operation, which is likely to be the norm for day to day operation.

All code, except the launcher, is packaged under com.cintegrity. One should define a project for work on ABL2DB which has separate source and run directories. I use src and run, but you can use whatever you like as long as you make the appropriate modifications as indicated on the configuration page. The current tool creates XREF, LIST, and DEBUG output from its empirical compile. The DEBUG output is not currently used, but it is expected to use it as the tool evolves. I use xrf, lst, and debug for these destinations respectively. You should create directories for your preferred names under the same project. I also create a rpt directory as the destination for the logs which are produced in running ABL2DB. The source code in the zip below should be installed in the src (or whatever you called it) directory.

My inclination is to put the database in a directory *not* under the project containing ABL2DB because it facilitates multiple projects accessing this same database as well as easier version transition. Again, put it where you like and edit accordingly.

The structure file below is just a sample to get you started.

The instructions for installing Proparse are in the PDF below. The expectation is that you will be using it in PDSOE, but there is no requirement that you do so.


Configuration for Site Specific Setup and Characteristics

The primary tailoring occurs in the launch program, abl2db_launcher.p which is found at the top of the src directory. At the top of this program you will find a section of variable definitions labelled Application Specific Definitions. You need to edit the initial values for these definitions to match your application.

The values in the code provided apply to sample code from an ERP package called Integrity/Solutions which was based on Varnet and which has not been maintained for 10 years. In some ways, this makes it a good "bad" example, including source files with no extensions and a lot of trash in the source tree which would have been cleaned up if maintained by modern standards. The code for this application is located on my system in \work\IS with the database under the DB directory and the code under ISRel. It uses two databases.

Variables in Launcher
The tailorings should be fairly self-explanatory, but here is an explanation of the variables:

Variables for selecting files from source directories

Database and Schema Variables:

Propath for compile

Source Variables

Other variables

Note that for the directory lists, there is a positional correspondence. I.e., the first source directory will be compiled into the first run directory with the xref in the first xref directory and so on.

Menu, Menu Item, and Functional Unit Files
The mechanism for menus is inherently specific to the code base, but provides an important clue as to the functionality of the application because it is the mechanism by which a user accesses that functionality (or most of it). To make it easy to load the menu structure, ABL2DB uses three text files in export format to load the menu structure.

The first file is a list of all menus, not the items on the menu. This consists of

For batch programs and other functions not accessed by a user menu per se, pseudomenu entries should be created.

The second is a list of menu items, i.e., things that appear on a menu (or, in the case of pseudomenus, the items in that category). This consists of:

The link name is either the name of a menu (submenu), as in the prior file, or of a functional unit (see below).

The third file is a list of functional units. In our UML Profile for ABL, a Functional Unit is all code which can be reached from a particular menu selection. That is empirically constructed here by associating a name with the compile unit which is run by the menu item. We can identify the full Functional Unit by following the links from the initial compile unit to all include files and compile units run from or by all those compile units. This is obviously a natural unit of study when making changes to any menu item functionality. The file consists of:

This compile unit path is then used to find the previously loaded compile unit and link to it.

For any application with a data driven menu system, it should be reasonably straightforward to export files in the desired format. It may be necessary to supplement this automatic list with some manual entries to cover batch functions and the like which do not appear on a menu. For an application which has hard coded menus in the code (shame, shame) manual or other automated procedures will be required to create these input files. While one may start using ABL2DB with empty files, users are encourage to create valid input by whatever means since a functional unit is the basic unit for analysis in many cases and one will have to manually identify the appropriate compile unit instead, if no functional units are defined.

Program Descriptions
If they are available, it can be very handy to have even a short description of source code units. With some frequency (though varying quality), then is some kind of description at the top of some or all source code disk files. The exact position and the label, if any, identifying this description varies. The GetDescription.cls has been provided to extract these descriptions when they are available. The customizable portion is in the method ExtractDescripiton which takes a source file as input. It copies the source file into a longchar variable where its lines can be accessed with entry() using a newline delimiter. The supplied version looks for a comment on the first line and extracts everything following an initial argument and a space, which works for Varnet style code. This is easily adapted to look for a particular line or to search through the initial lines for a keyword such as "Purpose:" or "Description:".

SchemaUtilities.cls
This class has two methods which are appropriate for site customization. MassageMetaDBName is called when there is a table reference in which the table name begins with an underscore, i.e., a metaschema table. When these occur in code unqualified, the cross reference will indicate Analysis as the database since that is the first connected database. The standard version converts this to a reference to DICTDB to indicate that it might be any database. Qualified references to other databases are left alone. One might wish to change this default.

The other method is FindTableForBuffer which is intended to use local naming conventions to convert a buffer name to a table name. How easy this is to do depends on the adherence to standards in the code and the nature of the standards. The sample code illustrates a number of prefix and suffix standards. In the worst case, one might have to create a lookup table.

Not Yet Implemented Features
Note that there are a number of features, particularly related to schema, which are not yet implemented because I have no sample .dfs incorporating them to work from. Generally, these are things that are trivial to implement if I just have a sample, so let me know if you run into something with your own code base. In particular, on the schema side is support for non-OE databases and LOB fields.


Questions and Answers

This section is intended to provide information on various questions which have come up in the use of this tool.


Proparse Utilities

Having started to use Proparse as a part of ABL2DB, I have created several utilities for Proparse, either to call from ABL2DB or to provide information to use in development of ABL2DB. Since these utilities might be useful to someone doing other development with Proparse, I am documenting them here. For now, they are available as a part of the ABL2DB distribution. If there was demand, I could package them separately. All are published in the com.cintegrity.Proparse package.

DocSchemaFromDB.cls - Creates proparse.schema from attached DBs in the fashion of the old schemadump1.p and schemadump2.p. Uses DumpOneSchema.cls.

DocSchemaFromDF.cls - Work in Progress - Creates proparse.schema from .df files. This function is currently performed in the context of ABL2DB by BuildSchemaDF which processes .df files and simultaneously builds both the ABL2DB schema tables and the proparse.schema.

SetEnvironment.cls - Creates the Environment for Proparse.

SetSchemaFromFile.cls - Loads the proparse.schema into the Proparse Schema object at runtime.

SharedInfoLister.cls - Creates a report on a single compile unit showing all of the shared objects. sharedinfo_launch.p is the launch program.

TestScanner.cls - Looks through all specified source directories looking for things that look like compile units and attempts to compile them. If they compile, it does a minimal Proparse run on the file. The purpose is to identify if there is any syntax in the source files which Proparse does not yet support so that one can work on getting that supported. TestScanner.p is the launch program. This uses BuildDirectoryTree.cls from ABL2DB and ScanOneCU.

TokenLister.cls - Parses a single compile units and produces a report showing the parse tree. Handy for developing ABL code to walk new syntax. tokenlist_launch.p is the launch program.


Sample Reports

While it is possible that fancier and more sophisticated reports and inquiries will arise on top of ABL2DB over time, not the least of which is its possible role as the front end for ABL2UML, it is likely that the bulk of analytical report for ABL2DB will be specific, project oriented programs. Fortunately, they are also likely to be quite simple. Therefore, the samples presented here are not fancy works of art, but very simple little programs designed to get the desired information with as little development effort as possible. In particular, they are all simple .ps so they can just be run, modified quickly, and run again. In all cases the relevant parameters are at the top of the program so one can easily pick different functions or schema elements or whatever to report on without altering the program. But, if any different information is desired for any particular report, then adding the information should be simple as well.

I hope that users will contribute their own reports so that we will build up a library to make it easier for people to get started with ABL2DB.


Blocks in Compile Unit

We all know how important buffer and transaction scope can be to the proper operation of an ABL program. We have COMPILE LISTING to show us these scopes, but many programmers are either unaware of the tool or seem to forget to use it to check their work. But, this information is critical to understanding many issues of many problems. Here is a program which starts with a particular FunctionalUnit, i.e., one menu selection or batch program which performs some function in the application, links to the main compile unit which is run to perform that function, and then follows the links to all of the compile units reached by those compile units. For each compile unit, it shows the blocks which either have transaction or buffer scope and what that scope is.

The report function and some sample output is provided.


Call Tree

Build a calltree starting with a diskfile record. Only procedure calls are logged, currently (oe 11.5.1) xml-xref does not provide nodes for function calls like 'RUN' for procedures. The report has a .p extension, purpose is to provide clickable links to files. Of course the propath has to make the clickable link possible.

Contributed by Stefan Houtzager.


References to an Integer DB column in programs

With the advent of the INT64 datatype, there can be the desire to change the datatype of an existing database column of type integer to type INT64. When this is done, it is desirable to examine the code where this column is referenced to see if there are local variables which also need to be changed.

This report starts with a designated database column and finds all of the references in code where that column is used.

The ABL code and a sample report are provided. The IntRefs_James sample is from James Palmer examining the proposed change of their customer number to INT64.


Shared Variable Usage in a Functional Unit

One of the common tasks in modernization is replacing the use of shared variables with explicitly passed parameters or references to persistent procedures or objects.

This report starts with a particular FunctionalUnit, i.e., one menu selection or batch program which performs some function in the application, links to the main compile unit which is run to perform that function, and then follows the links to all of the compile units reached by those compile units. In each of the compile units it shows all the shared variables and whether they are declared new. The report shows when the variable is assigned to or assigned from and flags any that are neither as unused.

The ABL report and a sample output are provided.


Unused Database Column

In the course of evolving an application, it is not uncommon for database columns to be added in anticipation of some change that never happens or for once used column to be replaced by other schema or even for whole tables to be abandoned because the functionality they supported is no longer needed or relevant. These columns present no actual harm, but may be confusing since one assumes they are used, especially since they may contain legacy data from when they were used.

This report starts with a particular database and examines all of the columns of all of the tables and looks for references in compile units. If there are more than a specified number of references, the column is skipped. If there are no references or less than the specified number, the column is reported. In the sample report, a limit of 1 is used on the theory that a column which is only referenced once may not be useful

The ABL report and sample output are provided. The UnusedDB_James is from James Palmer.


The Driver Class

To run ABL2DB one runs launcher_abl2db.p whose configuration is discussed in the Configuration section. it sets up the overall log (one can add additional tracking if desired for debugging), creates the Driver class, initializes the Driver class with the configuration values, and then executes the Initialize() method which tests some preconditions, and the Process() method which does the actual work.

The Driver class Process() method simply calls each of the classes which make up the ABL2DB processing in a logical order, one at a time. This process is easily customized to either skip some steps or to alter what happens in a particular step.

In particular, ABL2DB has been designed to start with an "empirical" compile of the code under the source directory that matches the specified criteria for a compilable unit of source code. This was done to ensure:

In ABL2DB, the decision was made to determine whether a file is or is not a compile unit based on whether an r-code file exists for it. Thus, compile-on-the-fly source code units are not considered compile units because, not being compilable except in a run-time context, one cannot generate the xref and list files for it.

If a particular application context can ensure that all files will be present and current, one could easily skip this step, but it was included to be safer than assuming that this processing occurred previously.

Similarly, if a particular usage of ABL2DB did not require some component of the information which it collects, one could easily skip that step as long as it was logically possible.

At present, it is assumed that a full build from scratch is done every time this process is executed. Providing a suitable empty database is external to this code.


Analysis & Reporting

In addition to targeted queries related to specific enhancement or maintenance projects, we will be exploring a number of possible analysis functions including:

No doubt, other ideas will arise and will be reported on here as they are created. Ideas are always welcome.


Roadmap & Wish List

Some of the items on the roadmap include:

Following are some items which are not specifically on the roadmap yet, but which have been requested and are being considered:


Change Log

See below for pre-1.00 log
1.01 - Initial Proparse release. Includes tracking of shared variables
1.02 - Convert to case insensitive GUID keys
1.03 - Add all shared object types - buffer, browse, dataset, frame, menu, query,
          stream, temp-table, variable, and work-table.
1.04 - Change compilable test to look for XREF
1.05 - Add Proparse buffer tracking to BuildBlocks to resolve buffer names to
          tables
1.06 - Bug fixes in BuildBlocks
1.07 - Additional Proparse utilities and revised handling
1.08 - Revisions for logical & physical DB names and aliases
1.09 - Bug fixes for shared objects
1.10 - Bug fixes for blocks with multiple buffers
1.11 - Shift in strategy for collecting buffers in BuildBlocks to get buffer
          parameters plus new Proparse utility SharedInfoList.
1.12 - Add Assigned From/To to Shared Variable tracking
1.13 - Fix bug and add log option in TestScanner
1.14 - Scope qualifier for shared objects, support for class scope qualifiers, and
          variable vs property.
1.15 - Support for multiple tables in a shared query.
1.16 - Restructuring BuildBlocks and BuildShared to create a fresh Proparse
          environment for each compile unit.
1.17 - Collecting field lists and additional info on shared queries.
1.18 - Revise Driver.cls to eliminate Proparse
1.19 - Modify BuildSchemaDF to ignore Alternate Buffer definitions.
1.20 - Improve detection of inappropriate incremental .df
1.21 - Add tree walk option to TestScanner
1.22 - Add tracking of include file containing RUN (Stefan Houtsager)
1.23 - Fix to TreeScanner


Change Log prior to version 1.00

0.50 - Initial Release
0.51 - Parameterize overall log location in variable
0.52 - Clean up GetDescription logic
0.53 - Fix separator bug in BuildRunLinks.cls
0.59 - Add M support for LOB-SIZE and LOB-BYTES
       - Support TABLE-TRIGGER
       - Support N separate DBs with their own schema
       - Support DB in table links
       - Support DB Connect params (including live DBs)
       - Fix potential bug in destructor
0.60 - Support sequences in .df
0.70 - Modify path utilities to better handle Windows drives.
       - Shift to support multiple sort and run directories
       - Change DiskFile.chSourceDirectory to DiskFile.chBaseDirectory.
       - Add fields to CompileUnit for RCode, Xref, List, and Debug base directory.
       - Add SchemaUtilites class to centralize MetaDB Name logic.
       - Add FindTableForBuffer method in SchemaUtilities.cls to match buffer
          names to tables with potential site customization.
0.71 - Add handling for encrypted source
0.72 - Add handling for reverse slash in .df triggers
0.73 - Add support for SA fields in table and column schema
0.74 - Add support for CLOB-* properties in .df
0.75 - Corrections for \, <> "", and "*.cls"
0.76 - Speed enhancements to BuildBlocks.cls
0.77 - Accommodate large line numbers in BuildBlocks.cls
0.78 - Add FOREIGN-NAME and LABEL-SA in BuildSchemaDF
0.79 - Page size and width options in CompileDirectoryTree
0.80 - Index instead of entry in BuildBlocks.cls
0.81 - Shift BuildTableLinks structure to handle class references that look like
          database references
0.82 - Add optional DB alias handling


Thank you

This page is intended to provide recognition for people who have contributed to the development of this utility. I will probably forget someone, for which apologies are offered, but I will try.

And, of course, one can't forget the communities at PEG, Progresstalk, and PSDN for important information as questions have come up.