new rule help

Tue, 2007-05-29 18:23 — gwest

Hi, all.

We just recently acquired proparse and prolint. I am looking into writing some new rules. I have not finished reading all the doc available at joanju.com, or fully absorbed the syntax tree. But I was hoping you folks could help me short-circuit the process a little. I am wondering if any of you
a) have written these rules
b) have suggested starting points to help me locate how to find them in the tree
c) be able to tell me "No can do"

on several rules I am looking at implementing.
1) No nested includes
2) use of positional arguments (instead of named arguments) in an include
3) use of a preprocessor inside an include that is not defined in the include (nor a named argument)
4) detecting a transaction that spans UI blocking
5) 'check existence before content' - before 'entry(x,string)' make sure there is a "if num-entries(string)'
6) unnecessary break-by - break-by on for each with no 'first-of' / 'last-of'... inside the block
7) source lines longer than 80 characters

I have figured out how to do most everything else I would like to implement in this round, or what to look for (or at a minimum made sure it CAN be found). I just have a little difficulty determining if the 7 items above can be 'noticed'. Any help would be appreciated. We intend to publish these rules back to the Hive when complete, along with another dozen or so we are working on.

Wed, 2007-05-30 00:40 — john

Re: new rule help

Hi Glen and all,

#1 - #3: Please see the "Preprocessor Listing File" section of the Proparse User's Guide. I think you can get what you need from it. That Proparse preprocessor listing file is something I've used a fair amount within ProRefactor (in Java), but nobody has done anything with it from the 4gl yet, that I'm aware of. If you play with some output from small example compile units (with preprocessing and include files), the listing file should become fairly obvious. It's one of those things that makes most sense with examples.

#4 This would probably have to be done using ProRefactor and Callgraph. It would be challenging to write, but I don't think that it is beyond reasonable.

#5 could be done by making sure that ENTRY(n,s) is within an IF NUM-ENTRIES(s) block.

#6 shouldn't be too tough. The first trick would be to have a good grip on the semantics: exactly which syntaxes using field f depend on break-by field f.

#7 would be easiest with INPUT FROM as Jurjen suggested. It could be done by looking at column position of the tokens in the syntax tree, but that would be more difficult to write the rule, and it would be inefficient.

gwest wrote:
> Hi, all.
>
> We just recently acquired proparse and prolint. I am looking into
> writing some new rules. I have not finished reading all the doc
> available at joanju.com, or fully absorbed the syntax tree. But I was
> hoping you folks could help me short-circuit the process a little. I am
> wondering if any of you
> a) have written these rules
> b) have suggested starting points to help me locate how to find them in
> the tree
> c) be able to tell me "No can do"
>
> on several rules I am looking at implementing.
> 1) No nested includes
> 2) use of positional arguments (instead of named arguments) in an include
> 3) use of a preprocessor inside an include that is not defined in the
> include (nor a named argument)
> 4) detecting a transaction that spans UI blocking
> 5) 'check existence before content' - before 'entry(x,string)' make sure
> there is a "if num-entries(string)'
> 6) unnecessary break-by - break-by on for each with no 'first-of' /
> 'last-of'... inside the block
> 7) source lines longer than 80 characters
>
> I have figured out how to do most everything else I would like to
> implement in this round, or what to look for (or at a minimum made sure
> it CAN be found). I just have a little difficulty determining if the 7
> items above can be 'noticed'. Any help would be appreciated. We intend
> to publish these rules back to the Hive when complete, along with
> another dozen or so we are working on.
>
>

Tue, 2007-05-29 19:26 — jurjen

re: new rule help

(for the record: answered at proparse@peg.com)

Hi Glen,

Yeah I have some experience with writing Prolint rules. John has a lot more knowledge of Proparse, but he is currently on vacation for two weeks so I will try to answer some of your questions.

The good thing about Proparse is that it expands preprocessors and includefiles. The downside of this genius is that this makes it hard to find preprocessors and includefiles in the parsed result, just like it is hard to
find them in compiler r-code. To find includefiles I think it may be easier to look in XREF output instead, and you can easily do that in a Prolint rule.

1) no nested includes:

I think I would make a rule that is based on XREF instead Proparse, and search the XREF file for entries where sourcefile is not equal to compilationunit and operation is "include". Try rule "wholeindex.p" as an example.

2) use of positional arguments:

Good idea! But sorry, I don't know how. John?

3) use of a preprocessor that is not defined in the same file

hmm, interesting: it looks like you found a rare example of something that's easier to find with grep than with the advanced parser. Or instead of grep: use 4GL to read the sourcefile line by line, make a list of preprocessor
definitions as you go, and if you see a reference then check if it is in the list.

But first you need a list of includefiles that is referenced by the current compilation unit. The current version of Prolint does not have such a list,
but I think it can be pulled from the footer of the parse tree.

Another interesting fact is that this rule only applies to includefiles (while other rules apply to compilation units) and that the includefile needs to be unparsed. This means that it is actually inefficient to make it as a regular rule: when you test 100 smartobjects, you will be checking the same "smart.i" 100 times. That's 99 times too many. It would be more efficient if Prolint would find includefiles AND compilationunits and apply different rulesets to both types. That's a framework change, and not an easy one.

4) detect a transaction that spans ui blocking

Difficult, I have looked into this a couple of times. One of the difficulties is determining the transactions scope: the compiler listing shows the transaction scope, but the line numbers in the listing file are not equal to line number in the original sourcefile and I have never
succeeded in finding a function like real_lineno=function(listing_lineno)

An other difficulty is that when the transaction runs from line 200 to 300, you can call a function or procedure in line 250 that is implemented in line 400 (or even out of sight, in a persistent procedure) where the actual UI
blocking appears. This may not be impossible to trace, but sure a lot of work.

Actually I think UI and transactions don't belong both in one procedure at all.

5) test num-entries(c) before entry(i,c)

Possible. But beware that the rule will probably raise a bunch of false positives (if too strict) and a bunch of false negatives (if not strict enough), because the acceptable distance between the num-entries function
and the entry function can vary.

I think I would start to scan for ENTRY.
on each result, find the parent node repeatedly until you are on the level of a procedure/function/method/trigger. From that point, start a query for "NUMENTRIES" and see if the line-number of the result is less than the line-number of the ENTRY node. If it is, check if both involve the same
character variable. If it is, check if the variable assigned by the num-entries function is the same variable that is used as the index in the entry function.

6) unnecessary break-by

I suppose it is possible, but I don't know what defines "unnecessary" ....
In my 15 years of Progress coding I have never used break-by, so it seems to me like it is always unnecessary :-) Or I have always overlooked a great feature :-(

7) lines longer than 80 characters

Like answer 3: probably easier to check with a imple "INPUT FROM" than with the parser, but I am curious if John agrees.
The tricky part is that lines may contain TAB characters that you need to count as an uncertain number of spaces.
(By the way, I am tempted to ask what's wrong with lines longer than 80, but that's none of my business. Still curious though).

Not sure how much detail you want in each question, but I hope this is enough to get you going. If not, just keep asking.

Regards,
Jurjen.

The OpenEdge Hive

More Navigation

new rule help

Comment viewing options

Re: new rule help

re: new rule help

Prolint

Active Projects & Libraries