Proparse API : Problem with arrays

Wed, 2008-08-13 15:35 — jmls

Stealing a comment from the performance thread:

[snip] For Prolint, I know that we can come up with an API that is a layer of easy-to-use methods which save the developer from working directly with the memptr, but at the same time, is very fast because the API works directly with the memptr rather than create intermediate objects or records.
[snip]
For Prolint, I think Jurjen and I should consider a blob API that is a drop-in replacement for the old DLL API (using integer node handles). We talked about this before.

This way, we'll have an API used by Prolint that is compatible with existing Prolint code, and also very fast.
[snip]
The map from handleNum to nodeOffset would not be a temp or work table, since those have some overhead. It would just be an array:

offset = nodeHandles[handleNum].
[snip]

the last comment (nodeHandles[handlenum]) will give you a problem. If you declare a variable of extent > 28000 you get message 3350

Array extent may not be greater than  (3350)

Program variables may not have an extent of more than approximately 28000, which is 32000 minus some control information.

so you won't be able to store any more that 28000 nodes. I think that the API will just have to work off the blob directly

GET-LONG(DataBlob, offsetOfIndex + (GET-LONG(DataBlob, FirstOffset + p_Node * 4)) * 4) + 1

I would still suggest that we create a separate ParseUnit class for handling memptrs, called MemParseUnit, so you could still call

ParseUnit:GetNodeFirstChild(x). In this class, this would return an integer, not a node class.

However, you would obviously not be chain attributes together as no classes are returned.

Proparse discussion

Wed, 2008-08-13 19:11 — jmls

Don't want to be

Don't want to be presumptuous, but is it ok if I start on the memParseUnit ? I have already done some coding as part of the exploratory work and can upload it if that's ok for people .. I don't want to tread on anyone's toes ..

Wed, 2008-08-13 19:43 — jurjen

There are no toes here ;-) I

There are no toes here ;-)
I would start with an inventory of API functions that are currently used by Prolint. That's just a subset of what proparse.i defines.

Wed, 2008-08-13 20:14 — jmls

ok, let me take out the

ok, let me take out the debug code and pretty it up first, and I'll upload the initial code.

It returns FirstChildNode, ParentNode, NodeText and NodeTypeName

Wed, 2008-08-13 19:34 — john

Re: Don't want to be

I sure don't mind! It's great that you are working on it. Jurjen should have a lot of input into its design and implementation, maybe he wanted to do some of it himself... Jurjen?

Wed, 2008-08-13 18:06 — john

no problem with arrays

With the old API, we define one variable for each node handle we want (like defining a BUFFER or a HANDLE). If we want four node handles, then we define four variables. If we want 28000 node handles, we define 28000 variables. No, wait, we've never done that. :-)

Usually only one, two, or a few handles would be needed by any program using the old API.

Even in recursive programs (walking the tree), there wouldn't be a need for all that many. Say a few for each recursion of the program, multiplied by the depth of the tree, which might still only add up to 100 or so handles at the most. If the programmer writing the recursive program forgot to release handles, then the limit (the size of the array) would be reached.

If Jurjen decides to do something other than use handles, then this is all moot anyway. :)

Wed, 2008-08-13 17:13 — jurjen

why handles?

nodetype = ParseUnit:GetNodeFirstChild(x, y).

Actually I do not understand why x and y have to be handles (I mean integer indices into an array of recordnumbers). Wouldn't it work when x and y are recordnumbers? Aside from the issue that y would have to be an output parameter:

nodetype = ParseUnit:GetNodeFirstChild(x, output y).

Wed, 2008-08-13 18:50 — jmls

First of all, being thick, I

First of all, being thick, I assumed that all output values would be the node numbers themselves, not pointers to the nodes. See my earlier example ;)

Also, I would like to see the API returning all values via the methods, and not via output parameters.

so,

 nodetype = ParseUnit:GetNodeFirstChild(x, output y).

should be

 nodetype = ParseUnit:GetNodeType(ParseUnit:GetNodeFirstChild(x)).

or (longhand)

 ChildNode = ParseUnit:GetNodeFirstChild(x).
  nodetype = ParseUnit:GetNodeType(ChildNode).

Wed, 2008-08-13 19:20 — john

old API

For performance reasons, and because of the way the handles worked, the old proparse.dll API did two things in one step. What I'm suggesting here is a drop-in replacement to have the same behavior, so that there are less changes to be made to existing Prolint code.

Outside of those constraints, I would agree with you.

We'll want to have some functions that behave much like the old proparse.dll, so that it's easy to get Prolint running with proparse.jar. We'll also want to have new (replacement) functions that aren't held back by legacy constraints.

Wed, 2008-08-13 19:50 — jurjen

Indeed I would appreciate it

Indeed I would appreciate it if the form would remain in general nodetype=function(node,node). A change to nodetype=function(node,output node) is not so bad because that is mostly dumb edit work, but a larger structural change might involve a lot of work in Prolint code. And you know how lazy I am.

Wed, 2008-08-13 18:21 — john

re: why handles?

I like your idea. The internals of the API would certainly be simpler.

Working with node numbers would also be easier to understand and explain than the old handles API.

On the Java side, I do sometimes work with node numbers and all of the nodes as an array. (This is because sometimes I need to store the node number.) To keep things consistent between the node numbers used in Java, and the node numbers used in ABL, we should count node numbers from zero.

So, using Jurjen's API suggestion, parseUnit:getNodeType(0) would always return Program_root.

Wed, 2008-08-13 17:49 — john

Re: why handles?

I was only thinking in terms of backward compatibility with the old API. Yes, it would work to use record numbers directly.

I guess that would be a nice compromise. I think most of the old API functions would look the same... not too many would be changed. So, it wouldn't be a huge change to the existing Prolint code.

The OpenEdge Hive

More Navigation