Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

Jonathan Thornburg
On Thu, 2 Jun 2005, Ethan Merritt wrote:

> On Wednesday 25 May 2005 05:03 am, Lars Hecking wrote:
>>
>> may I ask to advance the #define'd compile-time parameter
>>      MAX_NUM_VAR
>> in src/syscfg.h from 5 to something like 20 or so?  :-)
>
> The arrays that use MAX_NUM_VAR are horribly wasteful,
> particularly if we bump it up to a larger number.
>  e.g.:     char c_dummy_var[MAX_NUM_VAR][MAX_ID_LEN+1];
> would be reserving 20*50 chars of storage to hold what
> is usually 2-3 instances of single-characters names like
> "u", "v" or "w".
>
> How about changing these arrays to hold dynamically
> allocated pointers instead.  [[...]]

Is it worth complexifying the code to save 1K bytes of memory?
Does current gnuplot still run on MS-DOS 16-bit systems?

ciao,

--
-- Jonathan Thornburg <[hidden email]>
    Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institut),
    Golm, Germany, "Old Europe"     http://www.aei.mpg.de/~jthorn/home.html
    "Washing one's hands of the conflict between the powerful and the
     powerless means to side with the powerful, not to be neutral."
                                       -- quote by Freire / poster by Oxfam



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
_______________________________________________
gnuplot-beta mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gnuplot-beta
Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

Dave Denholm
Jonathan Thornburg <[hidden email]> writes:

> On Thu, 2 Jun 2005, Ethan Merritt wrote:
>
>> On Wednesday 25 May 2005 05:03 am, Lars Hecking wrote:
>>>
>>> may I ask to advance the #define'd compile-time parameter
>>>      MAX_NUM_VAR
>>> in src/syscfg.h from 5 to something like 20 or so?  :-)
>>
>> The arrays that use MAX_NUM_VAR are horribly wasteful,
>> particularly if we bump it up to a larger number.
>>  e.g.:     char c_dummy_var[MAX_NUM_VAR][MAX_ID_LEN+1];
>> would be reserving 20*50 chars of storage to hold what
>> is usually 2-3 instances of single-characters names like
>> "u", "v" or "w".
>>
>> How about changing these arrays to hold dynamically
>> allocated pointers instead.  [[...]]


This reminds me of some work I had planned long ago (when I had time
to help out on gnuplot). A couple of thoughts...


- how are string literals in expressions represented ?  When I was
  deliberating over this area, I was tending towards having the notion
  of a string pool associated with an exression. The pool would
  survive for as long as the expression did, and if that meant the
  expression was stored in a user-defined fn, then the string pool
  would persist.

  If that approach was used, then the names of the dummy variables
  could be stored in that string pool.


- the other aspect of MAX_NUM_VAR is to actually have space for the
  values during (recursive) expression parsing. My recollection is
  that gnuplot used to have a fixed area for the parameter values, and
  when recursing into a udf, would copy the old values into temporary
  space, and put the new values into the globals. I don't know if this
  is still how it works. But a much more elegant solution would be to
  use an evaluation stack. (I work on java vm's these days, so it's
  obvious where this idea comes from...)

  Basically, there is one block of memory for estack. Each function
  has a frame on the stack. When a fn is invoked, the frame of a
  function overlaps the estack of the previous one.

  eg

  g(x,y,z)=x+y+z
  f(x,y,z)=6 + g(x+2, y+3, 42)

gnuplot> show at f(10,11,12)

        pushc 10
        pushc 11
        pushc 12
        pushc 3
        calln f
          pushc 6
          pushd1 f dummy
          pushc 2
          plus
          pushd2  dummy
          pushc 3
          plus
          pushc 42
          pushc 3
          calln g
            pushd1 g dummy
            pushd2  dummy
            plus
            pushc 2
            pushd
            plus
          plus

consider estack growing from left to right.

push parameters

10 11 12

invoke f : I assume the 3 is a parameter count (ie linkage). This
becomes the start of the frame for f, with the parameter values still
on the bit of the stack belonging to the caller. (Or we could consider
this bit of the estack to be shared)

10 11 12 | 3 |

start interpreting f. The pushd1 (push dummy 1) reaches down in the
estack to retrieve the parameter values from the top of the estack of
the caller.

10 11 12 | 3 | 6 10 2

plus

10 11 12 | 3 | 6 12

so on until we invoke g

10 11 12 | 3 | 6 12 14 42 | 3 |

etc.


When a function exits, we merely take the value on top of the estack,
remove the linkage, and push the value back on top of the estack. so
the net effect is to pop the parameters and push the result.


dd
--
Dave Denholm              <[hidden email]>       http://www.esmertec.com


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
gnuplot-beta mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gnuplot-beta
Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

Ethan Merritt
On Friday 24 June 2005 12:11 pm, you wrote:
> >> How about changing these arrays to hold dynamically
> >> allocated pointers instead.  [[...]]
>
> - how are string literals in expressions represented?

If a literal string appears in-line on a single command line, then it
has no continuing existence.  It exists as part of the command line
being parsed.

String variables do have continuing existence, in dynamically-allocated
storage associated with the udv for that variable name.

A string literal that is part of a user-defined function is held in
a dynamically-allocated udv in the pre-built evaluation stack for
that function.

In any case, when the string is referenced during the course of evaluating
an expression, a copy of the string is made by allocating dynamic storage.
This temporary copy is pushed/popped/dereferenced using the stack during
the expression evaluation, and freed again at the end of evaluation.

> When I was
>   deliberating over this area, I was tending towards having the notion
>   of a string pool associated with an exression. The pool would
>   survive for as long as the expression did, and if that meant the
>   expression was stored in a user-defined fn, then the string pool
>   would persist.

If I understand correctly, that is what it currently does.
But I don't quite follow the use of the word "pool", except insofar
as the strings exist in the heap managed by malloc() via gp_alloc().

>   If that approach was used, then the names of the dummy variables
>   could be stored in that string pool.

There you lost me.  The dummy variable names are independent of
any particular function, and must be available to the parser before
any user-defined functions have been defined.  Unless, as before,
by "stored in that pool" you mean "dynamically allocated by gp_alloc()"?

> - the other aspect of MAX_NUM_VAR is to actually have space for the
>   values during (recursive) expression parsing. My recollection is
>   that gnuplot used to have a fixed area for the parameter values, and
>   when recursing into a udf, would copy the old values into temporary
>   space, and put the new values into the globals. I don't know if this
>   is still how it works. But a much more elegant solution would be to
>   use an evaluation stack. (I work on java vm's these days, so it's
>   obvious where this idea comes from...)

It does use an evaluation stack.  Currently the stack is of fixed size
   eval.c:  static struct value stack[STACK_DEPTH];
but I think it would be straightforward to make it dynamically evaluated
instead. The push() routine would have to check whether the stack needs
to be expanded via realloc().


--
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
gnuplot-beta mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gnuplot-beta
Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

Dave Denholm
Ethan A Merritt <[hidden email]> writes:

> On Friday 24 June 2005 12:11 pm, you wrote:
>> >> How about changing these arrays to hold dynamically
>> >> allocated pointers instead.  [[...]]
>>
>> - how are string literals in expressions represented?
>
> A string literal that is part of a user-defined function is held in
> a dynamically-allocated udv in the pre-built evaluation stack for
> that function.

Hmm - not sure what you mean by the pre-built eval stack - I know
gnuplot fairly intimately back in 3.6 but that was a while ago now.

I was thinking of expressions mostly for user-defined functions, but
there are also the expressions which are anonymous, but which are
evaluated many times (eg  plot ... using  )


> In any case, when the string is referenced during the course of evaluating
> an expression, a copy of the string is made by allocating dynamic storage.
> This temporary copy is pushed/popped/dereferenced using the stack during
> the expression evaluation, and freed again at the end of evaluation.
>
>> When I was
>>   deliberating over this area, I was tending towards having the notion
>>   of a string pool associated with an exression. The pool would
>>   survive for as long as the expression did, and if that meant the
>>   expression was stored in a user-defined fn, then the string pool
>>   would persist.
>
> If I understand correctly, that is what it currently does.
> But I don't quite follow the use of the word "pool", except insofar
> as the strings exist in the heap managed by malloc() via gp_alloc().
>

Okay... I had invented a scheme using a pool simply to avoid the pain
of having to clean up when the expression itself was freed. Rather
than having an arbitrary number of pointers to strings stored
elsewhere, I had in mind one lump of memory associated with the
expression, and then string literals were stored within that
pool, and the "action" to access the string literal stored an index into
that pool, rather than a direct pointer (so that as the pool grows, we
don't need to fixup pointer). Then since all expressions (or action
tables) have one constant-pool pointer, there is only one lump of
memory to free when the expression gets deleted. But this is just a
detail.

>>   If that approach was used, then the names of the dummy variables
>>   could be stored in that string pool.
>
> There you lost me.  The dummy variable names are independent of
> any particular function, and must be available to the parser before
> any user-defined functions have been defined.  Unless, as before,
> by "stored in that pool" you mean "dynamically allocated by gp_alloc()"?
>

Ah - sorry - perhaps I'm being stupid. I vaguelly recalled that the
dummy variable names were preserved in order to be able to echo them
back to the user when listing / saving udf's. But it probably just
saves the entire expression as a string. So it doesn't need to keep
the names of the variables.



Perhaps MAX_NUM_VAR is being misused - it is the size of an array to
store the current "global" dummy names, but also gets used as the size
of array for parameters passed to fns.


>> - the other aspect of MAX_NUM_VAR is to actually have space for the
>>   values during (recursive) expression parsing. My recollection is
>>   that gnuplot used to have a fixed area for the parameter values, and
>>   when recursing into a udf, would copy the old values into temporary
>>   space, and put the new values into the globals. I don't know if this

yeah...

/* user-defined function table entry */
typedef struct udft_entry {
    struct udft_entry *next_udf; /* pointer to next udf in linked list */
    char *udf_name; /* name of this function entry */
    struct at_type *at; /* pointer to action table to execute */
    char *definition; /* definition of function as typed */
    t_value dummy_values[MAX_NUM_VAR]; /* current value of dummy variables */
} udft_entry;


Okay, not globals, but a dummy_values[] array stored with each udf.


>>   is still how it works. But a much more elegant solution would be to
>>   use an evaluation stack. (I work on java vm's these days, so it's
>>   obvious where this idea comes from...)
>
> It does use an evaluation stack.  Currently the stack is of fixed size
>    eval.c:  static struct value stack[STACK_DEPTH];
> but I think it would be straightforward to make it dynamically evaluated
> instead. The push() routine would have to check whether the stack needs
> to be expanded via realloc().
>


yes, it uses an eval stack. But it's a "pure" estack. When a udf
function is invoked, the parameters are copied off the estack into
some other storage. What I'm suggesting a small change to leave them
on the estack, with a frame being pushed. This gets round the problem
of having an upper limit number on the number of dummy variables,
without having to set aside storage to copy parameters.

Looking at the 4.0 sources, f_calln() in internal.c has to

    for (i = 0; i < MAX_NUM_VAR; i++)
        save_dummy[i] = udf->dummy_values[i];

copy the old parameters out into some spare space (since this might be
a recursive call, presumably), then copy the parameters off the stack
into the dummy_values for the udf, and then resume resume. And then on
return, it has to copy back the saved ones into the current
dummy_values[]


I'm proposing that we (er... someone !)  changes this so that,
instead, f_calln() lays down some notion of frame linkage in the
estack, so that it can find dummy parameters.

Then f_pushd() [push dummy variable onto estack] is changed to not
fetch it from the (fixed size) array in the udf defn, but instead,
fetch from lower down in the estack.

And returning from a udf just removes the linkage, actually pops the
fn parameters off the caller's estack, and pushes the result.


dd
--
Dave Denholm              <[hidden email]>       http://www.esmertec.com


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
gnuplot-beta mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gnuplot-beta
Reply | Threaded
Open this post in threaded view
|

Re: [Fwd: Gnuplot: MAX_NUM_VAR in src/syscfg.h]

Ethan Merritt
On Saturday 25 June 2005 06:50 am, you wrote:

>
> yes, it uses an eval stack. But it's a "pure" estack. When a udf
> function is invoked, the parameters are copied off the estack into
> some other storage. What I'm suggesting a small change to leave them
> on the estack, with a frame being pushed. This gets round the problem
> of having an upper limit number on the number of dummy variables,
> without having to set aside storage to copy parameters.
>
> Looking at the 4.0 sources, f_calln() in internal.c has to
>
>     for (i = 0; i < MAX_NUM_VAR; i++)
>         save_dummy[i] = udf->dummy_values[i];
>
> I'm proposing that we (er... someone !)  changes this so that,
> instead, f_calln() lays down some notion of frame linkage in the
> estack, so that it can find dummy parameters.

OK. I get it now, at least conceptually.
I think you are/were more familiar with the inner workings of the
evaluation code than I am.

It looks to me that there is also a less drastic, though less elegant,
fix. Instead of pre-reserving a fixed size array in the udft_entry
structure, we could allocate the space dynamically immediately prior
to the code you quote above.

--- f_calln.c.orig      2005-06-25 15:14:01.620104672 -0700
+++ f_calln.c   2005-06-25 15:16:07.559958904 -0700
@@ -13,12 +13,13 @@
     if (!udf->at) {            /* undefined */
        int_error(NO_CARET, "undefined function: %s", udf->udf_name);
     }
-    for (i = 0; i < MAX_NUM_VAR; i++)
-       save_dummy[i] = udf->dummy_values[i];
-    /* if there are more parameters than the function is expecting */
-    /* simply ignore the excess */
     (void) pop(&num_params);
+    num_pop = num_params.v.int_val;
+    udf->dummy_values = gp_alloc(sizeof(struct value) * num_pop);
+    for (i = 0; i < num_pop; i++)
+       save_dummy[i] = udf->dummy_values[i];



It also looks to me that there is an overflow flaw in the current
code.  It tries to "ignore the excess parameters" by popping them into
the dummy_values array first, and then overwriting them with the actual
parameters up to MAX_NUM_VAR.  If the number of parameters to be thrown
away itself exceeds MAX_NUM_VAR, it will end up trashing whatever
is next in the malloc heap.

internal.c 198:
    /* if there are more parameters than the function is expecting */
    /* simply ignore the excess */
    (void) pop(&num_params);
    if (num_params.v.int_val > MAX_NUM_VAR) {
        /* pop the dummies that there is no room for */
        num_pop = num_params.v.int_val - MAX_NUM_VAR;
        for (i = 0; i < num_pop; i++)
            (void) pop(&(udf->dummy_values[i]));
    }

Hmmm. Let's see...
gnuplot> f(x) = sin(x)
gnuplot> print f(pi,1,2)
1.0
gnuplot> print f(pi/2,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,
17,18,19,20,21,22,23,24,25,26,27,28,29,30)
gnuplot> Segmentation fault

So indeed there is room to improve this code.


--
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
gnuplot-beta mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gnuplot-beta