[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Scheme-reports] Write procedure is not backwards compatible

John Cowan writes:
> If you have the data handy, how much space did the hash-eq occupy,
> as distinct from how many elements it had in it?

I don't know of a way to observe this directly.  What I did is this: rewrite write/safe to save the hash-eq in a global variable, then see how much memory is in use (forcing three garbage collections beforehand), then rewrite that variable to nil, force three more garbage collections, and compute the difference in memory used.  The results are very consistent, so I think they're accurate:

The hash-eq occupies [a power of two] bytes, between 22.5 and 45 bytes per cons cell [or, generally, per key-value pair] on my machine, doubling at very definite boundaries; this is true across the gamut of structure sizes.  A cons cell consumes 32 bytes on my machine.  The memory overhead of the hash-eq is between .7 and 1.4.

Alex Shinn writes:
> It is not unreasonable to work with a single data
> structure that large, and we should ensure it's possible
> to write it out.

True.  Still, this can in theory be done with all desired space-efficiency and portability (given that all data structures used are portable) by the user: write "write/fast" recursively in Scheme.  (Note that even a built-in write/fast would likely have a worst-case space overhead of .5 for the stack.)  It would merely be a royal pain in the general case to enumerate all the different types of structures to traverse (possibly including user-defined records), and write the code to traverse each type.  (One might do what I did--handle only cons cells myself and degrade to the built-in writing procedure on anything else.  This doesn't work for cons cells inside other structures, but I suspect most giant single data structures would be pretty homogeneous.)

Meanwhile, write/safe and write/shared cannot be defined efficiently and portably at all in the absence of portable hash-eqs or equivalent, as SRFI-38 complains.

I'm inclined to insist that write/safe and write/shared appear in some form in the standard, and don't care much whether the unsafe write/fast is a) prescribed by the standard, b) recommended but implementation-named and -provided, or c) not explicitly mentioned by the standard at all (though implementations could still provide it).

--John Boyle
Science is what we understand well enough to explain to a computer. Art is everything else we do. --Knuth

On Sun, Jul 8, 2012 at 12:45 AM, John Cowan <cowan@x> wrote:
John Boyle scripsit:

> write/fast consistently takes 4500-4700 milliseconds to print all of these
> structures out, which makes sense, because it does the same amount of work
> on each cons cell, no matter how long the tail of the list is.  Over the
> range of structures with 40 cons cells to those with 40,000 cons cells,
> "write/safe" is generally 10% slower than "write/fast".  At 400,000 cons
> cells, "write/safe" takes 33% longer (6100 msec), and at 4 million, 65%
> longer (7600 msec).  This seems consistent with my estimate of "O(n) -> O(n
> log n), or perhaps O(n), depending on exactly how hash-eqs work".  And
> incidentally, at 4 cons cells, write/fast took 5400 and write/safe took
> 6400 msec (20% slower).

If you have the data handy, how much space did the hash-eq occupy,
as distinct from how many elements it had in it?

John Cowan        http://ccil.org/~cowan   cowan@x
Lope de Vega: "It wonders me I can speak at all.  Some caitiff rogue
did rudely yerk me on the knob, wherefrom my wits yet wander."
An Englishman: "Ay, belike a filchman to the nab'll leave you
crank for a spell." --Harry Turtledove, Ruled Britannia

Scheme-reports mailing list