[Gambas-user] fastest way to "zero-out" a huge binary file
Doriano Blengino
doriano.blengino at ...1909...
Mon Jan 25 10:25:03 CET 2010
kevinfishburne ha scritto:
> Doriano Blengino wrote:
>
>> Read carefully the documentation about the WRITE instruction - in the
>> first few lines you find the explanation. When you write something out,
>> you can specify the length of the data you write. If you don't specify
>> it, gambas will do it for you, and this is what you don't want.
>>
>>
>
> I did exactly that last night, and also found the String$ function which was
> helpful in creating the string with all the zeros. I can now create a
> zero-value 8 GB file in about 60 seconds. My roots are in QuickBASIC 4.5,
> but my GAMBAS skills are improving the more I use it thankfully. This forum
> is probably the most helpful forum in forum history as well.
>
>
> Doriano Blengino wrote:
>
>> I don't know the algorithm you want to use on so many data, but if you
>> look deep at it, may be you find some shortcut to minimize read and
>> write from a file. You could read data in chunks, calculate intermediate
>> results in ram, and so on. If you try to randomly read 65537*65537 small
>> pieces of data, your program will take forever to run, because the
>> overhead of input/output will be overimposed on every two bytes. By
>> reading 4 bytes at a time instead of two, you will gain twice the speed,
>> and so on. If the calculus is heavy, then there will be a point of best
>> compromise between the heavyness of I/O and heavyness of calculus. If
>> the calculus is very-very-very heavy, then the I/O time will be
>> neglectable.
>>
>> Another nice thing could be to compress data to and from disk, if data
>> can be managed by blocks. You could even reduce data to 1/10, and this
>> would be a big improvement.
>>
>>
>
> Those are accurate observations and my main devil at this point. Here's an
> explanation of the algorithm:
>
> http://www.gameprogrammer.com/fractal.html#diamond
> http://en.wikipedia.org/wiki/Diamond-square_algorithm
>
> Basically think of the file as a square grid of byte pairs (shorts), each
> representing an elevation point on a map. Each pair of bytes is read
> directly into a short for processing as necessary. Initially the four corner
> points are read and the algorithm is applied, generating the center point
> which is then written back to the file (figure b in the first link).
>
> The second part of the algorithm uses these five points plus four points
> initially outside the file to generate four more points (figure c in the
> first link). It's the same as the first part of the algorithm, but rotated
> 45 degrees and performed four times in different areas.
>
> After the two algorithm variations are applied they are repeated at
> half-scale across all the points, exponentially increasing the number of
> iterations per pass (1x1, 4x4, 16x16, 64x64, etc.).
>
> The only thing I can think to do to increase I/O efficiency without adding
> more overhead to the number crunching (which is already significant) would
> be to assign each read point to a small array of shorts so the points would
> not have to be read from disk more than once per pass.
>
> Another idea would be to wait for the subdivisions to become small enough
> that one could be loaded at once into a huge array of shorts (2 GB, for
> instance), then operate solely on that chunk until it was finished and
> commit it back to disk. Because of the second part of the algorithm's 45
> degree angle I'd still have to read some bytes that were outside the array,
> however. That would also disrupt my nice nested FOR...NEXT loop I have going
> to control the iterations.
>
Surely this is what I thought at first. The algorithm starts by
examining points far apart each other, but then, dicotomically, goes to
smaller and smaller squares. At a certain point, the squares get so
small to be easily kept in ram; say, a 1024*1024 square. At this point,
if I well understand, there are no "pollutions" from the surrounding
points, so you can use an array and call recursively a routine on it,
giving the four vertexes.
The subroutine could look something like this:
sub square_diamond(left, right, top, bottom as integer)
step = (right+left) / 2
if right-left <= MAX_ARRAY_SIZE then
' use internal cached data
' but first load data from disk if it is the first time
if right-left = MAX_ARRAY_SIZE then load_array_from_disk(...)
cache[(left+step, top+step] = (blah+blah+blah+...) / 4
else
' seek in disk, load 4/8 values, calculate, and write back to disk
endif
' apply this algorithm in the four resulting squares
square_diamond(left, left+step, top, top+step) ' upper left
square_diamond(left+step, right, top, top+step) ' upper right
square_diamond(left, left+step, top+step, bottom) ' lower left
square_diamond(left+step, right, top+step,bottom) ' lower right
if step=1 then
' finished with this square. If this is the lower-rightest
square of the cache, write it back to the disk
' sorry... too lazy to continue this...
end
In the above code there are lots of bugs and omissions, but this is the
whole idea.
Hope this helps - cannot think, at the moment, at anything better.
Sometimes one can try to imagine at data in a way totally different; for
example, are we sure we must see the file as 65537 rows, each of 65537
columns? Could be that seeing it as a series of bigger and bigger frames
(1 dot, the central one, then the 8 surrounding dots starting from top,
then the next frame and so on) could help? Probably not, but this is
just a quick temptative to see a different arrangement.
Regards,
Doriano
More information about the User
mailing list