[Gambas-user] using a "file system database"
Doriano Blengino
doriano.blengino at ...1909...
Wed Apr 20 09:08:25 CEST 2011
Kevin Fishburne ha scritto:
>>> My current plan is to create a directory for each region
>>> ([65536/32/32]^2). Each region directory contains 32^2 data files
>>> (1024). Hopefully this won't stress any particular file system as far as
>>> how many directories and files are contained within a single directory.
>>>
>>>
>> But... I am missing something... the number of files was 4M, right? And
>> 64 directories with 1024 files does not sum up to 4M...
>>
>
> A cell is 32x32 tiles (bytes), and the map is 65536^2 tiles, so there
> are 2048^2 cells. Each of these are organized into directories of 32^2
> cells (regions). The data files alone are 4.2 million in number, the
> directories that organize them hopefully add a layer of filesystem
> efficiency. Right now a separate directory is created for each region
> and empty data files are created for each cell within that region.
>
>
I can't explain myself why I did not understand at first. Well (not)...
apparently I missed the square "^2".
So there are 4096 directories containing 1024 files each, and each file
is 1Kb in size (or not?). If this is correct, you should choose a small
allocation unit for your file system (1 or 2 Kb?). Assuming that the
file system uses, say, 128 bytes for storing the metadata for a file,
you will have 4Gb of data, 512Mb of metadata for the files, and
4K*128=512K of metadata for the directories. More than 12% of the total
volume of data is spent for organizing them, which is not a bad number,
but in your case this could be improved by merging together, for
example, all the files (1024) of a single directory, if this is
possible. If the files were different in length, then using the file
system to organize them would have been the easiest way. But this is
just a theoretical supposition...
I have checked the file systems on my old server. The root partition is
2Gb (your cells would not fit there), uses 1K blocks, and has a total of
1M inodes (again the cells would not fit). The /usr partition is 18Gb,
uses a block size of 4K, and has a total of 1M inodes. Again, your data
would not fit because of the number of files. The biggest partition is
200Gb, uses 4K as block size, and counts 12M inodes. In this partition,
your data would take 16Gb of space, but anyway it would fit, at the cost
of transferring 4 times the useful data at every read (probably). I
think the only pit is speed, and not disk space wasting. If there is
enough ram in the computer to cache the 512Mb of metadata, access is
very quick (10 milliseconds? Or less?). Otherwise, the same metadata
must be read over and over, at every access, which means that the
computer will read several Mb to access a single 1k file. These are the
opposite and extreme situations - in the reality a certain amount of
disk cache will always be in use. My old server has only 96Mb of ram,
but as a normal file server performs not badly. Clearly, I don't have 4M
files on it... :-)
Regards,
Doriano
More information about the User
mailing list