GpStructuredStorage internals
[This is a second article in the 'embedded file system' series. If you missed the first part, you can read it here.]
GpStructuredStorage compound file is organized in 1 KB blocks. First block contains header, then the content alternates between a file allocation table (FAT) fragment block and 256 blocks managed by the preceeding FAT fragment. Each block can be represented by a number - header block is block #0, first fat fragment is block #1 and so on.
[header:HEADER:1024]
[fat entry:FATENTRY:1024]
256 x [block:FOLDER/FILE:1024]
[fat entry:FATENTRY:1024]
256 x [block:FOLDER/FILE:1024]
...
[fat entry:FATENTRY:1024]
<=256 x [block:FOLDER/FILE:1024]
Header starts with a signature, which must always be 'GpStructuredStorage file'#13#10#26#0.
HEADERHeader ends in:
[signature:32] // PChar
[unused:964]
[storage attribute file:4]
[storage attribute file size:4]
[first FAT block:4]
[first unused block:4]
[root folder:4]
[root folder size:4]
[version:4] // storage system version
- block number and size of the internal file containing global storage attributes
- block number of the first FAT block (always 1)
- block number of the first unused file block
- block number and size of the root folder
- storage system version (at the moment $01000200, or 1.0.2.0)
Each FAT fragment contains 256 32-bit numbers, one for each of the 256 following file blocks. This number is either 0 if block is last in the FAT chain, or it is a number of the next file block in the chain. Each block is linked into exactly one chain. It can either belong to a file/folder or to a empty block list. Address of the first block in the empty list is stored in the header ([first unused block] entry).
FATENTRYFAT structure defines the limitations of my file format - last block must have number that is less than 4294967294. As block are 1 KB in size, total file size cannot exceed 4294967294 * 1 KB, which is slightly less than 4 TB. Enough for all practical purposes, I think.
256*[next block:4] // next-pointers for this block; 0 = unused
Folders are simple - each folder is just a file. It contains file information records and is terminated with two 0 bytes.
FOLDER //potentially split over several blocksFile information record is a variable length record containing file name (up to 64 KB), attributes, length, and address of the first file block (additional blocks are reached by following the FAT chain).
[FILE_INFO]
[FILE_INFO]
...
[FILE_INFO]
[0:2]
FILE_INFOAt the moment, only two attributes are defined. One specifies that file is actually a subfolder, and another designates a special file containing file attributes (for discussion of attributes see the previous article).
[file name length:2]
[file name:1..65535]
[file attributes:ATTRIBUTES:4]
[file length:4] // 4 GB per file
[first file block:4]
ATTRIBUTESThat's just about everything that is to tell about the compound file format. Armed with this knowledge, one can easily write a compound file browser/repair utility.
$0001 = attrIsFolder
$0002 = attrIsAttributeFile


5 Comments:
Brilliant. I found this in GExperts code and decided to use it in a NewsReader program i'm writing. These posts will surely come in handy.
Hvala, pozdrav :)
Thanks for the thumbs up!
(In hvala ;) )
good work, but i find a bug.
try to increase file size in test:
function TForm1.TestGSSBigAndSmall: boolean;
var
storage: IGpStructuredStorage;
begin
Result := false;
Log('Testing big and small files');
try
storage := CreateStructuredStorage;
storage.Initialize(CStorageFile, fmCreate);
// 0 bytes
TestFile(storage, '/small.dat', true, -1);
TestFile(storage, '/small2.dat', true, -1);
TestFile(storage, '/small2.dat', true);
// cross the 257-block boundary
TestFile(storage, '/large.dat', true, 64);
storage := CreateStructuredStorage;
storage.Initialize(CStorageFile, fmOpenRead);
TestFile(storage, '/small.dat', false, -1);
TestFile(storage, '/small2.dat', false);
TestFile(storage, '/large.dat', false, 64);
Result := true;
except
on E: Exception do
Log(' '+E.Message);
end;
end; { TForm1.TestGSSBigAndSmall }
Self test fail!
I use delphi 2007 Version 11.0.2804.9245
on XP
I can confirm the problem. I'll fix it as soon as I have a little spare time.
Thanks for reporting this!
It turned out to be only a bug in the test suite.
When you passed a value larger than 63 as a fourth parameter to the TestFile method, TestFile made some incorrect assumptions and compared value that was larger than $FFFF to another value that was truncated to two bytes. As a result of that, error was raised (incorrectly).
I have updated test suite at http://gp.17slon.com/gp/files/testgpstructuredstorage_src.zip.
Post a Comment
Links to this post:
Create a Link
<< Home