Michael Angelo Ravera
9/6/2011 6:45:00 AM
On Saturday, September 3, 2011 6:04:23 AM UTC-7, pozz wrote:
> Suppose I have a structure:
>
> typedef struct {
> int version;
> DUMMY dummy;
> FOO foo;
> BAR bars[128];
> } CONFIG;
>
> stored in a "config.dat" file with fwrite(). At startup, the
> application open the file and read the configuration. I think it is a
> normal approach to store the configuration of an application in a
> non-volatile way.
> Of course, there are many file types for storing application
> configuration (INI, XML, CSV, database...), but in my case a pure binary
> file is sufficient and simple to use.
>
> Now suppose I have a new version of the software and a new version of
> the CONFIG structure:
>
> typedef struct {
> int version;
> DUMMY dummy;
> FOOOLD foo;
> BAR bars[128];
> } CONFIGOLD;
>
> typedef struct {
> int version;
> DUMMY dummy;
> FOO foo;
> NEWELEM newelem;
> BAR bars[256];
> } CONFIG;
>
> Note that some elements are inserted in the middle of the structure, the
> size of the array bars is changed and the definition of sub-structure
> (FOO in the example) is also changed.
>
> I want to write a function that opens the configuration file and, based
> on the version, read the configuration or make an upgrade of the
> configuration file.
>
> Normally I would proceed opening the file, reading the version and, in
> the case it is old, reading the old configuration structure, copying to
> the new configuration structure (making adaptation), deleting the old
> file and creating/writing the new structure to the file. Something
> similar to this (without error checking):
>
> int fd;
> CONFIG cfg;
> fd = open("config.dat", O_RDONLY);
> read(fd, &cfg.version, sizeof(cfg.version));
> if (cfg.version == 2) {
> lseek(fd, 0, SEEK_SET);
> read(fd, &cfg, sizeof(cfg));
> close(fd);
> } else if (cfg.version == 1) {
> CONFIGOLD cfgold;
> BAR bar_default = { ... };
> lseek(fd, 0, SEEK_SET);
> read(fd, &cfgold, sizeof(cfgold));
> /* Copy from old to new configuration, filling the new elements
> * with default values */
> cfg.version = 2;
> cfg.dummy = cfgold.dummy;
> <...adapt cfgold.foo to cfg.foo, it's application dependent...>
> cfg.newelem = newelem_default;
> memcpy(cfg.bars, cfgold.bars, 128 * sizeof(BAR));
> memcpy(&cfg.bars[128], &bar_default, 128 * sizeof(BAR));
> close(fd);
> remove("config.dat");
> fd = open("config.dat", O_WRONLY | O_CREAT);
> write(fd, &cfg, sizeof(cfg));
> close(fd);
> }
>
> This algorithm assumes to maintain both structures in RAM, but I
> couldn't on my embedded platform with a small amount of memory. So I
> have to proceed with a different approach, I have to open the file with
> the old configuration and create a new file with the new configuration.
> The upgrade will be made field by field, reading a field from old file
> and writing it to the new file. After all, I can delete the old file
> and rename the new file. Something similar to this:
>
> int fd;
> CONFIG cfg;
> fd = open("config.dat", O_RDONLY);
> read(fd, &cfg.version, sizeof(cfg.version));
> if (cfg.version == 2) {
> lseek(fd, 0, SEEK_SET);
> read(fd, &cfg, sizeof(cfg));
> close(fd);
> } else if (cfg.version == 1) {
> int fdnew;
> BAR bar_default = { ... };
> fdnew = open("config.new", O_WRONLY);
> cfg.version = 2;
> write(fdnew, &cfg.version, sizeof(cfg.version));
>
> { /* dummy */
> /* !!! I'm not sure to read dummy here or after some padding */
> read(fd, &cfg.dummy, sizeof(cfg.dummy));
> /* !!! I'm not sure to write dummy here... */
> write(fdnew, &cfg.dummy, sizeof(cfg.dummy));
> }
> { /* foo */
> ...
> }
> ...
>
> close(fd);
> close(fdnew);
> remove("config.dat");
> rename("config.new", "config.dat");
> }
>
> The problem I couldn't solve is related to the reading/writing of each
> field. Indeed, between fields the compiler could add padding bytes, so
> reading/writing the entire structure (with padding) is completely
> different than reading/writing field by field (without padding).
>
> I think the solution is to calculate the offset of each field and move
> the current position with lseek() accordingly. Something similar to
> this for reading:
>
> lseek(fd,
> offsetof(CONFIGOLD, dummy) - lseek(fd, 0, SEEK_CUR),
> SEEK_CUR);
> read(fd, &cfg.dummy, sizeof(cfg.dummy));
>
> In other words, I move the position to the exact position of dummy field
> (skipping padding bytes, if any), starting from the current position.
> And for writing...
>
> lseek(fdnew,
> offsetof(CONFIG, dummy) - lseek(fdnew, 0, SEEK_CUR),
> SEEK_CUR);
> write(fdnew, &cfg.dummy, sizeof(cfg.dummy));
>
> Here lseek() after the end of file works and the subsequent write
> operation will fill intermediate bytes (between the last end of file
> position and the new current position) with zeros.
>
> What do you think? Do you have other better suggestions?
I did not look in depth at your code to prove whether it is absolutely correct as far as it goes.
The best approach, subject to the configuration file's size being within reason, is to read the whole damn file and use the version to set the cannonical (probably latest) configuration variables. Basically, create a union into which all configuration versions fit and read the largest one.
I wouldn't bother writing an update as long as you have the ability to read old configuration file formats.
There are usually better ways to handle configurations than a binary file, but binary files certainly can be made to work.