Wednesday, April 6, 2016

On Data Files and Configuration in Delphi Programs

On a Delphi programmer's group on Facebook the other day, someone asked a question about configuration files. They were going to use an INI file to hold their data, and someone suggested the Registry.  I chimed in that I find INI files superior to Registry files, with the provision that perhaps one exception to that might be if your application needs to configure where its textual configuration files are. The reasons why textual configuration files (in json or ini format) are better than the registry are:


  • Because it's easier to back up and restore application state.
  • Because for your support people, reading and inspecting a text file is easier than finding and inspecting the registry, which is in the end like one enormous INI file that describes the state of your entire system.
  • Because even if the system won't boot any more, it's easy to get that configuration data and copy it off that machine, just by booting from a live CD image like GPARTED or by slapping a completely dead PC's disk in another machine after the motherboard died.  In my career this has turned out to be a good choice when customers were told to make a backup and they didn't. Data recovery has saved my customer's bacon when they didn't do the backups they were taught and told to do.
  • In my automated bug report submission systems, using MadExcept (or you could use EurekaLog), I find it helpful to attach log files (showing what happened to the app recently, on the client and the server side) and even some of the configuration files, so you know what options the user has configured.
  • In some domains we might even check the configurations into version control, or keep backup copies of configurations. It's harder to do that with a registry.
Some interesting technical questions that remain:

  • Where should data files, including configuration files be stored?  For state which is different for each user, I believe in the user's appdata folders, either Local or Roaming, under a Subfolder named by your company name, with a Subfolder under that for your product name.  
    Similarly folders under c:\ProgramData should be used for global data that is not different for each user who logs in. So for AwesomeSoft's product SuperTool, on my computer, C:\Users\warren\AppData\Local\AwesomeSoft\SuperTool is used. If the installer version is known and this folder are backed up, then your entire application state can be backed up and restored without even backing up the program files folder.  Perhaps a local database or data file might also automatically live here if it hasn't been configured and moved elsewhere.
  • Even back in Windows XP era (2002) Microsoft has given clear guidance that non-system privileged processes should not have write access to the Program Files folders.  In order to preserve system stability, only the installer, which operates in an Elevated privilege level, and is started by a person with appropriate system administrative role, should be able to write to the program files folder.  I consider hacking NTFS permissions on Program Files to be a bad practice, on end user machines.   What I think is kind of hilarious though is that in 2016, Microsoft SQL Server 2016, still defaults to primary SQL Table data file storage under Program Files folders, instead of moving that global state data to Program Data where Microsoft's own guidance would have suggested they put it.   What do you do when even Microsoft's left hand doesn't follow Microsoft's right hand?  I would say, you do what is right, what is a solid engineering principle.  Protect users and their systems by separating your binary code and non-user-defined static state (maintained by installers and patch updaters) from your user defined static state (your data folders).
  • When if ever are binary configuration files appropriate? I would say, only when the problem domain or technical requirements cannot be met with a text file. For example, if the configuration files contain a million entries and really just needs to be a key-value store. In that case, I would say, you should use a proven binary container like SQLite, and not invent your own shabby one, or use some unmaintainable binary blob technology that you found on the internet, like that b+Tree algorithm you found somewhere randomly on the internet.  Binary files are opaque, testing binary file reads/writes is a harder problem, and inspecting that binary file for damage becomes a difficult task, unless you choose something like SQLite that already has tests, and file integrity checking already built.

7 comments:

  1. Warren, how do you attach log files using MadExcept? I can't find anything in Settings to allow me to do that.

    ReplyDelete
  2. 1. Registry exports are easy to create and restore.
    2. You have used RegEdit, right? Finding stuff is easy - it is all hierarchical, with well defined places for things.
    3. You can use RegEdit to view registry hives from offline machines, just install their HD via a USB drive holder.
    4. You can create readable versions of settings if required.
    5. Version control is an awful backup tool for deployed software.

    ReplyDelete
  3. Xepol,

    1) They're not as easy as simply copying a file. I can't use RSS to sync things between machines. And this is 2016, we just don't use proprietary, clunky, platform-specific file formats anymore.

    2) Who wants to do that?

    3) See above.

    What Warren is describing is the Unix/Linux way of doing things. On Linux there is no registry; programs use their own text-based settings files. They're stored in a hidden folder in the user's home directory, often exactly as Warren said, e.g. .local/Mozilla/Firefox. It has all the benefits Warren said, in addition to things like being able to reset a program to "fresh install" default settings simply be deleting its settings folder. When Linux programs don't see their settings folders they assume they've just been installed, recreate one with default settings, run a start-up wizard if necessary, etc.

    Text-based settings files make things easily scriptable/reproducible and allow your settings to be handled/processed by lots of different utilities. I should add that of course it's a cross-platform method that won't require rewriting if software if ported in the future, unlike the registry. Warren definitely has the right idea here.

    ReplyDelete
  4. 1) In *nix, every application uses its own proprietary syntax for configuration files. They are text based because *nix architecture is so old and outdated it understands little more.
    2) In Linux, each distro family puts some of them in different places. System specific system are not in the user home, anyway. Meanwhile I can run regedit, even access a remote machine, and find everything in a single place...
    3) What an application do if no settings are found depends on how it is written only. Many Windows applications too recreate settings if they are not found (it it sensible to do).
    4) Registry access can be scripted as well. Acting on registy nodes is no different than acting on XML ones, for example. Sometimes it's even easier to do because of the clear tree structure and data types. Try to script properly changes to an Apache conf file...
    5) If you use OO properly, it doesn't really matter how you store your settings. It could be an INI, a registry branch, a YAML file, or an XML one and for the app it's utterly transparent. The real problem are apps that try to run on Windows as if it was Linux..
    6) Today there are better remote managment systems than trying to use rsync to keep machine settings updated. And some of them undertand the registry too...

    ReplyDelete
  5. Warren, thank you.
    It's reassuring to see the conclusions that we have come to over time, and through real-world pain, be confirmed by other developers.

    Your thoughts on the location of the configuration files are also very helpful, as that's one issue that we have still been debating the best approach for. On that subject I'd like to ask for more of your thoughts please.

    One side of our debate argues to use folders under AppData as you have described above, whereas the other side argues to use somewhere completely independent (e.g C:\CompanyName\ProductName) to avoid complications next time Microsoft decide to change the rules & guidelines (again), and also to make the folders easier to find on a client system using Windows explorer.

    Could you please provide any arguments one way or the other that might help to clarify this choice?

    ReplyDelete
  6. Hello NCook, I have known MANY vertical-market Delphi companies that want a folder that contains their binaries and their data. In my view this is a very good idea, and one that Unix systems always allowed. Imagine you see the operating system and the program files folders (windows and program files, and appdata) as ONE layer of the filesystem, conceptually. Then a SECOND layer is everything under C:\AWESOMESOFT, and under that you have c:\AWESOMESOFT\bin and c:\AWESOMESOFT\data, that's a FANTASTIC and easily understandable layout. In less vertical market systems, where your product is purchased and installed by consumers, they will EXPECT you to live by the laws of the land, and that means following the advice I gave in my blog. Polluting the root directory is considered a bad idea by some users. Once a company has a 10 year tradition of doing it this way, I would say, changing would be worse than not changing. Customers are now used to the current practices whatever they are. Even changing away from putting all your data in Program Files (which is objectively a bad idea) is still actually DONE by Microsoft. If I had to theorize on why, it's because they felt changing this would be too disruptive to companies that rely on their old behaviour. It's all the more important to consider wisely before you make your initial design. Changing this stuff later will provoke universal condemnation and confusion. Engineering never has simple one-size-fits-all-answers but there are some principles like "make it easy to support your system" that lead me to advise against registry configuration (except where you have to, like OLE server registration) and to keeping most of your configuration in readable editable text files, even if you plan to provide a GUI to make editing those text files by hand unnecessary. In every major product I have built, end users do NOT want to edit text files. That doesn't make text files a bad idea. In every major product where binary configuration files have ever existed we have found it limited growth, crippled support people, and otherwise confused matters unnecessarily. All negative, no positive.

    ReplyDelete
  7. Our vertical market application has used INI files since 2000. Application-global settings are stored in an INI located in the data directory (Our app is a database app). Any user-specific settings are stored in the User's Local folders, as you've described.

    I've been very happy not having to walk users through making changes to the registry!

    ReplyDelete