Saturday, August 16, 2008

Data Duck Typing

Some friends of mine at 1060 Research recently sent me a new version of some software they are working on. After reading one of the XML configuration files, I asked if they had an XML Schema for it (which would define the grammar for legal configurations). The answer was that they did not, as they were moving away from formal grammars and more towards a rule-based approach like Schematron: which uses a set of pattern assertions (rules) for XML validation.

When I thought about this approach, I realized that it is duck typing for data. In object-oriented programming, the use of duck typing means that an object's behavior, rather than its class or inheritance structure, determines its interpretation and usage. The application of rule-based systems to categorize a data file or message is a data-oriented form of duck typing. Using "data duck typing", data is categorized (in this case validated) by having the right elements in the right locations.

Data duck typing means that a data file does not have to fully conform to a specific, rigid grammar as long as some of its parts meet the requirements of the particular rule set used for categorization. Thus, data messages for an application can come in all shapes and sizes as long as they contain the essential required elements with the right structural relationships. Applications which use this approach embody the design principle which says "be lenient in the messages that you accept" and will be much more flexible than applications based on rigid adherence to formal grammars.

No comments: