Thursday, May 16, 2019

Opaque Identifier Validation

The content of opaque identifiers should not be validated except by the class that implements them. An identifier passed to an API should not be validated. An API endpoint that accepts a required identifier should validate that a value is provided. Of course, if the identifier is part of a RESTful endpoint's path, the router implicitly ensures it is provided. Identifiers may also be validated for security to ensure that the identifier is not too long.

Identifiers are generated by the system and should be considered opaque by both the caller and the API.

Every component in a system that validates the format of an identifier would need to be updated to when the format of the identifier changes to support the new format. And formats do change. For example, perhaps an ID started as a numeric value. And say components validate that the ID consists only of digits (i.e., 0-9). Later, when symbolic values are also needed for that ID, any component performing validation would needlessly fail. Only the component that needs to parse and interpret the ID should validate the ID. Knowledge of the identifiers format should be isolated to the class that produces the ID and the class that ultimately consumes it. Classes that accept an identifier and passes it down to lower classes should not care about the format of the ID.

In the API, identifiers should be represented as strings (which includes serialized objects). If an identifier happens to be numeric, it should not be represented as a number within the API. It should be a string which often happens to consist of a sequence of characters "0" through "9". Number types (e.g., integer, long) should be used for values upon which mathematical operations can be performed. For example, a percentage, the number of items, a size or a dimension. Since arithmetic operations do not apply to an ID, identifiers should not be considered as numbers. Internally, an ID may be stored using a long or Guid for better performance. Internally, protection against changes to the format can be easily caught by the compiler.

Identifiers that are system-wide data types may be represented by a concrete class. The class may perform validation because it "owns" the format. Thus, when parsing a string, it may check that the string is in the format of a GUID. By only permitting the identifier class to validate the contents of the value, information about the format remains hidden from the various components that receive and forward identifiers. Using a class for the identifier provides the additional benefit that readability is improved by using semantic types (e.g., CustomerID) instead of primary data types (Guid).

Within a UI, validation helps guide the user to enter proper values. A user, however, should never be asked to enter an opaque identifier. Values entered by a user should be validated by both the UI and backing API. The former to help the user succeed and the later to protect the persisted data. With an API, a developer may need to pass an opaque identifier, but that ID almost certainly was returned by a previous API call. There is no need for the API to explicitly examine the format of the ID. The distinction between an ID that may succeed and one which definitely won't is slight indeed.

No comments:

Post a Comment