On the Importance of Protocols

2017-12-13 12:10

On the Importance of Protocols

Introduction

A protocol specifies the interactions of different entities. In the context of computing, networking, and other automated systems, this specifies the interactions between software. This can be between different instances of the same software, as is the case for networked video games, where each player runs a copy of the game, and those copies interact to provide the multiplayer experience. Or it can be between different software, as is the case for web browsers and web servers.

In broad terms, a protocol specifies not only the interactions, but some amount of meaning, or interpretation associated those interactions. Not only constraints on the observable behavior of an entity (from the perspective of the other participating entities), but a model of the internal operation of that entity. For example, HTTP specifies that certain of its methods are stateless or idempotent in semantics. This would not otherwise be required if the protocol did not specify that behavior of the HTTP server.

The a protocol is intuitively valuable as a tool for describing communications; and even though in computing, we are somewhat unsused to describing APIs or other more intimate details of our software in terms of protocols, we can think of those also as protocols—definitions of interactions and constraints on those interactions.

What I seek to explore herein is the usefulness and importance of this description, independent of any standardization or uniformity. A standards document, published by a standards organziation, or de facto implementer (cf. the APIs of many popular Internet-scale companies) will often provide a suitably detailed description, but a protocol need not be widely used, widely implemented, or defined between different entities to be usefully described.

However, a protocol must be carefully and completely defined, to be useful. At least, it is only as useful as it is completely and carefully defined. A partially specified protocol is more useful than an unspecified one, but less useful than a fully specified one.

What becomes more useful than a single (well-specified) protocol is a collection of them. Often it is simpler to build complex systems out of parts, of which implementations of protocols are especially useful building blocks. The assembly of these parts to form a solution (to a problem, or with reference to a set of requirements, which form an implicit problem to 'solve') is often the majority of the work in a software project.

Existing implementations of suitable protocols are reused, and new protocols (APIs) are developed in order to meet all the requirements. The 'glue' that joins together these parts is, to me, the interesting and valuable piece. In cases where the glue joins together two protocols with similar functions (as would be the case for ISUP and SIP), we would generally label the glue part as an 'interworking' part. This part implements both protocols, and performs some mapping between them, potentially a very complex one.

Given no obvious limitation of the complexity of that interworking, any software that implements more than one protocol could be said to 'interwork' those protocols. e.g. We could describe a web application as 'interworking' HTTP and a SQL database. This is not commonly referred to as interworking; generally the web application itself would be broken down into smaller parts. When viewed in aggregate a web application turns an HTTP request into a series of database queries, and an HTTP response, but it is not always a useful level of abstraction, especially when talking about the behavior of the web application itself. However, within the context of a web application, different parts will define their own protocols (usually called APIs, in this context) which define the interaction between the component pieces of the application. The part of the application that processes the HTTP request and produces an object representation of that request, used by the rest of the application, could reasonably be said to interwork the HTTP protocol with an application-internal request protocol.

At what level we stop decomposing the software depends on the goal of our description and the level of abstraction required for the description. There seems to be no fundamental reason we could not fully specify the protocols used for every layered abstraction. Practical limits on resources for a project would be what force the decisions.

The ability to interwork protocols is critical to successfully assembling a solution formed of parts, which use various protocols in their interactions. Assembling parts, as a design, is often motivated by the parts already existing, and thus not requiring effort to implement, only to assemble. (This need not be the case, the assembly from well-specified parts also allows for division of labor, so that implementation can carry on in parallel, and the potential to reuse those parts in the future.)

The role of standards is to limit the required amount of interworking, so that solutions can use these standard protocols and be assembled without much glue. Standard protocols represent something akin to interchangable parts in software.

Since standards provide an efficiency, in that they reduce the necessary interworking to solve a problem, I posit that the implementation of interworking is an extremely valuable problem to solve.

Motivating the use of "protocol"

Since I am using the word "protocol" in a very expansive sense, I would like to motivate the choice of word, as contrasted with other options.

Firstly, outside of the space of networking, "protocol" is not often used, and expanding the use of the word to other areas, such as internal to software, provides a novel word in those contexts, not to be confused with language features or specific existing practices.

Within the space of networking, "protocol" usage is consistent with my description in the Introduction.

My emphasis on documentation, specificity, and completeness of specificity with protocols is not a 'naturally' occuring property, but it is the importance of those properties, and the recognition of the importance of the protocol and those properties that I wish to discuss, in the context of developing software systems.

Why focus on protocols?

As disucssed above, a protocol is the software 'thing' that is being specified. This is in contrast to most API documentation, or documentation produced for the "internals" of a software system.

These sorts of internal documentation are not often useful, and are expensive to create, in terms of developer time. It is easy to get bogged down in the documentation of the 'easy' parts, or in creating documentation for parts of the software that are not needed to support other developers (or users).

Documenting the protocols for the software provides a framework to organize around, when documenting a system. This allows abstraction to be handled nicely, at the boundaries between protocols, and to decide how much abstraction a protocol provides.

The documentation of a protocol is more than simply a definition of contracts for interacting, in the sense of a simple precondition/postcondition/ensures/guarantees kind of contract.

The definition of a protocol, though documentation, is a way to provide useful documentation to other developers (and users).