By Stan Schneider, CEO, Real-Time Innovations
Kings Play Chess On Fine Glass Stools. Anyone remember this?
For most, that is probably gibberish. But not for me. This mnemonic helps me remember the taxonomy of life: Kingdom, Phylum, Class, Order, Family, Genus, Species.
The breadth and depth and variety of life on Earth is overwhelming. A taxonomy logically divides types of systems by their characteristics. The Science of Biology would be impossible without a good taxonomy. It allows scientists to classify plants and animals into logical types, identify commonalities, and construct rules for understanding whole classes of living systems.
The breadth and depth and variety of the Industrial Internet of Things (IIoT) is also overwhelming. The Science of IIoT Systems needs a similar organized taxonomy of application types. Only then can we proceed to discuss appropriate architectures and technologies to implement systems.
The first problem is to choose top-level divisions. In the Animal Kingdom, you could label most animals either, "land, sea, or air" animals. However, those environmental descriptions don't help much in understanding animals. The "architecture" of a whale is not much like an octopus, but it is very like a bear. To be understood, animals must be divided by their characteristics and architecture.
It is also not useful to divide applications by their industries like "medical, transportation, and power". While these environments are important, the applicable architectures and technologies simply do not split along industry lines. Here again, we need a deeper understanding of the characteristics that define the major challenges, and those challenges will determine architecture.
I realize that this is a powerful, even shocking claim. It implies, for instance, that the bespoke standards, protocols, and architectures in each industry are not useful ways to design the future architectures of IIoT systems. Nonetheless, it is a clear fact of the systems in the field. As in the transformation that became the enterprise Internet, generic technologies will replace special-purpose approaches. To grow our understanding and realize the promise of the IIoT, we must abandon our old industry-specific thinking.
So, what can we use for divisions? What defining characteristics can we use to separate the Mammals from the Reptiles from the Insects of the IIoT?
There are thousands and thousands of requirements, both functional and non-functional, that could be used. As in animals, we need to find those few requirements that divide the space into useful, major categories.
The task is simplified by the realization that the goal is to divide the space so we can determine system architecture. Thus, good division criteria are a) unambiguous and b) impactful on the architecture. That may sound easy, but it is actually very non-trivial. The only way to do it is through experience. We are early on our quest. However, significant progress is within our collective grasp.
From RTI's extensive experience with nearly 1000 real-world IIoT applications, I suggest a few early divisions below. To be as crisp as possible, I also chose "metrics" for each division. The lines, of course, are not that stark. But the numbers force clarity, and that is critical; without numerical yardsticks (meter sticks?), meaning is too often lost.
IIoT Taxonomy Proposal
Reliability [Metric: Continuous availability must be better than "99.999%"]
We can't be satisfied with the platitude "highly reliable". Almost everything "needs" that. To be meaningful, we must be more specific about the architectural demands to achieve that reliability. That requires understanding of how quickly a failure causes problems and how bad those problems are.
We have found that the simplest, most useful way to categorize reliability is to ask: "What are the consequences of unexpected failure for 5 minutes per year?" (We choose 5min/yr here only because that is the "5-9s" golden specification for enterprise-class servers. Many industrial systems cannot tolerate even a few milliseconds of unexpected downtime.)
This is an important characteristic because it greatly impacts the system architecture. A system that cannot fail, even for a short time, must support redundant computing, sensors, networking, and more. When reliability is truly critical, it quickly becomes a - or perhaps the - key architectural driver.
Real Time [Metric: Response <100ms]
There are thousands of ways to characterize "real time". All systems should be "fast". But to be useful, we must specifically understand which speed requirements drive architecture.
An architecture that can satisfy a human user unwilling to wait more than 8 seconds for a website will never satisfy an industrial control that must respond in 2ms. We find the "knee in the curve" that greatly impacts design occurs when the speed of response is measured in a few tens of milliseconds (ms) or even microseconds (µs). We will choose 100ms, simply because that is about the potential jitter (maximum latency) imposed by a server or broker in the data path. Systems that must respond faster than this usually must be peer-to-peer, and that is a huge architectural impact.
Data Set Scale [Metric: Data set size >10,000 items]
Again, there are thousands of dimensions to scale, including number of "nodes", number of applications, number of data items, and more. We cannot divide the space by all these parameters. In practice, they are related. For instance, a system with many data items probably has many nodes.
Despite the broad space, we have found that two simple questions correlate with architectural requirements. The first is "data set size", and the knee in the curve is about 10k items. When systems get this big, it is no longer practical to send every data update to every possible receiver. So, managing the data itself becomes a key architectural need. These systems need a "data centric" design that explicitly models the data thereby allowing selective filtering and delivery.
Team or Application Scale [Metric: number of teams or interacting applications >10]
The second scale parameter we choose is the number of teams or independently-developed applications on the "project", with a break point around 10. When many independent groups of developers build applications that must interact, data interface control dominates the interoperability challenge. Again, this often indicates the need for a data-centric design that explicitly models and manages these interfaces.
Device Data Discovery Challenge [Metric: >20 types of devices with multi-variable data sets]
Some IIoT systems can (or even must) be configured and understood before runtime. This does not mean that every data source and sink is known, but rather only that this configuration is relatively static.
However, when IIoT systems integrate racks and racks of machines or devices, they must often be configured and understood during operation. For instance, a plant controller HMI may need to discover the device characteristics of an installed device or rack so a user can choose data to monitor. The choice of "20" different devices is arbitrary. The point: when there are many different configurations for the devices in a rack, this "introspection" becomes an important architectural need to avoid manual gymnastics. Most systems with this characteristic have many more than 20 device types.
Distribution Focus [Metric: Fan out >10]
We define "fan out" as the number of data recipients that must be informed upon change of a single data item. This impacts architecture because many protocols work through single 1:1 connections. Most of the enterprise world works this way, often with TCP, a 1:1 session protocol. Examples include connecting a browser to a web server, a phone app to a backend, or a bank to a credit card company.
However, IIoT systems often need to distribute information to many more destinations. If a single data item must go to many destinations, the architecture must support efficient multiple updates. When fan out exceeds 10 or so, it becomes impractical to do this branching by managing a set of 1:1 connections.
Collection Focus [Metric: One-way data flow with fan in >100]
Systems that are essentially restricted to the collection problem do not share significant data between devices. They instead transmit copious information to be stored or analyzed in higher-level servers or the cloud.
This has huge architectural impact. Collection systems can often use a hub-and-spoke "concentrator" or even a cloud-based server design.
Taxonomy Benefits
Defining an IIoT taxonomy will not be trivial. This blog just scratches the surface. However, the benefits are enormous. Resolving these needs will help system architects choose protocols, network topologies, and compute capabilities. Today, we see designers struggling with issues like server location or configuration, when the right design may not even require servers. Overloaded terms like "real time" and "thing" cause massive confusion between technologies with no practical use case overlap.
It's time the Industrial Internet Consortium took on this important challenge. Its newest Working Group will address this problem, with the goal of clarifying these most basic business and technical imperatives. I am excited to help kickoff this group at the next Industrial Internet Consortium Members meeting in Barcelona. If you are interested, contact me ([email protected]), Dirk Slama ([email protected]), or Jacques Durand ([email protected]). We will begin by sharing our extensive joint experiences across the breadth of the IIoT.