Since we started working on the STOIC platform eighteen months ago, we’ve been very keen on making sure that meta-data behaves pretty much the same way as business data. In fact, for the longest time, there was no way to really distinguish one from the other.
As the platform matured though, their respective life-cycles started to diverge. For example, when we added support for meta-data caching, we had to explicitly indicate which objects would be included into this cache. This implicitly considered some objects as being part of the meta-data. Similarly, when we started to implement our Commit process, we had to identify a subset of these meta-data objects as special cases that require explicit commit operations.
Coming back to our original idea, treating meta-data and business data alike had clear benefits. For one, it allowed us to use the same canonical user interface for both. In other words, from the viewpoint of developers and users, meta-data and business data are the same thing. But from the viewpoint of the implementers of the platform (STOIC employees), they’re quite different, for rather good reasons. Clearly, we needed a way to reconcile both sets of requirements.
Today, we know that we want them to be both different and the same, all at once.
Then, as we started to implement our meta-data update framework to support cascading levels of meta-data custody, we realized that such a capability was required only for meta-data, not business data. The reason for it is very simple: while the platform vendor (STOIC), software vendors developing packaged applications on top of it, systems integrators customizing these applications to suit the needs of their customers, and customers configuring these applications could all make changes to meta-data, only customers (referred to as end custodians) would need to create and manage actual business data. As a result, the multi-custodian meta-data life-cycle could be applied to meta-data only, and we could pretty much ignore the concept of custodian for business data. This sudden reduction of scope opened the door to many opportunities for simplification and optimization, which we’re now taking full advantage of.
This is especially important because we’re doing all that work while finishing the implementation of our distributed meta-data cache and adding support for clustering. Taken individually, caching, clustering, and custody are hard-enough to implement. Put together, they’re like rocket science, and the more you can simplify, the better a chance you have of ever making it work.
With that in mind, we’re now streamlining the end-to-end data lifecycle. Here is how it will work.
First, we’re separating data from meta-data entirely. Meta-data is defined as the records of the objects for which Cached (a field of the Objects object) is set to
TRUE. All records of these objects will be part of our meta-data cache (
mdCache). This cache will have two versions, one for servers containing all fields of cached objects, and one for clients only containing the fields for which Cached (a field of the Fields object) is set to
Second, we’re creating one schema on PostgreSQL or one index on Elasticsearch for each and every custodian, according to the architecture described on this previous post, but these schemas or indexes are used for meta-data only. We then create a separate schema or index for business data, used by the end custodian only.
Third, we acknowledge the fact that any changes made to meta-data by upstream custodians (custodians other than the end-custodian) follow a different lifecycle than changes made by the end custodian. The former are traditionally called upgrades, while the latter are called configurations, customizations, or extensions. The former happen rather infrequently in a very controlled environment, while the latter happen on a daily basis, in a very ad hoc fashion. For this reason, they can be implemented very differently: the former is implemented by simply replacing a schema or index file by a new one, while the latter is implemented with incremental updates.
Fourth, we implement a cluster-friendly incremental update process for all updates made to meta-data by the end custodian. For these, we build an aggregated image of the meta-data by combining the meta-data schemas or indexes of all custodians, according to simple overloading rules. Usually, precedence is taken by changes made by the custodian who is the most downstream in the custody chain. Then, we deploy this meta-data in memory on all servers and clients, and make sure that they remain synchronized at all times.
Fifth, to keep everything synchronized, incremental updates to meta-data are first applied to a persistent copy of the aggregated meta-data stored by PostgreSQL or Elasticsearch. The meta-data is kept consistent through locking, which today is implemented in an optimistic fashion through the use of Change Identifiers (CIDs), but might be migrated to a pessimistic locking mechanism if we decide that it would improve the overall end-user experience. And we make sure that the internal structure of our meta-data cache supports incremental updates in a robust and high performance fashion, by getting rid of extraneous cross-references that were added to it.
As a result of this architecture, the complete refreshing of our meta-data cache will happen a lot less frequently than it has so far. In fact, it will be limited to instances where meta-data needs to be upgraded by upstream custodians, or when clients go back online after some period of offline activity (once we add full support for offline access). This should improve performance while reducing the latency of both server operations and client interactions.
That’s pretty much all for now. If you followed me so far, good for you. If you did not, don’t worry. You don’t really have to understand any of this, unless you’re planning to deploy the STOIC platform at a very large scale. All you should know is that this stuff is what makes it work.