My name is Ismael Chang Ghalimi. I fight against software illiteracy. I am a stoic, and this blog is my agora.

Migrating to Faye

We’re almost done migrating from Socket.IO to Faye. While Socket.IO served us well so far, we decided to migrate to Faye because it is based on a well-defined protocol (Bayeux) and it is more actively developed and supported. Pascal lead this effort and made the necessary changes to mapper (very low-level stuff). He was a bit trigger happy yesterday and made a commit that broke pretty much everything, but we managed to fix most problems before calling it a day. While this migration won’t give us any new feature nor improve performance, it will reduce our long-term technical debt and risk.

Abel to the rescue

This week’s Dojo is focused on view processing. We’re methodically implementing paging, sorting, filtering, grouping, and summaries for views defined on Objects or Composites, with a powerful pipeline architecture.

With this architecture  some processing happens directly within the datastore, some within the application server, and some within the web browser. All 10 datatype families are being implemented, with mindful consideration to many possible performance issues.

This is taking us deep into some pretty interesting algorithms directly influenced by group algebra. In a nutshell, we’ve constructed an abelian group to model the filtering and grouping rules defined by complex datatype families such as Chronological, Locational, and Numerical. It’s all pretty esoteric, but it’s allowing us to implement the whole thing with just meta-data and Formula.js expressions, making it fully extensible.

Demos coming later this week.

Thank you Niels!

RequireJS refactoring

We’re in the process of refactoring our entire codebase with RequireJS. It’s a pretty massive undertaking, but it will allow us to dynamically load only the libraries that we need at any given time, both on the server and on the client. It will also remove all global variables that have been creeping up over time, especially the all-too-convenient AngularJS $scope. Florian has been spearheading this effort and got some help from Pascal in the end. We’re just a few days away from reaching our goal.

Here is the final entity-relationship model for our new view engine.

Kudos to Florian and Denys for their help in developing such a powerful model.

In case you’re wondering, here is the meta-model we built for Perspectives, Views, Objects, and Composites. Boxes and arrows drawn in red show the objects and relationships that have been fully implemented. All four objects are working, and four out of seven relationships are supported. Dotted relationships are implicit and only reflected in the code (but not in the model). The rendering of Perspectives from Objects and Composites is pretty simple and will be added this morning. The inclusion of Views in Composites is a lot more challenging, because it creates a recursion. This means that Views could be defined on Composites that are themselves defined from Views, etc. Let’s hope that we can get that working today as well… Once we’re done, Florian will re-factor all perspectives according to this new architecture. Currently, only Map and Timeline use it. This work should be pretty quick though…

Performance improvements

A few users have complained about poor performance of the user interface, and we’ve decided to do something pretty radical about it: we will cache in the web browser all records of 16 objects which are part of what we call the Platform’s kernel. These include core meta-data objects like Object, Field, Datatype, as well as most objects required by the user interface such as Dashboard, Dashlet, Perspective, or View. All this meta-data, once properly streamlined, takes about 1.7MB, which fits into the web browser’s local storage. With this new architecture, we will get the following benefits:

  • 1.7MB less data to download when initializing the user interface
  • Many user interactions will be handled with no call back to the server
  • Most user interactions will take fewer calls back to the server
  • Most user interface code will be much simpler

This architecture should be fully deployed next week.

Historical Dashboards

We’ve made great progress with Dashboards (Cf. article) last week and I hope to be able to record a first screencast tomorrow or Tuesday. In the meantime, a new requirement has emerged, which is making the creative side of my brain running full speed day and night.

The idea is the following: could we combine Dashboards and Schedules to schedule regular snapshots of dashboards, then automatically display historical renderings of dashboards, for which the Widgets used by Dashlets would be automatically converted to their historical duals?

Yes, I know, that all sounds awfully complicated, but bear with me for a minute. Imagine that you have a dashlet in a dashboard tracking a particular value. Let’s say that this dashlet is defined with the Value widget, which displays a single number. Now, let’s say that you’re configuring your dashboard to be snapshot once a week. And now let’s say that you open this dashboard after a month. By default, it will show the real-time dashboard, but because you configured a scheduled snapshot for it, a selector will let you show the dashboard for any prior week since it was configured. Not only that, but it will also let you display a historical dashboard, for which you’ll be able to select the first and last week.

In this historical dashboard, the dashlet showing the value that you were monitoring is now implemented with the historical dual of the Value widget, which is the Chart widget, configured to display a line chart, with weeks on the x axis and values on the y axis. And all that is done transparently, without you having to make the link between Value and Chart. That’s because all widgets will define their historical duals (with a new Historical dual field in the Widget object) and the mapping rules will be defined at the datatype level (I’m still working on that part).

Want some more example? Here are how some more historical dual mappings:

  • A field, summary, or value is turned into a line chart
  • A grid is turned into a line chart (one line per numeric field shown in the grid)
  • A record is turned into a timeline showing all changes made to its fields
  • A tile (custom HTML content) is turned into a timeline of tiles
  • A line chart is kept as a line chart, but with an extended x-axis for time
  • A bar chart is turned into a stacked bar chart with variable height
  • pie chart is turned into a stacked bar chart with fixed height (100%)
  • A bullet chart is turned into a set of bullet charts (one for each time period)
  • A bubble chart is turned into an animated bubble chart (bubbles moving around over time)

If you’re into data analytics, this should be a pretty big deal for you…

Last email building block

A proper email automation framework needs to:

  • Send individual emails (email action)
  • Send emails to lists (email broadcast)
  • Execute actions upon reception of emails (email facet)
  • Capture all inbound and outbound emails

At this point, the first three are either done or fully specified. The last building block we need is the fourth one, which will be implemented using a newly-created Email object. This object will capture all inbound and outbound emails managed by the email server defined with the stc_email_server setting (or only a subset of the inboxes it manages).

The rationale for this object is that email remains a critical tool for businesses, and email servers only capture a small subset of an email’s context, especially with respect to its relationship to business objects and workflows. The Email object does not replace an email server. Instead, it complements it by enriching all received and sent emails with critical meta-data, such as relations to users, objects, records, etc.

Let’s review its most important fields:

Direction

“Received” or “Sent”.

Reply to

A hierarchical relationship used to capture email discussion threads. As a result, one will be able to represent such discussions using any hierarchical perspective, such as Dendrogram or Gridtree. How cool is that for an object-oriented email client?

From

A relationship to the Contact or User who sent the email.

To

A multiple relationship to the Contacts or Users included in To.

Cc

A multiple relationship to the Contacts or Users included in Cc.

Bcc

A multiple relationship to the Contacts or Users included in Bcc.

Subject

The email’s subject, also used as Name for the Email record.

Body

The email’s body (using the HTML datatype).

Raw

The email’s raw content (using the Text datatype).

Object

The object to which the email is related if the email was sent to an object (Cf. Email Facet).

Record

The record to which the email is related if the email was sent to a record.

Action

The action that was executed if the email was sent to an object or record.

Batch

The batch that was created if the email was sent to an object.

Job

The job that was created if the email was sent to a record.

Interaction

The interaction that was created if the email was sent as part of a broadcast.

EID

The email’s identifier on the email server (ie Gmail email identifier).

Thread

The identifier of the email’s thread on the email server.

URL

The URL of the email on the email server.

Comments

Comments can be added to emails, without being included into email threads.

Files

All files that were attached to the original email.

Events

The event that was created or updated by the original email.

Tasks

All tasks that were created in response to the email.

Owner

The user who received or sent the email.

Created by

The user who received or sent the email.

Created on

The timestamp at which the email was received or sent.

Transitioned by

The user who received or sent the email.

Transitioned on

The timestamp at which the email was received or sent.

Updated by

The user who received or sent the email.

Updated on

The timestamp at which the email was received or sent.

Once implemented, developers and users will be able to use this Email object in different ways.

First, it will instantly provide a new kind of email client that can leverage the multiple perspectives that are transparently offered for all objects. At this stage of the game, it’s hard to tell whether this will be really useful or not. But if we do a good job at optimizing the performance of some of our perspectives, it could be a game changer, and dramatically extend the usefulness of email within a business context, which is something that I’m really passionate about (I’m hopelessly part of the email generation).

Second, it will provide the foundation for very powerful email analytics. If this email service is turned on for all users within an organization, it will transparently create an email graph that can be leveraged in all kinds of ways, assuming that most privacy issues can be properly addressed (some organizations might decide not to capture email bodies for example).

Third, the infrastructure we’re building for email could be reused for other communication channels, especially social channels like Facebook, LinkedIn, or Twitter, opening the door to powerful social analytics applications.

To make a long story short, we’re just scratching the surface here, but it sure is shiny underneath.

Email Facet

Following our introduction to facets, let me tell you some more about the upcoming email facet. The idea is that every object and record will have its own canonical email address to which emails can be sent. When sending an email to an object or a record, a specific action will be executed on the object or the record, based on the email’s subject and body. This will essentially give objects and records an email persona, capable of automating many different actions.

Canonical Object Email Address

The local part of the canonical email address for an object will be the object’s identifier (namespace included). Alternatively, a different local part can be assigned to the object by populating its stc_local_part field. For example, the User object will have the following canonical email address:

stc_user@yourdomain.com

And if you were to set stc_local_part to “user” for it, the following would also become available:

user@yourdomain.com

Canonical Record Email Address

The local part of the canonical email address for a record will be the identifier of the record’s object, followed by a dot, then followed by the record’s UUID. Alternatively, the object’s custom Local Part can be used, and the object’s identifier can be omitted for brevity purposes (once we support high-performance reverse lookup with elasticsearch). As a result, the record for my user (Ismael Ghalimi) will have the following canonical email addresses:

stc_user.31415926-9df7-4aa6-994f-600567b0a37a@stoic.com

user.31415926-9df7-4aa6-994f-600567b0a37a@stoic.com

31415926-9df7-4aa6-994f-600567b0a37a@stoic.com

From there, a set of email actions are made automatically available, some implicit, other explicit. Implicit email actions are executed for all inbound emails, irrespectively of their subjects. Explicit email actions are executed in response to inbound emails that have a specific subject. Some explicit email actions also require additional parameters, which are set in the email’s body, following a very simple syntax.

Attach Files (implicit)

All files attached to an email sent to a record are added to the record’s stc_files field.

Store as Comment (Implicit)

The original email sent to a record is added as a comment to the record.

Create (Explicit)

To create a new record of an object, simply send an email to the object with the “create” subject (case insensitive), and set the record’s field values with one line per field, the field’s identifier (namespace optional), a colon sign (:), and the field’s value. For example:

stc_field_one:foo
stc_field_two:bar

Or:

field_one:foo
field_two:bar

All spaces around the colon sign are automatically stripped. Also, if the field’s name is omitted, the stc_name field is used by default. Therefore, if I wanted to create a task so that I remember to buy milk, I could simply send the following email:

To: task@stoic.com
Subject: create
Buy milk

Read (Explicit)

To read a record, send an email to it with the “read” subject. A reply is sent back with all the record’s field values. The reply is also sent to all recipients included in CC: in the original email.

Update (Explicit)

To update a record, do the same as Create, but with the “update” subject.

Delete (Explicit)

To delete a record, send an email to the record with the “delete” subject.

Action (Explicit)

To execute an action on an object (batch) or record (job), send an email to the object or the record, and put the action’s identifier (namespace optional) in the email’s subject, then set the action’s input parameters using the same syntax as the one used for Create or Update. For the namespace to be omitted, the stc_local_part field of the action must be set to the proper alias (namespace-free identifier). Alternatively, any other alias can be defined in that field.

Event (Explicit)

To create an event and add it to a record, send an email to the record with the “event” subject, and set the event’s field values using the same syntax as the one used for Create or Update.

Task (Explicit)

To create a task and add it to a record, do the same as for event, with the “task” subject.

Whenever an email is sent to an object or a record, an email reply is automatically sent to the original sender, as well as to all recipients included in CC: in the original email. This allows the sender to check that the email action was properly executed, or to be notified of some possible warnings and errors. This reply includes details about the action and hyperlinks to the object or record to which the email was sent, as well as to any relevant records. For example, if a custom action is executed on an object, the email reply includes a link to the record of the Batch object that is created in order to execute the action on all records of the object, or the records of a view defined as a parameter in the original email’s body.

Introduction to Facets

One of STOIC’s most important design principles is the concept of facets. The main idea behind facets is that entities managed by the platform (objects and their records) manifest themselves through different facets visible to users or available to systems. Facets reflect an entity’s properties in different manners, in much the same way a diamond’s facets reflect light across a broad spectrum of colors, some visible to the human eye, and some not.

Here are the facets that we currently support or are working on:

  • Meta-data facet (meta-data defined in the Object and Field objects)
  • Data facet (data stored in datastores)
  • Workflow facet (an object can define a workflow and a record defines a workflow’s state)
  • Email facet (an object or a record has its own canonical email address)
  • Web facet (an object or a record has its own canonical web page)
  • User interfaces facet (object view and record view, across multiple form factors)
  • Widgets facet (an object or record is canonically exposed through widgets)
  • API facet (API for objects and records exposed by the Object Data Mapper)
  • Action facet (the set of actions that can be executed on objects and records)
  • Feeds facet (canonical Object Feeds, View Feeds, and Record Feeds)
  • File facet (an object is materialized by on Object File on the user’s online drive)
  • Timeline facet (a record can be viewed through the timeline of changes made to it)

As we add all these facets, they will be available through the Info tab in the record view.

PS: Facets should not be confused with perspectives, for perspectives apply to a particular facet: the object view. But one could think of perspectives as sub-facets. Continuing with the diamond analogy, if we consider the set of perspectives exposed by an object as being one of the object’s most important facets, we could consider this set of perspectives as being the diamond’s crown (the 33 facets that make the top half in Marcel Tolkowsky’s brilliant diamond cut).

More on Batches and Jobs

As mentioned earlier, we’re in the process of building a batch processing framework. As part of this project, we just created two new objects called Batch and Job. They’re not very complex, but together they’re quite powerful. Here is how they’re supposed to work:

The Batch object is used to create batches of automated jobs. A batch is defined with a set of Actions (modeled as a JSON object) to be executed on the records of a View. This allows developers to benefit from the full power of the View Editor to select the set of records that will be processed by a batch.

When the workflow of a batch is transitioned from Prepared to Tested, the set of records defined by the view is immediately captured in the Records field of the batch, which is modeled using an Advanced Relationship. This allows the Batch to capture which records will be processed, even if the view is changed after the batch is started. The number of records to be processed is captured in the Total jobs field, and the first record in the view is processed as part of the Tested step in the batch’s workflow. To keep things simple, we do not define an alternative set of test actions.

At that point, the developer can check if the test passed successfully by looking up the related record in the Job object. If everything worked as planned, the batch’s workflow can be transitioned from Tested to Started. This triggers the execution of the batch for all records captured in the Records field, at the exception of the first one, which has already been processed as part of the Tested step.

As the batch processing framework goes through each and every record, a new record of the Job object is created. The outcome of each job is captured using the Workflow field and the Definition field (modeled as a JSON object), and the Completed jobs and Failed jobs fields of the Batch object are incremented according to whether the job succeeded or not (respectively). The Success rate field of the Batch is implemented using a simple formula and is therefore automatically updated.

When the Batch is complete, it’s workflow is transitioned from Started to Completed, unless it failed completely, in which case the workflow is transitioned to Failed. While the batch is on the Started step, it can be suspended, resumed, or canceled at any time, by simply changing the value of its Workflow field.

Once again, it’s all pretty simple, but it should be quite powerful, and it should prove very useful whenever we need to implement jobs that will automate the migration of data in customer instances after we deploy software upgrades that include significant changes to our meta-data.

We hope to have a first version working sometime next week.

Multiple relationships

After much anticipation, we’re now ready to start adding support for multiple relationships, also known as many-to-many or N-N relationships. Currently, we only support single relationships, also known as one-to-many or 1-N relationships.

For the multiple kind, we will use a new datatype called advanced relationship.

First, we can now specify whether the relationship is multiple or not through the stc_multiple option. By default, this option is set to false, and the relationship is of the single kind. What this means is that all relationships are modeled as fields (we don’t have any relationship object). When the relationship is single, this field contains a scalar value (in most cases), which is the UUID for the target record. When the relationship is multiple, it contains a list of target records.

Second, we can now specify a list of target objects. So far, we could only specify a single target object, using the stc_target_object option. We’ve now added another option called stc_target_objects, which is mutually exclusive with the previous one, and takes an array of objects. Granted, it would have been cleaner if we just had the latter, but re-factoring everything with that today would be a massive challenge, for very little benefits. When a relationship is defined with multiple target objects, it is called a dynamic relationship. And when it is defined with no target object at all (like the Related items relationship that is part of Bootstrap), it is called an ad hoc relationship.

Third, we can define a set of fields in the source and target objects which values will be cached by the relationship. This is due to the fact that multiple relationships are implemented using individual lookup table (for performance reasons), and that such tables can be extended with additional columns that can cache some data about the source and target records of relations (relations are instances of relationships), in order to reduce the number of joins required for looking up the data we need when showing lists of records for objects that are defined with many multiple relationships. These are defined using the stc_source_cache and stc_target_cache options, using lists of fields from the source objects and target objects respectively.

Fourth, we can also define attributes that can be attached to relationships and which values are set for individual relations (instances of relationships). Values for these attributes are captured by adding additional columns in a relationship’s lookup table, and can be used for capturing things like ordering positions in relationships.

Fifth, we can also define the cardinality of a relationship. It is defined as an optional integer using the stc_cardinality option, and it can be used to limit the number of times a record of a target object can be referenced from a relationship. If you set it to 1, you essentially define a one-to-one relationship (aka 1-1 relationship).

With all that, we need a simple rule to decide when lookup tables are created for relationships. The rule is pretty simple: always, unless the relationship is single, and has a single target object. In all other cases, a lookup table is created. These include:

  • Multiple relationship with 1 target object
  • Multiple relationship with 2 or more target objects (dynamic relationship)
  • Multiple relationship with no pre-defined target object (ad hoc relationship)
  • Single relationship with 2 or more target objects
  • Single relationship with no pre-defined target object (ad hoc relationship)

Last but not least, I should point to the fact that relationships are defined in relation to a field’s identifier. What that means is that a single lookup table is created for all fields that have the same identifier. For example, the fields of the Bootstrap object are automatically cloned for every new object, and a given field of Bootstrap has the same identifier across all its clones. As a result, the five fields of Bootstrap that are defined as multiple relationships (Comments, Files, Events, Tasks, Related items) are each supported by a single lookup table (five lookup tables in total), shared by all objects. This will make it easier to implement reverse lookups.

Pascal has four days to implement all this before he leaves for a week of well-deserved time off…

We should wish him luck…