My name is Ismael Chang Ghalimi. I build the STOIC platform. I am a stoic, and this blog is my agora.

Performance tuning

Jim is currently testing multiple deployment configurations in order to improve the performance of our platform. One thing we’re testing is the ability to deploy our middleware in the same AWS datacenter and availability zone as our database. The former is running on top of Cloud Foundry, which is itself hosted on AWS, while the other is directly deployed on top of EC2 and S3. Having both in the same datacenter would reduce network latency from a few dozens of milliseconds down to about 1ms. And having both within the same availability zone would bring it down to about 0.2ms. We’re also looking at migrating from Redis Cloud to ElastiCache, which would both improve performance and reduce costs. With a bit of luck, all these provisioning details should settle down sometime next week. 

Update on instance provisioning

As many of you have experienced over the past few days, Kickstarter instances are very unstable right now. This is due to the fact that we migrated to a production version of Cloud Foundry and are still making adjustments to the system’s configuration. Also, spreadsheet import is not working yet, and won’t be before next week. In the meantime, we’re doing some testing on various platforms in order to improve performance. Early results are very encouraging, but we still need a few days to get our ducks in a row.

That’s for the bad news. The good news is that things are rapidly converging, and problems are being solved one after the other. And while half of the team is working on caching, provisioning, and upgrades, the other half is fixing bugs and plugging holes left and right. Conclusion: we’ll end up being a couple of months late in regards to our original schedule, but the end result should be a lot more complete and a lot more stable, with a lot less technical debt in it. When we had to pick two out of features, quality, and timeframe, we picked the first two. But time has come to ship now, and it will happen this month.

Indices refactoring

Yesterday, Hugues, Pascal, and I finally agreed on the architecture we should use for our database indices in order to properly implement a smooth meta-data upgrade process. This is something that we’ve been struggling with for over a year now, and a resolution is finally in sight.

To make a long story short, the data and meta-data about any application will be split across two indices, one for its original meta-data, and one for everything else. The first one will be shared across all tenants within an instance, while a copy of the second will be created for every tenant.

While we still do not have a totally clear definition of what meta-data is compared to data, there seems to be an agreement that any object shipped without any record (like Contacts for example) should be considered as data. We’ll rely on this simple assumption for the time being, and add more complexity down the road only if necessary.

The tenant-specific second index will contain the following:

  • User data
  • Custom meta-data
  • Forks of original meta-data records

The third item is the tricky one, and we do not know yet how it will be implemented. One solution would consist in removing forked meta-data records from the first index, but this would prevent us from sharing it across tenants. Another would consist in using terms filters with Elasticsearch.

In essence, any forked meta-data record would be added to a master list on the tenant-specific index, and this list would be used as a terms filter in order to dynamically remove forked records stored on the shared index from queries. That way, forked records would be duplicated across the two indices, but only one copy (the forked one) would be returned by queries.

We’re not sure whether the solution described above will work or not. If it does, it’s awesome, because it will allow us to share indices for application meta-data across any number of tenants, thereby reducing storage requirements on Elasticsearch and dramatically simplifying the upgrade process for the meta-data of applications. If it does not, the overall architecture will still work, but we won’t get to enjoy these two benefits.

With our proposed architecture, we might even create one index per datasource. By default, each and every application is defined with its own datasource, but large applications can have multiple datasources, with one datasource usually being tied to a particular object, or to a collection of objects. For example, if you define an application that would have many records for a particular object, you might decide to package this object within a separate datasource, thereby getting its records stored into a dedicated index.

Another use case would be for applications that make use of connectors to large applications like SAP or Salesforce.com. In this case, you could have one datasource per connector, whereby all records of SAP would be duplicated into a dedicated index, and all records of Salesforce.com into another one. This would make the maintenance of your composite application a lot easier.

Now that we have an agreement on the architecture, it’s time to write some code…

All instances migrated

Last night, Jim managed to migrate all our Kickstarter instances to paid CloudFoundry instances. If your instance was at foo.cfapps.io, it should now be available at foo.stoic.io. We’re using one 1GB Cloud Foundry instance for every 100 tenants, thanks to the multi-tenant architecture developed by Hugues.

Cloud Foundry Provisioning

The trial instances of Cloud Foundry that we’ve been using for our Kickstarter backers up until now will soon expire. It’s time that we move to production instances, and Jim has taken the lead on this project. He will build upon the foundation laid by Hugues and implement a simple provisioning framework that will allow us to manage a handful of multi-tenant instances for our 300 beta customers. Upgrades will take place within the next two weeks.

New spreadsheet importer coming

Pascal is almost done with the new spreadsheet importer. The basic infrastructure is in place, and we’re now wiring together all the components necessary to implement the main use cases we want to support initially:

  • Deploy instance
  • Deploy application
  • Upgrade instance
  • Upgrade application
  • Reset instance
  • Reset application

We should have a first version working tomorrow or next week. Stay tuned…

Pascal and I have started work on a new instance manifest called build.json that describes the spreadsheets from which a tenant’s instance should be created. This manifest is a simple JSON file indicating where the Über spreadsheet is located, which applications should be loaded, and what the structure of application folders should be, including Archives, Resources, and Tests.
From this information, Pascal will refactor the upper parts of our spreadsheet importer so that data from different spreadsheets is stored in the appropriate indexes based on their original custodians. He will also refactor the meta-data cache build process to fuse all this information into a single instance of our meta-data.
Pascal and I have started work on a new instance manifest called build.json that describes the spreadsheets from which a tenant’s instance should be created. This manifest is a simple JSON file indicating where the Über spreadsheet is located, which applications should be loaded, and what the structure of application folders should be, including Archives, Resources, and Tests.
From this information, Pascal will refactor the upper parts of our spreadsheet importer so that data from different spreadsheets is stored in the appropriate indexes based on their original custodians. He will also refactor the meta-data cache build process to fuse all this information into a single instance of our meta-data.
Pascal and I have started work on a new instance manifest called build.json that describes the spreadsheets from which a tenant’s instance should be created. This manifest is a simple JSON file indicating where the Über spreadsheet is located, which applications should be loaded, and what the structure of application folders should be, including Archives, Resources, and Tests.
From this information, Pascal will refactor the upper parts of our spreadsheet importer so that data from different spreadsheets is stored in the appropriate indexes based on their original custodians. He will also refactor the meta-data cache build process to fuse all this information into a single instance of our meta-data.
Pascal and I have started work on a new instance manifest called build.json that describes the spreadsheets from which a tenant’s instance should be created. This manifest is a simple JSON file indicating where the Über spreadsheet is located, which applications should be loaded, and what the structure of application folders should be, including Archives, Resources, and Tests.
From this information, Pascal will refactor the upper parts of our spreadsheet importer so that data from different spreadsheets is stored in the appropriate indexes based on their original custodians. He will also refactor the meta-data cache build process to fuse all this information into a single instance of our meta-data.

Pascal and I have started work on a new instance manifest called build.json that describes the spreadsheets from which a tenant’s instance should be created. This manifest is a simple JSON file indicating where the Über spreadsheet is located, which applications should be loaded, and what the structure of application folders should be, including Archives, Resources, and Tests.

From this information, Pascal will refactor the upper parts of our spreadsheet importer so that data from different spreadsheets is stored in the appropriate indexes based on their original custodians. He will also refactor the meta-data cache build process to fuse all this information into a single instance of our meta-data.

Victory! After a couple of days of work (including plenty of blood, sweat, and tears), our Über meta-data micro-kernel is working. My minimalist instance is now up and running, and we can start implementing the spreadsheet export process. Great work Hugues!
Victory! After a couple of days of work (including plenty of blood, sweat, and tears), our Über meta-data micro-kernel is working. My minimalist instance is now up and running, and we can start implementing the spreadsheet export process. Great work Hugues!

Victory! After a couple of days of work (including plenty of blood, sweat, and tears), our Über meta-data micro-kernel is working. My minimalist instance is now up and running, and we can start implementing the spreadsheet export process. Great work Hugues!

Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:
Applications
Datatypes
Families
Fields
Objects
Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.
Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:
Applications
Datatypes
Families
Fields
Objects
Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.
Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:
Applications
Datatypes
Families
Fields
Objects
Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.
Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:
Applications
Datatypes
Families
Fields
Objects
Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.
Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:
Applications
Datatypes
Families
Fields
Objects
Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.

Hugues and I are currently refactoring our meta-data in order to create a micro-kernel from which the core platform can be built. For that purpose, we’ve created a minimalist spreadsheet called Über with the following objects in it:

Then, we’ve removed these objects from the original Platform spreadsheet, and we’ve packaged all our Resources and Tests objects in two separate folders. Resources are deployed on every instances automatically, while Tests are optional and can be used for testing purposes. This improved packaging should simplify and speed up our overall provisioning process quite a bit.

Victory! The new spreadsheet importer is working. This list of units was dynamically loaded from an externalized spreadsheet referenced from the Objects sheet of the Platform spreadsheet. Next, we need to add support for spreadsheet export with automatic backup creation, and that will give us all we need for round-tripping…

Spreadsheet Import/Export

Next week, Hugues and I will work on that last piece of infrastructure that we need in order to start using STOIC in production: spreadsheet export. So far, we could not yet deploy our platform in production because its user interface was not reliable enough. As a result, the Platform spreadsheet and all others spreadsheets that we have created to design applications have been stored in Google Spreadsheets, which we’ve used as our golden master. This ensured that our precious meta-data would remain safe at all times, even after being mangled by our temperamental user interface. Clearly not the kind of limitation that we can live with forever…

With the recent introduction of our commit process, we have good hopes that our user interface will become usable within a week or two. As a result, time has come to start using it for productive work on a daily basis. And we want that to happen as soon as possible, because it’s the only way that we’ll be able to fully debug it before we can ship it to customers. But for that to happen, we need to ensure that we have a reliable way of backing-up all our structured data on a regular basis. We also need to make sure that we can quickly verify the integrity of this backup, so that we don’t wake up one day realizing that our backups for the past few weeks were pure garbage (one of my worst nightmares).

To do that, we need a way to export all our structured data into spreadsheets, which are human readable and can be handled easily. As you’ve already guessed, modularizing our spreadsheets is a pre-requisite for all this work, and it will allow us to collaborate better with all the people that are now involved in the development of our platform. For example, it will make it easier for me to share all the meta-data about our 500+ Formula.js functions with Hannes, who has been helping us improve its documentation and website.

Once we’re done with this refactoring, you’ll be able to store any piece of structured data into Google Spreadsheets or CSV files on your drive (any drive, thanks to STOIC Drive), rebuild your instance from scratch in a single click of the mouse, then export everything back into neatly modularized files with another click, with byte-perfect roundtripping.

Why would you use CSV instead of spreadsheets? For any piece of reference data or business data that would contain more than 400,000 data elements (think cells in a spreadsheet), because of size limitations imposed by Google Spreadsheets. This will allow you to import large datasets from external systems and to export them from STOIC.

Hugues will work on all this next week. I can’t wait for him to be done with it…