performance issue
Incident Report for Intuiface
Postmortem

What happened

At 5:30am UTC on 14-June, we initiated a planned maintenance operation. At the end of the maintenance operation, we observed a significant increase in CPU usage for the Intuiface user database, triggering a high latency for my.intuiface.com access and license activation. Eventually, the database CPU usage went to 100%, preventing both access to my.intuiface.com and license activation of Players and Composers. (Already running Players were not impacted due to an efficient fallback mechanism.)

During the incident, several attempts to identify and fix the issues were made, without definitive success, leading to rollback of the maintenance update.

How do we plan to remediate such incident in the future?

We are actively working on enhancing our performance testing tools and procedures, particularly those for environments simulating near-real-life licenses checks and license activations on a pre-production servers.

We are also improving our incident management process for long lasting incidents so that rollbacks are triggered earlier during the incident, typically after one hour duration, thus reducing their negative impact.

Posted Jun 21, 2022 - 13:19 CEST

Resolved
The incident is closed. Our users will get a post-mortem email in the coming days detailing this event, causes and what we are putting in place to prevent another occurrence.
Posted Jun 14, 2022 - 15:17 CEST
Monitoring
We decided to perform a rollback to the version prior to our 7:30am CET database migration. Everything appears back to normal now.
Posted Jun 14, 2022 - 13:34 CEST
Update
my.intuiface.com, including license activation, is currently experiencing performance issues that are preventing authentication on support.intuiface.com. Players already activated are not impacted by the issue but Composers using “release license on exit” are and cannot get a new license. We are currently working on fix and will continue to inform you about the progress in 30’.
Posted Jun 14, 2022 - 13:08 CEST
Update
my.intuiface.com, including license activation, is currently experiencing performance issues that may also prevent authentication on support.intuiface.com. We have found the root cause and are currently testing a fix. We continue to expect the fix to be deployed within the next hour. We will continue to inform you about progress in 30’.
Posted Jun 14, 2022 - 12:00 CEST
Update
my.intuiface.com, including license activation, is currently experiencing performance issues that may also prevent authentication on support.intuiface.com. We have found the root cause and are currently testing a fix. We are expecting the fix to be deployed within the next hour. We will inform you about the progress in 30’.
Posted Jun 14, 2022 - 11:27 CEST
Identified
My.intuiface.com, including license activation, is currently experiencing performance issues that may also prevent authentication on support.intuiface.com. We have found the root cause and are actively working on a fix. We are expecting the issue to be fixed within the next hour. We will inform you about the progress in 30’.
Posted Jun 14, 2022 - 10:18 CEST
Investigating
my.intuiface.com, including license activation, is currently experiencing performance issues. We have found the root cause and are actively working on a fix. We are expecting the issue to be fixed within the next hour. We will inform you about the progress in 30’.
Posted Jun 14, 2022 - 09:45 CEST
This incident affected: Management Console and Analytics (Charts & Dashboards).