Intermittent outage for API Explorer
Incident Report for Intuiface
Postmortem

What Happened

On 13-December (a Friday), after midnight (UTC), an Intuiface employee using API Explorer reported an outage. The Intuiface Server Team also received automatic alerts about the lack of server response. These outages occurred several times until the morning of Monday, 16-December.

On 16-December, the Server Team performed root cause analysis, during which it was discovered that the outage issue mainly affected Intuiface employees. Further analysis identified the root cause of performance degradation as an unused but resource-costly feature only available to Intuiface employees. Non-employees encountered API Explorer outages as a side effect.

Corrective Actions

On the morning of 17-December, a plan was executed to remove the unused feature and rebuild the server. The rebuilt server was tested and ready at 2PM UTC. In addition, to proactively compensate for possible performance issues in the future, the host supporting the server was accelerated by a factor of two.

Remediation Plans

Moving forward, five optimization options have been identified, with implementation occurring over the coming weeks. These optimizations will tune request storage to accommodate an ever-growing database. These requests are used by the machine learning recommendation algorithm in API Explorer.

Posted Dec 19, 2019 - 01:17 CET

Resolved
This incident has been resolved.
Posted Dec 17, 2019 - 15:09 CET
Update
We have updated the underlying software infrastructure and the outage issue appears to have been corrected. Testing is ongoing
Posted Dec 17, 2019 - 14:57 CET
Update
We are continuing to investigate this issue.
Posted Dec 17, 2019 - 09:12 CET
Investigating
API Explorer is currently subject to short duration outages of 2 to 3 minutes, occurring on an infrequent but repeat basis. Investigation is currently underway with a plan to implement a correction or workaround sometime tomorrow, Tuesday 17-Dec, CET time.
In the meantime, if attempts to use API Explorer fail, what a minute or two then attempt a reconnect.
Posted Dec 16, 2019 - 20:11 CET
This incident affected: API Explorer.