Microsoft Teams went down for nearly three hours this morning, delaying and upending countless meetings and office chatter. The cause? Microsoft forgot to renew a security certificate.
That's right, Microsoft, a trillion-dollar corporation, with thousands of employees in IT alone, forgot to renew a cert., and in doing so, kicked 20M users off of a business-critical communication app. As the saying goes, it happens to the best of us.
What Went Down
Users first began reporting issues with Teams, Microsoft's collaboration platform and Slack killer, around 8:30 am. By approximately 9 am, Microsoft had taken note of the issue.
By 10 am, the team at Microsoft had isolated the issue: an expired certificate, roughly an hour later, they'd deployed an updated certificate.
By noon, the issue was largely resolved, but for users around the globe, one question remained: how could this happen?
An Easy Fix
Honestly, we can enjoy a cheap laugh at Microsoft's expense, but who hasn't let a cert of some kind lapse? It's a classic amateur IT mistake. Just so happens that when one of Microsoft's certifications lapses, it causes a massive outage for millions of users...
Of course, someone could have just written a script to alert on upcoming cert expirations.
But all of this begs the question: if it can happen to a company spending millions, if not billions on IT, can it happen to me? Short answer: yes, but not if you're prepared.
Want to Monitor SSL Certificates? WUG Does That
If you can't be bothered to write a PowerShell script, or want a simpler way to monitor certificates, you can set up an SSL certificate monitor in WhatsUp Gold, which will check a specific location for the existence of a valid SSL certificate, at whatever interval you want, so that you can always know exactly when a certificate expires.