Step by Step process to identify Failure modes in distributed systems.

Example application architecture

Let’s begin, shall we?

Step 1 :

Step 2:

Step 3:

Step 4:

Possible failures in Kafka broker
Schema Registry Failure
Control Center Failures
Example of possible failure mode of the architecture

Step 5:

Step 6:

Step 7:

Why categorising you ask ?

Good question. Categorisation allows you to consolidate steps to fix common failures into one document (runbook), which can be used by SRE or the support team.

Step 8:

Step 9:

Step 10:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
AK

AK

Software engineer, Big data application architect and programming language enthusiast. A guy who like technical discussions . Author on www.cloudkapoor.com