Instance Health Dashboard

An all-in-one dashboard for a virtual machine (or server) to assess overall health and identify root causes.

Context

AIM

COMPANY

Role & TASKS

Team

TIMELINE

The Problem

Support staff currently start their day by gathering data about app issues from multiple dashboards, leading to fragmented workflows and delayed issue resolution.

Into the Deep End

This project presented a steep learning curve, requiring me to quickly familiarise myself with unfamiliar terminology and concepts. To bridge the gap, I researched data visualisation dashboards, explored design options, and audited existing dashboards. Initially scoped for Level 3 support team leaders, the project expanded to all four support levels, prompting recalibration and collaboration with the Product Manager. With a refined scope, I was ready to conduct user interviews.

Shown: Affinity map for user interview data

Understanding Our Users

As I began interviewing users, some really useful insights quickly emerged. Discussions revealed valuable information about the root causes of their issues, health criteria, existing dashboards, and ideal features. The main roadblocks were the inability to poll tickets and the automation not triggering as expected (above).

These insights, in combination with my newly developed user flows helped define the first iteration of my wireframes (left) that would allow the users to easily diagnose why these roadblocks were occurring. These were further refined in collaboration with the PM and Tech Lead to lock in the final solution.

Shown: Initial wireframes

✨ The Solution

A new all-in-one health dashboard that makes it easier for support staff to view key data at a glance in a user-defined period and investigate potential issues further.

Issue Identification

Pinpoint areas of concern with intuitive border colours, clear icons, and key data highlights. Gain deeper insights by using dashboard links or exploring the rest of the dashboard.

Failure Analysis

Gain clearer insights into ticket failures by using filters to exclude graph groups and focus on data drops within a defined time period.

Optimised Performance

Improve system performance and prevent future issues through enhanced CPU and memory management.

Project Takeaways

Some of my key learnings include:

1. Limit the number of revisions.

Limit the number of revisions to maintain focus and avoid unnecessary back-and-forth. Four iterations occurred due to stakeholders shifting between ideas and platform limitations, requiring pushback to prioritise user needs. Setting clear boundaries can streamline decision-making and reduce scope changes.

2. Factor in extra time to address gaps in design systems.

As the design system lacked support for data visualisation, I conducted independent research, created an accessible colour palette, and identified this as a future improvement area – none of which was initially planned.

3. Stats are love, stats are life.

Or at least in my case. This was a fulfilling project as I was able to lean into my strengths and interests.

Although time didn’t allow for a complete E2E process, the product has since been shipped. Looking ahead, it would be valuable to conduct usability testing to gain insights into user behaviour and further enhance the dashboard’s effectiveness.

Explore more​

Scroll to Top