Sunday, October 1, 2017

A Devops Engineer Inside The World of Data Science - Survival Guide (Part 1)

Recently, I was tasked by the CTO to help the Operations department be more efficient and self-sustaining in their day-to-day routine. It was said that the only way to dig what and where the problem is through checking the messy gold mines (aka data) where details are disclosed and kept.

For someone who doesn't have any background in data science and big data, I wasn't sure if I could deliver the needs on time and accurately. While at the back of my mind, there is this voice that says "take it and explore". So the troll face in me says "challenge accepted".
What is fascinating about startup is, you can be anyone! Wearing many hats is a privilege and it's always good to have a taste of everything...
Before jumping into the waters of data science, the main thing I was up to -- is to know the fundamentals of it. I am not only after the formulation but the logic on how "factors" and "components" affect your formulation. The foundation I am trying to build is from the thinking "When does data make sense?"

My strategy for this role would be:
  • Research - about the tools, practices and know-hows
  • Design Thinking - conceptualization, formulation and composition
  • Delivery - reporting, analysis and dashboards
NOTE: The catch about data science is that, you're solving a problem that was either asked or never thought existed. It's always the underlying message, that you're after for.
The task was given to me Friday, prior to ending the shift. Not wasting any of my time, I made sure weekends are well spent and my Monday shift is all set.

This writeup doesn't give you the complete comprehensive guide to being a data scientist, rather gives you a good kickstart in taking your baby steps towards being one.

The main catch I was able to grasp is "visualization". Structured data is nonsense when it doesn't tell you a story on the first glimpse. That's the reason why people create and construct a great dashboard that explains it all.
When your work talks for itself, don't interrupt.
Since the early web, people love to do reporting with graphs and images to represent a body of information. As we evolve to modernization, the type of reporting also adapts the innovation. Dashboard plays a great role in reporting nowadays. Not only for analytics but also for user-experience.

As I deep dive into the topic of "dashboards", here are the pointers I noted from different articles and podcasts I've gone through.

Organizing Dashboard:
There are studies that prove that some dashboard are not cool as it looks like. Smart dashboards are what we are after for, thus, knowing what a bad dashboard is vital as we go along our research.

This is my personal structure of what a good dashboard looks like. Labelled based on "emphasis" and how people will look at it.

In creating dashboards that people will love to look at, using the right color scheme is a thing that should be observed. You need to be aware that not everyone who will be looking at your graph sheets and data details is not on a 100% visual state.

Choosing the right color, font and highlights will spice up the dashboard. It makes the "important" things easier to see and be marked.

Numerical Representation:
When numbers are involved in your dashboard, you might want to consider placing identifiers on every numerical value. This way, by just simply looking at the dashboard -- users already know what is the message.

Like the image below, which do you think speaks more? The one on the left or the one on the right? Which does makes more sense?

NOTE: When a number is added into your dashboard, it doesn't represent anything. It purely states the "value" but doesn't tell you anything more. Adding a "symbol" (ie. arrow), will tell you what that number means (might it be good or bad).

Other common mistakes people think makes their dashboard cool are the following:
  • Placing useless decorations
  • Implementing crosstabs
  • Using scrollbars 

As for the tools, there are now lots of choices you can pick -- from opensource to enterprise grade. As for me, the company is subscribed to Tableau tools.

Here is some list of the software that can help you set your course.


Part 2 of this writeup will somewhat tackle about the Tableau tools usage and certain topics about data sorting, modelling and structuring. I wish you all the best for your data science career.

PS: I really appreciate how the company gives me this kind of opportunity.