#MakeoverMonday Week 46 Diary

This week’s #MakeoverMonday, Week 46, is Diversity in Tech and covers several key technology companies and their breakdown of employees by gender and ethnicity. Starting this week and moving forward, this #MakeoverMonday Diary will take on a slightly different approach. In doing a couple of time-boxed posts now, it has quickly become clear that the approach of trying to complete the project in a set amount of time, while also taking notes and documenting my steps along the way, hinders my ultimate goal of becoming a better analyst. What’s important to me is that each week I’m learning and growing my analytical skills and also taking the time required to share my learnings with others, who may be looking to either begin building analytical skills of their own or improve upon their current skill set. Let’s get started!

original

Step 1. Know and Understand the Data

After first looking over the original visualization (above),which I liked quite a bit, I flipped over to data.world to download the data set and become familiar with it. The fields included in the data were Date, Type (of company) and Company (name), as well as nine columns for the percentage of employees who were Female, Male, White, Latino, etc. The Date field contained five values, but I had already determined my focus would be on the latest data only, so I added a data source filter getting rid of the previous four time periods. Under Type, I was only interested in Tech and Social Media, so used another data source filter, to filter out Entity and Government. I needed to also keep Country for some later calculations. One last filter on Company kept only those that were Tech and Social Media companies…as well as U.S. Population, again needed for those calcs that we’ll get to.

Step 2. Keep It Simple

Now that I had a good feeling for the data, it was time to think about design. Earlier, I mentioned that I liked the original viz quite a bit. So, in a effort to keep it simple, my approach was to stick with a similar layout, but really emphasize where companies were either overrepresented or underrepresented for a specific gender or ethnicity. In the original viz, I found it a bit inconvenient to have to always go back and reference the very top row (USA Population), to see if a company had more or fewer employees than the US Population, for a given gender or ethnicity. This is where those previously mentioned calculations would come in, but first we’ll touch on color.

Step 3. Effective Use of Color

Going back to the original viz, once you looked past the Gender section (to the right), it didn’t make a ton of sense to me why each ethnicity needed its own color. It was more confusing than anything…did the color actually mean anything or was it there just because? So, in my version of the viz, I stuck with the maroon and gold of the Gender section, letting anything in my viz that signaled overrepresentation be colored gold and anything that signaled underrepresentation be colored maroon. This way it would be extremely easy for the user to understand, at a glance, the breakdown across companies. And to make it even easier yet, I added a highlight when hovering on a company name. This action highlights the row you hover over while also adding the value next to each bar. In an attempt to keep the view clean, I went this route as opposed to adding permanent labels on all bars like in the original. Lastly, to avoid the clutter of any sort of color legend, I tied the colors into the title.

Title with color tied throughout the vizwk46title

Step 4. Choosing the Right Chart Type

So what would be an effective chart type that could achieve the goal of emphasizing where companies were either overrepresented or underrepresented, for a specific gender or ethnicity? Given the two color approach, I felt an effective way to do this would be to use a diverging bar chart and focus on the difference within each company from the US Population. So for each field (Female, Male, etc.) I needed to calculate the difference in the number employed for a company by the number represented in the US Population. For example, women make up 51% of the US Population and 17% of employees at Nvidia. But to simplify a bit, I took the percentages out of the equation and instead went with absolute values per 100 people. So, we could say;

  • For every 100 people in the US, 51 are female
  • For every 100 employees at Nvidia, 17 are female
  • 17 minus 51 is negative 34, so;
    • At Nvidia, for every 100 employees, there is an underrepresentation of 34 females. And conversely, males would be overrepresented by 34 for every 100 employees.

For reference, I included these figures in my tooltips (see below). tooltip

There’s likely a more efficient way of going about the calculations, but since each gender and ethnicity was its own field, I created six calculations, one for each field that would be included in my visualization. And once it came time to move onto the tooltip, several more calculations came into play in order to get the color coding to work. This approach worked here, but if there’s a quicker, easier way of tackling this part of the project and you happen to be reading this, I’m all ears!! So anyway, after going the diverging bar route, here’s what the view started to look like.

wk46.1

With the addition of a ‘sort by’ parameter and the highlight action mentioned earlier, I was starting to like how the visualization was coming together. It encouraged exploration, while providing a quick snapshot of the entire picture. It was easy to see, for instance, that Latinos were underrepresented at all companies (in the above image), while Asians were overrepresented at all companies. The user could sort the data various ways and also had the option of seeing more detail about a particular company if that was of interest; either through the highlight action or through the tooltips.

My final visualization is below and the interactive version can be found here. My hope is that this post and future posts are helpful to those who are early on in their analytical and #dataviz journeys and are looking to either build their skills from the ground up or improve upon their existing skills. If you have any questions at all, whether its something you liked or something you did not like, please don’t hesitate to reach out to me through Twitter at @JtothaVizzo. Thanks for reading and have a great day!

wk46final

 

 

 

#MakeoverMonday Week 44 Diary

IMG_3289

My first #MakeoverMonday Live came last week at Tableau Conference in New Orleans. It was an awesome experience that I’m happy to have been a part of!! As far as #MakeoverMonday’s go, in the past few months, I’ve been trying to do a better job of time-boxing myself to a one hour limit, which helped me in being more prepared for the Live version, than I would have been several months ago. So, moving forward my goal is to combine staying around that time limit while implementing the following format…For those of you who have ever read or listened to sports writer, Bill Simmons, he is a favorite of mine. I was a big fan of the NBA Draft Diary columns he used to write. In his articles, Simmons would watch the NBA Draft and simply record his thoughts, as the draft unfolded. Here’s an example…and of course, being a Minnesota Timberwolves fan, it just happened that I clicked on the 2009 draft, one that haunts Wolves fans everywhere to this day. YOU’RE WELCOME GOLDEN STATE!!!!!!!! Anyway, in 2009 Simmons writes;

MN1MN2MN3

Ok, so you get the point. I’ll set the timer, work through the week’s project and record some key moments as we go. With this week’s data set bound to be a fun one, why not get right to work?!!

9:11pm – Since seeing Eva’s tweet about the poopy data set, my mind instantly began thinking of ways I could work in an Austin Power’s reference, “Who Does Number 2 Work For?” Unfortunately, I didn’t come up with anything great, but hopefully somebody else does. While looking over the data a bit, it became clear to me that the aim should be to call out those people whose hand you should think twice about shaking. For the record, it blows my mind that people choose to NOT wash their hands after using the restroom, it’s just absolutely disgusting!!

9:18pm – With the decision made to call out those who fail wash their hands 100% of the time, I grouped all other responses together. This way I could incorporate some easy to understand bar charts while having just two bars for each gender, as opposed to six. One bar would represent the percentage of females/males in which you should feel confident shaking their hand, while the other would represent those where you should think twice. Reason behind this decision is if you aren’t washing your hands 100% of the time after using the restroom, I do not want to shake your hand!!

9:24pm – With the decision made on how to display the data, I was still left with three locations. In an attempt to make my visual simple and clean, I decided to focus on only the “While at work” location, as I felt it made for an interesting, albeit disturbing story line…that their are likely co-workers among you who failed to wash their hands after last using the restroom. Here’s the final bar chart, displaying the percentage of co-workers who always wash their hands. Simple and to the point…80% of females wash their hands all the time after going number 2 at work, making it ok to shake their hands. For the men, 77% do the same. The only calculations I made this week were simple text calculations that I would use to label the left side of my bar charts.

shake3

9:33pm – Probably 60-70% of my time with this viz was spent searching for and editing the two icons below, that indicate the act of shaking hands and giving knucks/fist bump. Taking a quick look back through my Tableau Public profile, I noticed that I really don’t use icons often, so this was a fun change of pace, but also fairly time consuming. For those of you who may be newer to #MakeoverMonday and Tableau Public, two great resources for finding icons are flaticon.com and thenounproject.com. For more on fonts, colors, etc. be sure to check out The Tableau Assistant Directory from Rebecca Roland.

10:08pm – Closing in on one hour, I finally had my icons edited through the use of PowerPoint and placed on my dashboard with the final visual looking like this.

visual

10:34pm – After adding a title (I took Eva’s comment, below, to heart!!) and some text to explain the viz, I tacked on the typical info on the bottom, including the source and it was time to save to Tableau Public…after a handful of tweaks to get the formatting to display correctly on Tableau Public, I was finished. One hour and twenty-three minutes, from start to finish, not too bad for my first #MakeoverMonday Diary.

evatweet

Click here for the final product…Thank you for reading!!

#MakeoverMonday Week 15 (Arctic Sea Ice Extent)

During #MakeoverMonday Week 15, I learned a lesson I’d like to share. First off, I felt the original viz did a good job of telling the story that in the Arctic, the area of ocean with at least 15% sea ice (known as the Arctic Sea Ice Extent) has been declining and that in recent decades that decline has become more rapid. So, with the original viz being an effective one in my opinion, I decided to go for what I believed to be a first in my short #MakeoverMonday tenure…stick with the original viz and simply create a variation of it.

With a plan in place, I set out…the line graph itself took virtually no time to make. I created a dimension called ‘week,’ threw it on the columns shelf, put AVG(Extent (million sq km) on rows and I had a line. Then, I needed to separate out the years, so I added YEAR(Date) to detail and got many little lines that looked something like ‘First pass,’ below. This was a good start, but I wanted to be able to clearly tell which years where most recent, so I added ‘Date years’ to color and arrived at ‘Adding color,’ below.

wk15.1
First pass
wk15.2
Adding color

So, now it looked like we were getting somewhere, as recent years were now displayed in orange. However, as opposed to seeing all the lines for each year, I wanted to sort of blend the years together, so I cranked up the line size and boy did I like what I saw…

wk15.3
Thickest lines possible

At this point, I was thinking, yeah this makes sense. Blue represents cold and the orange color represents the warming, which is in turn causing the Arctic Sea Ice Extent to decrease. Perfect, we’re good to go. So, I published my viz and shared it on Twitter, getting some positive feedback along the way. Then, the highly anticipated #MakeoverMonday blog post came out, where Andy covered a couple lessons. His lesson on color hit me right away…

wk15.4

I realized that with the use of blue/orange, I had done exactly what Andy mentioned, which was use color to convey temperature. However, the data was about more ice or less ice as opposed to hotter or cooler temperatures. So, I made the mental note and as soon as I had a chance to make the change, later that morning, I swapped out the blue/orange for blue/white, resulting in the below. A much more impactful final product, thanks to a great lesson from Andy, one that has taught me to be more mindful of what the data is about before jumping into design and color choices.

wk15.5
Final result

Name That Baby!!

baby-name-surprised

In 2014, when my wife and I went to the hospital to have our first child, we were all packed up and as prepared to go as we could possibly be. Living just a few blocks from the hospital, the option was available for me to swing home, with ease, if needed. But, nonetheless, the bags that would accompany us sat, packed in our spare bedroom, for the better part of two weeks. However, as prepared as we were with packing, we were equally unprepared in another major part of this whole baby having process…what the hell would we name the baby??? As there are few surprises in life, we chose not to find out the sex, though everyone assured us we were having a boy. So, needing both a girl and boy name, over several months we periodically looked up lists of baby names and talked about which ones we liked or didn’t like, but never seemed to gain much ground. Finally, the day was here and as we rushed out the door, our list was still incomplete, consisting of a single maybe for a girl name and exactly zero boy names. Well, as it turns out, we wound up having a beautiful baby girl and our maybe name, Ruby, seemed to fit her perfectly. Whew, crisis averted!!

Now, as 2017 comes to an end and we usher in 2018, we are expecting our second child in just over three weeks. And here we are sitting in the same situation. Once again, not wanting to find out the sex, this time we’ve been able to muster up one boy name, but zero girl names!! So, how does any of this pertain to Tableau and/or Data Visualization? Funny you should ask…

Screen Shot 2017-12-29 at 11.19.00 AM

Why the Viz?

After going through the same song and dance we went through in 2014, I decided to leverage my passion for Tableau and Data Viz as a new way to approach searching for baby names. Having lost track of how many times I’ve Google searched phrases including “baby names,” it seemed only right to try and make the process more simple and fun. Eventually, I landed on the Social Security Administration website, where I was able to find data on the top baby names, by decade. After narrowing down my list to go back only to the 1920s, as opposed to the 1880s, I began gathering the data.

How Can it Help?

The process of picking out baby names may be easy for some, but very difficult for others. For us, it has been the latter for a few reasons that I won’t go into. Either way, in our situation, my wife and I both tend to stay away from the ultra popular names of today, as we prefer classic names that are beginning to come back in a small way, especially for girls. This is how we landed on Ruby, which also happened to have some meaning to us. So, with these thoughts in mind, I wanted to trend the popularity of baby names over time and use that to determine if the criteria are met for a specific name.

How Does it Work?

Dating back to the 1920s, a lot of names have landed in the Top 200 most popular baby names for a given decade. So, with so many names to weed through, I needed a way to filter down the options of what was viewable at a particular time. Thus, the viz is basically useless without the first of three dashboard actions;

  1. Name Begins with Filter: Including an A to Z list on the lefthand side of the viz allows the user to filter to names that begin with a desired letter. Once a letter has been selected, the second and third dashboard actions come into play.
  2. Name Rank Trend Highlights: Hovering on a girl name will highlight the name rank trend below, while hovering on a boy name will do the same for the boy name rank trends.

Once your name is highlighted in the line chart, you will see its initial Top 200 Rank, as well as all subsequent ranks, allowing you to easily see if the name has increased or decreased in popularity. Here’s a quick example; Although the spelling is different, the name Brittany entered the Top 200 in the 1980s, ranking #21 among girl names. By the 1990s it had climbed to #7. And then in the late 1990s, Britney Spears  became a thing and by the 2000s the popularity of the name Brittany had plummeted to #189. Coincidence? You be the judge.

My hopes are that this viz can be helpful in several different ways, regardless if you like popular names, classic names or anything in between. Thank you for reading, now GO NAME THAT BABY!!

 

Viz What You Love: Part II

cmavizJust over three weeks ago, I posted a viz about Notre Dame football, supporting it with a blog post called ‘Viz What You Love,’ professing and detailing my love for the Fighting Irish football program. A few days after that post, I shared a viz outlining the history of the CMA (Country Music Association) Awards Album of the Year winners. Having grown up in the middle of nowhere, literally, in northwestern Minnesota, sports and music were two of the things that became very important to me early on in life. While, my desire to be active and competitive fire were fueled through sports, music was always there when it was time to relax, study or have fun. I love several genres of music, but where I grew up, country music was big and it has always had a place in my heart. My first ever CD was John Michael Montgomery…no seriously!! And my first ever concert was Tim McGraw, way back when his only hit was “Don’t Take the Girl.” The point is that I love country music and that one really fun way to continue improving your Tableau skills is to produce data visualizations about things you love. I like to call this “Viz What You Love.” Part II is about my CMA Awards 51 Albums of the Year viz.

When I first saw Sean Miller‘s ‘The 100 Greatest Metal Albums of All-Time’ viz, I was blown away not only by how cool it was, but also by how much information was right there at my fingertips. Now, while I’m not a huge metal-head, I’ve listened to enough to know many of the artists and albums on the list, among them Black Sabbath and Ozzy Osborne. The very first thing that caught my attention on Sean’s viz was the range of energy in Black Sabbath/Ozzy albums vs. those of Slayer, which is all energy, all the time. I hadn’t heard much Slayer before, so pulled them up on Spotify. You could say their music is…aggressive!! Anyway, I thought Sean’s viz was awesome and I wanted to try something similar from some music more familiar to me. The first step would be to find a data set…well wouldn’t you know Sean also blogged about his viz and included a sweet little trick you can do in Spotify to capture several different attributes. Thanks for sharing Sean!! Here’s the link he included in his blog that helps you sort your music, so you can then throw it into a spreadsheet and start visualizing. This process is much more seamless than I was expecting, so that was a pleasant surprise!!

As for song attributes, I chose beats per minute, energy, acoustic and popularity. Being country music was my choice, I thought valence may also be interesting, but it didn’t tell the story I was hoping for. I included all songs from each album, because I wanted to see any clustering, especially on the low and high ends of each attribute category. For instance, a majority of two-time Album of the Year award winner, Charlie Rich’s music is low energy and highly acoustic, while recent two-time winner, Chris Stapleton offers a wide variety on his albums. The extreme unpopularity of country music from the 60s through the 80s is clear, save a few notable exceptions such as Merle Haggard, Kenny Rogers and Willie Nelson. There’s a gradual increase in popularity, the newer the music is and neither of these facts are a huge surprise when you think about the demographics of Spotify listeners. I’m really going out on a limb here, but my hunch is that more millennials are using Spotify than senior citizens. I mean, my dad certainly isn’t on Spotify…can you get Spotify on a track phone??? Wait, is it track phone or TracFone? Ah, who the hell knows, the point is not many millennials are listening to Ronnie Milsap, Alabama or George Strait, but they damn well should be!! Ok, here’s what I like about the viz;

  • Like I mentioned earlier, I’m a fan of including all songs on the dot plot, as the clustering of songs within an album is interesting to see.
  • I would have never chosen these colors on my own, but a quick Google search led me to colors associated with each genre of music. So, I chose four related to country music and feel that they actually look pretty nice together, thanks in large part to the dark blue background.
  • I think the highlight actions work well, as you can hover on a song under one column and easily see where that song falls in the other categories as well.

I hope you enjoyed reading, now go out and Viz What You Love!! Thank you again for the inspiration Sean, this was a really fun project!!