American Behemoth: Our Trillion-Dollar Healthcare System
When I first conceived of this project, I came at it from a conversation with a close friend of mine about how her own friend visited nine doctors, and then finally got a diagnosis when he visited the tenth—and still had to pay for his care after paying for the doctors’ services. Coming off of that conversation, I thought the real story here was about medical debt, but when I went to go find data, I instead ran across how much healthcare costs on the whole. The data on healthcare costs was deeply compelling; I found myself looking at the tables, asking “It’s going to go up by how much?” and realizing it wasn’t going to stop. One of my primary goals in creating this project was to try to make concrete the enormity of how much we spend on healthcare, and the story unfolded from the premise of inherently falling short of my goal: no one can really grasp how much trillions of dollars is, which is when I realized that that is precisely the point.
American Behemoth takes a look at healthcare on a national health expenditure level, and then breaks it down into: 1) insurance/out-of-pocket spending; 2) percentage shifts in that spending; 3) categorical healthcare spending; and 4) a combination of the two types: categorical spending by source of funding (insurance/out-of-pocket). The research questions I ended up formulating were: What are Americans more likely to be paying out of pocket for? How much of insurance is private versus public versus paid for by “third party providers”? Are any healthcare costs projected to fall instead of rising? The short answers ended up being: Other Non-Durable Medical Products, a lot versus a moderate amount versus a tiny amount, and not in the slightest. At the final review, I even had more than a few people tell me outright: “I’m sorry, I just can’t get a handle on how much this is.” “You said trillion?” “It does just keep going up.” “This is depressing.”
The data that I worked with comes from the Center for Medicare and Medicaid Services and includes an amalgam of data from a few governmental offices, including the Office of the Actuary. Out of 17 tables, I ended up consolidating 15 into four separate datasets that I linked to my Tableau for cleanliness purposes. Those four datasets contained very similar information, just broken down in complicated ways. One of the challenges I faced with regard to the data was making sure I kept it clean and sanity checked myself frequently. A couple of times, I found that I had pivoted tables and immediately ruined the data. The data provided a challenge even after the first iteration, in which I realized that if I wanted to create an effective visualization, I would need to separate and pivot certain parts of the data. The fact that I had consolidated around 15 of those tables into four different excel spreadsheets really hammered home how careful I would need to be with the numbers. In the end, the creation of those four consolidated datasets felt like perhaps, a bit much—however, creating my visualizations, it became clear that each of the datasets meaningfully contributed to the points of each of the charts.
In total, I made 12 visualizations, consolidated into five dashboards, which were all fitted into one story sheet. Six of those visualizations were the categorical breakdowns, which I turned into a selectable view for the most detailed breakdown in the entire project. Compared to the prior iteration, which only included four visualizations in two dashboards, the final iteration was a lot more complex and accomplished my goal of telling a story with the data. My visualizations included: 1) a scatter plot with a money bag shape where a circle would be; 2) an area graph; 3) a heatmap; 4) two sets of area graphs, split by category and meant to work in tandem; and 5) several treemaps that accomplished the goal of first showing the categorical breakdown by percentage, and then showing that categorical breakdown by insurance or non-insurance type. For example, if you want to see what percentage of our overall spending in 2019 would go to hospital care, you would look at the first treemap, and then if you wanted to know what percentage of that comes from people’s own funds, you would go to the next section and select the year 2019 in out-of-pocket costs.
The area graph covered a specific portion of the graph per category and showed a breakdown of each insurance type, where each amount per year added up to the total expenditures. I color coded each category to attempt to match the heatmap on the second page of the story. I felt that creating an area graph prepares the viewer to understand the heatmap, which is then broken out into percentages to attempt to show the minute movements. The heatmap itself is essentially a fancy table with colors that get darker the larger the corresponding number is. In this case, it was fairly simple: over time, all of the categories shifted down the color scale, because every single category just kept increasing. It’s slightly clunkier to explain the two area charts working in tandem; one of them is the same graph, just made to show smaller changes when the first graph is interacted with in a dashboard.
Essentially, the reason I chose another area chart instead of, for example, a line chart is that I felt that the line chart would just not show the same density of amount. The point of the project is to show that healthcare costs are rising astronomically, and choosing a line chart would have felt emptier, in a sense, than the area chart. For the same reason, I chose to use treemaps to represent the categorical parts of a yearly whole; a treemap uses size and color to show the portions of a selected amount, and in this case, my decision to turn each insurance type/overall total into a percentage ended up making something actually comparable. A lot of my visualizations ended up being either parts of a whole or on some kind of Cartesian plane, which really speaks to the topic at hand. The struggle I kept coming back to was: how do I represent such a huge system in such a small amount of space?
Representing the fact that the data was made up of historical estimates and projections was difficult, and I constantly felt clumsy pointing this out on my graphs—however, one of my favorite things that I did to show the projections was creating a blank annotation, stretching the square out to cover the part of the graph that was projected, and then reducing the opacity to 15% to give the effect of slightly faded colors. It’s one of the few things that remained the same from start to finish, and one of the few things that was also very finicky when I was first starting out (it would disappear when I was trying to figure out what scale I wanted to use for the scatter shape chart).
A lot of the time, I made sure to follow Tufte’s principles, removing anything that was unnecessary for the graph to be understood. The feedback I received at the pin-up was absolutely invaluable for telling me what that was: certain legends were necessary, and others weren’t. Definitions were required for understanding the idea of the healthcare system, as were instructions on how to engage with the graphs themselves, and though they ended up taking up a lot of space, they were absolutely essential, and without them, people would be asking questions about what the data actually meant. These questions were not something that could be answered with the data alone, so I decided to try to make those definitions as unassuming as possible.
At first, I wanted to try to have the first set of definitions pop up when someone hovered over a certain portion of the graphs, but I couldn’t find any way to create an annotation that popped up on a hover-over. Finally, I decided that if I simply created a balanced view on the first story pane, it would still look aesthetically pleasing and I would also accomplish my goal of creating definitions. The second set of definitions was much easier, as I built them right into the area graphs, but at the review, I ran into an issue that doesn’t seem to have an immediate solution: sometimes, people were completely ignoring those annotations and moving on to the tree maps, and then asking me what each categorical expenditure meant. This only happened twice, when I had larger crowds, so I have a feeling it has something to do with being able to really sit with this story. The instructions were fairly straightforward—I included them only for things that I felt weren’t immediately obvious, like my final interactive piece. Because I couldn’t include titles (the dropdown view toggle wouldn’t work otherwise), I had to point the viewer to the dropdown menu, indicate what the default view was, and then point out that there was a year filter.
One of the things I most wanted to do, from the very beginning, was tell a color story using two complementary colors: green to represent money/greed, and red to represent health/bodies. A lot of the time, green can signal something good, while red can signal something bad; however, by immediately associating green with money bags, I asked the viewer to choose to see it as something financial rather than any of its other associations, and from what I heard at the pinup and review, it seems to have worked. The only thing that mildly concerned me with this color story was the fact that, as per Nathan Yau’s Data Points, red is not exactly accessible to those with color-blindness. However, theoretically, the color-blindness issue only comes into play when there are other colors in addition to red in the same visualization from which to differentiate. The saturation in the heat maps was varied enough that the main function—showing less and more—remained, so I kept it. Additionally, the urgency inherent in the color red convinced me that this was a necessary component to the visualizations’ story.
With regard to font, I settled on a mix of sans serif for the story title, serif for the dashboard title, and sans serif again for the individual graph titles and annotations. My intent was simply to create an aesthetic contrast, something eye-catching and interesting so that my viewers won’t get bored. A difficulty I ran into was when I was sizing everything for the final review as opposed to publishing on Tableau Public—though the former allowed me more space to stretch everything out, I also had to remember that people needed to be able to read everything, especially from a distance. For that reason, a lot of the instructions had large font sizes, while the descriptions had slightly smaller font sizes. When publishing to Tableau Public, I found that my definitions and instructions were getting cut off, so I reduced font sizes, made some of the graphs smaller, and set everything to a fixed size so that I could control what the view looks like. When it came to tooltips and ancillary annotations, I varied font size and color in shades of gray/black in order to emphasize the information I thought was most important—this was usually years, categories, and insurance types. I wanted ancillary annotations to be legible but unobtrusive, so with regard to things like sources, images, and citing the methodology paper, I found smaller corners and font sizes in which they wouldn’t offend the eye, but would still be accessible.
The biggest challenge I faced overall with regard to editing the visualizations was getting a treemap to become small multiples. One of the points of feedback I received at the pin-up was that it was too difficult to understand the out-of-pocket treemap I had made, because it combined each of the years. In my first iteration of this visualization, you would not have a side by side comparison of the categorical breakdown by year—to compare the years, you would have to first combine them, which made it more confusing in the long run. I did a lot of research on this in attempting to address the critique, and as far as I can tell, there is no real way to create “small multiples” of a treemap, at least not in the traditional sense. I tried a few different things before I finally found a “toggle” explanation, which accomplished my goal of swapping between different worksheets that I had created. Though I had to compromise the integrity of having titles above each of the treemaps, the fact that my “small multiples” ended up working at all was possibly the highlight of that iteration.
Finally, I took some time to think about the title of each dashboard as well as the overall story title. During the pin-up, I got a piece of feedback from one of my classmates about my titles, saying that I should swap my first dashboard’s title with the story title. This is where I really thought about what I was saying with this project, because I was wrapping the entire story up with a bow. I knew I was trying to show just how high healthcare costs are and how much higher they’re going to get, but until I had to rename my story, I wasn’t sure how to present it. I settled on the idea of a monster, something so vast and archaic that it could make the attempt at coming close to describing the United States’ healthcare system. Coming from the idea of “my friend’s friend ended up paying way too much for healthcare” and moving towards “this system is genuinely going to continue to grow larger and larger until we do something,” I purposely made all of my titles reflect this, and the result was a more cohesive end than the previous iteration had been.
This project pushed me beyond my comfort zone when it came to working with Tableau, and participating in the pin-ups especially encouraged me to seek out more advanced ways of creating visualizations using this software. The hands-on approach, figuring out how to make certain tutorials work with my own visualization goals, allowed me to cement these skills in a way that only working with theory would never have done. Content-wise, my classmates, TA, and professor really motivated me to think outside the box and go beyond aesthetics to really tell a story that might make a difference. In the end, I think that because of all of the challenges I faced in creating this project, I’m way more confident using Tableau to tell stories with data, and in the future, I hope to continue creating more data stories that make information accessible to a wider audience.
Acknowledgments:
This project topic would not have been what it is without a conversation I had with one of my closest friends about the heaviness of how profit corrupts care and how the high cost of medical care could hit any one of us at any time—thank you, Allie!
Additionally, huge thanks to my pin-up group, our class’s TA Andi Cupallari, and Professor Michelle McSweeney, who all really pinpointed what could be made better and helped me find the sculpture in the marble of my pin-up iteration. I’m really proud of this project, and it wouldn’t have come together as cohesively as it did without the insight and help everyone offered. Thank you all!
Final project reference: