Concluding yesterday's discussion on Medicare physician co-occurrence graphs (here's Part 1 if you missed it), here's Part 2: Digging Deeper on CMS Physician Co-occurrence Clusters.
The visualizations we shared in yesterday's post are pretty impressive without any additional context, but of course one of the primary goals of data visualization is to help the viewer gain a more intuitive understanding of the data. To that end, let's focus on a few of the distinct clusters in Dr. McGinnis' co-occurrence neighborhood. It is my supposition that the neighbor clusters correspond to geographic areas or large healthcare organizations like hospitals, where individual providers regularly work with each other. To test this I mapped the locations of each neighbor in a selected cluster. All of the maps below use the same viewport, centered on New Jersey-where Dr. McGinnis' practices are based-ranging from Connecticut in the upper right to Maryland and the Chesapeake Bay in the lower left.
The first cluster we'll focus on is the center-top cluster in the neighborhood graph visualization. It is interesting because it is the most independent of all the large clusters. Plotting those neighbor's physical locations on the map shows that they are highly concentrated in and around Dover, DE. The isolation of the cluster within the graph is reflective of Dover's relative isolation from the other urban centers from DC to NYC.
The second cluster is just to the right of the first in the neighborhood graph visualization. Plotting those physicians on the map shows that they are concentrated north of DC, particularly along I-270 toward Germantown, MD.
The third and final truly distinct neighbor cluster we'll take a look at is physically located around Bridgeport, CT.
Having examined those three clusters, let's take a look at the massive cluster of neighbor physicians in the lower right quadrant of the graph. As I mentioned earlier, Dr. McGinnis is located in New Jersey, so it's a natural suspect that this large group of co-occurring neighbors falls within New Jersey as well, and indeed that is the case! However, taking a closer look at the large cluster you can see a slight separation among three subregions. Let's plot those three regions separately and check out the results.
The three subregions represent fairly distinct geographical regions, all based in New Jersey. There is some overlap in area corresponding to the subregions' loose interconnectedness.
What Does It All Mean?
So what sort of conclusions can we draw? (Granted this is purely anecdotal, and any strong conclusions would require more a rigorous statistical analysis across a broader set of data.) We found supporting evidence for the suspicion that clusters within a co-occurrence neighborhood correspond to geographical regions. However, after looking up individual physicians in these neighbor clusters I was surprised that the clusters were not more tightly bound to particular hospitals or hospital systems. I suspect this is due to the co-occurrence (rather than true referral) nature of the CMS referral patterns data; one external party, such as our top pathologist, has the potential to link otherwise unrelated providers by mere coincidence.
The two datasets were loaded, joined, aggregated and filtered using GraphLab Create. The graph visualizations here were created using Gephi. The map scatter plots were created using Matplotlib with Basemap. In a future post I'll detail how you can connect GraphLab and Gephi using NetworkX, as well as perform some graph analysis to find potential influencers in the co-occurrence graph beyond just looking at gross Medicare reimbursements.
- Snake_Byte #31: What's in a __name__? - June 2, 2017
- Snake_Byte #18: namedtuples with defaults - February 3, 2017
- Visualizing Medicare Physician Co-occurrence Graphs with Payment Data (Part 2 of 2) - September 4, 2014
- Visualizing Medicare Physician Co-occurrence Graphs with Payment Data (Part 1 of 2) - September 3, 2014