Earlier this year, I spoke at PyCon about running our Python code with PyPy and how it has sped up our healthcare transaction processing. After the talk, I was pleased to be greeted outside the room by so many people that were also working in healthcare. They were interested in how PyPy performed for different scenarios that were applicable to their daily work. We recently upgraded our production environment to PyPy 5.6.0 so I thought I’d use this week’s Snake Byte installment to share more about what we’ve seen since the switch, specifically with our claims processing.
Prior to rolling out an interpreter update, we go through a series of tests to ensure things are stable for our code running with a different interpreter. We did this when switching over from CPython to PyPy, and we continue to do it for each new PyPy release. We’ll start using the new interpreter version locally as part of daily development to ensure things work as expected and our test suite continues to pass. After a week or two of stable local development, we’ll proceed to verify the new interpreter within our continuous integration system. After things have been reviewed there, we’ll do some performance testing on a minimal cluster to compare with previous versions and then roll it out to production in one of our regular deployments.
As we were doing the last bit of performance testing for our code running with PyPy 5.6.0, I decided to see how a minimal cluster (two WSGI instances running with gunicorn and four gevent workers each. An elastic load balancer spreading requests across the two instances) would handle being flooded with claims API requests. I used Locust, a load testing tool, to see what improvements PyPy 5.6.0 might bring to our claims processing compared to PyPy 5.3.1 and CPython 2.7.
We do quite a bit of claim validation upfront for each claim request received. If something in the claims request isn’t valid for the specified trading partner, we’ll immediately respond with a list of validation errors to be corrected in order to get the claim submitted successfully. If a claims request is determined to be valid, we’ll respond with platform activity information that may be used to track the claim during its processing lifecycle with a trading partner. Our goal is to always do this within milliseconds similar to what you might expect with any other modern API. As we learn more about a specific trading partner (or when their rules change), we roll this back into our claims request validation for the benefit of all of our customers.
I set up a quick test simulating 1000 platform applications (with no rate limits) pushing claims API requests as fast as possible. Each claims API request represented a valid claim for a typical office visit with a few services included with the visit. After spinning up this test, our minimal cluster spiked up to around 335 claims/second before the two WSGI instances maxed out on their CPU usage. In production, we’d start scaling out our WSGI tier to accommodate additional API clients and their requests as request volumes spiked. For this type of testing, it can be helpful to operate with a fixed cluster size to see how we’re doing on a minimal configuration to help gauge performance benefits from a new interpreter version or specific code change. It can also help us plan for scaling our WSGI tier during request spikes.
After running this locust swarm for about an hour, here’s what we saw for request response times:
Percentage of the requests completed within given times (milliseconds):
Name 50% 66% 75% 80% 90% 95%
/api/v4/claims/ 480 590 670 740 980 1300
We strive to keep our API response times as low as possible and this round of testing helps us plan for scaling up our WSGI tier. That ensures that folks will have the best possible claims API response times, even under heavier loads. We still seem to be CPU bound on the front side of our claims request processing (claims request validation, serialization of claims API response).
Scaling up WSGI instances should allow for much higher spikes in our request rates and we’ll continue to review and optimize our code for better JIT’ing under PyPy.
In our pre-PyPy era (when we were running with CPython 2.7), we'd max out on this test configuration around 45 claims/second. So, we're seeing a good boost from PyPy for claims request validation and serialization of validation errors/claims API responses. Our numbers from previous PyPy 5.3.1 testing also improved under PyPy 5.6.0. Our last round of similar tests with PyPy 5.3.1 resulted in a 162 claims/second rate with a similar configuration.
There’s always quite a bit of work by the PyPy team to fix bugs and incorporate new performance improvements with each new release (including some bug fixes pulled in from the Python standard library). I’ve not yet isolated all the improvements in PyPy that might be contributing to this boost we’re seeing (between 5.3.1 and 5.6.0), but we have encountered a few issues that were fixed up in Python’s 2.7.12 standard library. These were included in PyPy 5.6.0 to address memory leaks. It’s likely that under heavier load, we experienced some swapping before workers were recycled and that lowered the claims/second rates we experienced.
With PyPy 5.6.0 and this minimal cluster configuration, we can handle just under 26M claims submissions per day without needing to add any additional instances. If that was sustained daily, that'd put us up at around 9.5B claims per year. While those numbers may seem low when compared to API usage in other industries, think about how many people there are in the United States. Think about how many healthcare claims you and/or your family submit each year. Assuming a population of 325,110,000, you could route (and validate) about 29 claims per person through this set up each year before needing to start scaling it up. Now consider how much is spent each year on “Health IT” and what we just tested with a few AWS instances and some Python code running with PyPy.
It’s unlikely that the PyPy team is thinking about US healthcare claims processing when they’re working on the next round of performance improvements. But we are - and we’re extremely thankful for the benefits they enable, namely lowering the cost per healthcare transaction so that people can maximize their quality of life. If you’re looking to speed up your Python based system, I cannot recommend PyPy enough.