Unpacking Averages: Using Natural Language Processing to Extract Quality Information from MDRs
Over the spring and summer, I did a series of posts on extracting quality information from FDA enforcement initiatives like warning letters, recalls, and inspections. But obviously FDA enforcement actions are not the only potential sources of quality data that FDA maintains. FDA has what is now a massive data set on Medical Device Reports (or “MDRs”) that can be mined for quality data. Medical device companies can, in effect, learn from the experiences of their competitors about what types of things can go wrong with medical devices.
The problem, of course, is that the interesting data in MDRs is in what a data scientist would call unstructured data, in this case English language text describing a product problem, where the information or insights cannot be easily extracted given the sheer volume of the reports. In calendar year 2021, for example, FDA received almost 2 million MDRs. It just isn’t feasible for a human to read all of them.
That’s where a form of machine learning, natural language processing, or more specifically topic modeling, comes in. I used topic modeling last November for a post about major trends over the course of a decade in MDRs. Now I want to show how the same topic modeling can be used to find more specific experiences with specific types of medical devices to inform quality improvement.
As I explained last November, topic modeling requires the exercise of judgment to find what is relevant to the business. In this post, I once again choose to use Latent Dirichlet Allocation, a form of unsupervised learning, but this time implemented through the Python library SKLearn. In unsupervised learning, a data scientist needs to make some decisions about what information to include and what to exclude from the database to find the most meaningful topics.
For example, in preparing the data set, I chose to exclude the single most common MDR, that for product code DZE for the dental “Implant, Endosseous, Root-Form.” As you may know, that particular product code is responsible for an enormous number of MDRs, roughly half a million in calendar year 2021. Including those data skews the other data, and frankly doesn’t add a lot of information because most of those MDRs are very similar.
As in most data science exercises, the goal is to find the signal among the noise. I also removed words that are too common in MDRs to get rid of the noise, bringing the signal into clearer focus. Noise words included ‘MDR’, ‘investigation’, ‘issue’, ‘test’, ‘information’, ‘conclusion’, ‘product’, ’cause’, ‘(B)(4)’ (which is the language FDA inserts wherever a document is redacted for confidential information), ‘medwatch’, and ‘CFR’.
Running some tests on the topics, I optimized the number of topics for these purposes at 20. It seemed that 20 topics provided the most meaningful and common topics addressed across all the remaining MDRs. Then, while any given MDR might address multiple topics, my algorithm went through every MDR and classified it according to the primary topic addressed.
My goal in this analysis was to find a particularly good visual representation of a topic, and then a good visual representation of exactly for which types of products (by product code) MDRs addressing that topic were filed. Below I share one topic that was relatively focused on a given product, and another topic that cut across many different product categories. I like to use word clouds for presenting the topics themselves because they give a good sense of which words are most important to the topic. The larger the word, the more important it is to the topic mathematically.
To truly understand the topic, it’s important to look at some examples of the MDRs that are summarized by that topic. Here’s one:
ACCORDING TO THE COMPLAINANT THE DEVICE WILL NOT BE RETURNED FOR INVESTIGATION. WE ARE UNABLE TO CONFIRM THE BENT CANNULA OR TO DETERMINE IF IT COULD HAVE CONTRIBUTED TO THE REPORTED HYPERGLYCEMIA. NO LOT RELEASE RECORDS WERE REVIEWED, AS THE PRODUCT LOT NUMBER WAS NOT PROVIDED.
Here are the product codes where that topic was found in calendar year 2021:
While this cannula topic shows up other places, it is predominantly found in insulin pumps.
Now let’s look at a topic that cuts across several product codes.
Again, to understand the topic, it’s helpful to look at an example MDR.
INFORMATION WAS RECEIVED INDICATING THAT THE CADD LEGACY PCA PUMP MALFUNCTIONED. THE PUMP CAUSED OVER DELIVERY AND THE PATIENT HAD TROUBLE BREATHING AND HAD LOW OXYGEN LEVELS WHICH LED TO AN EMERGENCY ROOM VISIT. ADDITIONAL INFORMATION IS BEING GATHERED.
Notice, for example, that not every word in the topic appears every place where the topic is assigned as the primary topic. In the above MDR, the word “lead,” for example, does not appear. Sometimes when the word “lead” does appear, instead of an electrical lead, it’s referring to the verb, as in something “leads” to something else.
I want to emphasize this point. Topic modeling is a high-level approximation. It shouldn’t be taken too literally to mean that every single one of the MDRs means the same thing. They don’t. Indeed, in my examination of actual MDRs under this topic, they represented a somewhat wide variety. Here are a few others that make this point:
CONCOMITANT MEDICAL PRODUCTS: C6TR01 CRTP, IMPLANTED: (B)(6) 2015. IF INFORMATION IS PROVIDED IN THE FUTURE, A SUPPLEMENTAL REPORT WILL BE ISSUED.
CUSTOMER (PERSON) NOT PROVIDED, INFORMATION PROVIDED BY (B)(6). OCCUPATION: NON-HEALTHCARE PROFESSIONAL. FILTER INTERACTS WITH IVC WALL, E.G. PENETRATION/PERFORATION/EMBEDMENT. THIS MAY BE EITHER SYMPTOMATIC OR ASYMPTOMATIC. POTENTIAL CAUSES MAY INCLUDE IMPROPER DEPLOYMENT; AND (OR) EXCESSIVE FORCE OR MANIPULATIONS NEAR AN IN-SITU FILTER (E.G., A SURGICAL OR ENDOVASCULAR PROCEDURE IN THE VICINITY OF A FILTER). POTENTIAL ADVERSE EVENTS THAT MAY OCCUR INCLUDE, BUT ARE NOT LIMITED TO, THE FOLLOWING: TRAUMA TO ADJACENT STRUCTURES, VASCULAR TRAUMA, VENA CAVA PERFORATION, VENA CAVA PENETRATION. THIS REPORT INCLUDES INFORMATION KNOWN AT THIS TIME. A FOLLOW UP REPORT WILL BE SUBMITTED SHOULD ADDITIONAL INFORMATION BECOME AVAILABLE.
The fact that these topics are at a higher level and are probably not the way a human would construct them if a human read the millions of MDRs does not mean they are without value. It just means they need to be appreciated in the appropriate context. There are insights to be gleaned, but those insights are not perhaps as literal as if a human had designed the topics.
Here are the product codes in which topic 15 was found.
Notice that I’ve deliberately stayed at a high level to illustrate how this is done. If I was working for a particular company, and on a particular product code or set of product codes, the analysis would be much more specific. But I wanted to show how this works without embarrassing any one company or set of companies regarding the quality of their products.
The purpose of this is to illustrate how quality information can be gleaned from the text portion of MDRs. Topic modeling will seize on very specific words that are unique to a given product type, and summarize the number of MDRs where that topic is discussed within that product code. This can be useful for companies that make medical devices that produce – across the industry – many MDRs. Rather than reading every single MDR filed by your competitors, topic modeling will give you a sense of what the topics are within that product code.
Topic modeling will also show where topics are common to multiple product categories. The utility of finding those topics is in then examining how products in the other categories are addressing the quality problem. It may be that companies that make other products that suffer from the same problem have come up with useful solutions that you can import for your own products.
Admittedly topic 15 seems to focus on where implants may require surgery as a result of some malfunction. Some of the products within that set of products may suffer from defects involving electronic leads, while others may not. It is certainly possible to sort the MDRs by the inclusion of a specific words such as “lead.” Further, given the natural language processing tools available, we can sort by those MDRs where the word “lead” is used as a noun versus those where it is used as a verb. For that, we use the Python library called spacey.
Natural language processing offers a variety of tools that help us glean insights from large databases, in this case MDRs, which might number in the millions in a given year. These insights allow companies to learn from the experiences of others, both competitors who make the same product and other medical device companies that make products with similar features or at least similar problems. This post is just the tip of the iceberg. In the context of a particular product, I’ve tried to explain how it’s possible to drill down to more and more specific information that becomes actionable in a quality improvement initiative.