Throughout the document or undocumented history of humans, infectious diseases have followed us everywhere we traveled or settled. Some disease outbreaks were localized, whereas some were global at scale. Various cultures and civilizations called it Wrath Of God. “God is punishing us through these diseases,” they said.
But the history of infectious disease is also the history of individuals who have described infections, differentiated one from another, isolated or characterized the pathogenic microorganism, developed the diagnostic tests, pioneered treatments, developed preventive public health measures, or developed vaccines or chemoprophylaxis to prevent infections.
By understanding the history of infectious diseases, we gleaned more information about the disease itself, the transmission vector, and how to prevent and cure it. To do so, we have set up labs for diagnosing these diseases, trained medical professionals about these diseases, and created research labs to widen our understanding of these diseases. It led to many people believing that our society is advanced and sophisticated and is relatively immune from the devastating effect of infectious diseases experienced by the people in the ancient world. But the new infectious diseases are always lurking, as attested to by recent outbreaks of plague in India, Ebola in Africa, and severe acute respiratory syndrome in China. With the advent of international air travel and extensive intercontinental interchanges of animals, insects, and humans, there is always the potential for the emergence of new and devastating infectious diseases. The spread of coronavirus (Covid-19) – believed to have originated from China – throughout the world is the most recent example of communicable disease, which threatens the existence of humankind.
But, despite all the advancements in medical science and pathology we have done so far over these years, we are still too late to check the spread of these infectious diseases. Take the current scenario where the coronavirus is ubiquitous. No doubt that we are trying our best to preclude the spread of the virus, but we are failing miserably, probably because we do not have a robust contact tracing mechanism, up-to-date lab testing equipment, or a robust infrastructure to handle a large number of infected patients. Medical documents siloed into the healthcare system does not help us either. What the need of the hour is to build an infrastructure that can handle infected patients, update our lab testing equipment, and slowly build robust contact tracing mechanisms for the future.
But it still leaves us with the medical document stuck in the silos of the hospital and labs. With advancements in Artificial Intelligence and Machine Learning, we can glean information from these documents too, and predict the outbreak of diseases in the future, if and only if these medical documents are aggregated and made available to researchers for such studies.
In this blog, we are going to briefly discuss what we can do with the medical documents to prepare ourselves for such outbreaks in the future.
But first, a few things about the Medical Documents.
When we think about medical documents, the first thing that comes to our mind is a prescription from a doctor. But others also come under medical documents like CT Scan, MRI, Lab reports of all kinds.
Why do we care about these documents? Are we going to gather information from these documents that can help humanity on a large scale?
The answer to the above question is – Individual documents do not give much information to extrapolate for a larger scale, but once aggregated can help us in numerous ways. It can help us find out how a communicable disease is spreading in the community. It can help us find the covid-19 patient cluster. It can help us strategize on the supply-chain of the covid-19 vaccine – since the demand for the vaccine is very high throughout the world, but the supply of the vaccine is going to be scarce in the initial days of the release of the vaccine. But to do so, we need to understand the covid-19 clusters in advance.
Now, once we understand the need to aggregate all the medical documents, we also need to understand the format in which these medical documents come in because this will help us with the storage of these documents and different compliance associated with these medical documents. A doctor’s prescription comes in text format, but CT Scan, MRIs are images that come in DICOM format, whereas Lab reports are in different formats like HL7, HL7v2, or CSV, and we can convert them into FHIR format if needed.
What to do with these Medical Documents?
In a recent project, we aggregated all the lab reports from different labs to find out how many tests (of all kinds) are done by individual labs, how many of these tests are covid-19 tests, and how many of them are covid positive. It is one of the best use-cases in the current scenario because it helps us strategize about the covid-19 vaccine distribution. It also helps us with the lab’s efficiency and the steps needed to modernize these labs.
So, the solution is to build a user-interface (UI) and expose the REST service so that the labs can upload/send their lab reports to our system. Once consumed, we bring these reports on Google Cloud Platform(GCP) and store them in the corresponding datastore of Google Healthcare Solution like HL7v2 messages get stored into HL7v2 Datastore, CSV gets stored in the Google Cloud Storage(GCS), and all access to these GCP components is through access-controls and service accounts.
You might be thinking – Why GCP? What about the privacy or security of the data? What about HIPPA, HITRUST compliance?
Google Cloud Platform is one of the most innovative cloud platforms, with expertise in healthcare solutions. All the products on Google Cloud Platform inherit Google’s privacy and security policy. Nonetheless, GCP has all the required compliance to provide solutions in the healthcare domain like HIPAA compliance, ISO/IEC 27001, ISO/IEC 27017, ISO/IEC 27018, HITRUST CSF certification; and if any other certification or compliance is missing in the above list, I am sure Google is working on it. So, we are good concerning privacy, security, and compliance.
GCP is also known for its machine learning capabilities, which we can leverage for the text documents as well as images. We can use machine learning to build solutions which can predict tumor in the brain to hair-line fracture in bones to cataract in the eyes, case in point, Google’s AI arm, Google Brain has trained image recognition algorithms to detect signs of diabetes-related eye disease roughly as well as human experts. The software examines photos of a patient’s retina to spot tiny aneurysms indicating the early stages of a condition called diabetic retinopathy, which causes blindness if untreated.
Future of Medical Documents at SpringML
Machine learning, if done on medical documents, opens the doors for endless possibilities. If we look for the best products in the healthcare domain, in one way or another, it is using machine learning. But it is only possible because some of the documents are available for specific research purposes, enabling different teams to come up with a myriad of products. We, at SpringML, specialize in Artificial Intelligence and Machine Learning. We have been releasing products in these domains for quite some time, be it document processing for text documents to image processing for images or video processing for videos. One of our products, Patient 360, is a living example of our expertise in the healthcare domain.
At last, I hope this blog gives you some closure about the medical documents in aggregation and the opportunities that come with it. With SpringML by your side, you will have a strategic advantage over the products available in the market.