Linear Algebra — Can It Be a Cost Effective Protection for HL7 Traffic?

For beginners, HL7 stands for Health Level 7 standards or set of standards for transferring clinical and administrative data in a healthcare organization.  

One of the principal advantages of the HL7 and probably the reason for its wider acceptance is the ease of integration as it allows disparate system such as patient admission, insurance, billing , labs and other patient care support systems to communicate using a common language and thereby improving efficiency, helping healthcare sector to provide quality patient care.

Arguably, it can be called as the most critical ‘operational circuitry’ in Healthcare, carrying some of the most sensitive data.

Though it is one of highly prevalent communication standards within healthcare, it is beset with some serious security flaws.

Starting with, it is a simple, clear-text messaging protocol with no fixed /assigned ports (though vendors may have standardized it if they own a list of reserved ports).  In addition, the standard also lacks authentication and therefore it is highly probable that a rogue system can connect to an HL7 port. Encryption is left to lower layers in the stack and not enforced. It was probably written in the time when trust was implied, and communicating system talking HL7 were manageable and within the limits of an enterprise.  

 Typical attacks that can severely compromise an HL7 system are Denial of Service attack, MITM (Man-in-the-Middle) attacks. Executing these attacks are straight forward and can be achieved relatively easily using techniques like ARP poisoning or a simple for-loop python script spawning multiple TCP connection to a receiving HL7 port. HL7 interface engines are most vulnerable to these attacks as some default settings like “keep connection open” is set to ‘yes’ and receive timeouts are disabled.

However, the goal of this blog is not too ornate on HL7 flaws and their impacts (as enough studies and papers are available on the web providing detailed description of its existing limitations), but to find practical cost-effective remedies to secure HL7 network from bad actors.  

This where linear algebra comes handy.

We have learned in school that any distortion function with minimization objective using unlabeled feature vectors can help us find optimal coherent subsets. These subsets can be used to define behaviors with closer approximations.  Any disproportionate deviation from these subsets can be flagged as anomalous. Moreover, for unlabeled data set with large dimensions, PCA (Principal Component Analysis) can be used to reduce representational space for faster computing.

Linear algebra principles as described above when used with machine learning algorithms, can easily detect acute behavioral anomalies in the HL7 network such as ARP poisoning, packet modification which can often go unnoticed by traditional security tools. Reason being, when threat vectors are multi-dimensional with large feature sets, previously unlearned, and often spanning across different layers of an OSI stack, traditional detection mechanisms fall short in sensing anomalous behavioral patterns.

Adding to that, advanced deep learning techniques such as one-hot vectors and encoder network can provide additional layer of security for HL7 traffic.