simMachines


Sample record:

Query

Results


Privacy of sensitive/confidential data is a prime requirement in industries such as Healthcare, Insurance, Government, etc. Various privacy laws/acts (such as HIPAA, FRCA, ECPA, etc) have been enforced to prohibit the disclosure or misuse of such information.

Safety of our customer's data is essential to us and we at simMachines have developed Machine Learning techniques that allow fast and flexible searches to be performed on such sensitive data without risking any disclosure. Building on top of our proprietary querying algorithm, we are also able to perform classification and clustering, while adhering to all the privacy laws in place.

The following demo aims to illustrate our ability to sift through Medical data and find a specific person's record from different databases (discern and omit errors and typographical mistakes in their particulars across various platforms) and yet ensure that the search is performed in a HIPAA-Compliant fashion. This is achieved by converting each record into a fixed size binary string. This one-way conversion is designed in such a manner that sensitive information cannot be recreated/retrieved.

Please note that only for the purposes of this demo, we are displaying the original text in the query results, whereas in the real application, such information would not be revealed. To start the demo, please select a record from the list shown above. The record will be converted to bits and the 10 closest records matching the search will be used to predict its class. Matched records are highlighted in "Green".

Tech-Dive

The methodology behind this demo is that each record is stored as a binary string with its human-readable class. Our proprietary similarity engine computes the distance between each "encoded" record and if two binary strings are "similar", it is inferred that the respective records belonging to the binary strings are also alike. Despite the fact that the user can't access the original data, the obtained query results will still include the justification of the predicted record class.