Efficiently Identifying PII in Unstructured Data Using Analytics
The Growing Problem with PII
In today’s world, driven by the use of big data and online transactions, the use of PII (Personal Identifiable Information) has exploded, which has created major concerns with consumers about the privacy of their data. These concerns have translated into a number of regulations enacted into law by various regulatory bodies around the world. The amount of data that can be classified as PII is fairly broad and can generally defined as the following:
Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. Further, PII is defined as information: (i) that directly identifies an individual (e.g., name, address, social security number or other identifying number or code, telephone number, email address, etc.) or (ii) by which an agency intends to identify specific individuals in conjunction with other data elements, i.e., indirect identification.
U.S. Department of Labor
PII-related regulations may be driven at a state, national, and international level, with each state and country issuing their own rules regarding the handling of PII data and where that data can be stored (sovereignty laws, and while PII privacy laws and regulations are written for all businesses within a jurisdiction, some are written for specific industries.