Detect person names in text: Part 2 (Technical)

Jan ProcházkaDeep Learning, Personal Data, Whitepaper

Neural network architecture

In Detect Person Names in Text: Part 1 (Results), we benchmarked our new named entity recognizer (NER) against popular open source alternatives, such as Stanford NER, Stanza and SpaCy. Today we dig a little deeper into the NER architecture and technical details. First, recall our main NER objectives. In short, we require our NER to be practical, rather than just … Read More

Detect person names in text: Part 1 (Results)

Jan ProcházkaDeep Learning, Personal Data, Whitepaper

F1 scores for different software doing personal name detection.

Detecting people’s names is part and parcel of PII discovery. Traditional techniques like regexps and keywords don’t work, because the set of all names is too varied. How do open source Named Entity Recognition (NER) engines compare, and can we do better? This Part 1 has NER results and benchmarks. There’s also Part 2 with technical neural network details. Developing … Read More

How to evaluate PII discovery software

Radim ŘehůřekDeep Learning, Personal Data

So, you’re considering buying software for discovery of PII / PCI / PHI. Or about to start your trial of PII Tools. How to test discovery SW properly? Don’ts: Careful what you test for Consider the following “passport”: Why won’t PII Tools detect the “PII” in this passport scan? This is an actual file submitted to our support team during … Read More

Finding Affected Persons in a Data Breach

Radim ŘehůřekData breach, Personal Data

There was a data breach, the clock starts ticking. The dataset is large. How do you quickly determine who’s affected and how? Who’s Data Was Breached? Manual discovery of sensitive information is tedious and costly, so automated solutions like PII Tools come in handy. In its latest 3.7.0 release, we implemented new features in PII Tools to support breach workflows. … Read More

Exclude PII / PCI / PHI From a Breach Report

Radim ŘehůřekData breach, Personal Data

When responding to a breach incident, having a clear idea who’s affected and how is a matter of urgency. Manual discovery of PII information is tedious and costly, so automated solutions come in handy. But how to deal with false positives? PII Exclusions One typical task during a data review is removing unwanted data instances. PII Tools already automates PII … Read More

How To Automate Personal Data Discovery

Radim ŘehůřekCCPA, Relativity

The recent wave of privacy legislations around the world introduced new challenges to experts in litigation support, incident response and auditing. How can modern automation help with reliable PII discovery across emails, files, and databases? 3 Reasons Keywords Fail Traditional approaches based on manually defined keywords and regular expressions fail for three fundamental reasons: High cost. Keywords and regexps are … Read More

Export PII drill-down reports

Radim ŘehůřekPersonal Data Protection

In the latest February release (version 2.4.0), we combined Personal Data Analytics search with dynamic HTML report generation to make GDPR compliance and auditing easier. PII Tools already supports dynamic PII Analytics queries to locate personal, sensitive and intimate information. It also generates drill-down PII audit reports for completed data scans. In the latest release, we combined these two capabilities. You can now … Read More

Personal Data Analytics

Radim ŘehůřekDeep Learning, Personal Data Protection

The latest 2.0 release of PII Tools brings a brand new SAR dashboard, allowing targeted personal data search, filtering and analytics. Search for all files on John that contain any financial information, while restricting the results to CRITICAL severity only. Locating Information for Data Subject Access Requests What if you want to find all data related to a specific person … Read More