Top Features to Look For When Comparing Data Discovery Tools

Matyáš VejskalData Discovery, Personal Data, Sensitive Data Discovery Tool, Uncategorized

Comparing features of data discovery tools is a complex task that can easily turn into a nightmare. Here is a shortlist of the most important features to take into account when selecting the best option for your business, so you won’t get caught in the net of unimportant details and technical specifications.

The main purpose of data discovery tools is to help enterprise executives, such as CTOs, CIOs, or data analysts, make their decisions based on big data, gleaning value from endless and varied data sets. And that’s why there is a wide range of tools for data discovery helping collect personally identifiable information (PII) from various sources; enabling users to detect patterns and trends. For illustration, check out how manual document reviews can be effectively replaced by an automated tool for personal data identification.

Choosing the right tool that will suit your company can’t happen without proper research. Tools are designed for specific objectives or use cases, offering different levels of security, precision, and efficiency. How to quickly identify and compare the quality tools suitable for you? To make your work easier, let’s go through the essential features you should look for.

Compare the most important features

1. Clear Mission & Pricing

Some tools on the market are aimed exclusively at the enterprise – expensive to implement and maintain, complex to use. For those who don’t have teams of experts to understand and operate such robust solutions, there are lighter alternatives, laser-focused on providing value to clients of any size.

The best tools know and clearly communicate their mission and scope: what they do, but also what they don’t do, so that customers know what to expect. As opposed to piles of marketing-speak; do cliches like “privacy is essential” or “we care deeply about your privacy” inspire your confidence? Practical tools put emphasis on the ease of implementation and use.

Finally, there’s one quick rule of thumb: the price of such tools shouldn’t be a mystery. The ability to evaluate the software on your own datasets before purchase, for free, goes without saying.

2. Self-hosted or SaaS

Private and personal data processing is a sensitive matter often requiring the top level of security. The self-hosted option requires a little extra work, but you are rewarded by full control of PII data security as the information is stored on your servers, or in your own private cloud.

On the other hand, the SaaS model is much quicker to implement and cheaper to maintain for customers, at the cost of a loss of control.

No matter which model you choose, a data discovery tool should be able to guarantee that personal and sensitive data remain safe and private at all times.

3. Code or no-code?

The easier the installation and implementation, the better for you. Some tools require a lot of coding which makes it difficult to start as well as maintain. Modern best-in-breed solutions come as a turn-key installation using a virtual image, without writing a single line of code. 

Standard code-heavy implementation can cost as much as the whole license, so be sure to check this line item twice.

4. Technologies & Formats

People often try to compare features they don’t fully understand. Yes, it may be important how many formats are supported by the tool, but there are more relevant questions to ask.

  • Can the tool analyze unstructured data from images?
  • Is the tool only capable of reading text from images (OCR), or does it analyze the visual content of each image for faces, passports, etc.?
  • Can it handle both structured (Excel, SQL, database…) and unstructured (PDF, Word…) data?

Most software claims to be “AI” these days, but what matters more is whether their technology can bring you any real benefit. Think about what you can actually use. For example, only a few solutions offer functionality for detecting faces, passports, and other PII in images via machine learning algorithms. For companies in the travel, healthcare, or public sectors, such functionality is critical. On the other hand, if all you need to scan are Oracle databases, there’s no need for AI for images.

5. Integrated Visual Analytics

To squeeze the most benefit out of discovery, you should look for a tool that offers an intuitive way to visualize the risk in your data inventory and also answer ad-hoc PII data queries, without coding. By ad-hoc, we mean queries like:

  • “Show me all the locations where we keep information on John Smith” (for subject access requests),
  • “Show me all high-risk files and emails” (for remediation or breach incident investigations),
  • or “Show me all locations where we keep financial information” (for PCI-DSS compliance and auditing).

Integrated analytics that allow you to get concrete insights on the risk inherent in your data should be high on your list of required features, as that’s the only way to get an actionable output beyond simple discovery.

6. Drill-down Reports

Automated report generation is a common add-on that helps companies drill down from an executive “bird’s-eye view” all the way down to the file level, or even the exact position within each file (spreadsheet cell, text offset) where a PII instance was detected.

Some data discovery tools come without reports, leaving it to you to interpret and assemble the discovered data, to effectively create your own reports. Other tools come with this feature included, so you don’t need to pay for developers.

What’s next?

When you have a shortlist of 2-3 best-fitting data discovery tools, put them to the test. Evaluating the solutions on your own data is the most efficient way to understand the strengths and benefits of each tool, when done properly. Avoiding the most common “don’ts” might save you quite a bit of time.

Why PII Tools Rocks

We consider it a professional courtesy not to let you go without at least a tiny sample of our attitude towards data discovery. PII Tools is our product for sensitive data discovery, and it’s packed with advanced features rarely offered by any other players on the market. At least not with an incredibly simple, 30-minute implementation and super intuitive, no-code user interface. That’s where PII Tools shines and gives a powerful tool to even non-technical roles within the company, such as DPOs, lawyers, and auditors.

Unlike all-in-all GRC platforms that often try to spread themselves too thin, we are focused on providing a specialized solution for sensitive data discovery, analytics, and remediation. This means that PII Tools won’t help you with activities like cookie consent tracking or policy management because they are simply out of our scope.

With those point aside, our solution offers all of the above-mentioned features plus a few more:

  • Incredibly simple implementation
  • Self-hosted and Cloud SaaS
  • Structured & unstructured data
  • OCR in different languages, rotated documents
  • 400+ file formats support
  • Discovery mode (identify PII for GDPR, PCI-DSS, CCPA, HIPAA, LGPD, etc.)
  • Remediation mode (PII quarantine, erasure, etc.)
  • Intuitive visual analytics & Interactive reports

To discover the power of PII Tools, take its product tour, or request a FREE DEMO.