Best Practices for Securing PII: Why is PII important if it’s publicly available?

By Scott McKelvey, VP of Technology
May 4, 2023

The terms “Personal Information” (PI) and “Personally Identifiable Information” (PII) are widely known, and many companies now take great care to protect this data for their customers. WebCE is adamant about data privacy (here’s our policy). One of the key best practices that we strive to follow is to secure PII both “at rest” (through database or server encryption) as well as “in transit” (via HTTPS on our site or encrypted file transfers).

This policy is obvious for confidential information such as Social Security Number (SSN), but as we work with individual customers, corporate clients, and regulatory agencies, there’s one data privacy question that comes up more than any other: “Why bother protecting PII that is already available to the public?”

“Why bother protecting PII that is already available to the public?”

The first phonebook was published in 1878, and today you can find identifiable personal information in an instant. PII could be individual names and phone numbers, job titles on LinkedIn, email addresses from ZoomInfo, or anything personal that has been published on the internet.

In financial services and other government-regulated industries, this question most commonly comes up related to state-issued License Numbers or the National Producer Number (NPN). Each of these are used to clearly identify a specific person, meaning that they are unquestionably “PII” by definition. That said, they aren’t confidential or sensitive. In fact, they are often published on state or association websites or available to the public through simple lookup portals. With the data so easily accessible and not confidential, why would anyone care to protect it in uploads, emails, or Excel reports?

The answer is this: The public PII is not sensitive data, but by protecting the “identifiable” components you automatically protect the entire data record. That is, all the other personal data associated with the user which is not available publicly.

By protecting the “identifiable” components you automatically protect the entire data record

For example, here’s the smallest amount of data needed to report two student course completions:

LicenseNumber

LastName

Grade

CompletionDate

CourseNumber

111

Smith

70

2/28/2023

4556

222

Johnson

100

2/11/2023

6554

 

The personally identifiable (PII) pieces are LicenseNumber and LastName. Both of those might be published openly on the state website. The grade value of “70” is just a number and not identifiable at all, nor are the arbitrary completion date field or course approval number. These are not PII. However, when combined in a single row, they form a “personal data record” because they tell us something done by a specific person. It is no longer just “grade,” but rather “Mr. Smith’s grade.” If we allow this record to be transmitted unprotected because some of the information is public then we also expose the other personal data to hackers or any man-in-the-middle who might be involved.

But who cares about that other information if it’s not identifiable? We may not know, but we must assume that both Mr. Smith and Mrs. Johnson care about their data. Since this data directly describes their activity and behavior, they are the only ones who should decide where and when it is made public. As a business, we are the stewards of this information and need to protect it to the best of our ability while it is in our care.

“We are the stewards of this information and need to protect it to the best of our ability”

Maybe Mr. Smith doesn’t want his boss to know that he barely passed with a 70, that he waited until the last day of the month, or that he completes the same course every year. Maybe Ms. Johnson doesn’t want everyone to know she was taking courses on a Saturday. What if she works for a competitor and doesn’t want her coworkers to know she’s taking courses from WebCE at all? In that case, even just the fact that she has an account in our system could be valuable information worth protecting.

We don’t get to judge for other people the importance of their data or choose for them which data is put at risk. We need to protect it all: the entire data record. As soon as identifiable components are involved, the record must be protected regardless of the publicness of those components. It’s the connection to the PII that triggers the need to secure and encrypt the entire package, both in transit (inbound or outbound) and at rest.

WebCE is regularly reviewing our practices and pushing our industries for more secure data handling across all channels. Together we can create a secure environment for all student data throughout the licensing and education process.