Free Lunch?

Data Mining Our Personal Information
By Don Denman



‘There is no free lunch’...an old adage which many stamp collectors have understood for years. But for some reason, hobbyists seem to suspend this common sense understanding when it comes to websites. Many think that if they are not submitting a payment or subscription fee that the use of the website is free. But in the case of the larger websites, this is simply not true. Companies do not invest significant amounts of cash and resources out of the goodness of their hearts. Companies exist to generate revenue and in this internet age how they generate revenue has become much less obvious than in years past. In many cases, the ‘product’ these companies are peddling is our data and personal information.

The tip of the ‘data ownership’ iceberg was previously uncovered in an infamous chapter of philatelic forum history. One philatelic forum, administered by Lloyd A. de Vries, was popular for a number of years before running afoul of data ownership and usage. The forum was established by Lloyd using a ‘free’ forum service offered by Delphi; as the forum popularity grew, members supplied more and more email addresses and content.

After some time, Lloyd was offered a job with a major stamp company who viewed part of his value as including being able to deliver a large base of forum users and their years of posted content. But Lloyd and his new employer quickly found out that Delphi aggressively defended their data ownership. Legal actions were threatened, Lloyd immediately lost his administrator permissions on the forum, and his new employer got none of the Delphi content. Delphi had invested large amounts of time and money in building the website infrastructure needed to host ‘free’ forums and felt they own the data.

But in today’s online world, examples of data ownership are often not nearly as obvious as the Delphi example; too many hobbyists look only for a direct relationship between a free website and revenue. They think, ‘I have not received any spam emails after submitting my email address so this website is ok’. It is not this simple or straight forward. Online companies have become extremely sophisticated at data mining and many based almost their entire revenue stream upon it. Data mining turns your personal information into a product.

What is data mining? Data mining is the process of analyzing large amounts of data for new visualization or insights. Data mining typically has six stages; trend detection, modelling, clustering, classification, and finally report generation. By seeking patterns and associations in the data sets, new information can emerge. And this new information can be productized.

Data mining can be illustrated with a Facebook example. Say you start a free Facebook account and profile but think to yourself, ‘I am not going to post any personal information on my Facebook page’. You make your page ‘private’ and post no personal information on it, you only use it to ‘friend’ your acquaintances and a few family members. Facebook can still data mine this information, noting that the majority of those you have friended are stamp collectors. If Amos, SG, or APS comes along and is willing to buy a list of those who are stamp collectors, your name will be on the list even though you never posted anything on your page.

And of course, data mining becomes more and more effective when you increase the total amount of data. A good example of this is Google and YouTube (Google owns YouTube and they combine their data sets into a single massive database). So if you have a free Google Gmail account, use Google as a free search engine, and occasionally watch free YouTube videos; they can data mine from all three data sets to form a very valuable profile on you.

So collectors should keep this in mind when using and recommending free websites. The more a company has invested in a website, the more they will be driven to get returns on their money. Personal websites (like this one) and websites which request no personal information at all are less likely to be productizing your personal information.

As recent news reveals, data mining has also raced ahead of our laws. Facebook and other online ‘free’ websites are using our personal data as their products and privacy laws have not kept pace. Over the next few years I expect that additional transparency will be demanded including the ability to opt out of much of the datamining and sharing. The increased transparency and oversight will drive these companies to add more human resources since this kind of effort cannot be done programmatically. (Facebook has already announced it will double its 10,000-person safety and security staff by end of 2018.)

So if these companies are incurring more cost while also experiencing a reduction of those who participate in the generation of saleable data, the hand writing is on the wall. The online services which were previously offered for free will be greatly reduced and new paid services will replace them.

There is no free lunch.