Google Analytics and Privacy

Collecting web usage data through services like Google Analytics is a top priority for any library. But what about user privacy?

Most libraries (and websites for that matter) lean on Google Analytics to measure website usage and learn about how people access their online content. It’s a great tool. You can learn about where people are coming from (the geolocation of their IP addresses anyway), what devices, browsers and operating systems they are using. You can learn about how big their screen is. You can identify your top pages and much much more.

Google Analytics is really indispensable for any organization with an online presence.

But then there’s the privacy issue.

Is Google Analytics a Privacy Concern?

The question is often asked, what personal information is Google Analytics actually collecting? And then, how does this data collection jive with our organization’s privacy policies.

It turns out, as a user of Google Analytics, you’ve already agreed to publish a privacy document on your site outlining the why and what of your analytics program. So if you haven’t done so, you probably should if only for the sake of transparency.

Personally Identifiable Data

Fact is, if someone really wanted to learn about a particular person, it’s not entirely outside the realm of possibility that they could glean a limited set of personal attributes from the generally anonymized data Google Analytics collects. IP addresses can be loosely linked to people. If you wanted to, you could set up filters in Google Analytics that look at a single IP.

Of course, on the Google side, any user that is logged into their Gmail, YouTube or other Google account, is already being tracked and identified by Google. This is a broadly underappreciated fact. And it’s a critical one when it comes to how approach the question of dealing with the privacy issue.

In both the case of what your organization collects with Google Analytics and what all those web trackers, including Google’s trackers, collect, the onus falls entirely on the user.

The Internet is Public

Over the years, the Internet has become a public space and users of the Web should understand it as such. Everything you do, is recorded and seen. Companies like Google, Facebook, Mircosoft, Yahoo! and many, many others are all in the data mining business. Carriers and Internet Service Providers are also in this game. They deploy technologies in websites that identify you and then sell what your interests, shopping habits, web searches and other activities are to companies interested in selling to you. They’ve made billions on selling your data.

Ever done a search on Google and then seen ads all over the Web trying to sell you that thing you searched last week? That’s the tracking at work.

Only You Can Prevent Data Fires

The good news is that with little effort, individuals can stop most (but not all) of the data collection. Browsers like Chrome and Firefox have plugins like Ghostery, Avast and many others that will block trackers.

Google Analytics can be stopped cold by these plugins. But it won’t solve all the problems. Users also need to set up their browsers to delete cookies websites save to their browsers. And moving off of accounts provided from data mining companies “for free” like Facebook accounts, Gmail and Google.com can also help.

But you’ll never be completely anonymous. Super cookies are a thing and are very difficult to stop without breaking websites. And some trackers are required in order to load content. So sometimes you need to pay with your data to play.

Policies for Privacy Conscious Libraries

All of this means that libraries wishing to be transparent and honest about their data collection, need to also contextualize the information in the broader data mining debate.

First and foremost, we need to educate our users on what it means to go online. We need to let them know its their responsibility alone to control their own data. And we need to provide instructions on doing so.

Unfortunately, this isn’t an opt-in model. That’s too bad. It actually would be great if the world worked that way. But don’t expect the moneyed interests involved in data mining to allow the US Congress to pass anything that cuts into their bottom line. This ain’t Germany, after all.

There are ways with a little javascript to create a temporary opt-in/opt-out feature to your site. This will toggle tags added by Google Tag Manager on and off with a single click. But let’s be honest. Most people will ignore it. And if they do opt-out, it will be very easy for them to overlook everytime without a much more robust opt-in/opt-out functionality baked in to your site. But for most sites and users, this is asking alot. Meanwhile, it diverts attention from the real solution: users concerned about privacy need to protect themselves and not take a given websites word for it.

We actually do our users a service by going with the opt-out model. This underlines the larger privacy problems on the Wild Wild Web, which our sites are a part of.

Advertisements

One thought on “Google Analytics and Privacy

  1. Publishing a privacy document for the sake of transparency is indispensable and I hope, a no-brainer.

    Let’s consider the opt-in from a different perspective for a moment. Imagine a few years ahead that opt-in is highly successful and most of our library web users are in. Our conversions and goals of whether users are completing their tasks will have to shift away from analytics reporting to some other methods for measuring. Where will we turn to if our analytics data is highly unrepresentative? Qualitative research is great for understanding the what and why but not so useful at capturing how much.

    Does personalisation of services fit into the same privacy policy document? Libraries are already personalising services to some extent (my checkouts and holds, saved searches, notifications of new resources, and so on). Personalised service is bound to grow in the coming months and years. If we have an opt-in mechanism, personalisation should form part of that. Personalised service is where the richer quantitative data is too.

    I’m all for better privacy and better library user experiences but wonder how decisions can be made without reasonably accurate insights based on, at least some, quantitative measures.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s