29 January 2021

Does the Federated Learning of Cohorts (FLoC), spell the death of third party cookies?

Header Image

A few days ago, Google announced an update on their efforts to curtail the use of third party cookies. Their reasoning? Building a better advertising ecosystem for the web with end-user privacy at the forefront of these efforts. With many online advertising products and platforms built around third party cookies use and data collection they have enabled, you may be wondering, why the need for change?

The answer is a complex mix of governmental regulatory action, high profile data leaks and a growing concern and unease from end users around how they are being targeted, monitored and tracked as they move across the web.

Quick background:

Louder has been commenting on these changes throughout 2020. In July we discussed Apple’s ITP changes, in September we posted Safari Tightens the ITP screws again and then in October we published, Apple Just Made 10 Louder as they ramped up efforts to protect user privacy.

We focused on Apple because, up until recently, they were the most aggressive of the big tech firms when it came to enforcing user privacy policy and protecting their customers. We could even speculate that their introduction of ITP (Intelligent Tracking Prevention) in 2017 was the catalyst of the current industry pivot.

Since the initial ITP launch, there has been an ongoing cat and mouse game between various big tech firms who have tried to adapt to these changes by patching over the cracks and trying to address data gaps and limitations on an ad-hoc basis. But there was still a reluctance to give up the use of third party cookies.

That paradigm started to shift in early 2020, however, when bigger industry players such as Google acknowledged that change is inevitable, and a post back in January 2020 describes building a web where third party cookies are all but obsolete.

Wait a minute, are third party cookies bad?

The short answer is no, the long answer is… it’s complicated. Like any tool, third party cookies are perfectly valid and not nefarious on their own. It’s the context in which they are used or misused that has brought about all this hoo-ha.

You see, third party cookies are used by ad-tech vendors to monitor user behaviour, not just on a single website or property, but often across vast numbers of web-properties (and in some instances devices too). In doing so, significant user profiles have been created with this data, which in turn has made targeted advertising very effective and profitable, but it’s this ability to monitor user activity across multiple properties that causes the biggest concern.

Unlike first-party cookies where tracking activity is ring-fenced to a specific domain or business, third party cookies are able to be used to build significant profiles on what users are or aren’t doing. For example: what they were viewing, what they click on, what they were buying (or not buying) and the frequency and times in which they performed these tasks.

But perhaps the bigger concern was how that collected data was then being used or misused once collected. Often these significant cookie pools and user profile data were being sold off to the highest bidder, whilst the user was none the wiser (unless they had spent the required hours pouring over various T&C documents - probably not practical).

In effect this has caused an ad tech user profile big data arms race - where each firm has been desperate to collect more and more personal details about you. After all advertisers want to know as much about the users they target and have demanded to know more before unleashing their budgets on the respective platforms. Not to mention most of the big ad tech vendors have been more than happy to share this so called honey pot of information because it makes for very effective advertising and targeting, which ultimately has translated into more revenue and quarterly profit growth for the platform owners.

But there’s a growing backlash

Historically the most savvy tech professionals have often been aware of the potential for their data to be exploited to some degree. These professionals often had a mitigation strategy in place, such as firewall configuration to block requests to known offenders, blocking third party cookies, installing ad-blocker tools etc. But, while those of us savvy enough to “defend” ourselves and “our data” may have felt better about the fact that these big ad tech vendors would only have limited insight into who we are and what ads make us tick, the challenge has been educating the less tech savvy individuals to be able to do the same.

Enter Regulatory Enforcement

Revelations from the likes of Edward Snowden and the NSA along with Facebook’s Cambridge Analytica scandal were high profile privacy exploits that kicked off their own round of regulatory responses and debates in various geographic regions, mainstream media and political circles.

However, even if initiatives such as the GDPR or COPPA had already begun to force wider industry changes to privacy and respect of user data before these breaking stories, suddenly the user privacy concern had shifted from the fringes (where the IT savvy geeks were making noise) and was placed front and centre in the minds of the average Internet user. A small, but noisy, cohort of users who rejected what was happening were easier to ignore previously, but now you had a loud chorus to users demanding bigger reform and control over how their data was being shared, used and consumed.

Locally in Australia, which up until recently has been lagging behind on the user privacy front compared with other parts of the world (Europe and the USA in particular), is now in the process of reviewing its 1988 privacy act and how it pertains to online activity. This has been driven, in part, by annoyed citizens writing to local political representatives stating that they want something done to protect their privacy and prevent large tech firm overreach.

The existing framework and laws have been criticised as insufficient for dealing with an Internet based world. So far the submissions received are vast and include many existing industry players and consultants. Whatever the outcome, it’s fairly safe to assume that changes will be coming soon!

Privacy now matters especially if you care about profit!

Perhaps the most recent high profile end user revolt was from Facebook’s WhatsApp privacy policy updates. Originally scheduled to take effect on the 8th February 2021, that deadline has now been pushed back to the 15th May 2021 in the hopes of limiting user loss. The change, which would give Facebook more opportunity to exploit WhatsApp user data, has resulted in the growth of less popular platforms such as Telegram and Signal, with both experiencing huge surges in installs.

Other browser developers are taking action

Whilst law makers are busy debating how best to protect the user privacy of their citizens in an online world, other browser developers are following Apple in efforts to quash exploitation of user privacy through manipulation of existing browser features. Firefox, for example, recently announced steps to curtail Supercookies, while Google is cutting off Chromium based derivative browsers from storing user data on Google’s servers (potentially creating a data security hole).

The key theme here is that walls around data are quickly being erected and big tech firms need a way to continue to deliver value to their advertisers, while also walking the tight rope of respecting end user privacy.

Enter Google’s recent proposal, Federated Learning of Cohorts (FLoC).

So, what the FLoC is this all about then?

Now that you’ve got more context on broader industry changes, we can discuss how Google is attempting to pivot away from its reliance on third party cookies. Their recent announcement on the privacy sandbox is an attempt build similar functionality for advertisers as before without the need to share individual user-level data.

How does it work?

Federated Learning of Cohorts (FLoC) proposes to cluster groups of users based on similar interests. The idea is that your data is mixed in with a crowd of other users data to make fingerprinting more difficult if not impossible. The key to this working is that cohort IDs don’t need to be assigned by a central server, rather they are defined in the end client device (your browser) which uses an algorithm that takes into account various signals such as your prior web browsing activity (URLs) to determine which cohort ID to assign to you.

Since more than one user (often thousands) are likely to all share the same cohort ID, it’s not an effective method of identifying individual users or devices (fingerprinting).

So advertisers can still target ads to you based on prior interest, without revealing who you are in granular detail the way that third party cookies may have enabled.

But what about targeting advertising to you based on prior actions such as conversions or the likelihood to convert? It’s a similar approach to interest based targeting. Google is proposing to use aggregate level data to mask individual activity, but in the case of conversions tracking they propose introducing what they call “noise” into the dataset to prevent individual identification and instead allow machine learning techniques to help build relevant audiences to target. For what it’s worth, Google has not yet defined what they determine to be “noise”, but they have provided a whitepaper on how these algorithms work for those curious.

Keep in mind that much of this is still in a state of flux and subject to feedback and input from multiple parties, so it is likely to evolve and change between now and the end of 2021.

So how effective is it?

At the moment the only data on how effective these techniques are at replicating existing third party cookie tracking techniques is the data provided by Google. A lot of these algorithms or techniques are still proposals and not widely adopted across the advertising community yet.

However Google has stated that initial studies show these new anonymised techniques to be 95% as effective as existing targeting tools, which are based on third party cookie data.

But all this is not without criticism

The UK Competition and Marketing Authority (CMS) has already expressed concern that these proposed changes could allow Google to abuse its online dominance further:

“As the CMA found in its recent market study, Google’s Privacy Sandbox proposals will potentially have a very significant impact on publishers like newspapers, and the digital advertising market,” said Andrea Coscelli, chief executive of the CMA.

“There are also privacy concerns to consider, which is why we will continue to work with the Information Commissioner’s Office as we progress this investigation, while also engaging directly with Google and other market participants about our concerns.”

You can see how they arrived at this view point. When you consider that Google is able to pull this off due to the sheer volume of users it has at its disposal, it can more easily infer trends on user cohorts that other smaller publishers or advertising platforms would not be able to do so readily.

Of course, other big players like Facebook and their 1B users will likely to be able to make similar inference in ways that smaller players cannot. At this stage, Facebook has not shared information on how they plan to deal with third party cookies going away, so we can only speculate that they will follow a similar route to the path Google is taking at this stage.

What are the next steps?

Google’s 2023 target to phase out cookies is approaching quickly, so they are ambitiously looking for a viable alternative to the existing reliance on third party cookies. Whether or not the privacy sandbox will prove as effective as they suggest remains to be tested and we believe Google’s priority will be a roll out of these sandbox features in 2021. However in the interim, industry wide discussions will be happening via the W3C’s improving the web advertising business group.

Louder can help you navigate the chasm forming below

At Louder, our 2021 objective is to keep our finger on the pulse of these industry wide shifts, continue to document the changes and call it as we see it. If you would like help navigating your business through these challenges and be ready for these broad sweeping changes, be sure to get in contact with us. We’d be glad to help.

Resources and further discussion:

About Gavin Doolan

Gavin specialises in web analytics technology and integration. In his spare time, he enjoys restoring vintage cars, gardening, spending time with the family and walking his dog, Datsun.