Building a better community through user agents
Podcasting is brilliantly different. Unlike its digital predecessor, there’s unparalleled collaboration. The Open Podcast Analytics Workgroup (OPAWG) shows that our space can come together to better document and improve an area that takes considerable effort to manage: user agents.
So what is a device user agent?
We talk a lot about user agents, so let’s explain what we mean.
When I load up Safari on my iPhone X running iOS 14 and check my user agent, this is what I see:
Mozilla/5.0 (iPhone; CPU iPhone OS 14_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Mobile/15E148 Safari/604.1
Depending on the app, be it browser, mail client, or podcast player, the user agent might be completely different. User agents were built to ensure compatibility, enabling a server to take into account the environment it’s responding to. In ad tech, user agents add a whole additional layer of identification as well.
Just receiving that short bit of information above, you can tell quite a bit about my phone. Services like DeviceAtlas provide detailed breakdowns of what you can find out from most user agents.
What do user agents look like in podcasting?
Most podcast consumption is through an app, not a web browser, reducing the level of information you can get from it.
User agents for apps can be as succinct as:
Overcast/3.0 (+http://overcast.fm/; iOS podcast app)
As raw as:
PodcastAddict/v2 - Dalvik/2.1.0 (Linux; U; Android 9; SM-N950U Build/PPR1.180610.011)
Or as vague as:
okhttp
Large commercial services like DeviceAtlas are focused heavily on user agent detection from web browsers, not podcast apps. It’s hard enough for those of us invested in identifying user agents correctly to figure out what app the request is coming from, let alone a third party that probably can’t make a lot of money off our industry.
The most commonly received and important pieces of information you can interpret from a podcast user agent are:
- Name of the app (eg Apple, Spotify, Chrome)
- Type of the device (eg Phone, Tablet, Speaker)
- Operating System (eg iOS, Android, Windows)
- Is it a known bot (yes/no)
But distilling the user agent correctly into those four pieces of information is not so clear cut.
How are they used?
App, device, and operating system are the basis for most of the reporting we see in our space. While understanding what the breakdown of your audience is compared to the industry norms can give you unique insight into your listenership, we could do so much more with it.
Larger hosting platforms could tie together the knowledge of the devices within a digital household, skipping some of the need for using an external device graph. From there, they could do clever things such as targeting ads to any household that has a smart speaker, not just to streaming plays specifically on one.
It could even allow them to identify households that are using multiple podcasting apps. We have the information to understand if people are fully moving to Spotify and abandoning their old apps to listen to exclusives or straddling between two.
Those pieces of information are also coveted for targeting. While definitely less focused on in podcasting than digital, there’s still value for an iOS-only service to not waste impressions on Android devices.
But the fourth value, identifying if it’s a bot, provides a significant amount of value by saving you time and money.
Bots aren’t bad and we absolutely want to provide space for them to exist. They enable companies like Magellan AI to identify what advertisers are spending the most in our space. Apple and Google use bots to download and transcribe full episodes for indexing. Spotify uses bots to cache episodes. Allowing bots to operate like a user but preventing them from impacting reporting is a pretty critical part of using user agent.
IABv2 certification does provide best practices for filtering out bots, but it doesn’t provide a detailed and updated list to be followed.
A better solution: Open Podcast Analytics Working Group
Founded in March 2019 by Mark Steadman of Podiant, Open Podcast Analytics Working Group (OPAWG) launched with the goal of providing a framework for analytics and sharing of data between companies in the podcast ecosystem.
Buy in from Dan Benjamin of Fireside and James Cridland of Podnews helped immensely, with the most actionable part of the project being the large user agent database. But there’s a lot more to it than that and we as a community can help this budding group go a lot further.
OPAWG is a free and open source resource with over 22 active users on Github and many more on their slack channel documenting device user agents, RSS feed user agents, hosting urls, and prefix tracking urls. These fantastic individuals dig into the values listed above whenever any new app, host, or tech company enters the space.
Instead of your company going at it alone, trying to make sense of every single value they receive, by participating in OPAWG, you have access to an active, thriving space that documents every new value they come across. One that you can and should participate in, too.
It’s not uncommon for podcast hosting, tracking, or analytics companies to build their own solutions to handle this process themselves. While a valiant effort, it creates more problems than benefit.
With no uniformity, comparing data between partners can become rough. It also can mean that updates only happen when issues are reported.
Imagine a new bot for a tracking company makes a series of single download requests for a new clients entire back catalogue. If the host doesn’t have that bot flagged properly, hello hundreds or thousands of IABv2 valid downloads. Even though the IP address is the same for each download, it still counts as one download per episode if the bot isn’t flagged. Now, the hosting platform needs to manually update their bot list and filter the inappropriate downloads from historical reports.
OPAWG is definitely the better solution.
I truly encourage everyone with a stake in developing or using data sources like those listed above to check out the resource directly.
Benefits of using OPAWG’s user agent repository
Every podcasting company will receive this user agent or one like it.
Podcasts/1150.47 CFNetwork/811.5.4 Darwin/16.6.0
Would it shock you to hear that’s a user agent for Apple Podcasts on iOS? Nothing in that string identifies the app or the device. If I wanted to confirm it manually, I’d need to make a test download from my iPhone, using Apple Podcasts, and read the raw user agent from my host to confirm it.
But then they also see this value come along too at incredibly high volumes:
AppleCoreMedia/1.0.0.15G77 (iPhone; U; CPU OS 11_4_1 like Mac OS X; en_us)
Do they make the assumption that this is just another variation of Apple Podcasts? Do they treat it as something different? Do I name it “AppleCoreMedia”, “Apple Core Media”, or “Apple Podcasts”?
To cover all my bases, I’d need to buy multiple devices across all the categories (phone, tablet, smart speaker, etc) and operating systems (android, iOS, windows, etc) then downloading the top 20 or more podcast players and start tracing the calls to see what user agent is being sent through when I download an episode and when I stream an episode.
That’s a lot of work.
OPAWG is a place for our community to come together, identify our own user agents and urls for the appropriate repository, and collaborate when we come across new values to properly document them.
If your company operates a podcast player, hosts shows, uses prefix urls, or runs a bot, you absolutely should confirm that OPAWG correctly lists your information. You also should make sure your user agent is following best practices in general.
If your company uses the user agent in any way, you should adopt this free resource, saving yourself a headache and improving fluidity for your customers with their data.
Device user agent is just one of the many values listed in OPAWG.
Next week we’re going to explain how OPAWG clears up the black hole that has been identifying what podcast players your listeners used on the web.
Say goodbye to seeing “Chrome” listed as a top podcast app when it’s really Spotify.
Homework
The goal of Sounds Profitable is to educate and empower each of you. If we’ve had a chance to talk directly, you know that I am truly passionate about both adtech and podcasting. We learn through asking tough questions and discussing the answers. Armed with today’s new knowledge, I want to help you ask more questions. Please consider supporting Sounds Profitable through our Patreon.
- What user agent lookup database does your hosting, analytics, or tracking provider use?
- If they have their own database:
- Is it accessible for you to review it?
- Will they accept edits and feedback?
- How often do they update it?
- What process do they follow to test for accuracy and discover new user agents?
- Why don’t they use OPAWG?
- If they use OPAWG:
- Have they confirmed their entries in OPAWG are correct?
- How often do they update their local copy?
- Are they active in improving the resource for the greater community?