Are there any legal issues recreating YouTube SponsorBlock for Podcasts?

I’m part of a small group of Jr Self Taught Web Developers who were recently brainstorming ideas for a Group Project App we could put together and actually create a user base.

I offered up the suggestion of a podcast application which would have the major feature of being akin to YouTube Sponsor Block, but specifically for podcast episodes.

Essentially, a user contributed database of timestamps for podcast episodes where the mention of cutting to sponsored ads or mentions of sponsorships would be marked so they could be edited out of the episode and then the user could also download said episode where ads are cut out of the final audio file.

My idea was shot down due to fears of possibly infringing on copyright and we ended up with going with another idea. I’m certainly not upset, and am actually excited with the project idea we did choose, but it did get me wondering about whether this idea actually could have legal implications.

I know specifically with YouTube there appears to be a sort of legal loophole that prevents Google from suing projects like invidious, yt-dlp, and YouTube Sponsor Block, but am unaware of the specific details as to how this works.

Thusly, I just wanted to ask if anyone has any insights into whether this project idea would incur any legal infractions from the likes of IheartRadio and other media platforms?

To be clear, I’m not seeking legal advice here, and I’ll be taking any responses with a grain of salt, but I just wanted to see if anyone knows anything on this subject and the legal concerns raised.

I very much dislike being advertised to and podcasts are one of the last bastions of media where advertisements still come up regularly and I’d love to make this application for those who are frustrated with how often they have to skip through sponsor mentions.

Thanks in advance.

tubbadu,

Other answer seems to suggest that the problem is that the same podcast can be available, depending on where and who is listening to it, with different length due to different ads injected into. Here’s my probably stupid and completely ignorant suggestion: instead of using timestamps for both begin and end of the ads segment, you could use a timestamp for the beginning, and an hash of the first part of “non-ads” segment. I’ll try to explain better:


<span style="color:#323232;">|----------------xxxxx--------------------|
</span><span style="color:#323232;">                ^     |___|
</span>

The xxx is the ads segment, the ^ is the timestamp of the beginning of the ads, the |___| is a small duration segment (for example, 0.5 seconds) right after the ads segment. The data of that segment is hashed and used as “end ads segment indicator”.

On the other device, with a different duration of the ads, you should start hashing it to find the corresponding segment.

Is this doable or did I just said a bunch of idiot things?

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Possibly doable, and definitely not a bunch of idiot things.

The beginning of the FIRST ad certainly does start at a consistent point during the podcast episode, but due to the dynamic nature of the injected targeted ads afterwards, the remaining timestamps for the beginning of the subsequent ads would be different. The hash of the audio file was suggested by another helpful person earlier, and it has piqued my interest, though it’s implementation, at least as I conceive of it currently, would be rather clunky, as it would require an ad free version of the audio file to compare it against, as well as a way (possibly a hash or some sort of audio recognition software, or AI trained to recognize ads, effectively acting as audio recognition software), to recognize exactly when the ads started and stopped by comparing the ad free version second by second to the ad-injected version.

Additionally, because re-distribution of said audio files would definitely land me in legal trouble, I’d have to dynamically generate those timestamps and send it to you while you received the audio file from the official distributor, all to be edited on your device upon arrival.

I’m still very much a noob at programming and webdev, so this is definitely something that is probably a few years down the line in the making as I continue to upskill, but it’s good to think about as I would like to produce this if at all feasible (and won’t land me in legal trouble or hurdles, I’d like this to be akin to Invidious and Sponsorblock if possible).

Thanks for the suggestion! You definitely got me thinking hard on this problem and its potential solutions.

some_guy,

Seems like time-shifting (VCRs, Tivo) is protected, but I agree with advice to consult a lawyer.

smileyhead,

Retransmission of a podcast from your own server - no.

Cutting sponsored fragments on the end device - yes.

At least in most countries.

SSJ2Marx,
@SSJ2Marx@hexbear.net avatar

edited out of the episode and then the user could also download said episode where ads are cut out of the final audio file

This is your problem, because you’re redistributing someone else’s work with the ads cut out, which isn’t sufficiently transformative to qualify for fair use. Sponsorblock is allowed because it doesn’t actually interfere with the video stream, it just tells your computer when to skip ahead using YouTube’s already-existing playback features - your app should work the same way, integrating into an existing podcast platform and skipping forward based on crowdsourced timestamps, then the only thing you’re providing are the timestamps, which don’t violate copyright.

Zagorath,
@Zagorath@aussie.zone avatar

Exactly this. What you’d want is to develop your own podcatcher where instead of the listener pressing skip to skip forward 1 minute manually until they get to the end of the ad, your app does it automatically.

Yearly1845,

Go get a lawyer and don’t trust any other answer you get.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Definitely. Though posting this did help me think through the logistics of the problem itself, so I’m glad I did! Thanks for the good advice.

harsh3466,

I know nothing of the implications of developing what you’re proposing, but I 1000% support it and would actually start listening to podcasts again if it ever came to fruition and I could use it.

JimboDHimbo,

Am I wrong in thinking that if you obscure your identity well enough, there’s no reason to worry about legality?

Silentiea,

I mean, only insofar as that’s true of anything illegal?

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

I thought about this, and I believe I just wouldn’t want to increase my threat profile that much. I’d have to put the API up on either a .onion or i2p vps instance, in which case I’d have to brush up my knowledge on the dark web more than I currently have. I’d also have to become a much better cybersecurity expert than I am now. Don’t get me wrong, these are all great skills to have, but if I can go the more legit route of being able to do this legally and without fear of unnecessary legal hurdles (frivolous lawsuits, etc.), then I’d prefer that.

Thorny_Insight,

As much as I hate ads on podcasts that still wouldn’t be a good enough reason for me to switch apps unless the new one is atleast as good in other ways too as the one I’m using now.

GlenRambo,

Ive seen sponsor block for YTb added to a few different playes. It sesms it operates moren like a DB that other clients can hook into. So OPs idea wouldn’t need to be a new podcast player exactly.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

NewPipe has variations that include additional features like incorporating Youtube SponsporBlock API. I had thought on forking an existing podcast application and simply adding this API once I figure out how to generate the timestamps.

Bristle1744,

From what I understand, they’re able to practically make custom audio files for every download. Sharing the time stamps wouldn’t work that well. Re-distributing podcasts without the ads would definitely land you in legal trouble, cause every audio file is their “work of art”.

Not a problem for ublock because you’re editing their work of art for your personal use, and sharing unaltered stuff.

And youtube sponsor block is just sharing time stamps you might be interested in.

AI system that can recognize patterns and auto skip forward?

lemming934,

Maybe if you could distribute audio files (or hashes of audio files) that mark the start and stop of ads, that would solve the problem.

I guess podcasters could combat this by inserting random noise into their audio files, but they probably wouldn’t do that.

Bitrot,
@Bitrot@lemmy.sdf.org avatar

They don’t need to add random noise, they just do what they already do and insert new advertising materials. Your static timestamps mean the ads and content end up at different locations.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Yeah, that’s a problem. Dynamic length of targeted advertisement breaks would mean even a user generated database of timestamps wouldn’t be that useful.

lemming934,

I’m not suggesting static timestamps, but small audio files of the podcast about to enter, and just exited an ad.

The app could then search for the clips in the podcast to get the timestamp.

If there are copyright issues of sharing small clips, you can just save a hash of a clip, which will allow the app to find a match, but is not itself the Intelecual property of the podcaster; The hash cannot be turned back into the audio file. The hash would be smaller than the audio clip anyway, so sharing hashes would be better

ScreaminOctopus,

Some perceptural hash of the actual ads could work to. You could run into legal trouble sending the ads themselves or the hosts speaking.

lemming934,

Good idea! I bet you could make good ad library by comparing the audio between episodes of the same podcast (to catch the ads read by the host) and between different podcasts (to catch the targeted ads inserted into a lot of podcasts)

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Such a library of ads would quickly become massive, even if stored in a series of hash references. Interesting idea though.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

That’s an intriguing approach. Saving a hash of the clip instead of the timestamps MIGHT work. I’m still a bit worried about legal ramifications in that case.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Yes, using a trained AI model that recognizes ad segments could be possible for this, albeit expensive due to the cost of GPUs on a VPS.

fiat_lux,

so they could be edited out of the episode and then the user could also download said episode where ads are cut out of the final audio file.

This is the part that might be problematic and I can see being part of a civil suit (I am not a lawyer). Depending on how you collect and store the episodes (which you may not actually have to do to achieve your goal, but is the easiest solution) you would likely run afoul of "distribution" precedents in the US that may result in a judgement against you.

But even if you didn't actually break the law, the media lobbies globally are well known for filing huge numbers of lawsuits over anything that even looks a little like it might be costing them money. Defending yourself at all is hard time-consuming and often expensive. It's not something I would recommend going into casually.

https://torrentfreak.com/category/lawsuits/ is a great site for learning about the current lawsuits from a tech perspective, and has helped me out many times over the last decade. It's one of the gems of the internet, in my opinion.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

I love that site.

Ultimately, as others have suggested, the most probable way of doing this legally would be to not distribute anything other than timestamps as well as a simple binary/executable that would use a built in media editing tool like ffmpeg to cut out the advertisements/sponsor mentions and then recut the audio file back together. This is much akin to how sponsorblock works from what I’ve gathered so far.

parpol,

Legally there are no issues as long as the download option directly downloads from the original host where your application then cuts out the ads. Blocking ads is not illegal. It is neither copyright infringement nor piracy. It might break the ToS but that is ultimately not illegal, and may also be circumventable if carefully implemented.

Invidious does not break the youtube ToS because it doesn’t use the youtube API that the ToS is applied to. Same with Grayjay.

pixel,
@pixel@beehaw.org avatar

speaking of grayjay, how is it? I remember it releasing and then heard a whole lot of nothing after the fact, is it a solid client? I’d love something like vanced that doesn’t require a whole bunch of fiddling to achieve a similar featureset

parpol,

I stopped using revanced in favor of Grayjay. It is an excellent app. It was initially a bit buggy, but not anymore. But don’t get the playstore version. Download it directly to get all plugins.

eatham,
@eatham@aussie.zone avatar

Really good app, I havnt had many issues with it and I have been using it since as soon as it was available for download

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

I’ve listened to an Invidious Developer talk about why Google is unable to sue the Invidious developers (even though apparently they have wanted to). Apparently they web scrape the data, but I still don’t know how they manage to actually get the Videos if not through the Youtube API? Any clarification on how this is done would be greatly appreciated, if only solely for my own curiosity’s sake.

sin_free_for_00_days,

Oh man, the number of times I’ve wanted this to be a thing! I have no idea the legal issues surrounding it, but it is a product that would be used by a lot of people.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

That’s the hope. I’m still a Jr Web Developer and have a few personal projects under my belt, but none that have gained any user traction due to the things I’ve built being a bit too humble and the use case too niche. This is also a niche use application, but would still have more wide appeal than my previous endeavors. This is still very much a Concept. I might never actually follow through on it, but I’m doing research now to see what sort of solutions or complications I might be missing.

rollingflower,

Great plan! Forking Antennapod would be a good idea I guess.

Also many youtube videos are 1:1 available as podcasts, using the sponsorblock db here would already help.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

That’s a good point, Perhaps that would be a place to start. Thanks!

federalreverse,

It likely won’t work (well), because lots of podcasts actually use Megaphone and similar services that add interest-based ads into your download. I.e. ads can be of variable length or there may even be no ads, because the podcast targets the US but you’re downloading from Pakistan.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

I see. I think there might be an issue in redistribution to a certain extent. Some podcasts you can download directly from their website using RSS feeds and command line tools like wget. But a lot of those don’t directly have sponsor mentions, but if they do, those are easily removed because they aren’t injected at download time.

Others would require download using a service like Spotify, etc. And then editing the audio file and then redistributing it from a centralized data store, and that’s where I believe the legal question would certain gain more validity

Rather than just providing the timestamps and running a script that removes those clips prior to download from another source (like how the sponsor block api can be queried to cut out sponsor menttions using a command line flag from yt-dlp prior to download), which I believe would fall into more of a legal grey area.

But yeah, injection of ads based off of location is one potential hiccup I had considered when thinking on the proposed app’s implementation. Unless the ads are always loaded at a specific timestamp in the episode, this means that the length of the ads would be of varying length, making it less likely to work consistently, as you indicated.

So the only way would be to keep the audio files with the sponsor mentions removed in a centralized data store to be redistributed from, which I’m pretty sure isn’t legal…not sure though.

Thanks for the insights!

federalreverse,

Ftr: I was talking about regular RSS feeds+MP3 downloads, not Spotify exclusives.

If you really wanted to do something about Spotify exclusives, the likely only way to do this legally is building a custom Spotify client—Spotify allows custom clients, but only for paying customers, not for free users.

SomeoneSomewhere,

You definitely would have legal issues redistributing the ad-free version.

Sponsor block works partly because it simply automates something the user is already allowed to do - it’s legally very safe. No modification or distribution of the source file is necessary, only some metadata.

It’s an approach that works against the one-off sponsorships read by the actual performers, but isn’t effective against ads dynamically inserted by the download server.

One option could be to crowdsource a database of signatures of audio ads, Shazam style. This could then be used by software controlled by the user (c.f. SB browser extension) to detect the ads and skip them, or have the software cut the ads out of files the user had legitimately downloaded, regardless of which podcast or where the ads appear.

Sponsorships by the actual content producers could then be handled in the same way as SB: check the podcast ID and total track length is right (to ensure no ads were missed) then flag and skip certain timestamps.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

One option could be to crowdsource a database of signatures of audio ads, Shazam style. This could then be used by software controlled by the user (c.f. SB browser extension) to detect the ads and skip them, or have the software cut the ads out of files the user had legitimately downloaded, regardless of which podcast or where the ads appear.

That is one of the more unique ideas presented thus far. The other similar approach would be utilizing a trained AI model that would recognize advertisements and sponsor mentions. I’m not exactly sure how Shazam works, but that might be something to research in figuring out how best to approach this. Thanks.

SomeoneSomewhere,

Yeah, I have no idea either, but it’s been around for more than a decade so it should be fairly easy to find a library that duplicates it.

I would be wary of AI-based solutions. There’s a risk of it picking up e.g. satirical/spoof sponsorships as actual ads, and perhaps not detecting unusual ads.

I’m slightly terrified of the day someone starts getting AI to reword and read out individual ads for each stream.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

Perhaps that would be a good first step then. Figure out how Shazam works, then create a standalone application that catalogues and recognizes the audio of advertisements. An obvious name for such an app would be along the lines of “IsAnAd?”. Then hook that standalone application up to a podcast aggregation client and use the timestamps of that to create the desired sponsor block functionality.

Thanks again. Just hashing this out with others like yourself has been super helpful.

ahto,

Even if you’re downloading the file directly from the URL found in the RSS feed, that doesn’t mean that ads can’t be dynamically injected into the file. A URL like download.my.podcast/episode4.mp3 can still be answered by a script that serves a custom version of the podcast with region specific ads.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

You’re right, I had forgotten about targeted ads, but you’re right, that increases the length of the ad dynamically.

rollingflower,

Not for anything I listen to, they just embed standard product ads in their talking

Greg,
@Greg@lemmy.ca avatar

Have you downloaded your podcasts while I’m another country or with a VPN set in another country?

rollingflower,

Antennapod doesnt embed ads. Mostly EU or close to EU countries

tja,
@tja@sh.itjust.works avatar

Not your app, but the server on which the podcast is hosted. They will see from which country you are trying to download it and sometimes insert different ads. But this mostly depends on the podcasts you are listening to.

rollingflower,

Interesting, but this may be at defined timestamps right? So wouldnt change the core idea

derf82,

They may be added at a defined time stamp, but if the ad length varies, then the timing would just be thrown off.

I know they get pretty local. I listen to a podcast from Canada that inserts ads for concerts in my home city in Ohio.

rollingflower,

I wonder if those inserted ads could be detected. Antennapod also supports download, I wonder how that would work.

Also I wonder if such ads always need to have a given length, but maybe not.

AeroLemming,

Detecting the ads directly would be hard. The real way to do this would be to mark segments of *non-*advertisement and then send the information necessary to identify them to the client so that it can scan through the downloaded audio file and remove anything that shouldn’t be there. The algorithm would still be pretty complicated, but feasible.

rollingflower,

This makes sense.

So mark the segment where an ad starts, then the segment when the show goes on. The client catches the audio snippet which can then be moved to autoskip ads.

Gurfaild,

It should be possible to detect non-ads by downloading different versions of the audio file and checking which sections are identical, but you’d need some way of detecting transitions between sections.

If the ads use a voice actor who doesn’t talk on the podcast, maybe you could try to detect that.

AeroLemming,

Those are both solutions that only work sometimes and for the former, you have no way of knowing if it worked unless you actually listen to the result. Having to download the podcast twice is also rather undesirable.

z3rOR0ne,
@z3rOR0ne@lemmy.ml avatar

This gets to the heart of the difficulty of this proposed project though. Thanks for going down this train of thought to all involved, very interesting. I had hoped to utilize a series of user contributed timestamps, but this would get more complicated depending on region, distributor, etc. This is a project I’ll be thinking about long term though (and if I really think I have a solid plan, I’ll seek legal advice last to ensure I have all my ducks in a row). Thanks for the advice.

tja,
@tja@sh.itjust.works avatar

I didn’t really know. However they will probably have different lengths, so this might be a problem

aniki,

I think that’s highly app dependent, no? I can see it giving an IP based download. I cannot imagine AntennaPod sending client information on a download request.

makeshiftreaper,

Oh man, that answers some questions I’ve had for a while. Some of the podcasts I listened to have custom ad reads and then some will just blast the same ad as another unrelated one. Especially considering I get republican ads on podcasts with very liberal hosts. Plus gambling ads fucking everywhere

  • All
  • Subscribed
  • Moderated
  • Favorites
  • asklemmy@lemmy.ml
  • everett
  • Youngstown
  • InstantRegret
  • ethstaker
  • slotface
  • PowerRangers
  • Durango
  • vwfavf
  • kavyap
  • tsrsr
  • ngwrru68w68
  • DreamBathrooms
  • mdbf
  • magazineikmin
  • anitta
  • rosin
  • tacticalgear
  • thenastyranch
  • osvaldo12
  • GTA5RPClips
  • khanakhh
  • cisconetworking
  • modclub
  • cubers
  • tester
  • normalnudes
  • Leos
  • provamag3
  • All magazines