Suggested, yes, but anecdotally I release a lot of web services in the wild that is not open source and I don't have any plans on making any money off them. It's not binary "OS or monetization".
I can see content holders willing to pay for an automated service if it's better than what's already available. Particularly if it issued take down requests as well.
> No, we just get the video and search the web based on the keywords.
So you're not indexing every video, you can only find videos that share some keywords with the original. If I had a video with no keywords it wouldn't show up even if it was an exact copy?
How many videos will you analyze per search? If it's something with popular keywords and shows a million videos, will you download all of those and compare?
ikeboy: At this stage we are not indexing to control server costs. However, our goal is to have all videos indexed to reduce the searching time. We are searching the web based on keywords to reduce the searching space.
You don't have to, fair-use is a defence only applicable in court, the easiest way is to sue everyone and let their lawyers assert fair-use. AFAIK it's not unlawful, nor will you incur any fines, if you sue someone for using your copyrighted material if it ends up being declared fair-use. Additionally, I think youtube is even more lenient than that towards people asserting unlawful use of their copyrighted material. In short, fair-use does not exist outside of the courtroom and no amount of "I own no copyrights" or "No copyright intended" tags, or citing the copyright code on your videos can summon it.
YouTube already does an absolutely horrible job with handling take-downs and fair use. I have zero interest in seeing any systems or algorithms built that would aid the sloppy, lazy, and greedy organizations shot gunning take downs, even on their own material and channels (always funny).
I posted an unlisted video of my daughter at a noisy indoor theme park. It got automatically removed because the venue had a song playing in the background.
It means the average person has no idea how copyright law works. It means lots of average people think there is nothing wrong with non-commerical usage of copyrighted works so long as you don't try to pass something off as your own. It means the legal definition of copyright infringement is out of step of what most people think of as right and wrong.
Hi abdias, thanks for your question! At this stage we only inform our users where the uploaded video is online. Do you think we should deal with fair-use?
Interesting in what way?
Everyone who has spent more than five minutes on youtube knows their bread and butter is copyright infringing videos. It's the unspoken dirty secret of online video.
It's hardly unspoken, Viacom sued them for a billion dollars way back before Google bought them. They only lost the case because they couldn't find evidence that YouTube execs mentioned any specific video when discussing the money they made from piracy.
They fact that they included videos they had specifically authorised or uploaded themselves in the list of supposedly infringing videos couldn't have helped.
Beyond the outright movie/album piracy, there's another phenomenon that happens constantly at Youtube:
1. Someone uploads funny/interesting content.
2. Other people (bots maybe) notice that it's popular and download and re-upload it, titling and promoting it so that it ranks above the original in search results. Often the re-uploaders file takedown notices against the original video or other re-uploaders, so they're removed.
3. Repeat step 2 hundreds of times, until any search for "funny/interesting content" returns only a series of almost-identical 240p 15fps cropped, mirrored, or otherwise unwatchable versions of the original video.
I don't know if they condone this because it results in more overall views, or if they're just terrible at fighting it, but it's largely made Youtube useless for its original purpose as a many-to-many video sharing platform.
We're looking into this for couple of years now. We're doing some steady progress, but far from a final result.
The problem is incredibly complex, because Fair Use doesn't have any quantifiable metrics that you can translate into a computer algorithm. The gist of Fair Use is that you have to prove the intent of the uploader trying to rip you off.
As you can imagine, this is a very hard thing to prove and the amount of videos being uploaded every day makes this is even harder to crack.
If you're interested in this problematic, reach out and I will be happy to share more.
I did their suggested Pepsi video search[1], then picked a lower ranked YouTube video titled "GiveMeNews"[2] where it claims that "61% of your video was found here".
But that video consists a single frozen frame from the Pepsi ad and someone talking for a couple minutes. One frame isn't 61% of the video.
Hi personlurking! Yes, because we just launched today and need to control our servers we are asking for you email to send you the reports and hear back from your experience. Thanks for trying :)
rwinn, thanks for checking our platform and for your patience waiting for the results. It's great to know your experience and the results you got (or lack of those :P). We'll check it and get back to you with some news.
I know you didn't ask for this, but I took the liberty to run the video you asked for through our data and here are the results [0]. As you can see, pretty decent amount of them found across many different sites. These results look much more impressive for massively viral videos, but even this paints an interesting picture.
Impressive results! Even where the original is distorted and mixed with other videos it's detected. I take it you use a different technique than spotter?
Thank you. Can't comment on their technology as I have no idea what are they using. Our service is very complex environment that consists of many different parts and pieces. We run at a huge scale (many thousands of servers) and indexed more than 4B videos. That allows us to do what you see above.
We struggled with that, but it was the only way we found to provide a fair and good service. You can check an example report on our dashboard (https://dashboard.spotter.tech). Check Psy video for example.
Fair use is a thing in the US. It's not a thing in most other places, at least not in anything like the same generic form. And of course a great deal of online copying isn't fair use even in the US.
A shotgun approach is one thing, but speaking as someone who's seen content produced by his businesses shared online, 100% of the copies we've found were blatantly infringing. Trying to keep track of them and get them taken down does waste time that we'd rather spend on activities like making more content. Given that YouTube are awful at dealing with legitimate takedown requests from small businesses (again, personal experience talking here) a tool that would rapidly identify infringing content and let us review it and submit a properly formed takedown notice quickly could be very useful.
>Trying to keep track of them and get them taken down does waste time that we'd rather spend on activities like making more content
Then don't. If people want to steal your content, it's going to be stolen. You cannot solve that. It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.
> If people want to steal your content, it's going to be stolen. You cannot solve that. It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.
You can solve a lot of it. I wrote a book that took me a lot of time and effort to produce. People started posting it for free. I forced them to take it down. That was much, much easier than writing another book (which would take me about a year)
How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source? Also, I expect that your book is still available in certain circles for free.
> How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source
More than zero. My book is a particular niche that people need, and they will pay what they have to in order to get it. If zero is an option some will take zero, if not they'll pay.
How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source?
I can't speak for Austen, but in our case we've seen people attempting to rip our stuff and use it in such a way that there was definitely revenue being generated. What's more, we know that the people providing that revenue liked what they saw and were looking for more of it, because some of the comments or ratings on some of the distribution channels are public. It's just that those people who were demonstrably willing to support the content were sending their support to the wrong place.
Also, I expect that your book is still available in certain circles for free.
Again, I can't speak for Austen, but in our case those "certain circles" would have to be pretty tight and obscure. As I said before, we operate in a relatively small world, and if someone were leaking any of our stuff on a large scale, we'd almost certainly know about it. And if anything out there is only on a small enough scale that we haven't heard about it, then almost all of our other potential customers probably haven't either. That's a big difference from seeing people putting something up on YouTube or Vimeo or whatever and watching them pick up thousands of views/listens and positive feedback in a matter of hours.
If people want to steal your content, it's going to be stolen. You cannot solve that.
Man who say something cannot be done should not interrupt man who is doing it.
We're talking about small producers operating in niche markets here, not the latest Hollywood summer blockbuster, Taylor Swift album or Game of Thrones episode. You can't just go onto YouTube and find our stuff, except on the rare occasions when someone's managed to post a little of it for a short time before we get it taken down.
The downside of this is that I'm writing this from a normal house, not a luxury yacht somewhere nice and sunny.
The upside is that most of our customers don't know how to save/share our stuff, and the few who do stick out like a sore thumb in their access patterns and can rapidly be cut off. The biggest real world threat to us is not acting quickly on those or the redistribution they attempt, meaning we wind up losing other potential customers to rips of our stuff with someone else's branding/advertising slapped all over them.
It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.
No, it isn't. The evidence in our case is beyond any doubt. Of course we would much rather be making new content and doing more for our paying customers, but we can't afford to just ignore the small minority of abusers.
OP's software identifies videos that could potentially be copyright infringement. The conclusion that we shouldn't build software like that because there's such a thing as fair use doesn't follow.
The subsequent DMCA notice that's filed is the problem. DMCA has a huge problem with abuse from automated filings. Every takedown notice should be prepared by a human. Writing software that automates the process is a crime against culture.
Again, the conclusion doesn't follow. DMCA being abused doesn't mean that copyright is inherently wrong or that software that makes it possible for copyright holders to find offenders is evil or a "crime against culture."
At this stage we analyze YouTube and Facebook. But we are also able to do it with LinkedIn, Vimeo, Twitter, Instagram, the Chinese platform Tencent Video and a couple of porn sites :D
Wrong. You should always consider the potential for abuse when you make software. OP seems to be pushing a commercial angle as well - who exactly do you think is going to pay?
Congratulations to the team on the release. It's exciting to see more companies entering the market.
We built a similar service [0] although from the first look it seems we took a different path. Our approach is to crawl all videos and music on the web, fingerprint the multimedia content and search through the fingerprints.
We just recently passed 4B indexed videos (we run a bit large scale [1]). Thus our results are bit different. Here is an example for gangnam style for comparison [2].
Anyway good luck. Feel free to reach out if there is anything we can do to help.
Please don't reply to people with their usernames; HN's threaded comments make that redundant, and it breaks the sense of normal conversation in much the way that repeating people's names out loud would.
reply