Show HN: Spotter – Search for copies of videos on the internet

aaaawweeeee | karma 11 | avg karma 2.2 · 2017-05-10 12:28:16+00:00

Do you track what people search? What do you do with that data? Who do you share it with? What's the business model, if any?

Kiro | karma 10888 | avg karma 1.51 · 2017-05-10 12:30:13+00:00

Who says they're looking to monetize?

aaaawweeeee | karma 11 | avg karma 2.2 · 2017-05-10 12:30:27+00:00

Just asking.

nerdponx | karma 22397 | avg karma 2.51 · 2017-05-10 12:33:45+00:00

Suggested by the fact that it's not Free Software, or even open source.

Kiro | karma 10888 | avg karma 1.51 · 2017-05-10 07:38:12

Suggested, yes, but anecdotally I release a lot of web services in the wild that is not open source and I don't have any plans on making any money off them. It's not binary "OS or monetization".

dickbasedregex | karma 358 | avg karma 2.65 · 2017-05-10 07:58:51

Servers ain't free, mate.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 13:49:56+00:00

Thanks for asking aaaawweeeee. Answers below:

Do you track what people search? - No, we just get the video and search the web based on the keywords.

What do you do with that data? - We don't keep the videos. We use our own fingerprints.

Who do you share it with? - With the uploader of the video

What's the business model - Trying to figure it out... Have any suggestion? :)

reply

laumars | karma 12945 | avg karma 2.49 · 2017-05-10 09:27:59

I can see content holders willing to pay for an automated service if it's better than what's already available. Particularly if it issued take down requests as well.

ikeboy | karma 14321 | avg karma 2.95 · 2017-05-10 14:34:06+00:00

> No, we just get the video and search the web based on the keywords.

So you're not indexing every video, you can only find videos that share some keywords with the original. If I had a video with no keywords it wouldn't show up even if it was an exact copy?

How many videos will you analyze per search? If it's something with popular keywords and shows a million videos, will you download all of those and compare?

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 10:45:51

ikeboy: At this stage we are not indexing to control server costs. However, our goal is to have all videos indexed to reduce the searching time. We are searching the web based on keywords to reduce the searching space.

smnscu | karma 1141 | avg karma 2.87 · 2017-05-10 07:38:45

It'd be interesting to build a tool to detect all the copyrighted material on YouTube that bypasses the automated checks. Some of the tricks used:

* picture-in-picture: https://www.youtube.com/watch?v=Rp1aOWUSRZg

* mirrored image

* higher pitch

* other?

reply

abdias | karma 289 | avg karma 3.75 · 2017-05-10 12:42:42+00:00

How will you deal with fair-use?

Asooka | karma 1643 | avg karma 0.98 · 2017-05-10 12:52:25+00:00

You don't have to, fair-use is a defence only applicable in court, the easiest way is to sue everyone and let their lawyers assert fair-use. AFAIK it's not unlawful, nor will you incur any fines, if you sue someone for using your copyrighted material if it ends up being declared fair-use. Additionally, I think youtube is even more lenient than that towards people asserting unlawful use of their copyrighted material. In short, fair-use does not exist outside of the courtroom and no amount of "I own no copyrights" or "No copyright intended" tags, or citing the copyright code on your videos can summon it.

dickbasedregex | karma 358 | avg karma 2.65 · 2017-05-10 12:57:43+00:00

YouTube already does an absolutely horrible job with handling take-downs and fair use. I have zero interest in seeing any systems or algorithms built that would aid the sloppy, lazy, and greedy organizations shot gunning take downs, even on their own material and channels (always funny).

yomly | karma 2841 | avg karma 3.05 · 2017-05-10 13:17:19+00:00

This amuses me greatly, got any tangible examples?

emodendroket | karma 21781 | avg karma 1.83 · 2017-05-10 13:29:47+00:00

jwz posted a thing about his long saga fighting a takedown of horror movie reviews.

yomly | karma 2841 | avg karma 3.05 · 2017-05-10 20:43:07+00:00

>even on their own material and channels (always funny).

I was referring specifically to this? Am I right to infer that DMCAs are so scatter gun that publishers have managed to DMCA themselves?

reply

Nadya | karma 4019 | avg karma 1.63 · 2017-05-10 15:44:01

>Am I right to infer that DMCAs are so scatter gun that publishers have managed to DMCA themselves?

Yes this has happened multiple times. Notable one that's somewhat recent that comes to mind: https://torrentfreak.com/warner-bros-flags-website-piracy-po...

reply

brlewis | karma 11715 | avg karma 2.69 · 2017-05-10 15:32:55+00:00

I posted an unlisted video of my daughter at a noisy indoor theme park. It got automatically removed because the venue had a song playing in the background.

corobo | karma 8116 | avg karma 2.43 · 2017-05-10 12:57:52+00:00

> sue everyone and let their lawyers assert fair-use

YouTube would turn into a tumbleweed landscape overnight!

I've often thought YT should just add "No copyright intended" et al as a ContentID trigger though.

reply

emodendroket | karma 21781 | avg karma 1.83 · 2017-05-10 13:29:16+00:00

"No copyright intended" always makes me laugh. What does that even mean?

rmc | karma 15660 | avg karma 2.36 · 2017-05-10 13:35:20+00:00

It means the average person has no idea how copyright law works. It means lots of average people think there is nothing wrong with non-commerical usage of copyrighted works so long as you don't try to pass something off as your own. It means the legal definition of copyright infringement is out of step of what most people think of as right and wrong.

SomeStupidPoint | karma 2326 | avg karma 1.61 · 2017-05-10 14:34:24+00:00

Your line of thinking is why we need anti-SLAPP style laws protecting Fair Use.

Strategic lawsuits against Fair Use are a form of stifling protected speech.

reply

dagw | karma 25328 | avg karma 2.79 · 2017-05-10 12:56:28+00:00

Since there is no algorithmic way of defining fair use, it can only really be dealt with by lawyers in court.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 09:39:14

Hi abdias, thanks for your question! At this stage we only inform our users where the uploaded video is online. Do you think we should deal with fair-use?

retox | karma 1890 | avg karma 2.51 · 2017-05-10 12:56:00+00:00

Interesting in what way? Everyone who has spent more than five minutes on youtube knows their bread and butter is copyright infringing videos. It's the unspoken dirty secret of online video.

Just search for 'full album' or 'full movie'.

reply

sp332 | karma 55607 | avg karma 2.75 · 2017-05-10 13:22:29+00:00

It's hardly unspoken, Viacom sued them for a billion dollars way back before Google bought them. They only lost the case because they couldn't find evidence that YouTube execs mentioned any specific video when discussing the money they made from piracy.

aphexbr | karma 73 | avg karma 2.52 · 2017-05-10 08:40:13

They fact that they included videos they had specifically authorised or uploaded themselves in the list of supposedly infringing videos couldn't have helped.

frogpelt | karma 2144 | avg karma 2.1 · 2017-05-10 14:12:18+00:00

One correction: Google bought YouTube in 2006. Viacom sued YouTube in 2007.

sp332 | karma 55607 | avg karma 2.75 · 2017-05-10 09:52:06

I guess that makes more sense, since YouTube wouldn't have had $1B to sue for. But the actions brought up in the trial were pre-acquisition.

CommieBobDole | karma 4787 | avg karma 10.57 · 2017-05-10 08:36:35

Beyond the outright movie/album piracy, there's another phenomenon that happens constantly at Youtube:

1. Someone uploads funny/interesting content.

2. Other people (bots maybe) notice that it's popular and download and re-upload it, titling and promoting it so that it ranks above the original in search results. Often the re-uploaders file takedown notices against the original video or other re-uploaders, so they're removed.

3. Repeat step 2 hundreds of times, until any search for "funny/interesting content" returns only a series of almost-identical 240p 15fps cropped, mirrored, or otherwise unwatchable versions of the original video.

I don't know if they condone this because it results in more overall views, or if they're just terrible at fighting it, but it's largely made Youtube useless for its original purpose as a many-to-many video sharing platform.

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:54:20

Hey CommieBobDole, that's exactly the problem we are solving. Our technology is robust enough to detect all those tweaks.

robbiemitchell | karma 1894 | avg karma 3.42 · 2017-05-10 12:56:55+00:00

Other tricks I've seen:

* skip a 1-3 seconds every so often

* chop it into blocks and re-arrange them, so you watch out of order but still get the gist (1,2,3,4,5 becomes 1,3,2,5,4)

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 13:56:02+00:00

Hi robbiemitchell, we can also recognize video with that technique :) You should try the platform

emodendroket | karma 21781 | avg karma 1.83 · 2017-05-10 13:28:37+00:00

Cell phone video of a television.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:57:43

Hey emodendroket, that's one of the use cases we are better at: recognizing videos from a scan of the TV.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:52:58

Thanks for starting a good discussion smnscu. That's exactly what we are here for: we are solving that problem!

my_ghola | karma 187 | avg karma 2.34 · 2017-05-10 15:59:21+00:00

I'd be more interesting to build a tool to detect all the videos in Youtube that fall under fair use but are still being DMCA'd.

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 11:03:46

Could be a cool open source project. Would almost definitely be a worse business.

doh | karma 3588 | avg karma 5.11 · 2017-05-10 11:44:13

We're looking into this for couple of years now. We're doing some steady progress, but far from a final result.

The problem is incredibly complex, because Fair Use doesn't have any quantifiable metrics that you can translate into a computer algorithm. The gist of Fair Use is that you have to prove the intent of the uploader trying to rip you off.

As you can imagine, this is a very hard thing to prove and the amount of videos being uploaded every day makes this is even harder to crack.

If you're interested in this problematic, reach out and I will be happy to share more.

reply

doh | karma 3588 | avg karma 5.11 · 2017-05-10 11:40:44

Yeah, those are fun ones. We [0] are dealing with most of those.

Here are some fun ones based on Shia LeBouf's green screen performance https://www.youtube.com/watch?v=ZXsQAXx_ao0

- https://vk.com/video-67185996_171327939

- https://www.youtube.com/watch?v=w_TuTzzSUb0&t=0s

- https://www.youtube.com/watch?v=i4ktEzJvaGM&t=186s

- https://www.youtube.com/watch?v=5BLMW3vQV20&t=124s

etc. There is much more we can do, but there are some we can't. For instance phone recorded 3D movie recorded in a cinema from a weird angle.

[0] https://pex.com

reply

mholmes680 | karma 454 | avg karma 2.23 · 2017-05-10 12:50:52+00:00

just fyi - got flagged as malware in my company network.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:50:59

Hi mholmes680! That's important for us to know. Can you share some details? What page were you in or what were you doing?

mholmes680 | karma 454 | avg karma 2.23 · 2017-05-10 10:19:32

Just clicked the link directly from HN, redirect ends up "This domain is blocked due to a security threat."

no real info on that page that i can share though. My company is very aggressive with security and maybe out of date blacklist?

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 15:27:19+00:00

Thanks mholmes680... probably related to the previous owner... not sure how we can change that. If you have any insight please share

And thanks for the heads up :)

reply

mysterypie | karma 2931 | avg karma 6.26 · 2017-05-10 12:59:39+00:00

It's still beta (i.e., slightly buggy) I think.

I did their suggested Pepsi video search[1], then picked a lower ranked YouTube video titled "GiveMeNews"[2] where it claims that "61% of your video was found here".

But that video consists a single frozen frame from the Pepsi ad and someone talking for a couple minutes. One frame isn't 61% of the video.

[1] https://dashboard.spotter.tech/reports/1?platform=youtube

[2] https://www.youtube.com/watch?v=sKrDECFhosA

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:42:26

Hey mysterypie. Yes we just launch and want to collect as much feedback as possible.

Thanks for that note, we are checking that. The good thing is that we were able to found a single frame from the original video :)

reply

nefitty | karma 2511 | avg karma 1.62 · 2017-05-10 09:37:59

Yeah, that actually sounds like it could be very useful as well! Good work on this so far.

personlurking | karma 2915 | avg karma 2.67 · 2017-05-10 08:01:24

FYI: You can only test the default videos. Trying any other (Youtube) video URLs leads to them asking for you to sign up in order to see the results.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 13:59:06+00:00

Hi personlurking! Yes, because we just launched today and need to control our servers we are asking for you email to send you the reports and hear back from your experience. Thanks for trying :)

rwinn | karma 722 | avg karma 6.94 · 2017-05-10 13:18:23+00:00

Jumped through the hoops and signed up to submit my own video, still "Queued to analyze..." after 1h.

rwinn | karma 722 | avg karma 6.94 · 2017-05-10 13:33:50+00:00

Got the results now, posted one of my own videos I know has been reuploaded multiple times:

    0 total copies were found
    with 0 total views

Original: https://vimeo.com/132700334

Some known re-uploads:

https://www.youtube.com/watch?v=X0oSKFUnEXc

https://www.youtube.com/watch?v=onbi3Ws8fng

https://www.youtube.com/watch?v=SCE-QeDfXtA

---

Great concept, would love if it actually worked.

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 08:40:13

rwinn, thanks for checking our platform and for your patience waiting for the results. It's great to know your experience and the results you got (or lack of those :P). We'll check it and get back to you with some news.

(we just launch the tool and it's still on Beta).

reply

doh | karma 3588 | avg karma 5.11 · 2017-05-11 02:44:27+00:00

I know you didn't ask for this, but I took the liberty to run the video you asked for through our data and here are the results [0]. As you can see, pretty decent amount of them found across many different sites. These results look much more impressive for massively viral videos, but even this paints an interesting picture.

[0] https://www.dropbox.com/s/2913aiiug2bvwb9/pex_Inside_an_Arti...

reply

rwinn | karma 722 | avg karma 6.94 · 2017-05-11 15:38:16+00:00

Impressive results! Even where the original is distorted and mixed with other videos it's detected. I take it you use a different technique than spotter?

doh | karma 3588 | avg karma 5.11 · 2017-05-12 05:05:28+00:00

Thank you. Can't comment on their technology as I have no idea what are they using. Our service is very complex environment that consists of many different parts and pieces. We run at a huge scale (many thousands of servers) and indexed more than 4B videos. That allows us to do what you see above.

rixrax | karma 1503 | avg karma 4.28 · 2017-05-10 14:16:39+00:00

Duh. Spotter requires sign-up with an email address to receive the report. No Thank You.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 14:22:27+00:00

Hi rixrax!

We struggled with that, but it was the only way we found to provide a fair and good service. You can check an example report on our dashboard (https://dashboard.spotter.tech). Check Psy video for example.

Thanks for your feedback!

reply

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 14:05:08+00:00

Sounds like an excellent tool for censorship and copyright enforcement. Can we not make things like this?

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 09:06:07

Because copyright enforcement is inherently evil? Please

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 09:15:54

Yes, if you use a shotgun approach. Fair use is a thing.

Silhouette | karma 13968 | avg karma 1.95 · 2017-05-10 15:02:28+00:00

Fair use is a thing in the US. It's not a thing in most other places, at least not in anything like the same generic form. And of course a great deal of online copying isn't fair use even in the US.

A shotgun approach is one thing, but speaking as someone who's seen content produced by his businesses shared online, 100% of the copies we've found were blatantly infringing. Trying to keep track of them and get them taken down does waste time that we'd rather spend on activities like making more content. Given that YouTube are awful at dealing with legitimate takedown requests from small businesses (again, personal experience talking here) a tool that would rapidly identify infringing content and let us review it and submit a properly formed takedown notice quickly could be very useful.

reply

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 10:06:46

>Trying to keep track of them and get them taken down does waste time that we'd rather spend on activities like making more content

Then don't. If people want to steal your content, it's going to be stolen. You cannot solve that. It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.

reply

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 10:14:57

> If people want to steal your content, it's going to be stolen. You cannot solve that. It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.

You can solve a lot of it. I wrote a book that took me a lot of time and effort to produce. People started posting it for free. I forced them to take it down. That was much, much easier than writing another book (which would take me about a year)

reply

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 15:25:24+00:00

How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source? Also, I expect that your book is still available in certain circles for free.

ygaf | karma 218 | avg karma 1.0 · 2017-05-10 15:42:15+00:00

Legitimate buyers can convert the other way.

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 16:00:33+00:00

> How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source

More than zero. My book is a particular niche that people need, and they will pay what they have to in order to get it. If zero is an option some will take zero, if not they'll pay.

reply

Silhouette | karma 13968 | avg karma 1.95 · 2017-05-10 16:42:10+00:00

How many of the people who were going to steal it do you think were converted into legitimate buyers when you took down the free source?

I can't speak for Austen, but in our case we've seen people attempting to rip our stuff and use it in such a way that there was definitely revenue being generated. What's more, we know that the people providing that revenue liked what they saw and were looking for more of it, because some of the comments or ratings on some of the distribution channels are public. It's just that those people who were demonstrably willing to support the content were sending their support to the wrong place.

Also, I expect that your book is still available in certain circles for free.

Again, I can't speak for Austen, but in our case those "certain circles" would have to be pretty tight and obscure. As I said before, we operate in a relatively small world, and if someone were leaking any of our stuff on a large scale, we'd almost certainly know about it. And if anything out there is only on a small enough scale that we haven't heard about it, then almost all of our other potential customers probably haven't either. That's a big difference from seeing people putting something up on YouTube or Vimeo or whatever and watching them pick up thousands of views/listens and positive feedback in a matter of hours.

reply

Silhouette | karma 13968 | avg karma 1.95 · 2017-05-10 16:37:28+00:00

If people want to steal your content, it's going to be stolen. You cannot solve that.

Man who say something cannot be done should not interrupt man who is doing it.

We're talking about small producers operating in niche markets here, not the latest Hollywood summer blockbuster, Taylor Swift album or Game of Thrones episode. You can't just go onto YouTube and find our stuff, except on the rare occasions when someone's managed to post a little of it for a short time before we get it taken down.

The downside of this is that I'm writing this from a normal house, not a luxury yacht somewhere nice and sunny.

The upside is that most of our customers don't know how to save/share our stuff, and the few who do stick out like a sore thumb in their access patterns and can rapidly be cut off. The biggest real world threat to us is not acting quickly on those or the redistribution they attempt, meaning we wind up losing other potential customers to rips of our stuff with someone else's branding/advertising slapped all over them.

It's better to spend your time making more content and better content so your paying customers pay more and refer more paying friends.

No, it isn't. The evidence in our case is beyond any doubt. Of course we would much rather be making new content and doing more for our paying customers, but we can't afford to just ignore the small minority of abusers.

reply

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 10:05:37

OP's software identifies videos that could potentially be copyright infringement. The conclusion that we shouldn't build software like that because there's such a thing as fair use doesn't follow.

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 15:12:12+00:00

The subsequent DMCA notice that's filed is the problem. DMCA has a huge problem with abuse from automated filings. Every takedown notice should be prepared by a human. Writing software that automates the process is a crime against culture.

austenallred | karma 22490 | avg karma 7.52 · 2017-05-10 10:58:24

Again, the conclusion doesn't follow. DMCA being abused doesn't mean that copyright is inherently wrong or that software that makes it possible for copyright holders to find offenders is evil or a "crime against culture."

tokenizerrr | karma 2693 | avg karma 2.1 · 2017-05-10 09:15:14

And here I was thinking porn.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 14:24:02+00:00

Hi tokenizerrr, I was counting down until someone mentioned porn, you were the first!!! YAY!!!! :)))

tokenizerrr | karma 2693 | avg karma 2.1 · 2017-05-10 14:34:50+00:00

So what sites do you scrape? Just YouTube?

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 10:02:20

At this stage we analyze YouTube and Facebook. But we are also able to do it with LinkedIn, Vimeo, Twitter, Instagram, the Chinese platform Tencent Video and a couple of porn sites :D

darkstar999 | karma 2159 | avg karma 2.69 · 2017-05-10 09:23:36

A knife can be used for good or for evil. You don't stop tools from being made just because they can be used in bad ways.

Sir_Cmpwn | karma 19702 | avg karma 5.33 · 2017-05-10 14:26:06+00:00

Wrong. You should always consider the potential for abuse when you make software. OP seems to be pushing a commercial angle as well - who exactly do you think is going to pay?

gondo | karma 567 | avg karma 2.21 · 2017-05-10 15:44:16+00:00

similar service https://pex.com/

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 10:47:52

Thanks gondo! Please try our tool and give us feedback ;) Which one got the best results?

doh | karma 3588 | avg karma 5.11 · 2017-05-10 16:36:59+00:00

Congratulations to the team on the release. It's exciting to see more companies entering the market.

We built a similar service [0] although from the first look it seems we took a different path. Our approach is to crawl all videos and music on the web, fingerprint the multimedia content and search through the fingerprints.

We just recently passed 4B indexed videos (we run a bit large scale [1]). Thus our results are bit different. Here is an example for gangnam style for comparison [2].

Anyway good luck. Feel free to reach out if there is anything we can do to help.

[0] https://pex.com

[1] https://news.ycombinator.com/item?id=13259415

[2] http://i.imgur.com/3KDKHsI.png

reply

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 16:53:42+00:00

Hey doh! Thanks for reaching out. We have no videos indexed and we are doing strictly visual search - we are Computer Vision evangelists :)

Thanks for your availability, likewise: feel free to drop us a note.

reply

dang | karma 18142 | avg karma 0.25 · 2017-05-10 14:07:33

Please don't reply to people with their usernames; HN's threaded comments make that redundant, and it breaks the sense of normal conversation in much the way that repeating people's names out loud would.

joaodmj | karma 54 | avg karma 1.35 · 2017-05-10 19:53:05+00:00

Thanks for letting me know. I'm new to HN :)

dang | karma 18142 | avg karma 0.25 · 2017-05-10 19:56:02+00:00

You're welcome, and welcome!