I've thought about something like applying a threshold filter to thumbnails to distort them sufficiently so they're not potentially "offensive" but have it still be apparent if they might contain bad content. It's too difficult a problem for me to actually implement though, and I suspect it could be easily hacked somehow.
You could run through the sites you grab the thumbnails from and take an educated guess if they should be shown or not.
Most adult sites have an rta meta tag. You can use that along with Alexa and scraping the keywords + domain name for usual suspect text fragments. Might help block out those horrific images that cant be unseen..
Otherwise, love this service and I'm pretty sure I'll be using it
No. I looked into this a lot for a dating app I ran and no algorithms came close to human moderation, even for images you'd consider are obviously pornographic, which can get expensive.
A funny idea I had was to reverse the whole system - feed UGC content of a site that's supposed to be SFW into a porn site which is definitely NSFW, one that has lots of thumbnails. The ones that don't get any clicks to enlarge probably aren't porn and can pass the test :)
This is just an idea, since I never built such filter, but you could automate a large part of filtering NSFW images. A quick search on google lead to this paper : http://cs229.stanford.edu/proj2005/HabisKrsmanovic-ExplicitI...
Once you have that in place, I guess it's better to make it agressive and report false positive as NSFW.
Google "safe image search" has the additional help of searching the content of the page the image is used. You might be able to do the same, up to some limit, by checking the http referer header field to know where requests are coming from. You could scan the referer's page for some keywords. This might give you a better idea of the context where the image is used. Note that this might be tricky, since you probably don't want traffic coming out of your server to some child porn site.
That said, those are just some ideas. Youtube has a good community that flags videos, but also an army of reviewer that look at the flagged content.
I know they have blurred previews for content marked sensitive and I imagine porn marketers or people sharing content that's illegal anyway are more than happy to ignore that. I wonder if you can blur, or even better, not load any images and display them selectively? Or not load any images from people you don't follow?
Defining "bad content" is difficult. I don't think we're nearly advanced enough avoid human intervention. Using user reports works well.
Here's an example: if you go on YouTube you can find harmless videos of parents bathing their babies, and I doubt anyone would call that child pornography. Same goes for something like a birth video, even though it shows a vagina. How do you detect the differences?
I think it'd be an improvement if Facebook were more open with their policies and allowed nudity, even if it required flagging content as mature. To get around this people end up placing tiny black bars in any of the "offensive" parts. But how does that seriously change anything?
I'd say it's pretty reasonable to request users flag content. Detecting it programmatically is currently impossible, so what other choices do you have if you want to allow users to upload content? Should only multi-billion dollar companies be able to have user-uploaded content? How would you handle this issue if you were building a small media sharing site?
Isn't the easy solution to require logins for particularly adult content (even if it only requires login for part of the content of an article) in the same way YouTube does? You could also enable "mouse-overs" for authenticated users so they don't see potentially "offensive" images straight away.
By malicious, do you mean inappropriate, or containing a malicious metadata inside the image? The images are probably stripped and/or compressed, removing any malicious metadata.
You can use machine learning to classify images of various categories to detect inappropriate images. For example, this is what Yahoo open sourced: https://github.com/yahoo/open_nsfw
Additionally, user-generated content sites will almost always have a way for users to flag inappropriate content. If something's flagged disproportionately, it will be removed.
I don't think anyone's suggesting filtering out that content, but having a warning that there may be some disturbing images seems fair and we will likely put that on the site when we have a chance.
You could look at it as trying to get them blocked by search engines. Can you detect when they're proxying a search bot as opposed to a user? As for punish, you don't have to make it eye-bleach, just enough to make it firmly NSFW so nobody can get any business value from it, or even use it safely at work.
A little soft NSFW would also greatly accelerate them being added to a block list, especially if you were to submit their site to the blocklists as soon as you started including it. You can include literally anything that won't get you arrested. Terrorist manifestos, the anarchists cookbook, insane hentai porn... Use all those block categories - gore/extreme, terrorist, adult, etc.
This is a bit complicated: what if you had some sort of capcha that required users to classify images as nsfw/sfw/illegal?
If the user normally encounters photos on the site by requesting them (e.g., by entering a search query or browsing a friend's album) rather than having random photos thrown in their face (like HotOrNot.com), I would think you could run into some very upset users (and possibly legal problems) if you are throwing random photos that might contain disturbing images in their faces. I mean, if you go to a website intending to browse photos your friend took of his boat and the site throws up some random child porn on your screen, you'd be pretty annoyed, right?
I should clarify that on the home page. Thanks for the feedback! We are planning to review images and flag them for objectionable content e.g. explicit, violent, etc.
Do you want cops filtering through all online sites looking for child porn? Do you think that's a good use for their time?
It's the threat of law enforcement that leads people who run websites to remove illegal content.
Generically (to, say, please advertisers) that is an expectation that sites are going to be proactive about removing offensive (or illegal) material. Simply responding on a "whack-a-mole" basis is not good enough. I ran a site that had something like 1-in-10,000 offensive (not illegal... but images of dead nazis, people with terrible tumors on their genitals, etc.) images and that was not clean enough for Adsense. From the viewpoint of quality control, particularly the Deming viewpoint of statistical quality control, it is an absolute bear of a problem to find offensive images at that level -- and look at how many people write a paper about some A.I. program that gets 70% accuracy is state of the art.
Maybe I'm missing something here, but I suspect this would be a technical and political minefield. Sure, you can recognise certain types of abuse images, but then what about content like that in fiction? It's very logical that something seems as problematic in real life would be fine in say, a Call of Duty game or a blockbuster movie or some other form of fiction.
Heck, there have been cases where scenes in movies and games have been 'reappropriated' as real life military or terrorist events by clueless nations and groups. For example:
So your system would have to figure out not just whether something is seen as 'offensive' or 'against the rules', but whether its from a fictional work that might be allowed on the site.
We would be interested. We run http://picturepush.com and adult material is forbidden, but there is a lot of it added anyway (of course). A service for fixing that would be great.
I would expect, by now, someone would have made something automated for that kind of thing though...
OK. I don't think this is the solution you're after if your problem includes "crowd sourced video".
Nudity detection though - I'd probably at least try doing something like "Check every 50+rand(100) frames, and only examine more carefully if you get hits on that sampling". Sure - that's "game-able" - but subliminal nudity isn't something I'd expect trolls or griefers to expend too much effort to slide past your filters...
reply