[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y ] [Home]
4chanarchives logo
I just cleaned up my archive using the cli tool »findimagedupes«
Images are sometimes not shown due to bandwidth/network limitations. Refreshing the page usually helps.

You are currently reading a thread in /g/ - Technology

Thread replies: 12
Thread images: 1
File: 1423895627242.jpg (41 KB, 717x430) Image search: [Google]
1423895627242.jpg
41 KB, 717x430
I just cleaned up my archive using the cli tool »findimagedupes« and I'm wondering how /g/ is dealing with file duplicates.

Welcome:
- md5 checking solutions?
- visual checking solutions?
- browser extentions?
- scripts?

It's time to boost the quality of our archives, /g/ents.
>>
I usually just delete duplicates as and when I come across them.
>>
>>52064635
I have a little Python3 script the reads through a directory tree, sha512's all files and reports any duplicates.

Everyone should make one.
>>
>>52064829
Care to share here m8o?
>>
>>52064829
>>52065058
if you're a linux user just make a simple find->hash->sort hashes->print duplicates shell "script"
>>
>>52065363
Yeah, probably. But fuck everything about "find".

>>52065058
Uh, it's not terribly neat. Are you really sure you want it?
>>
>using common cryptograpic hashes to find duplicates
What about the same image, but one is 600x900 and the other one is 599x900? Md5 will give you different hashes. You need dhash or phash.

https://realpython.com/blog/python/fingerprinting-images-for-near-duplicate-detection/
>>
>>52065638
findimagedupes works exactly like this
>>
>>52065673
Which one? The one on the Debian packages or the Go implementation on gitcuck?
>>
>>52064635
XNView has a decent duplicate search feature.
>>
>>52066314
the debian one
I use two:

fdupes to sort out "real" duplications
findimagedupes to sort out the other stuff like same pic but different resolution, etc
>>
http://www.nirsoft.net/utils/search_my_files.html
not only for images but for any files
Thread replies: 12
Thread images: 1

banner
banner
[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y] [Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
If a post contains personal/copyrighted/illegal content you can contact me at [email protected] with that post and thread number and it will be removed as soon as possible.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com, send takedown notices to them.
This is a 4chan archive - all of the content originated from them. If you need IP information for a Poster - you need to contact them. This website shows only archived content.