I just cleaned up my archive using the cli tool »findimagedupes« and I'm wondering how /g/ is dealing with file duplicates.
Welcome:
- md5 checking solutions?
- visual checking solutions?
- browser extentions?
- scripts?
It's time to boost the quality of our archives, /g/ents.
I usually just delete duplicates as and when I come across them.
>>52064635
I have a little Python3 script the reads through a directory tree, sha512's all files and reports any duplicates.
Everyone should make one.
>>52064829
Care to share here m8o?
>>52064829
>>52065058
if you're a linux user just make a simple find->hash->sort hashes->print duplicates shell "script"
>>52065363
Yeah, probably. But fuck everything about "find".
>>52065058
Uh, it's not terribly neat. Are you really sure you want it?
>using common cryptograpic hashes to find duplicates
What about the same image, but one is 600x900 and the other one is 599x900? Md5 will give you different hashes. You need dhash or phash.
https://realpython.com/blog/python/fingerprinting-images-for-near-duplicate-detection/
>>52065638
findimagedupes works exactly like this
>>52065673
Which one? The one on the Debian packages or the Go implementation on gitcuck?
>>52064635
XNView has a decent duplicate search feature.
>>52066314
the debian one
I use two:
fdupes to sort out "real" duplications
findimagedupes to sort out the other stuff like same pic but different resolution, etc
http://www.nirsoft.net/utils/search_my_files.html
not only for images but for any files