[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y ] [Home]
4chanarchives logo
How the heck does internets analyze 14 billion images in a split
Images are sometimes not shown due to bandwidth/network limitations. Refreshing the page usually helps.

You are currently reading a thread in /g/ - Technology

Thread replies: 16
Thread images: 2
File: Screenshot_2016-03-19-09-40-07.png (176 KB, 720x1280) Image search: [Google]
Screenshot_2016-03-19-09-40-07.png
176 KB, 720x1280
How the heck does internets analyze 14 billion images in a split second? I can't fathom how this is possible.
>>
the power of an ssd
>>
didn't you know op? the internet is controlled by one central processor.
>>
>What are hashsums?
>>
>>53570669
Idk lol
>>
>>53570700
Then GTFO and get some education, nigger.
>>
>>53570713
I don't take advice from bigots desu senpai
>>
>>53570725
I didn't type that wtf
>>53570669
That's irrelevant
How does internets analyze 14 billion hash sums in a split second
>>
>>53570631
MD5's

>>53570762
>first you upload the image
>then they get the hash of it
>get the first X characters
>go to that folder and search that hash
It only search in that hash folder, the other ones are untouched.
>>
>>53570762
They changed words to trigger shit desu senpai
>>
>>53570631
>heck
Now mister what have we discussed about bad words?
>>
>>53570631
Quamtum cormputingmreating.
>>
>the hash starts with 1
>you now eliminated 90% of all hashes
>the second number is a 1
>you now eliminated 99% of all hashes
>the third number is a 1
>you now eliminated 99,9% of all hashes
>the fourth number is a 1
>you now eliminated 99,99% of all hashes
>>
>>53570631
As nobody here seems to know what the fuck they are talking about let me explain.

Modern image search systems make use of machine learning. They don't actually search 14b images. They calculate an ID for the image based on a handful of attributes such as layout, size, colours present, colour quantities etc.

From there they query their data store for images which match to a certain % in several categories.

So far this isn't really machine learning but indexing and lookup. Where ML comes into play is using a system to work out what the image IS. For example if you upload an image of a cat the system "sees" it is a picture of a cat so can remove all of the non-cat images from the results.

All of this is done of a massive scale in parallel so it happens very fast. Usually around 1 second.

Also note that the results are only for the first 20 or so highest scoring results. The search actually continues while you look at the results for a few seconds so that it can present page 2 then page 3 etc should you not find what you want. Also analytics is used on how you make use of the results so if you go to page 2 it knows it didn't do a great job at finding the right result first time. When you click an image and then leave the page it makes an educated guess as to if the result was correct and so learns from that. If you go and do another search for the same thing right away it will know it didn't do a great job etc

Hope that helps senpai
>>
>>53571829
interesting
>>
File: 1454745030613.jpg (40 KB, 550x512) Image search: [Google]
1454745030613.jpg
40 KB, 550x512
>>53571793
>1488 starts with 1
>you've now eliminated 90% of Jews
Thread replies: 16
Thread images: 2

banner
banner
[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y] [Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
If a post contains personal/copyrighted/illegal content you can contact me at [email protected] with that post and thread number and it will be removed as soon as possible.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com, send takedown notices to them.
This is a 4chan archive - all of the content originated from them. If you need IP information for a Poster - you need to contact them. This website shows only archived content.