[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y ] [Home]
4chanarchives logo
I've always wondered, how do archive sites work? Say I
Images are sometimes not shown due to bandwidth/network limitations. Refreshing the page usually helps.

You are currently reading a thread in /qa/ - Question & Answer

Thread replies: 22
Thread images: 4
File: 1462411017807.jpg (127 KB, 720x576) Image search: [Google]
1462411017807.jpg
127 KB, 720x576
I've always wondered, how do archive sites work?
Say I posted something but deleted it moments after will it be archived?
>>
It depends on the individual archive I imagine.
https://desuarchive.org/qa/thread/584952/#584952
>>
>>584956
Damn, that's insane. How does it keep track of all these threads?
>>
File: 1461253293326.png (435 KB, 720x720) Image search: [Google]
1461253293326.png
435 KB, 720x720
>>584972
When did you start using the internet? or maybe I'm just being an elitist, the internet can work in real time you know
>>
>>584974
I don't know how archiving a live, moving page works. I figured that bots would periodically crawl and update. But immediately with the massive amounts of active threads on this site I had no idea it was possible without the amount of server power like a big tech company would have. These archive sites seem like they aren't big.
>>
>>584987
>I figured that bots would periodically crawl and update
yes, that's what happens. once you make one bot to crawl one board, you just change a couple lines of code to make another bot to crawl another board. it is possible for a post to be made and deleted between passes.

>These archive sites seem like they aren't big.
that's why they go down more often than your mother
>>
>>584952
You have to wait 60 seconds before deleting things so...
>>
>>585010
Yeah, I know that. I managed to delete it and it wasn't in archive but another anon reposted it and it showed up there. I accidentally uploaded a screen shot with my gf's pic in the messenger float chat head thing. I'm idiot.
>>
>>585083
as long as you don't bring attention to it, it'll be obscure enough

People think everything lasts forever on the internet now, but a lot of stuff dies. Whatever archive it appeared on will probably die in a year and that data will be lost forever. Right now in the archive it's already pretty obscure and no one has any particular reason to care about your gf's pic
>>
>>585098
I know all that it just feels weird knowing that at least one anon saved the pic to upload and it was in a huge thread. I know that nothing is going to likely happen but I can't help but feel bad about it. Thanks for the help though.
>>
>>585098
kek, that's what you think.
I've already found the pic, doxed that bitch, and am currently sending 50 pizzas to her doorstep along with a singing telegram letting her know her bf uses 4chan and what a massive faghole he is
>>
File: 1463656994019.jpg (101 KB, 500x501) Image search: [Google]
1463656994019.jpg
101 KB, 500x501
>>585119
That's pretty nice of you m8, thanks.
>mfw I actually met her on /b/
And that was two years ago.
>>
>>585125
Holy shit you are cancer. Please, please, PLEASE kill yourself.
>>
>>585083
Most, if not all the archives will respond to takedown requests (not just copyright stuff). Just give the relevant information and it should be dealt with.
http://archived.moe/_/articles/faq/
But in reality no one is going to take about some random chick's picture.
>>
>>585150
Whoops, wrong link.
http://www.archiveteam.org/index.php?title=4chan
>>
>>585127
>implying we're not all cancer
Welcome to 4chan, enjoy your stay you sad, lonely faggot
>>>/r9k/ may be more to your liking
>>
>>585159
Fuck off normalfag
>>
>>585295
Looks like we got a real bad ass roody poo here
>>
File: ravenkeys.jpg (620 KB, 1920x1200) Image search: [Google]
ravenkeys.jpg
620 KB, 1920x1200
Archive sites do one of two things, sometimes both. They can either:

A) Visit the boards periodically (like once every minute or whatever) and use an algorithm to discern which threads should be archived, or

B) Visit threads on a referral basis, which means a user copy-pastes the URL into the archive site.

Once it has determined it's going to archive a thread, through either of these methods, it scrapes the HTML content from 4chan, parses that content, and downloads the images, associating it with the scraped posts' content.
>>
Actually curious on how archive sites delete cp

Even if porn is posted on a blue board most archive sites will keep the image but as soon as cp is posted and deleted on this site the archive site deletes it too

Do mods tip off the people in charge of those sites?
>>
>>585713
I've never looked carefully, but there might be some hidden HTML under the hood that clues them in that it was deleted for a rules violation, perhaps even describing which rules violation, such as it being cheese pizza. Which would prolly be a good thing. I don't exactly object to efforts to suppress that shit.
>>
test fooble
Thread replies: 22
Thread images: 4

banner
banner
[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y] [Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
If a post contains personal/copyrighted/illegal content you can contact me at [email protected] with that post and thread number and it will be removed as soon as possible.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com, send takedown notices to them.
This is a 4chan archive - all of the content originated from them. If you need IP information for a Poster - you need to contact them. This website shows only archived content.