[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y ] [Home]
4chanarchives logo
What was your first big-boy project, /g/? Whichever /g/-related
Images are sometimes not shown due to bandwidth/network limitations. Refreshing the page usually helps.

You are currently reading a thread in /g/ - Technology

Thread replies: 55
Thread images: 6
File: TERMESTHETIC.png (20 KB, 1418x874) Image search: [Google]
TERMESTHETIC.png
20 KB, 1418x874
What was your first big-boy project, /g/? Whichever /g/-related project that made you say "yo, this is pretty legit and I can be mildly proud of its utility"

if( user.getProjects() == SHIT && user.postNumber == TRIPS)
{
OP.becomePajeet();
OP.moveToIndia();
}

Also the pic is a terminal fuck up I had a while back that resulted in a hella A E S T H E T I C
>>
i made a tool to help me grep the latest porn in a category on iafd

:^)
>>
I made a robot that jerks me off using an arduino board.
>>
>>53496424
Private cloud (I hate that term, btw) deployment tool.

8 HP DL560 G8's, EMC VNX5400.

Cable everything according to diagram, boot first node from flash, come back after a long lunch, everything is configured.

Automation is fun...
>>
Made a web framework (backend) that was faster than any other within the same language and runtime. I have like <20 real users. I made it because I thought the api I designed was neat. After 6 months of working on it, there are several things I'd do in hindsight.

I just wished it'd help me get a job.
>>
>>53496464
best in the thread? Vote now!
>>
>>53496553
Stuff like that usually would.
>>
first compiler outside of college classes
>>
>>53496464
aren't the rigid edges of the board uncomfortable?
>>
>>53496424
a crawler that went through 4chan posts and tries to identify people through their choice of words, grammar and exif data if pictures.

i worked on it with my old cs prof and it was shockingly, even disgustingly easy to get accurate results after letting it work for a bout a week. that thing was so accurate it could point specific posts to facebook and twitter accounts.

the idea was actually my prof since hes a big conspiracy nut and i was interested in the subject and learned a lot through this.
i also suggested 4chan for this since it was perfect for this.
on boards with ids we could test the accuracy of identifying a user that made several posts.

one of these gems is that we could assign 85% of a days posts to their right posters on /pol/.
the other 15 percent were non english, shitposts or just backlinks with or without shitposts.
>>
>>53497760
that's some NSA tier shit. Pretty cool. Good thing I don't use my facebook or twitter accounts for shit.
>>
File: raingraph.png (510 B, 211x135) Image search: [Google]
raingraph.png
510 B, 211x135
>>53497760
can you link the code? this seems pretty interesting.
>>
>>53497809
funny youd mention that.
wed actually dropped the project because we were approached by the NSA.
in fucking germany.
>>
>>53497860
i can not since we both dont have the rights to that stuff anymore.
however that project was based on a research paper. i can try to dig it up for you.
>>
>>53497867
whoa, how?
also, what language was it made in?
>>53497896
I would be eternally greatful
>>
>>53497903
they wanted to recruit me and my prof to work for them, also put pressure on the uni to give them all of the research materials and code available.
it was made in c++ and python. we crawled 4chan with a python application and fed the information into a sql database and let the c++ program apply a few algorithms on the data and try to find connections in oftenly used grammer mistakes and words.
after some time trying to figure out which words mean nothing and what words to filter wed made progress, uncanny progress and we had a bit of fun with it too. linking back posts to some kids and also grown men, with families.
we didnt contact them though.

unfortunately ive lost access to the student library after i graduated and probably wont find the research papers on this. however what i do remember is the research paper was originally from 2005 and rleased by either Stanford or the Mit.
you might be able to find the document in sci-hub.
the program we made was in 2011-2012.
>>
>>53498030
Thanks for the info. You might want to post this on lain's /cyb/.
that is unbelievable.
>>
>>53498073
maybe i will someday. i actually frequend both lainchan and 8ch/cyber.
>>
>>53498030
In your own opinion, were these links strong, or weak?
>>
>>53498101
it varied.
from the data available it could be very weak to strong.
for example we linked some kid to posts about metal gear because hed make the same grammatical mistakes over and over and always wrote kewl instad of cool and big baws isntead of boss.

that was really enough to identify the kid. we verified it by checking his profile manually and saw 4chan and /v/ groups on his facebook. along with this distinct trait of misspelling these words.

we also did this on commonly used hardware at that time. this ran mostly on a quad core Q6660 with 8 gigs of ram.
>>
>>53498164
I am super interested in the C++ algorithms you used for this. I know you no longer have the code, but could you give a quick rundown of how it identified similarities?
>>
>>53498164
G/r/e/a/t NOW I g.otta 3ncr7pt meye lL /A lN (G (U /A (G lE
>>
>>53498197
great, now I need to develop anti-obfuscation code

actually that might be useful for ebonics. I'll use text-to-voice and then pipe it back through voice-recognition to create usable text.
>>
File: 1453132496211.jpg (122 KB, 594x594) Image search: [Google]
1453132496211.jpg
122 KB, 594x594
>>53498218
when the teacher thinks you're making shitposts but you're actually masking your identity online
>>
>>53498191
my prof did most of the work but we used an ai approach to the thing. we fed a neural network with the data and dictionaries for words and grammar and let it iterate over posts and words parts. wed generate a new generation every time it successfully made a connection with another posts and went from there. we did this for every post. and if generations came to stand still we analyzed the text manually sometimes very little differences like NIgger or Nigger or even nigger could make or break an association.


https://www.youtube.com/watch?v=qv6UVOQ0F44
this can tell you about about neural networks.
there are a few resources online about how this works but it be too complicated to explain it right now. i dont know the exact algorithms anymore though since my prof made all the ground work and i just helped make it faster and more efficient. and did the web component.
>>
File: 1427297744043.jpg (565 KB, 1600x1200) Image search: [Google]
1427297744043.jpg
565 KB, 1600x1200
>>53498260

That's pretty epic. I've been trying to build a web scraper for about six months now. I want to crawl The Guardian comment is free sections and illustrate political bias in comment moderation.

I decided to do this with Node.js, Cheerio and Request modules, but it's become a nightmare. It seems that I'll need to use Phantom.js because the comments wont even load without some user input / scrolling. This shit is killing me, I've never coded before. I'm glad I'm semi au-fait with javascript now but maan, so little progress...
>>
>>53498260
Have you made anything else since then? Have you ever attempted to make a shitpost generator?
>>
>>53498360
keep going. node is surprisingly fast.
im doing webdev myself now mostly too and i work a lot with node but mostly i do react and angular stuff.
whenever i want to scrape a site i usually use x-ray.
mabe this will help you a bunch.
https://github.com/lapwinglabs/x-ray

>>53498409
nothing big. im mostly a webdev now. im working on shitty mobile games here and there but im mostly just trying to get by and not be noticed by anyone.
>>
>>53498418

Thanks Friendo, I'll check X-Ray out. I thought it was kinda deprecated but will revisit with a vengeange.

Yeah, Node is sweet. I'm running it on an old 70 dollar Core 2 Duo desktop in the corner. Fan hasn't come on once. It's giving me major head spazz though, callbacks and all. Node is definitely a thing of beauty. I've just decided that I should probs step back and cut my teeth on some more traditional projects / tutorials first though. Modern web pages do not seem to appreciate web scraping AT ALL...
>>
>>53498418
What is the money like in webdev?
>>
>>53498511
between 40k to 200k depending on what you do.
increasingly webdev pays better than just being a codemonkey.
everyone should know some webdev. atleast javascript and maybe python or ruby.
>>
>>53498503
>Modern web pages do not seem to appreciate web scraping AT ALL...
most websites nowadays are just an empty dom and are built with javascript upon visitng.
>>
>>53496424
Yoooooooooooo, I might make this my background.
>>
File: 1451671864301.jpg (1 MB, 1841x1227) Image search: [Google]
1451671864301.jpg
1 MB, 1841x1227
Is this big-boy stuff?...

I've got an old desktop box running airodump-ng 24x7. One of the neighbours has been letting their dog shit up and down the street. Motherfucker has dropped chods outside my gaffe a few times now, so I got airodump-ng logging away like a motherfucker. Basically any mobile phone with wifi active (every mobile phone in the universe) that gets walked by my house gets picked up, and the time logged. With a bit of luck, it will also pick up any wifi access points that the phone has attached to before. The idea was I'd capture that latter information and then use it to work out who the phantom chodder was by walking the neighbourhood with a wifi scanner (and then fill their front garden with... I dunno)

What's innarestin is there has been no dog shit for about six months now. However a few months back some fucker tried to break into my car - he failed but made off with a neighbours car. I called the police and gave them the iPhone Mac address that blipped into view on the logs at 4 am (neighbours car was stolen at 4.05)

Turned out the police didnt need this, the thief ditched the car after a chase not far away soon after he stole it.

Still, using airodump-ng as a real world IDS is a pretty rad project ya'll.

INB4 I'ma sinister curtain twitching nightmare neighbour. I dont care...
>>
>>53498547

>most websites nowadays are just an empty dom and are built with javascript upon visitng.

Thanks man, but the Request module should pull all that? It seems the lack of input from the end user (my bot) is whats causing the arse-ache? As I see it, this is a job for Phantom.js ...?
>>
>>53498634
yeah its weird. phantomjs is probably the best tool.
i dont actually exactly how this works but something about discerning webbrowsers from regular http requrests and only serving assets when a browser requests the website or an indexer like google.
>>
>>53498600
Damn that is smart, makes me want to buy a faraday cage for my phone.
>>
>>53498600
I really like that idea. Now for the cartoon super villain part. Make a quad that can follow MAC addresses or can around town searching for them.
>>
>>53498671
>i dont actually exactly how this works but something about discerning webbrowsers from regular http requrests and only serving assets when a browser requests the website or an indexer like google.

Thanks for the intel Friendo, I feel like all the wbe scraping knowledge / tutorials out on the web just dont cover this problem. They pretty much just step up to real basic 90's style web pages. I'll crack into Phantom then. Sigh (taking me forevers)
>>
Antox
>>
>>53498711
> Damn that is smart, makes me want to buy a faraday cage for my phone.

airodump_ng is a slice of delicious. Much fun...
>>
>>53498719

There's a penetration testing company who actually did something like this I think.
>>
Nyaa scraper. Up to ID 750k in nyaa, and ID 1.95 mil in sukebei.

https://github.com/altbdoor/php-nyaa-archiver
>>
>>53498711
Or you could turn off Wi-Fi when you're not using it
>>
A program I used to monitor my ghetto renderfarm in real-time, with a lot of cool graphs and stats about temps, resource usage, jobs progress, etc.
I wrote it because I was afraid shit would catch on fire.

I figured it would be quite useful for sysadmins looking for a quick and easy monitoring solution and I've been planning to rewrite it from scratch to be usable for much more general use cases, but I can't be assed.
>>
I wrote a entire site in Perl with FastCGI in 2009. Code is still running fine today, with some changes obviously.
>>
>>53500499
Have you watched the Perl Jam?
>>
>>53500472
anon just release as-is. Push updates later
>>
File: 1432737239287.jpg (54 KB, 680x571) Image search: [Google]
1432737239287.jpg
54 KB, 680x571
>>53500499
>Perl with FastCGI in 2009

please put a trigger warning next time you say something like this
>>
>>53500508
Yes, and I'm not affected by any of that stuff.
>>
>>53500541
But it's written in unportable and unmaintanable python and has an embarasingly hideous god object doing most of the work.
I would at least want to clean the code up before releasing it publicly, especially since a lot of potential recruiters might read it before considering an interview.

Besides, a lot has changed since I wrote it and new technologies like NoSQL DBs and WebSockets would make it much more efficient and reduce its code complexity by a lot, particularly on the client (web viewer) side.
>>
>>53500674
why not just host it on neetcode or something like that instead of github? no employer will look on there
>>
>>53496553

This would get you a job. Either stop 'sperging out, or move away from your parents that's in the middle of nowhere.
>>
>>53498260
Could you use it to identify that faggot who uses multiple quote marks like """" this"""" to emphasize shit? I've seen his posts around here, r9k and even pol.
>>
>>53500508
I didn't know DSP attended security conferences.
Thread replies: 55
Thread images: 6

banner
banner
[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y] [Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
If a post contains personal/copyrighted/illegal content you can contact me at [email protected] with that post and thread number and it will be removed as soon as possible.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com, send takedown notices to them.
This is a 4chan archive - all of the content originated from them. If you need IP information for a Poster - you need to contact them. This website shows only archived content.