[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y ] [Home]
4chanarchives logo
/mlds/ - Machine Learning and Data Science >May Project
Images are sometimes not shown due to bandwidth/network limitations. Refreshing the page usually helps.

You are currently reading a thread in /g/ - Technology

Thread replies: 50
Thread images: 3
File: mltgeneral.jpg (103 KB, 638x359) Image search: [Google]
mltgeneral.jpg
103 KB, 638x359
/mlds/ - Machine Learning and Data Science

>May Project
(open to suggestion, something related to /g/ traffic probably)
>4chan API
https://github.com/4chan/4chan-API
>Interesting projects, or other ML stuff, or ?
https://www.jukedeck.com
http://cs229.stanford.edu/projects2
http://imgur.com/a/K4RWn - challenge images
http://vizdoom.cs.put.edu.pl/compet
https://www.kaggle.com/ - ML competitions with rewards
https://github.com/alexjc/neural-do
https://github.com/awentzonline/ima

>Cheatsheets, infographics
http://www.datasciencecentral.com/p


>Beginner Links
https://medium.com/@ageitgey/machin
https://www.youtube.com/watch?v=bxe [Embed]

>/mlds/ projects

>Free learning
https://www.coursera.org/learn/mach
https://www.coursera.org/learn/prac
https://lagunita.stanford.edu/cours
http://www-bcf.usc.edu/~gareth/ISL/


We are still in the stages of making a dev team, I'll make a fake email after I eat breakfast. Throw some ideas out there!
>>
>>54495147
>no links or mention of kaggle

I'm very disappointed
>>
>Data Science

you mean statistics?
>>
>>54495414
No, I mean like combinations and permutations of data sets. Fuck outta here with your pussy stats.
>>
>>54495147
How many of these threads have to 404 before you decide it's a bad idea
>>
>>54495398
Well the whole idea was started last night while I was on Xanax, and I thought to myself "Hey, I'd really like to do something with this."

Ergo the thread.
>>
>>5449545
Well the most I'm gonna post is 3. After three, I'll just go enjoy my vacation.
>>
>>54495452
Well the most I'm gonna post is 3. After three, I'll just go enjoy my vacation.
>>
Does anyone have experience with unbalanced data?

I'm still kinda learning, but I'm applying logistic regression to some data which has something like 200 positives and 1800 negatives.

I've read some things online like bootstrap or undersampling, I'll try them later.
But I feel like everything is written to account for a problem in gathering the data, but in my case the real data will have that same ratio or lower, are they still correct?

Also most of the links on the OP got cut off.
>>
>>54495471
I'm just kidding dude, you mentioned kaggle. Did you write the OP?
>>
>>54496071
I didn't write it, but I guess the person who did wasn't serious about doing some work. So I copied it. I'm already taking Linear Algebra in the fall so I'd love to get some practice with applications of that.
>>
>>54496060
I do not have experience with unbalanced data sorry.
>>
>>54496060
You may try an approach from anomaly detection rather than classification
>>
>>54495147
kool, nice thread op, was just thinking it would be nice to see some real computer science on /g sometimes

have been thinking for a while it would be fun to try and analyse post times, unique posters, replies, punctuation etc to essentially build a samefag detector
I think it could be reasonably reliable
>>
>>54498928
I tried KNN and it gave really bad results, granted that probably doesn't work for almost anything.

I'm going to try with the one class support vector machine, mostly because I already kinda read about support vector machines, thanks for the advice.
>>
>>54500513
I've had good results with naive bayes on similarly sized datasets as well.
>>
>>54500609
I just tried ir and it didn't work for me, 0.73 ROC AUC compared to the ~0.87 I'm getting with logistic regression.
>>
>>54496060
try data augmentation maybe.
>>
How do I into network/graph analytics?
>>
File: 1458829091123.jpg (453 KB, 1872x1990) Image search: [Google]
1458829091123.jpg
453 KB, 1872x1990
thanks op i saved your thread opener
i'll play with it over the summer
>>
>>54495147
op, half the links are broken. could you fix them?
>>
>>54501917
You're supposed to write an AI that can predict the content that those pages should have.
>>
>>54501959
nigga please not when the beginner links are broken
>>
Has any of you faggots applied any of these techniques to the stock market?

I've been reviewing a lot of computational Intelligence techniques in the past months and am thinking of giving it a try with neural networks.

My final goal is to get a living way off an automatic trading program.
>>
>>54502263
considering even a mediocre stock trader have knowledge to quickly scan news articles about their assets, you have a pretty high bar to beat there (you need NLP)
>>
>>54502263
If you're talking about looking at graphs and finding patterns you can forget about it.
>>
>>54502263
good way to lose all of your money fast
>>
>>54495147
tell me a difference between Machine learning/ANNs and optimized brute force.

PROTIP:you can't

ANNs are the biggest meme in science in the last 50 years
>>
File: 1461042249562.gif (1 MB, 371x209) Image search: [Google]
1461042249562.gif
1 MB, 371x209
>>54502263
RIP in pepes
>>
>>54502263

Take a look at https://numer.ai//
>>
>>54502642
what the fuck are you even on about?
>>
>>54503576
That looks so l33t
>>
>>54502642
>optimized brute force
this term accurately describes solving literally any problem using literally any solution (except "unoptimized" brute force, i guess)
>>
>>54502719
>dat gif
am I the only one who thinks it looks like she's blaming the hammer?
>>
anyone using Julia?

reminder there is no reason to have to use R, Python, C/C++ anymore with Julia's speed, memory capacity, ease of syntax, low level and network ability
>>
>>54495147
Just got done with a data science class. Count me in!

What if we designed a decision making program that reads the words in posts and determines the appropriate thread for the particular post?
>>
>>54505689
Sorry, meant appropriate board.
>>
>>54505689
>>54506112

Oh shit, this would be great

Integrate into 4chan so every thread gets posted into exactly the board/containment it should be

Move threads between boards when they get spammed with too much anime or go off-topic
>>
>>54505689
Good luck making it learn dank memes tbqhwy famalampai
>>
>>54503576
Nice website

>>54502263
>neural networks
Won't work, good luck
>>
>>54506913
Not as hard as you think. Acronyms and board-related jargon would actually help a text learner classify the proper board, given it uses a Bayesian decision forest.
>>
>>54502263
Unlikely to work unless you have access to a massive financial database. Focusing solely on price ignores far too many factors, but it's definitely a fun little project
>>
>tfw too stupid for machine learning
But I'm really fucking interested in it.

I know it's not really ML related, but are genetic algorithms a good starting point if I want to get into this sort of stuff?
>>
>>54508178
Sure bruh.
Start with those, then I suggest you to read into fuzzy logic and then move on to perceptron networks.
>>
Anyone taking the Coursera course taught by Andrew Ng right now?
>>
Anyone experimenting with cellular automata neural networks?
Something similar to this: https://en.wikipedia.org/wiki/CoDi
>>
>>54508470
When should I move onto neural networks?
I'm mainly interested in those.
>>
>>54508543
If you want you can skip the first two, it was just to give you a list of easy stuff to do.
You can jump into NN as soon as you have a basic understanding of linear algebra and some calculus (derivatives, gradients...).
>>
>>54508728
>You can jump into NN as soon as you have a basic understanding of linear algebra and some calculus (derivatives, gradients...).
Well I'm already familiar with those two, so I guess I will do NN's once I'm done with genetic algorithms.
I thought they would require more knowledge than that, though.
>>
>>54508799
They do, but not if you only want to play around with easy problems.
If you want to do image recognition you'll need a bit more general knowledge.
Thread replies: 50
Thread images: 3

banner
banner
[Boards: 3 / a / aco / adv / an / asp / b / biz / c / cgl / ck / cm / co / d / diy / e / fa / fit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mu / n / news / o / out / p / po / pol / qa / r / r9k / s / s4s / sci / soc / sp / t / tg / toy / trash / trv / tv / u / v / vg / vp / vr / w / wg / wsg / wsr / x / y] [Home]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
If a post contains personal/copyrighted/illegal content you can contact me at [email protected] with that post and thread number and it will be removed as soon as possible.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com, send takedown notices to them.
This is a 4chan archive - all of the content originated from them. If you need IP information for a Poster - you need to contact them. This website shows only archived content.