Computer Science

Thread replies: 7
Thread images: 1

Anonymous
Computer Science 2016-01-12 18:59:42 Post No. 7779060
[Report] Image search: [Google]

File: 2016-01-12-185924_752x799_scrot.png (107 KB, 752x799) Image search: [Google]

Computer Science Anonymous 2016-01-12 18:59:42 Post No. 7779060 [Report]

How does one get a good understanding of preprocessing data before starting to think about neural network architecture, etc?

Is there a checklist or something? I guess there's imputation if needed, converting categorical to numerical, then... I look for correlations (correlation matrix) and maybe for mutual information (to check for non-linear correlations) but what else? I don't know, is there a complete guide for this?

Also, computer science general

>>

Anonymous 2016-01-12 19:02:20 Post No.7779065
[Report]

Anonymous 2016-01-12 19:02:20 Post No.7779065 [Report]

>http://blog.kaggle.com/2016/01/04/how-much-did-it-rain-ii-winners-interview-1st-place-pupa-aka-aaron-sim/

>mfw random physics guy jumps into ML and gets #1

>>

Anonymous 2016-01-12 19:18:54 Post No.7779090
[Report]

Anonymous 2016-01-12 19:18:54 Post No.7779090 [Report]

>>7779060
fuck NNs, bayesian program learning BTFO deep learning: http://science.sciencemag.org/content/350/6266/1332.full

>>

Anonymous 2016-01-12 19:20:12 Post No.7779091
[Report]

Anonymous 2016-01-12 19:20:12 Post No.7779091 [Report]

>>7779090
nice paywall kike

>>

Anonymous 2016-01-12 19:54:22 Post No.7779172
[Report]

Anonymous 2016-01-12 19:54:22 Post No.7779172 [Report]

>If I were to take one point away from this contest, it is that the days of manually constructing features from data are almost over. The machines will win. I experienced this in the Plankton classification contest where the monumental effort that my teammate and I put into extracting image features was eclipsed within minutes by even the shallowest of CNNs.

>>

Anonymous 2016-01-13 08:35:19 Post No.7780343
[Report]

Anonymous 2016-01-13 08:35:19 Post No.7780343 [Report]

>>7779060
That basically means you have to learn the field you are trying to do learning on.

>>7779172
People in general don't bother reading it if it's behind a paywall. Also the machines won't win if you don't have a method of selecting relevant training data. Any machine learning method could fail if you train it using the wrong data. Manually selected features could be used to disqualify the worst training data to avoid ruining the network.

>>

Anonymous 2016-01-13 09:28:47 Post No.7780382
[Report]

Anonymous 2016-01-13 09:28:47 Post No.7780382 [Report]

>>7780343
>not being part of a group that provides access to all papers you want