Help with Python script

Thread replies: 11
Thread images: 1

Anonymous
Help with Python script 2016-04-04 17:52:40 Post No. 53863908
[Report] Image search: [Google]

File: 9da8e6ba78759e88180d41445029fb16.jpg (13 KB, 236x346) Image search: [Google]

Help with Python script Anonymous 2016-04-04 17:52:40 Post No. 53863908 [Report]

Need some help with implementing ccertain logic into the .py script.

The subject is a command-line app for downloading videos from FC2 portal. The problem is most of recent videos fail to download. I've found a possible solution.

Original file: https://github.com/h-collector/youtube-dl/blob/master/youtube_dl/extractor/fc2.py#L79
What needs to be done: with "info_url", send an additional parameter named gk (like &gk=n27cXGgTDW). Its value is calculated and hidden/scrambled in each webpage inside the cass() javascript function. You can get the value if you run alert(cass()) through browser console. However this logic needs to be included in the .py script itself. I'm too dumb to do this myself.

The file is from outdated youtube-dl fork and works better than original. Original authors never bored to fix FC2 script properly and it still has other problems, in addition to this one.

>>

Anonymous 2016-04-04 19:40:53 Post No.53865549
[Report]

Anonymous 2016-04-04 19:40:53 Post No.53865549 [Report]

cass() isn't showing up as a function

>>

Anonymous 2016-04-04 20:47:28 Post No.53866536
[Report]

Anonymous 2016-04-04 20:47:28 Post No.53866536 [Report]

>>53863908
If you don't mind evaluating the javascript source it's esasily doable with spidermonkey or js2py, but it probabily won't be accepted if you open a pull request. Have you seen that cass calls another function ca*** that scrambles for each page?

>>

Anonymous 2016-04-04 20:48:46 Post No.53866552
[Report]

Anonymous 2016-04-04 20:48:46 Post No.53866552 [Report]

>>53863908
Just update to the latest youtube-dl mate

>>

Anonymous 2016-04-04 20:50:54 Post No.53866586
[Report]

Anonymous 2016-04-04 20:50:54 Post No.53866586 [Report]

>>53863908
>>53866536
However, I strongly suspect there is a better way to get the video url. I remember some time ago i watched a video tutorial on how to reverse it using the network panel in the firefox/chrome tools.

>>

Anonymous 2016-04-04 21:10:51 Post No.53866884
[Report]

Anonymous 2016-04-04 21:10:51 Post No.53866884 [Report]

>>53866536
Including any additional parser would probably do no good. You can tell the sequence just from looking at the source of cass(), its (char_number, char) array. Also, not going to do a pull request. Looks like both projects are as good as dead in terms of FC2 support.

>>53866552
Latest youtube-dl has the same bug, crashing on same videos. Example is http://video.fc2.com/en/a/content/20160404BTueH9Sx/
Also, it fails to save non-ASCII characters into the filename. While this 2-years old fork do this just fine.

>>

Anonymous 2016-04-04 21:25:22 Post No.53867101
[Report]

Anonymous 2016-04-04 21:25:22 Post No.53867101 [Report]

>>53866884
I was trying to write my solution with spidermonkey when i accidentally visited https://github.com/fent/node-youtube-dl
Then it's just "sudo npm install youtube-dl -g" if you already have node and npm.
Bad news is it now gets called instead of my python youtube-dl on my linux.
Good news is it works, so no python practice for today.
Can we call it solved for now?

>>

Anonymous 2016-04-04 21:31:04 Post No.53867200
[Report]

Anonymous 2016-04-04 21:31:04 Post No.53867200 [Report]

>>53866884
>>53867101
>http://video.fc2.com/en/a/content/20160404BTueH9Sx/
nvm, works only for some videos.

>>

Anonymous 2016-04-04 22:14:54 Post No.53867925
[Report]

Anonymous 2016-04-04 22:14:54 Post No.53867925 [Report]

>>53863908
>>53866884
Ok, you were right.
I used a string named html_source containing the page source because now I have no time to check what the "webpage" variable in youtube-dl contains.
And sorry for uglyness.

<code>
html_source = " ,,, "

import re
our_lines = [l for l in html_source.split("\n") if re.match(r"c[0-9]= new Array\(", l)]

# removing superfluous:
new_lines = []
for l in our_lines:
m = re.match(r"c.= new Array\((.,'.)'\);", l)
if m:
new_lines.append(m.groups()[0])

splitted = [s.split(",'") for s in new_lines]
our_sequence = "".join([l[1] for l in sorted(splitted)])

print(our_sequence)
</code>

>>

Anonymous 2016-04-04 22:17:01 Post No.53867964
[Report]

Anonymous 2016-04-04 22:17:01 Post No.53867964 [Report]

>>53867925
<.<

html_source = " ,,, "

import re
our_lines = [l for l in html_source.split("\n") if re.match(r"c[0-9]= new Array\(", l)]

# removing superfluous:
new_lines = []
for l in our_lines:
m = re.match(r"c.= new Array\((.,'.)'\);", l)
if m:
new_lines.append(m.groups()[0])

splitted = [s.split(",'") for s in new_lines]
our_sequence = "".join([l[1] for l in sorted(splitted)])

print(our_sequence)

>>

Anonymous 2016-04-04 22:19:05 Post No.53867999
[Report]

Anonymous 2016-04-04 22:19:05 Post No.53867999 [Report]

>>53867964
Bah. I shouldn't reply when I'm in a hurry.

import re
our_lines = [l for l in html_source.split("\n") if re.match(r"c[0-9]= new Array\(", l)]
## removing superfluous
new_lines = []
for l in our_lines:
    m = re.match(r"c.= new Array\((.,'.)'\);", l)
    if m:
        new_lines.append(m.groups()[0])

splitted = [s.split(",'") for s in new_lines]
our_sequence = "".join([l[1] for l in sorted(splitted)])
print(our_sequence)