So I thought maybe I could simply get them off of facebook - no go!
why? Facebook doesn't provide plain-text email adds, which presents a bit of a problem. After a little research, it became clear that FB uses one of those string-to-image scripts. Hah! easy I thought, I'll just decode the Base64 string and voila... as it happens it's not that easy. It's not a Base64 string and to be honest I couldn't figure out what it was. So that left me with the other option - OCR
This didn't prove too difficult at all. For the most part all I had to do was go through all my friends profile pages, extract the string_image hash, and pass that to
http://m.facebook.com/string_image.php?ct=XXXXXXX&fp=8.7&state=0
where ct takes the has, and fp is a float that controls the size of the output image. 8.7 is standard. you can crank that up to improve the OCR detection rate. I found 35 to be the optimal value between size and clarity.
based on this, i was able to whip up a quick bash script to take in a list of User-ID's (just a bunch of numbers that corrospond to a given user. Do what you will to grab that), grab the email image and use OCR on it. I used OCRAD to do the OCR, and imagemagick for convertion.
EDIT: It saddens me that some people have been making money off the code that I wrote. I helped you guys out in good faith. Really sucks that you took advantage of it. Anyway, I've decided to re-post the code here so the lamesters can be exposed for what they are. I'm posting the rewritten perl code here, since the original bash thing didn't work anyway.
NOTE: I have made some deliberate omissions here. modifications are needed before the code will be functional. you WILL GET BANNED by facebook if you overdo it.
here's the bash stuff:
BOING BOING!!! where did the code go? SOrry guys, I had to remove it.
and in perl: (the xxxx's should be easy to figure out if you see my other scripts)
#!/usr/bin/perl use strict; use xxxxx; use xxxxx; use Image::Magick; use Shell qw[ocrad]; my $username = @ARGV[0]; my $password = @ARGV[1]; my $iurl;#temp var my $id; #temp var my $x; #temp var my $uids="uids"; #path of uid list file my $idlist="idlist"; #path of output file my $size=35; #size of email image to download my $mech = xxxxxxx->new(); my $image = Image::Magick->new(); $mech->cookie_jar(xxxxxxxxx->new()); #login $mech->post("https://login.facebook.com/login.php?m&next=http://m.facebook.com/inbox",{email=>$username,pass=>$password}); #star processing uids open(UIDS,$uids); open(IDLS,">>$idlist"); foreach $id () { chomp($id); $mech->get("http://m.facebook.com/profile.php?id=".$id."&v=info&refid=17"); if(defined ($iurl=$mech->find_image( url_regex => qr/string_image.php/ ))) { ($iurl=$iurl->url_abs())=~s/8.7/$size/; chomp($iurl); $x = $image->Read($iurl); $x = $image->Write(gamma=>0.3,colorspace=>'rgb',filename=>$id.".ppm"); print IDLS "$id,".ocrad("$id.ppm")."\n"; @$image = (); } else { print IDLS "$id,undefined\n"; } } close(UIDS);close(IDLS);
This works remarkably well for the most part, although ocrad did confuse some 1's for l's. I had better results with tesseract - but had to convert all the images to bi-tonal graymaps first. otherwise it's simply useless.
11 Comments:
isn't it sort of a bad idea posting this? script-kiddies, spam, end-of-the-world. Get my idea?
Bad...BAAD!
niiiicce... hmmm but i do agree with anon up there...but u could remove the script and post the article just not the entire thing???
thanks. I used this with some modifications to backup my contacts emails and then make a new facebook account and invite everybody to it. Very helpful. I wish facebook would provide this information by default.
they got to you huh? removing the code?
the thing that is interesting is ct.
ct is double base64.
the first part, is a 128bit checksum/md5 hash. followed by a double byte length. then the final bit is the length/8 blocks of code book encrypted data. "code book is like a lookup table, one block doesn't effect the next". However I don't know how to determine the encryption algorithm.
However the iphone interface. uses plain text to specifiy email addresses.
dam script kiddie , please post the code ! ...
sprnch.com has a scraper that does this as well
Hi there, can I peek at the script? I'm trying to learn this stuff.
My email addy is highrider778@gmail.com Let me know if you can or can't either way let me look at the script, thanks!
Post a Comment