Im sorry again for the last email but I also have something to ask for ..
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)
now Im talking about this element ([\w\d]{2,4}) which seems to match
.ex , .ext or .exte for example .mp3
I understand that \w matches an alphanumeric character, including "_"
same as [A-Za-z0-9_] in ASCII
that I know it finds for numbers , letters including underscore ..
which is correct here but the thing that is confusing ot me
also we have used \d which finds for matches a digit same as [0-9] in
ASCII.. so we have used 0-9 twice! any comment about it?
Im also seeing these urls again
#generic http://variable.domain.com/path/filename."ex", "ext" or "exte"
#http://cdn1-28.projectplaylist.com
#http://s1sdlod041.bcst.cdn.s1s.yimg.com
^ means that we matches the beginning of a line or string.
m/^http:\/\/ ... we used at the start (.*?) which seems to be to find anything !
If we want to look at this url ; #http://s1sdlod041.bcst.cdn.s1s.yimg.com
If Im correct then (.*?) means to match "s1sdlod041" and then the
second element(\.[^\.\-]*?\..*?) we moved to . after
"s1sdlod041" so nw we have "http://s1sdlod041." but I want to know how
about "[^\.\-]*?\..*?" like [] or we used ^ for \. and \-
coz we are also finding dashes or dots .. after that we used "*"
anything! and then Question Mark "?" .. something also confusing to me
"\.." or "\..*?" .
another question to ask for ([^\?\&\=]*) umm I think this one is for
folders or what ?...
as I saw the slash \/ before it .. which seems to catch
/?url=blah&C=blah2 and the "*" matches "blah" and "bla2"
but please if you dont mind then you can explain or illustrate more
about (\.[^\.\-]*?\..*?) or maybe you can explain it well
using your way as Im sure you are a good teacher hehehe
Please explain the whole match to me
(m/^http:\/\/(.*?)(\.[^\.\-]*?\..*?)\/([^\?\&\=]*)\.([\w\d]{2,4})\??.*$/)
I was eager to ask you all these questions from the start but I was
afraid thinking you'll not help anyway
that what I was trying to go so far is FileHippo domain
http://fs34.filehippo.com/6574/058e5771e07c467cb38d70ab6fbed3c0/Opera_1150b1_int_Setup.exe
in this case we have to try to change the domain into
"cdn.filehippo.com/6574/Opera_1150b1_int_Setup.exe" because we removed
the hashed folder!
Its okay I have the script for it
#cdn, varialble 1st path
} elsif (($u =~ /filehippo/) &&
(m/^http:\/\/(.*?)\.(.*?)\/(.*?)\/(.*)\.([a-z0-9]{3,4})(\?.*)?/)) {
@y = ($1,$2,$4,$5);
$y[0] =~ s/[a-z0-9]{2,5}/cdn./;
print $x . "http://" . $y[0] . $y[1] . "/" . $y[2] . "." . $y[3] . "\n";
and its working 100% . I can get it from cache too .. what if I want
to add wlxrs.com into ($u =~ /filehippo|wlxrs/)
does that match this URL?
http://css.wlxrs.com/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/OOB_30_IllustratedKai/15.40.1211/img/Kai_Sunny_thumbnail.jpg
I dont think so as it has "!" where should I add this one to match a
folder like
"/HGjlAVvMlW6-1!iEEpuBkgo2TZKpU8RH!W4mH-UPgteZ8OD6Oxte!sCQWfQ1OB7A6B-NZoBS1jrItq7zq!v10A/"
sometimes the CDN folder comes at the 1st folder or 2nd or 3rd ..
deopends on any website.
can you lead me where should I find or edit this script to follow WLXRS.COM
btw, you really helped alot with those complicated examples which
means I can start from now to match any known cases
Thank you alot
Received on Tue May 31 2011 - 17:47:20 MDT
This archive was generated by hypermail 2.2.0 : Wed Jun 01 2011 - 12:00:04 MDT