develooper Front page | perl.beginners | Postings from April 2011

Re: Regular Expressions Question

Thread Previous | Thread Next
From:
Rob Dixon
Date:
April 11, 2011 08:51
Subject:
Re: Regular Expressions Question
Message ID:
4DA32372.7020903@gmx.com
On 11/04/2011 06:43, Shlomi Fish wrote:
> On Sunday 10 Apr 2011 14:05:49 cityuk wrote:
>>
>> This is more of a generic question on regular expressions as my
>> program is working fine but I was just curious.
>>
>> Say you have the following URLs:
>>
>> http://www.test.com/image.gif
>> http://www.test.com/?src=image.gif?width=12
>>
> 
> Don't use regular expressions to parse URLs - instead use URI.pm:
> 
> http://cpan.uwinnipeg.ca/dist/URI

I agree. The program below shows a subroutine which will extract the
file type from either form of URL. It first checks to see if there is a
'src' option in the query, using this for the file name if so; otherwise
it uses the last segment of the URL path. The file type type is
extracted by capturing all trailing non-dot characters from the file
name.

(I assume your second address should read
<http://www.test.com/?src=image.gif&width=12> with an ampersand instead
of a second question mark?)

HTH,

Rob


use strict;
use warnings;

use URI;

sub filetype_from_url {
  my $url = URI->new($_[0]);
  my %form = $url->query_form;
  my $file = $form{src} || ($url->path_segments)[-1];
  return $file =~ /([^.]+)\z/;
}

print filetype_from_url('http://www.test.com/image.gif'), "\n";
print filetype_from_url('http://www.test.com/?src=image.gif&width=12'), "\n";





Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About