develooper Front page | perl.module-authors | Postings from February 2005

Naming Proposal: WWW::Patent::Page (continued from earlier at comp.lang.perl.modules)

Thread Next
Wanda Anon
February 21, 2005 14:58
Naming Proposal: WWW::Patent::Page (continued from earlier at comp.lang.perl.modules)
Message ID:
I have written a new module, WWW::Patent::Page, and
propose to submit it to CPAN.  Your comments would be

Does the name seem reasonable? I am happy to take
suggestions.  I think it is reasonable to have a
"Patent" namespace in WWW, since much patent
information is available on the WWW. For example,
searches of the prior art, patent family
relationships, patent applications via XML, etc.  With
a namespace, related modules may be grouped easily. 
One can imagine future modules like
"WWW::Patent::Apply", WWW::Patent::Family", or
WWW::Patent::Search" for interacting with various web

WWW::Patent::Page is alpha software- my first module,
and my intent is to see if the perl community has any
interest in the idea.  It is rough around the edges,
but passes what tests it has.

The module provides a consistent way to obtain pages
of patent documents from various patent offices that
make them available on the WWW.  Typically, doing this
is relatively easy by hand, page by page, but takes a
bit of work if you want to do automate it effectively
for many pages or documents.  The offices typically
make it hard to get the whole document, presumably
because supplying that is one source of revenue.

From this primitive module, users can stitch together
tiff or PDF into multipage documents by whatever
method they prefer.

The module uses submodules, specific to separate
patent offices, and comes with working examples for
the USPTO and EPO, which between them supply granted
patents in html and tiff (USPTO) and pdf (US, EP, and
much of the world...). Hopefully, other interested
users will create new or improved submodules and feed
them back into the distribution.

For casual users, this module should simplify life. 
Abusive users will likely find their IP address banned
by the patent office being spidered.

Here is the documentation as it now stands:

    WWW::Patent::Page - retrieve a patent page (e.g.
from United States
    Patent and Trademark Office (USPTO) website or the
European Patent
    Office (ESPACE_EP). )

    Please see the test suite for working examples.
The following is not
    guaranteed to be working or up-to-date.

      use WWW::Patent::Page;

      my $patent_document = WWW::Patent::Page->new();
# new object
      my $document1 =
            # defaults:     office  => 'USPTO',
            #               country => 'US',
            #               format  => 'htm',
            #               page    => '1',      #
typically htm IS "1" page
            #               modules => qw/ us ep / ,

      my $document2 =
                            office  => 'ESPACE_EP' ,
                            format  => 'tif',
                            page    => 2 ,

      my $pages_known =
$patent_document->pages_available(  # e.g. TIFF
                            document=> '6 123 456',
      Intent:  Use public sources to retrieve patent
documents such as
      TIFF images of patent pages, html of patents,
pdf, etc.
      Expandable for your office of interest by
writing new submodules..
      Alpha release by newbie to find if there is any

      See also SYNOPSIS above
         Standard process for building & installing

              perl Build.PL
              ./Build test
              ./Build install

    Examples of use:

      $patent_document = WWW::Patent::Page->new(
                            doc_id  =>
                            office  => 'ESPACE_EP' ,
                            format  => 'tif',
                            page    => 2 ,
                            agent   => 'Mozilla/5.0
(Windows; U; Windows NT 5.0; en-US; rv:1.4b)
Gecko/20030516 Mozilla Firebird/0.6',
    # 'Windows IE 6' => 'Mozilla/4.0 (compatible; MSIE
6.0; Windows NT

    # 'Windows Mozilla' => 'Mozilla/5.0 (Windows; U;
Windows NT 5.0; en-US;
    rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6',

    # 'Mac Safari' => 'Mozilla/5.0 (Macintosh; U; PPC
Mac OS X; en-us)
    AppleWebKit/85 (KHTML, like Gecko) Safari/85',

    # 'Mac Mozilla' => 'Mozilla/5.0 (Macintosh; U; PPC
Mac OS X Mach-O;
    en-US; rv:1.4a) Gecko/20030401',

    # 'Linux Mozilla' => 'Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.4)

    # 'Linux Konqueror' => 'Mozilla/5.0 (compatible;
Konqueror/3; Linux)',

      my %attributes =
$patent_document->get_patent('all');  # hash of all

      my $document_id =
            # US6,654,321(B2)issued_2_Okada

      my $office_used =
$patent_document->get_patent('office'); # ep 

      my $country_used =
$patent_document->get_patent('country'); #US

      my $doc_id_used =
$patent_document->get_patent('doc_id');  # 6654321

      my $page_used =
$patent_document->get_patent('page');  # 2

      my $kind_used =
$patent_document->get_patent('kind');  # B2 

      my $comment_used =
$patent_document->get_patent('comment');  #

      my $format_used =
$patent_document->get_patent('format'); #tif

      my $pages_total =
$patent_document->get_patent('pages_available');   #

      my $terms_and_conditions =
$patent_document->terms('us'); # and conditions
      my $document =
$patent_document->get_patent('document'); # the loot

    Pre-alpha release, to gauge whether the perl
community has any interest.

    Code contributions, suggestions, and critiques are

    Error handling is undeveloped.

    By definition, a non-trivial program contains

    For United States Patents (US) via the USPTO (us),
the 'kind' is ignored
    in method provide_doc

    Yes, please. Checks are best. Or email me at to
    arrange fund transfers.

            Wanda B. Anon
    This program is free software; you can
redistribute it and/or modify it
    under the same terms as Perl itself.

    The full text of the license can be found in the
LICENSE file included
    with this module.

    Andy Lester for WWW::Mechanize, that got me

    The authors of Finance::Quote, which served as an
example of providing

    Erik Oliver for patentmailer, serving as an
example of getting patent

    Howard P. Katseff of AT&T Laboratories for,
version 2, a proxy
    that speaks LWP and understands proxies,

    and of course Larry and Randal and the gang.


  Subroutine _countries_known()
     Usage     : internal method only
     Purpose   : list all entities that could give a
     Returns   : ref to a hash with keys of
abbreviations and values of entities (usually a
country)  ...

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About