develooper Front page | perl.pep | Postings from November 2017

Re: Any advice for a searchable web archiver ?

Thread Previous | Thread Next
From:
Eric Wong
Date:
November 19, 2017 21:35
Subject:
Re: Any advice for a searchable web archiver ?
Message ID:
20171119212836.GA26895@starla
Marc Chantreux <marc.chantreux@renater.fr> wrote:
> Hello,
> 
> As the sympa community (http://www.sympa.org) recently grown, we are
> thinking about revamping the whole UI and we would like to have
> a new web archiver based on:
> 
> * no default frontend but exposing the API through REST, websockets or
>   whatever.
> * maximizing the interactions between Sympa and CPAN
> * trying to avoid other dynamic langage or jvm dependency
>   (or considering them as temporary solutions)
> * being JMAP friendly (we bet on it to become a very healthy community)
> 
> My first idea was to use notmuch, PEP modules and Dancer on top of
> maildirs then i discover Dezi (inactive since 2015) and the use of
> Lucy (also used by the very active librecat project).
> 
> I know Dezi is a general search engine but i hope that taking care of
> a good email support for it than reinvent the wheel.

public-inbox is Perl, uses Email::MIME, and (optionally) uses
Xapian like notmuch.  The Perl bits around search indexing is
ported to Perl from what I understood of the C++ code in notmuch.

The web part is PSGI and I consider the URL format a stable API:

	https://public-inbox.org/design_www.txt

I will probably add JSON support to it for external web services;
haven't looked into JMAP, yet...

There's also a standalone NNTP server based on Danga::Socket.

You can find an example of it for the git mailing list
<git@vger.kernel.org>:  https://public-inbox.org/git/

> Those are lot of things to look for if i want to have a clear opinion
> on a good strategy. Any advice would be warmly welcome.

It's probably not a perfect match for you guys, but it's all AGPL-3+.
The whole thing (code AND data) is designed to be completely
replicatable and forkable using git, so anybody can clone any instance
it's entirety.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About