develooper Front page | perl.cpan.testers.discuss | Postings from October 2019

Re: Data Retention Policies

Thread Previous | Thread Next
From:
James E Keenan
Date:
October 28, 2019 11:15
Subject:
Re: Data Retention Policies
Message ID:
20191028111455.8428.qmail@lists-nntp.develooper.com
On 10/26/19 9:49 PM, Doug Bell wrote:
> Okay, the general vibe I got from the responses is:
> 
> * Report metadata isn't useful enough without the report text
> * Data for old distributions matters for the people maintaining systems 
> that use them
> 
> I think the best path forward might be to only do these things:
> 
> 1. Archive the full report text for reports older than 5 years
> 2. Keep only the full report text in the database for 5 years
> 3. Keep metadata and statistics in the database forever
> 
> Then we can build a site that can read those old reports from the 
> archive files. Since the most common use-case for a visitor (and correct 
> me if I'm wrong) is to go look up the reports for a specific 
> distribution on a specific Perl/platform, no functionality is lost. With 
> development we can even make some filtering / searching of the archived 
> reports possible.
> 
> At the moment, keeping metadata forever should not be a huge issue: If I 
> fix the metadata to remove some duplicate data and normalize it a bit 
> better, I can even make it smaller.
> 
> The full data retention policy then becomes the following decision tree:
> 
> * Reports (full report data)
> * Reports submitted >5 years ago
> * Release on CPAN
> - Report archived
> * Release not on CPAN
> - Report archived
> * Reports submitted <5 years ago
> * Release on CPAN
> + Report available
> * Release not on CPAN
> + Report available
> * Metadata (release, Perl version, Perl architecture, OS name/version, 
> test reporter, date/time, pass/fail status)
> * Reports submitted >5 years ago
> * Release on CPAN
> + Metadata available
> * Release not on CPAN
> + Metadata available
> * Reports submitted <5 years ago
> * Release on CPAN
> + Metadata available
> * Release not on CPAN
> + Metadata available
> * Statistics (release, pass/fail count)
> * Reports submitted >5 years ago
> * Release on CPAN
> + Statistics available
> * Release not on CPAN
> + Statistics available
> * Reports submitted <5 years ago
> * Release on CPAN
> + Statistics available
> * Release not on CPAN
> + Statistics available
> 
> I'll start planning out the scripts needed to achieve this, and when I'm 
> ready to do something, I'll make an announcement and give some time for 
> additional comments.
> 
> Thanks,
> 
> 
> Doug Bell
> doug@preaction.me <mailto:doug@preaction.me>
> 
> 

Doug, thanks for thinking this through.  My own opinion:  you could 
s/5/3/g above and still meet 99% of our needs.

Thanks.
jimk

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About