develooper Front page | perl.beginners | Postings from September 2021

Re: Regex to detect natural language fragment

Thread Previous | Thread Next
From:
Chankey Pathak
Date:
September 13, 2021 15:39
Subject:
Re: Regex to detect natural language fragment
Message ID:
CA+PMQkEWjn81tAZsok7QO8niB8C1sQN32AsBpfB7uLWSe=zvfw@mail.gmail.com
You can look into NLP https://metacpan.org/search?q=nlp

On Mon, 13 Sept 2021 at 21:04, Julius Hamilton <juliushamilton100@gmail.com>
wrote:

> Hey,
>
> I'm not sure if this is possible, and if it's not, I'll explore a better
> way to do this.
>
> I would like to write a script which analyzes if a line of text is
> (likely) a broken natural language sentence, i.e., it is probably part of a
> sentence, even if the start or end is not present, rather than it being a
> fully "complete" linguistic entity, for example, a header of a section,
> which does not have a period at the end and is not really a sentence, yet
> is in a complete and unbroken form.
>
> I'm pretty sure in principle this will require some kind of syntax
> parsing. I think I read somewhere regular expressions for some mathematical
> reason cannot parse tree / nested structures, for example HTML.
>
> Does anyone know what some next most ubiquitous, standard tool is for
> analyzing nested linguistic structures? Is that an XML parser?
>
> Thanks very much,
> Julius
>

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About