TexeK

Textuals Klassifiers.



Language Detection

In this section you can query a TexeK Klassifier designed to work as a language detector.
The Klassifier has been build by simply declaring four classes, one for each target language:
And then using as examples of each language 3 texts extracted from the Project Gutemberg catalog (stripping the PG english preface templates), specifically: The whole process took less than five minutes to deploy. No special criteria has been used to select those references other than to ensure that no mixed languages were used in the texts.

To test the TexeK Klassifier, just enter a short phrase in any of the four languages and  pres 'submit', the Klassifier will try to detect the language used in the entered phrase..


News Detection

In this section you can query a TexeK Klassifier designed to work as a news type detector.
This Klassifier has been build by declaring five classes, one for each usual news type.
The trainning has been done by presenting 100 news english web pages for each section extracted from Google News between days 6-15 June 2005.

The only consideration has been to avoid web redirects or subscription based news pages.

To test the TexeK Klassifier, just enter the url of an english news page and  pres 'submit', the Klassifier will download the page and try to detect what kind of news the page is about.