TexeK

Textual Klassifiers.




TexeK is a textual information statistical classifier technology developed by EnSoft.

TexeK technology uses advanced statistical analisis methods to build 'klassifiers' of textual content in a very simple way.

TexeK does not use linguistic knowledge and it is a language and domain independent classifier technology.

TexeK klassifiers works by detecting statistically relevant characteristics of the texts that are presented as 'samples' of a given class. To build a klassifier, all you have to do is to declare a set of target classes and present a set of texts as examples of every class, the sistem will analise those texts and compute the statistically relevant properties. The system is then ready to 'klassify' (compute the most pobable class) of any incomming textual information.

The word 'texts' in the above paragraph, refers to any textual unit, it could be a document, a web page a .pdf or a file in any of the multiple supported text formats.

TexeK is also a web enabled technology, his klassifiers are not only able to learn from web documents, but to be queried also from standard web interfaces.

Being a language agnostic technology it can be used in a lot of tasks, for example:

Cleint-server setups can be deployed with load-balancing capabilities for systems with a high demand level. TexeK klassifiers are also able to hide any textual information they manage, to ensure maximal privacy and security, even in open  environments as web based setups.

TexeK is being developed on unix/linux platforms, but being a web enabled technology can be queried from any other web enabled system.


Nota Legal:
TexeK technology is © (2005-2009)  Joan Vilaseca Corbera.