Texe
K is
a textual information statistical classifier technology developed by
EnSoft.
Texe
K
technology uses advanced statistical analisis methods to build
'klassifiers' of textual content in a very simple way.
Texe
K
does not use linguistic knowledge and it is a language and domain
independent classifier technology.
Texe
K
klassifiers works by detecting statistically relevant characteristics
of the texts that are presented as 'samples' of a given
class. To build a klassifier, all you have to do is to declare a set of
target classes and present a set of texts as examples of every class,
the sistem will analise those texts and compute the statistically
relevant properties. The system is then ready to 'klassify' (compute the
most pobable class) of any incomming textual information.
The word 'texts' in the above paragraph, refers to any textual unit, it
could be a document, a web page a .pdf or a file in any of the
multiple supported text formats.
Texe
K is
also a web enabled technology, his klassifiers are not only able to learn from
web documents, but to be queried also from standard web interfaces.
Being a language agnostic technology it can be used in a lot of
tasks, for example:
- Spam detection.
- Topic or language detection.
- Mail or News classification, etc
Cleint-server setups can be deployed with load-balancing capabilities
for systems with a high demand level. Texe
K klassifiers
are also able to hide any textual information they manage, to ensure maximal
privacy and security, even in open environments as web based
setups.
Texe
K is
being developed on unix/linux platforms, but being a web enabled
technology can be queried from any other web enabled system.
Nota Legal:
TexeK
technology is © (2005-2009) Joan Vilaseca Corbera.