Resumo: |
The Natural Language Processing technologies (PLN) are being used for analysis of huge amounts of data. With the advent of new media and mass adoption of social networking, the flow of information generated every second is the largest in history. The majority of that is multimedia files. Meanwhile, a large portion of the information produced, especially in social network, is textual. Thus, PLN solutions need to be more robust than they ever were, finding processing solutions that might accompany this constant information production or at least provide better results compared to procedures previously used. The labelers or taggers are a major component of PLN. Its function, explored in this work is the ability to observe and catalog the words in a text according to their morphosyntactic functions. The name commonly given to this process is the POST (Part-Of-Speech Tagging). Within the context Part-Of-Speech (POS) is the function to process and identify a group of words by grouping them into pre-defined types. This grouping can occur due to syntactic, morphological or morphosyntactic. Although the processing speed is a worthy feature, when we deal with labelers, the accuracy obtained for its process should be the premise. The concept of obtaining semantic labels from texts evaluations seems simple at first sight, although presents several challenges. One of the major challenges encountered in PLN is the problem of ambiguity. This situation, which occurs in several stages of natural language processing, is complex due to requires comprehensive knowledge from the processing application using that as tools to collaborate in order to implement the most correct choices. It is a classic problem, inherent to natural and existing language since the beginning of the researches of this area. Several possibilities to minimize its consequences have been proposed since then. This paper lists some of the proposals found on the literature by adding the possibility to use MTMDD structures during the process, looking for a substantial performance gain. |
---|