Results 1 - 10
of
86
Overview of TweetLID: Tweet Language Identification at SEPLN 2014 Introducción a TweetLID: Tarea Compartida sobre Identificación de Idioma de Tuits en SEPLN 2014
"... Resumen: Este art́ıculo presenta un resumen de la tarea compartida y taller TweetLID, organizado junto a SEPLN 2014. Resume brevemente el proceso de colección y anotación de datos, el desarrollo y evaluación de la tarea compartida, y por último, los resultados obtenidos por los participantes. Pa ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Resumen: Este art́ıculo presenta un resumen de la tarea compartida y taller TweetLID, organizado junto a SEPLN 2014. Resume brevemente el proceso de colección y anotación de datos, el desarrollo y evaluación de la tarea compartida, y por último, los resultados obtenidos por los participantes
TweetSafa: Tweet language identification TweetSafa: Identificación del lenguaje de tweets
"... Resumen: Este art́ıculo describe la metodoloǵıa utilizada en la tarea propuesta en SE-PLN 14 para la identificación de lenguaje de tweets (TweetLID), como se explica en (Iñaki San Vicente, 2014). El sistema consta de un preprocesamiento de tweets, creación de dic-cionarios a partir de N-Grams y ..."
Abstract
- Add to MetaCart
dos algoritmos de reconocimiento de lenguaje. Palabras clave: Reconocimiento de lenguaje, lenguaje de tweets. Abstract: This paper describes the methodology used for the SEPLN 14 shared task of tweet language identification (TweetLID), as explained on (Iñaki San Vicente, 2014). The system consists
Tweets language identification using feature weighting Identificación de idioma en tweets mediante pesado de términos
"... Resumen: Este trabajo describe un método de detección de idiomas presentado en el Taller de Identificación de Idioma en Twitter (TweetLID-2014). El método propuesto representa los tweets por medio de trigramas de caracteres pesados de acuerdo a su relevancia para cada idioma. Para el pesado de los t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. Finalmente, realizamos un análisis de los resultados obtenidos. Palabras clave: tweets, identificación de idioma, pesado de rasgos Abstract: This paper describes the language identification method presented in Twitter Language Identification Workshop (TweetLID-2014). The proposed method represents tweets
Language Identification for Creating Language-Specific Twitter Collections
"... Social media services such as Twitter offer an immense volume of real-world linguistic data. We explore the use of Twitter to obtain authentic user-generated text in low-resource languages such as Nepali, Urdu, and Ukrainian. Automatic language identification (LID) can be used to extract language-sp ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
-specific data from Twitter, but it is unclear how well LID performs on short, informal texts in low-resource languages. We address this question by annotating and releasing a large collection of tweets in nine languages, focusing on confusable languages using the Cyrillic, Arabic, and Devanagari scripts
Language variety identification in Spanish tweets
"... We study the problem of language vari-ant identification, approximated by the problem of labeling tweets from Spanish speaking countries by the country from which they were posted. While this task is closely related to “pure ” language iden-tification, it comes with additional com-plications. We bui ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We study the problem of language vari-ant identification, approximated by the problem of labeling tweets from Spanish speaking countries by the country from which they were posted. While this task is closely related to “pure ” language iden-tification, it comes with additional com-plications. We
Language Identification of Tweets Using LZW Compression
"... Language identification plays a major role in natural language processing ap-plications and is commonly used as a pre-processing phase. Typical tech-niques employ n-gram models and re-quire text parsing and cleanup such as punctuation removal, white-space nor-malization, etc. This paper investi-gate ..."
Abstract
- Add to MetaCart
Language identification plays a major role in natural language processing ap-plications and is commonly used as a pre-processing phase. Typical tech-niques employ n-gram models and re-quire text parsing and cleanup such as punctuation removal, white-space nor-malization, etc. This paper investi
Nirmal: Automatic identification of software relevant tweets leveraging language model
- in Software Analysis, Evolution and Reengineering (SANER), 2015 IEEE 22nd International Conference on. IEEE, 2015
"... Abstract—Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140-character messages called “tweets”. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
with noise, we propose a novel approach named NIRMAL, which automatically identifies software relevant tweets from a collection or stream of tweets. Our approach is based on language modeling which learns a statistical model based on a training corpus (i.e., set of documents). We make use of a subset
Detecting the Gender of a Tweet Sender
"... ABSTRACT Social media has been in existence for over two decades and, increasingly, people are using it to communicate, connect, share content, and socialize across the globe. Given the huge amount of data generated by the great popularity of social media sites, opportunities have emerged for resea ..."
Abstract
- Add to MetaCart
and Google+), but such information could be useful for targeting a specific audience for advertising, for personalizing content, and for legal investigation. These procedures, known as authorship identification, provide veritable information about the tweet author, for example, the gender of the author
Goal-Directed Elaboration of Requirements for a Meeting Scheduler
, 1995
"... Recently a number of requirements engineering languages and methods have flourished that do not only address what questions but also why, who and when questions. The objective of this paper is twofold: (i) to assess the strengths and weaknesses of one of these methodologies on a non-trivial benchmar ..."
Abstract
-
Cited by 139 (10 self)
- Add to MetaCart
Recently a number of requirements engineering languages and methods have flourished that do not only address what questions but also why, who and when questions. The objective of this paper is twofold: (i) to assess the strengths and weaknesses of one of these methodologies on a non
Analysis of Named Entity Recognition and Linking for Tweets
"... Applying natural language processing for mining and intelligent information ac-cess to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-depen ..."
Abstract
- Add to MetaCart
-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new
Results 1 - 10
of
86