Text Mining and Analytics: Concepts, Techniques, and Applications
Chapter 7: Text Mining and Analytics
Understanding Text Mining and Analytics
IBM’s Research Challenge
IBM Research embarked on a journey to explore new ways for computer technology to impact science, business, and society, aiming to advance computer science while aligning with IBM’s business interests.
Defining Text Analytics
Text analytics encompasses a broad range of techniques, including information retrieval, information extraction, data mining, and Web mining, to extract meaningful insights from textual data.
Information Extraction in Text Mining
Information extraction involves identifying key phrases and relationships within text by searching for predefined patterns and sequences.
Natural Language Processing (NLP)
NLP, a subfield of artificial intelligence and computational linguistics, focuses on understanding and processing human language, converting it into computer-readable formats.
ECHELON Surveillance System
The ECHELON system is believed to be capable of intercepting and analyzing various forms of communication, including telephone calls, faxes, emails, and satellite transmissions.
Clustering Techniques
Query-Specific Clustering
This hierarchical clustering method organizes documents based on their relevance to a specific query, with the most relevant documents appearing in tightly knit clusters.
Popular Text Mining Software Tools
- ClearForest: Text analysis and visualization tools
- IBM SPSS Modeler: Data and text analytics toolkits
- Megaputer Text Analyst: Semantic analysis, summarization, clustering, and retrieval
- SAS Text Miner: Comprehensive text processing and analysis tools
- KXEN Text Coder: Text analytics solution for structured representation
- Statistica Text Mining: User-friendly text mining with visualization capabilities
- VantagePoint: Interactive graphical views and analysis tools
- WordStat: Analysis of textual information from open-ended questions and interviews
- Clarabridge: End-to-end solutions for customer experience management
Sentiment Analysis
Alternative Names
Sentiment analysis is also known as opinion mining, subjectivity analysis, and appraisal extraction.
Sentiment Analysis Process
- Sentiment Detection: Distinguishing facts from opinions
- N-P Polarity Classification: Classifying opinions as positive, negative, or neutral
- Target Identification: Identifying the subject of the expressed sentiment
- Collection and Aggregation: Combining sentiment data points into a single measure
Speech Analytics: Linguistic Approach
The linguistic approach in speech analytics focuses on explicit sentiment indicators and the context of spoken content within audio data.
