Your location:Home>科学计算 数据统计分析>数据分析

T-LAB Tools for Text Analysis

T-LAB is available for Windows platforms (Windows 7810).

MAC users interested in using T-LAB must have Microsoft Windows active on their computer.

During the pre-processing phase, T-LAB carries out the following treatments: corpus normalizationmulti-word and stop-word detection, segmentation into elementary contexts (i.e. sentences or paragraphs), automatic lemmatization or stemming (see the below table), key-terms selection.

Subsequently T-LAB allows the integrated use of three kinds of tools for text analysis:

 

N.B.: All videos listed below refer to an outdated version of the software. An updated version of them will be available soon.

CO-OCCURRENCE ANALYSIS

Word Associations 
Comparison between Word pairs 
Co-Word Analysis and Concept Mapping 
Sequence Analysis 
Concordances 

THEMATIC ANALYSIS

Thematic Document Classification 
Thematic Analysis of Elementary Contexts 
Dictionary-Based Classification 
Modeling of Emerging Thems 
Key Contexts of Thematic Words 

COMPARATIVE ANALYSIS

Specificity Analysis 
Correspondence Analysis 
Multiple Correspondence Analysis 
Cluster Analysis 
Contingency Tables 

 

The table below summarizes the main characteristics of the software:

 

T-LAB Plus 2018 

Input format

Text in all languages,including those using ideograms (i.e. files in UTF-8 format)

Maximum size of a corpus

90 MB

document formats which can be processed

.txt, .doc, .docx, .pdf, .rtf, .html, .xls, .xlsx, .csv, .mdb, .accdb (N.B. The image-only PDF files must processed using OCR software first)

Languages for which Lemmatization or Stemming are supported

LEMMATIZATION:
Catalan, Croatian, English, French, German, Italian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Ukrainian;
STEMMING: 
Arabic, Bengali, Bulgarian, Czech, Danish, Dutch, Finnish, Greek, Hindi, Hungarian, Indonesian, Marathi, Norwegian, Persian, Turkish.

All the T-LAB functions allow charts and tables to be saved. Texts and documents may be analyzed and compared through the use of definite variable by user. Currently, the number of available categorical variables is fixed at 50, each allowing subdivision of the corpus into as many as 150 subsets which can be compared.

The user's interface, the contextual help and the manual are in four languages: English, French, Spanish, Italian.

 

 

北京哲想软件有限公司