Skip to Main Content UMKC University Libraries

Introduction to Text Data Mining

This is a beginner's guide to the principles and concepts of text data mining (TDM). TDM is the computational and statistical analysis of large corpora of texts. In this guide you'll find brief descriptions of different types of text mining, some low bar

Machine Learning Tools

Topic Modeling


Visit the IS Lab Software page to see what software is available in campus labs.

  • R (also available for free online)
  • Python (also available for free online)
Online Tools

Natural Language Processing

  • Python (also available for free online)
Online Tools


  • stylo: Stylometric Multivariate Analyses is an R package to perform various analyses in the field of computational stylistics, authorship attribution, etc.

Network and Citation Analysis Tools


Visit the IS Lab Software page to see what software is available in campus labs.

  • Gephi (network analysis - also available online)
Online Tools

Qualitative Mark-Up and Annotation

  • ATLAS.ti 
  • NVivo 
Online Tools

Text Data Visualization Tools


Visit the IS Lab Software page to see what software is available in campus labs.

  • ATLAS.ti 
  • NVivo 
  • R (also available for free online)
  • Gephi (also available online)
Online Tools

Word Frequency Analysis Tools


Visit the IS Lab Software page to see what software is available in campus labs.

  • ATLAS.ti
  • NVivo 
  • R
  • SAS Text Miner 
  • Python
Online Tools