Advanced Natural Language Processing
Exploratory data analysis in large collections of text particularly emphasizing techniques for text classification, text clustering, and topic identification.
 Hours3.0 Credit, 3.0 Lecture, 0.0 Lab
 PrerequisitesOne or more of C S 401R. 478, 479, 677, Stat 551, 651 (or equivalents).
Course Outcomes: 

A student who has completed CS 679 will possess

  • experience identifying a text mining problem, examining possible solutions, and implementing and validating one solution
  • experience conducting exploratory data analysis on real large collections of text
  • familiarity with models, methods, and algorithms for unsupervised text analysis and text mining tasks
  • the ability to design, implement, and estimate parameters for statistical models for text mining tasks
  • understanding of unsupervised approaches, including approximate Bayesian reasoning, to document clustering and topic identification
  • experience with techniques for evaluating and visualizing the results of unsupervised learning processes
  • confidence in the student's own mathematical and statistical abilities
  • the ability to run meaningful experiments
  • familiarity with some of the text mining literature
  • readiness to conduct research that will advance the state of the art
  • preparation for careers in the field of text mining