Advanced Natural Language Processing
Exploratory data analysis in large collections of text particularly emphasizing techniques for text classification, text clustering, and topic identification.
C S
679
 Hours3.0 Credit, 3.0 Lecture, 0.0 Lab
 PrerequisitesOne or more of C S 401R. 478, 479, 677, Stat 551, 651 (or equivalents).
 Taught 
Course Outcomes: 


A student who has completed CS 679 will possess

  • experience identifying a text mining problem, examining possible solutions, and implementing and validating one solution
  • experience conducting exploratory data analysis on real large collections of text
  • familiarity with models, methods, and algorithms for unsupervised text analysis and text mining tasks
  • the ability to design, implement, and estimate parameters for statistical models for text mining tasks
  • understanding of unsupervised approaches, including approximate Bayesian reasoning, to document clustering and topic identification
  • experience with techniques for evaluating and visualizing the results of unsupervised learning processes
  • confidence in the student's own mathematical and statistical abilities
  • the ability to run meaningful experiments
  • familiarity with some of the text mining literature
  • readiness to conduct research that will advance the state of the art
  • preparation for careers in the field of text mining