文件名称:Practical Text Mining With Perl (zipped)
文件大小:13.6MB
文件格式:ZIP
更新时间:2013-03-26 10:37:51
Perl 正则
Perl 正则表达式 处理 文本 英文版 This book introduces the basic ideas of text mining, which is a group of techniques that extracts useful information from one or more texts. This is a practical book, one that focuses on applications and examples. Although some statistics and mathematics is required, it is kept to a minimum, and what is used is explained. This book, however, does make one demand: it assumes that you are willing to learn to write simple programs using Perl. This programming language is explicitly designed to work with text. In addition, it is open-source software that is available over the Web for free. That is, you can download the latest full-featured version of Perl right now, and install it on all the computers you want without paying a cent. Chapters 2 and 3 give the basics of Perl, including a detailed introduction to regular expressions, which is a text pattern matching methodology used in a variety of programming languages, not just Perl. For each concept there are several examples of how to use it to analyze texts. Initial examples analyze short strings, for example, a few words or a sentence. Later examples use text from a variety of literary works, for example, the short stories of Edgar Allan Poe, Charles Dickens’s A Christmas Carol, Jack London’s The Call of the Wild, and Mary Shelley’s Frankenstein. All the texts used here are part of the public domain, so you can download these for free, too. Finally, if you are interested in word games, Perl plus extensive word lists are a great combination, which is covered in chapter 3. Chapters 4 through 8 each introduce a core idea used in text mining. For example, chapter 4 explains the basics of probability, and chapter 5 discusses the term-document matrix, which is an important tool from information retrieval.
【文件预览】:
Practical Text Mining With Perl.pdf