n-grams

Research on N-Grams in Information Retrieval
http://www.cs.umbc.edu/ngram/

Using Statistical Properties of Text to Create Metadata
http://www.computer.org/conferences/meta96/crowder/onefile.html

Marc Damashek. Gauging Similarity with N-Grams: Language-Independent Categorization of Text. Science, Vol. 267, pp. 843-848, 10 February 1995.

National Security Agency: Information Sorting and Retrieval by Language or Topic.
This is Marc Damashek's n-gram algorithm.