The directory includes 4 files:

orig_docs.rar: original movie reviews

database.txt: movie database created by us

processed_doc.rar: processed reviews in two parts. 
		        unstructured.txt contains the list of unstructured words from each movie.
		        structured.txt contains the attribute terms for each movie along with their candidate matches and corresponding scores

groundtruth.txt: contains the movie id's for each document. This is used only for evaluation purposes.

