VARIABLES in WordsCount GLOBALS by order of appearance nl, tb, sp, lb, hy, null = "\n", "\t", " ", "#", "-", "" < string > some handy strings to use as variables symbol_words = ['&'] < string > list of acceptable single character non alpha words base_path < string > sets the base directory where being run char = re.compile(r"""[\s\.,:;!?"\(\)\[\]]+""") < _sre.SRE_Pattern > characters to help decide where to split words total_words_dict < dictionary > *(1) will eventually contain all the words in all files ans = "Y" < string > default answer to pass test if not asked later count = 0 < integer > global count low_range < integer > low range in percentage of documents with word high_range < integer > high range in percentage of documents with word stat_dir < boolean > boolean to see if directories needed already exist ans < string > result of user input to create needed directories text_dir < string > name of the directory containing original text documents file_count < integer > total number of text files in text_dir text_file < string > name of individual item in text_dir and is skipped if it is not a file and something else like a directory may break things if it is not a text file content < string > complete text of current file, txt_file words < list > list of content split into basic words good_words < list > list of refined words in individual file (txt_file) file_words_dict < dictionary > dictionary of word/count of current txt_file file_words_list < list > list of unique words from file_words_dict keys (current file) total_words_dict < dictionary > *(1) see above, It actually gets created here using above variables along with some local variables within functions. All unique words are now in the dictionary, total_words_dict pct_lst < list > list of all unique words with % of documents in which they appear list_range < list > list of all words with % value in a given % range pos_lst neg_lst * Notes (1) Most important single object in module, contains each unique word as key to dictionary and name of file(s) in which word appears along with a count of how many times that word appears in that file.