max_df is used for removing terms that appear too frequently, also known as “corpus-specific stop words”. For example:
max_df = 0.50means “ignore terms that appear in more than 50% of the documents“.max_df = 25means “ignore terms that appear in more than 25 documents“.
The default max_df is 1.0, which means “ignore terms that appear in more than 100% of the documents“. Thus, the default setting does not ignore any terms.
min_df is used for removing terms that appear too infrequently. For example:
min_df = 0.01means “ignore terms that appear in less than 1% of the documents“.min_df = 5means “ignore terms that appear in less than 5 documents“.
The default min_df is 1, which means “ignore terms that appear in less than 1 document“. Thus, the default setting does not ignore any terms.