When I added more examples to the documents array in the tf-idf example, the wrong document was shown as the most similar. For me, with scikit-learn version 0.24.1, the cosine similarities don't include the input document, so the index 'i' is actually one less than the corresponding document in the documents array. Therefore the most similar document turns out to be documents[highest_score_index + 1].