Unfortunately I don’t believe you can definitively prove that your code is thread-safe by using run-time testing. You can throw as many threads as you like against it, and it may/may not pass depending on the scheduling.
Perhaps you should look at some static analysis tools, such as PMD, that can determine how you’re using synchronisation and identify usage problems.