Definition: The percentage of classes/methods/lines of code executed during testing. There is class coverage, method coverage and line coverage. Code coverage is generally a useful quality metric and indicator of missing tests, but it is not a guarantee of good tests or that all relevant cases are covered, so take it with a grain of salt.
Aim for 100% code coverage. Untested code is risky and bug-prone. Although 100% is rarely exactly achieved, it's a worthy goal that encourages thorough testing.
Meaningful Tests: Only add tests that add value, e.g. don't add tests just to achieve 100% code coverage or to test something that has already been tested.
Leverage code coverage tools that determine code coverage during test execution.
Failure Strictness: If at least one test fails, it requires the developer's attention. He is not allowed to proceed with other work until that problem is fixed, and all tests must pass without exception.
Falsifiability = The ability to fail.
Wrongfully failing tests are easy to detect, because a good developer will pay attention to all failing tests and fix the problems before moving on.
Wrongfully passing tests lack the ability to fail properly and are hard to detect because everything seems fine, which is why it is so important to focus on preventing them. They do damage by appearing to validate functionality that they do not. A properly executed Test Driven Development workflow is one of the best ways to avoid these kinds of problems, which are described in this article.
Determinism and Reproducibility: Repeated tests should always produce the same result. The success of tests should never depend on anything other than the production code. Avoid sources of randomness that could cause the test to fail even though the production code is fine. These include random number generators, threads, host-specific infrastructure (operating system, absolute paths, hardware: I/O speed or CPU load), networking over the local network or Internet (since the network or other hosts may sometimes be down), time/timestamps, pre-existing values in the database or files on the host system that are not in the source code.
Single Action Execution: The entire setup and testing process should require a single click or shell command. Otherwise, they will not be used regularly, which reduces the quality assurance of the production code.
Independence: Tests should be independent and not influenced by other tests; the order in which tests are executed should not matter.
Self-Containment: The result of unit and component tests should depend only on the source code and not on external factors. This helps to achieve reproducibility and independence.
Clean Test Code: Maintain the same high standards for test code as for production code. The only exception is execution speed - tests need to be fast, but not highly optimized, as they typically deal with only small amounts of test data.
Easy Bug Tracing: Failed tests should clearly indicate the location and reason for the failure. Measures to achieve this could include using only one assertion per test, using error logs, and using unique exceptions and error messages where appropriate.
Test behavior, not implementation. Always run tests against interfaces from production code to test the implementation behind them. The developer should be free to modify the code as long as it produces the correct results. This also reduces the coupling between test and production code, resulting in less work to adapt tests when the implementation changes. Avoid reflection and weakening encapsulation by making private things public. However, weakening encapsulation is a lesser evil than not testing at all.
'Program testing can be used to show the presence of bugs, but never show their absence!' (Edsger W. Dijkstra).
Follow good testing practices to avoid a large number of bugs, but be aware that bugs may still occur and be prepared for them.
Requirement fulfillment over specific solutions: Don't limit tests to a specific solution when multiple valid solutions exist. Test for compliance with solution requirements, not the specific output. This provides flexibility by accepting any valid solution. For example, in pathfinding problems, minimum travel costs validate a solution, not the particular path chosen.
Fast: Tests should be small and fast, which means, for example, that there is no need to test million data points, but maybe 5, each covering just one specific use case or border case. The exceptions are load/stress tests, which are specifically designed to send huge amounts of data to an application.
Specific Assertions: For example, you should assert exactly what type of exception is thrown or what error (message) is returned. If the assertion isn't specific enough, it might not be able to distinguish between similar problems, such as two exceptions of the same type, so it can't detect potential problems.