Concurrency
Last modified by chrisby on 2024/06/02 15:15
Objects are abstractions of processing, threads are abstractions of timing.
Why Concurrency?
- Concurrency is a decoupling strategy. The what is decoupled from the when.
- Concurrency is can improve the throughput and structure of an application.
Why Not Concurrency?
- Unclean: It is hard to write clean concurrent code, and it is harder to test and debug.
- Design Changes: Concurrency doesn't always improve performance behavior and but it always requires fundamental design changes.
- Extra Management: Concurrency demands a certain amount of management effort, which degrades performance behavior and requires additional code.
- Complexity: Proper concurrency is complex, even for simple problems.
- Unreproducible: Concurrency bugs are usually not reproducible; therefore, they are often written off as one-time occurrences (cosmic rays, glitches, etc.) rather than treated as true defects, as they should be.
- Side-Effects: When threads access out-of-sync data, incorrect results may be returned.
Defensive Concurrency Programming
- Single-Responsibility Principle
- Separation of code: Changes to concurrent code should not be mixed with changes to sequential code. Therefore, you should separate the source code of sequential and concurrent code.
- Separation of change: Concurrent code has special problems that are different, and often more serious, than sequential code. This means that concurrent and sequential code should be changed separately, not within the same commit, or even within the same branch.
- Principle of Least Privilege: Limit concurrent code to the resources it actually needs to avoid side effects. Minimize the amount of shared resources. Divide code blocks and resources into smaller blocks to apply more granular, and therefore more restrictive, resource access.
- Data Copies: You can sometimes avoid shared resources by either working with copies of data and treating them as read-only, or by making multiple copies of the data, having multiple threads compute results on them, and merging those results into a single thread. It is often worth creating multiple objects to avoid concurrency problems.
- Independence: Threads should be as independent as possible. Threads should not share their data or know anything about each other. Instead, they should prefer to work with their own local variables. Try to break data into independent subsets that can be processed by independent threads, possibly in different processes.
Basic Knowledge
Before starting to write concurrent code, get familiar with the following basics:
- Libraries: Use the thread-safe collections provided. Use non-blocking solutions if possible. Be aware multiple library classes are not thread safe.
- Concepts: Mutual Exclusion, Deadlock, Livelock, Thread Pools, Semaphore, Locks, Race Condition, Starvation,
- Patterns: Producer-consumer, Reader-Writer
- Algorithms: Study common algorithms and their use in solutions. For example, the Dining Philosophers problem.
Synchronized Methods
Synchronized means that only one thread can access a method at a time to prevent side effects.
- Avoid dependencies between synchronized methods: In concurrent code, such dependencies, such as when one synchronized method calls another, can cause subtle bugs like deadlocks and performance issues.
- Avoid applying more than one method to a shared object. If this is not possible, you have three options:
- Client-based locking: The client locks the server, calls all the server methods, and then releases the lock.
- Server-based locking: Create a method in the server that locks the server, calls all the methods, and then unlocks the server. A client can now safely call this new method.
- Adapted Server: Create an intermediate component to perform the lock. This is a variant of server-based locking when the original server cannot be changed. Ideally, one would use thread-save collections and implements them behind extended interfaces.
- Server-based locking is preferred over client-based locking. With server-based locking, the class used takes care of the internal locking, so the user has nothing else to worry about. With client-based locking, the user has to implement locking manually, which makes the approach error-prone and difficult to maintain.
- Keep synchronized sections small. Locks are expensive because they add administrative overhead and delay. On the other hand, critical sections need to be protected. Critical sections are pieces of code that will only run correctly if they are not accessed by multiple threads at the same time. Keeping synchronized sections small avoids both problem. Only use it for small, critical code sections.
Miscellaneous
- Performance: When a performance bottleneck is detected in an application, the cause can be I/O or CPU. Increasing the number of threads will show which of the two is the actual bottleneck, see this article.
- Stress Testing: A common type of test that determines the maximum throughput of an application by sending a large number of requests and examining the response times.
- Execution Paths: Always consider the concept of different execution paths. The amount of possible interleaving of instructions that are processed by at least two threads. For example, objects with mutable states could unintentionally cause different results doing the same operation twice.
- Write Shutdown Code Early: Shutting down an application requires the safe termination of all concurrent processes. Writing shutdown code is difficult. Writing shutdown code early is cheaper than writing it later. Study the available algorithms.