-
Notifications
You must be signed in to change notification settings - Fork 1
/
rules.tex
130 lines (93 loc) · 6.63 KB
/
rules.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
\section{Rules for Multithreaded Programming}
\label{sec:rules}
In this section I'll try to summarize a few ``rules of thumb'' that one should keep in mind when
building a multithreaded program. Although using multiple threads can provide elegant and
natural solutions to some programming problems, they can also introduce race conditions and
other subtle, difficult to debug problems. Many of these problems can be avoided by following a
few simple rules.
\subsection{Shared Data}
As I described in Section \ref{sec:thread-synchronization}, when two threads try to access the
same data object there is a potential for problems. Normally modifying an object requires
several steps. While those steps are being carried out the object is typically not in a well
formed state. If another thread tries to access the object during that time, it will likely get
corrupt information. The entire program might have undefined behavior afterwards. This must be
avoided.
\subsubsection{What data is shared?}
\begin{enumerate}
\item Static duration data (data that lasts as long as the program does). This includes global
data and static local data. The case of static local data is only significant if two (or more)
threads execute the function containing the static local at the same time.
\item Dynamically allocated data that has had its address put into a static variable. For
example, if a function uses malloc() or new to allocate an object and then places the address
of that object into a variable that is accessible by more than one thread, the dynamically
allocated object will then be accessible by more than one thread.
\item The data members of a class object that has two (or more) of its member functions called
by different threads at the same time.
\end{enumerate}
\subsubsection{What data is not shared?}
\begin{enumerate}
\item Local variables. Even if two threads call the same function they will have different
copies of the local variables in that function. This is because the local variables are kept
on the stack and every thread has its own stack.
\item Function parameters. In languages like C the parameters to functions are also put on the
stack and thus every thread will have its own copy of those as well.
\end{enumerate}
Since local variables and function parameters can't be shared they are immune to race
conditions. Thus you should use local variables and function parameters whenever possible. Avoid
using global data. Be aware, however, that \emph{taking the address of a local variable and
passing that address to place where another thread can read it amounts to sharing the local
variable with that other thread}.
\subsubsection{What type of simultaneous access causes a problem?}
\begin{enumerate}
\item Whenever one thread tries to update an object, no other threads should be allowed to touch
the object (for either reading or writing). Mutual exclusion should be enforced with some sort
of mutex-like object (or by some other suitable means).
\end{enumerate}
\subsubsection{What type of simultaneous access is safe?}
\begin{enumerate}
\item If multiple threads only read the value of an object, there should be no problem. Be
aware, however, that complicated objects often update internal information even when, from the
outside, they are only being read. Some objects maintain a cache or keep track of usage
statistics internally even for reads. Simultaneous reads on such an object might not be safe.
\item If one thread writes to one object while another thread touches a totally independent
object, there should be no problem. Be aware, however, that many functions and objects do
share some data internally. What appears to be two separate objects might really be using a
shared data structure behind the scenes.
\item Certain types of objects are updated in an uninterruptable way. Thus simultaneous reads
and writes to such objects are safe because it is impossible for the object to be in an
inconsistent state during the update. Such updates are said to be \newterm{atomic}. The bad
news is that the types that support atomic updates are usually very simple (for example: int)
and there is no good way to know for sure exactly which types they are. The C standard
provides the type \snippet{sig\_atomic\_t} for this purpose. It is defined in
\snippet{<signal.h>} and is a kind of integer. Simultaneous updates to an object declared to
be \snippet{volatile sig\_atomic\_t} are safe. Mutexes are not necessary in this case.
\end{enumerate}
\subsection{What can I count on?}
Unless a function is specifically documented as being thread-safe, you should assume that it is
not. Many libraries make extensive use of static data internally and unless those libraries were
designed with multiple threads in mind that static data is probably not being properly protected
with mutexes.
Similarly you should regard the member functions of a class as unsafe for multiple threads
unless it has been specifically documented to be otherwise. It is easy to see that there might
be problems if two threads try to manipulate the same object. However, even if two threads try
to manipulate different objects there could still be problems. For various reasons, many classes
use internal static data or try to share implementation details among objects that appear to be
distinct from the outside.
You can count on the following:
\begin{enumerate}
\item The API functions of the operating system are thread-safe.
\item The POSIX thread standard requires that most functions in the C standard library be
thread-safe. There are a few exceptions which are documented as part of the standard.
\item Under Windows the C standard library is totally thread safe provided you use the correct
version of the library and you initialize it properly (if required).
\item C++ 1998 does not discuss threads so The thread safety of the C++ standard library is
vague and dependent on the compiler you are using. This has been corrected with the C++ 2011
(or later) standard which supports threads as part of the standard.
\end{enumerate}
If you use a non thread-safe function in one of your functions, your function will be rendered
non thread-safe as well. However, you are free to use a non thread-safe function in a
multithreaded program provided it is never called by two or more threads at the same time. You
can either arrange to use such functions in only one thread or protect calls to such functions
with mutexes. Keep in mind that many functions share data internally with other functions. If
you try to protect calls to a non thread-safe function with a mutex you must also protect calls
to all the other related functions with the same mutex. This is often difficult.