-
Notifications
You must be signed in to change notification settings - Fork 39
/
l19-hubspot.html
173 lines (142 loc) · 4.91 KB
/
l19-hubspot.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
<h1>6.824 2015 Lecture 19: HubSpot</h1>
<p><strong>Note:</strong> These lecture notes were slightly modified from the ones posted on the
6.824 <a href="http://nil.csail.mit.edu/6.824/2015/schedule.html">course website</a> from
Spring 2015.</p>
<h2>Distributed systems in the real world</h2>
<p>Who builds distributed systems:</p>
<ul>
<li>SaaS market
<ul>
<li>Startups: CustomMade, Instagram, HubSpot</li>
<li>Mature: Akamai, Facebook, Twitter</li>
</ul></li>
<li>Enterprise market
<ul>
<li>Startup: Basho (Riak), Infinio, Hadapt</li>
<li>Mature: VMWare, Vertica</li>
</ul></li>
<li>...and graduate students</li>
</ul>
<p>High-level components:</p>
<ul>
<li>front-end: load balancing routers</li>
<li>handlers, caching, storage, business services</li>
<li>infra-services: logging, updates, authentication</li>
</ul>
<p>Low-level components:</p>
<ul>
<li>RPCs (semantics, failure)</li>
<li>coordination (consensus, Paxos)</li>
<li>persistence (serialization semantics)</li>
<li>caching</li>
<li>abstractions (queues, jobs, workflows)</li>
</ul>
<h2>Building the thing</h2>
<p>Business needs will affect scale and architecture</p>
<ul>
<li>dating website core data: OkCupid uses 2 beefy database servers</li>
<li>analytics distributed DB: Vertica/Netezza clusters have around 100 nodes</li>
<li>mid-size SaaS company: HubSpot uses around 100 single-node DBs or around
10 node HBase clusters
<ul>
<li>MySQL mostly</li>
</ul></li>
<li>Akamai, Facebook, Amazon: tens of thousands of machines</li>
</ul>
<p>Small SaaS startup:</p>
<ul>
<li>early on the best thing is to figure out if you have a good idea that people
would buy</li>
<li>typically use a platform like Heroku, Google App Engine, AWS, Joyent, CloudFoundry</li>
</ul>
<p>Midsized SaaS:</p>
<ul>
<li>need more control than what PaaS offers</li>
<li>scale may enable you to build better solutions more cheaply</li>
<li>open source solutions can help you</li>
</ul>
<p>Mature SaaS:</p>
<ul>
<li><a href="http://aphyr.com/tags/jepsen">Jepsen tool</a></li>
<li>"Ensure your design works if scale changes by 10x or 20x; the right solution
for x often not optimal for 100x", Jeff Dean</li>
</ul>
<p>How to think about your design:</p>
<ul>
<li>understand what your system needs to do and the semantics</li>
<li>understand workload scale then estimate (L2 access time, network latency) and
plan to understand performance</li>
</ul>
<h2>Running the thing</h2>
<ul>
<li>"telemetry beats event logging"
<ul>
<li>logs can be hard to understand: getting a good story out is difficult</li>
</ul></li>
<li>logging: first line of defense, doesn't scale well
<ul>
<li>logs on different machines</li>
<li>what if timestamps are useless because clocks are not synced</li>
<li>lots of tools around logging</li>
<li>having log data in queryable format tends to be very useful </li>
</ul></li>
<li>monitoring, telemetry, alerting
<ul>
<li>annotate code with timing and counting events</li>
<li>measure how big a memory queue is or how long a request takes and
you can count it</li>
<li>can do telemetry at multiple granularities so we can break long requests
into smaller pieces and pinpoint problems</li>
</ul></li>
</ul>
<h2>Management: command and control</h2>
<ul>
<li>in classroom settings you don't have to set up a bunch of machines</li>
<li>as your business scales new machines need to be set up => must automate</li>
<li>separate configuration from app</li>
<li>HubSpot uses a ZooKeeper like system that allows apps to get config values</li>
<li>Maven for dependencies in Java</li>
<li>Jenkins for continuous integration testing</li>
</ul>
<h2>Testing</h2>
<ul>
<li>automated testing makes it easy to verify newly introduced changes to your code</li>
<li>UI testing can be a little harder (simulate clicks, different layout in different browsers)
<ul>
<li>front end changes => must change tests?</li>
</ul></li>
</ul>
<h2>Teams</h2>
<ul>
<li>people: how do you get together and build the thing</li>
<li>analogy: software engineering process is sort of like a distributed system
with unreliable components.
<ul>
<li>somehow must build reliable software on a reliable schedule</li>
</ul></li>
<li>gotta take care of your people: culture has to be amenable to people growing,
learning and failing</li>
</ul>
<h2>Process</h2>
<ul>
<li>waterfall: big design upfront and then implement it</li>
<li>agile/scrum: don't know the whole solution, need to iterate on designs</li>
<li>kanban:</li>
<li>lean:</li>
</ul>
<h2>Questions</h2>
<ul>
<li>making a big change on fast changing code base
<ul>
<li>if you branch and then merge your changes, chances are the codebase has
changed drastically</li>
<li>you can try to have two different branches deployed such that the new
branch can be tested in production</li>
</ul></li>
<li>culture changes with growth
<ul>
<li>need to pay attention to culture and happiness of employees</li>
<li>very important to measure happiness</li>
<li>having small teams might help because people can own projects</li>
</ul></li>
</ul>