Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a git repository for retrospectives #53

Closed
imbstack opened this issue Apr 11, 2017 · 23 comments
Closed

Use a git repository for retrospectives #53

imbstack opened this issue Apr 11, 2017 · 23 comments
Assignees

Comments

@imbstack
Copy link
Contributor

imbstack commented Apr 11, 2017

Currently we track retrospectives in a couple ways:

  1. The lightweight team internal retrospectives we sometimes do which end up in etherpads somewhere ephemeral. (get it?)
  2. The heavyweight Platform Operations retrospectives in Google Docs that we've never really adopted. Now that we're not on that team anyway, it doesn't matter so much.

I propose that to replace these two processes, we have a single process in a git repository. We'll use the lightweight template we were using before to encourage more contributions. I've created an example repo at https://github.com/taskcluster/taskcluster-retrospectives. The instructions are in the README. I've created an example PR for my outage today in taskcluster/taskcluster-retrospectives#1.


Proposed Process

  • incident occurs
  • someone opens up a google doc (I agree these are better than etherpad for this sort of thing)
  • everyone dumps notes, info, etc. in there
  • incident is resolved
  • someone uses the google doc to write up a retrospective document, flags one or more other participants for review on a PR
  • Once it's generally agreed to be comprehensive and accurate, someone clicks "merge".
@imbstack imbstack self-assigned this Apr 11, 2017
@imbstack
Copy link
Contributor Author

@gregarndt @djmitche thoughts on this?

@djmitche
Copy link
Contributor

One key change that we've made recently, that I like a lot, is not having meetings for retrospectives unless they warrant it (lots of discussion, disagreement, or need for research).

This hits a nice balance -- there's still room to discuss in the PR, and everyone can choose their level of engagement, from "don't care" to "comment on every PR".

@jhford
Copy link
Contributor

jhford commented Apr 13, 2017

It feels like we're using Github Issues as the new Wiki. How about wiki.mozilla.org?

@djmitche
Copy link
Contributor

I think wikis miss the discussion portion (talk pages try to fill that gap, but I haven't ever seen them used in earnest at Mozilla). Retrospectives map nicely to the PR process: propose, discuss, modify, agree, commit. And that discussion is at the right level -- not yet-another-meeting, but not just a comments section.

@djmitche
Copy link
Contributor

Further to the question about wiki -- we have been using etherpad for this, but they are not very searchable and this hasn't supported discussion when we do not have a meeting. I'm very in favor of not having a meeting for every production issue, as in many cases the timeline and next steps are already fairly well-understood right after the event.

@jhford what do you think?

@jonasfj
Copy link

jonasfj commented Apr 25, 2017

I would rather see them on the mana...

Or as etherpads, do these really have any value once calendar events for actions that resulted from them have expired..?

@imbstack
Copy link
Contributor Author

imbstack commented May 5, 2017

Yeah, they pretty much have value forever as far as I can tell. If we want to debate the value of having retrospectives at all, I guess we can do that instead? It seems like a well established practice when running important services. If I was a "customer" of Taskcluster and it impacted my work in a negative way, I would want to know what happened and what is being done to fix it.

The process of creating one itself is quite valuable and I think that pull requests with reviews and such lend themselves quite well to the process. Having them in public is/will be important as we expand the platform to more than just people who have access to the mana. Those are my biggest arguments in favor of the Github approach.

Jonas-friendly bullet points:

  • Postmortems are for more than just the team and so should be public
  • PRs seem like a good way to do the discussion involved in drafting one asynchronously

@jonasfj
Copy link

jonasfj commented May 5, 2017

Okay, I can agree to those bullet points...

Except, PRs can be a painful way to comment on text documents. I hate to say it but, Google docs is better for text docs..

@djmitche djmitche self-assigned this May 31, 2017
@djmitche
Copy link
Contributor

I'd like to make a decision on this one way or the other. Options on the table now are:

  • github repo
  • wiki.mozilla.org
  • mana
  • google docs
  • etherpads (no change)

My requirements:

  • public
  • allow asynchronous discussion (avoiding a meeting)
  • clear notion of when the postmortem is "finished" (PR is merged, in the github case)

I think we've narrowed to github repo and google docs. How can we decide between the two?

@gregarndt
Copy link

I like having a place for historical reference (such as the github) and a lighter weight way of documenting the incident while it's happening and refining it before putting it up somewhere. I tend to favor google docs because it gives the flexibility of real time collaboration, and you can comment on pieces of it fairly easily. It can also work for async discussions as well too.

In the end I'm flexible enough to use anything and I think they have a lot of value. I can adapt to what the team finds the best fit.

@djmitche
Copy link
Contributor

djmitche commented Jun 12, 2017

OK, my understanding is that the majority opinion is 'google docs' and the minority agrees to use whatever the majority decides. Assuming this is decided:

  • move all retros to a google docs folder
  • make that folder world-readable
  • link to that folder from our wiki page
  • build a simple template based on the existing etherpad template
  • update retro instructions to point to that folder

(FCP lasts until 6/20)

@petemoore
Copy link
Member

So I think I favour a repository, since it is easy to grep, use markup, doesn't require signon (assuming you have it checked out, it is just a git pull). I personally hate using web-based text editors (etherpad, google docs, ...) - maybe since it is often slow/laggy for me.

I think a repo has more tools available to it too, e.g. mass cleanup is relatively painless, restructuring, reorganising. For example if we decide to apply a new format across reports, we can easily script something for text-based content, but iterating through google docs is painful. I think google docs is ok when you have one or two docs, but when we continually add new docs for retrospectives, at some point this becomes painful to assimilate content across docs, search for text strings, etc. I also like that sticking it in a repo makes it highly discoverable for people browsing the taskcluster org. One less tool required.

@petemoore
Copy link
Member

I tend to favor google docs because it gives the flexibility of real time collaboration, and you can comment on pieces of it fairly easily. It can also work for async discussions as well too.

What about using google docs, and then copying into github when the retrospective is completed?

This would meet the real-time collaboration need, but also solve the scalability need when we have a lot of retrospectives to search through / maintain etc.

@djmitche
Copy link
Contributor

Pete's email was messed up, so he didn't get notified of this until just now :(

So we now have three in favor of repos (me, @imbstack, @petemoore) and two in favor of gdocs (@garndt, @jonasfj).

@djmitche
Copy link
Contributor

@gregarndt, can you outline the "real-time collaboration" bit? One of the things I'd like to do is not have retrospective meetings anymore, so I'm not sure how "real-time" applies.

@gregarndt
Copy link

What I was thinking about this is that during an incident we want to have as much lightweight interactive communication as possible. Probably by using a google doc or etherpad, I really don't care either way. Once the incident is over, we can then move that over to a PR in the retrospective repo to commit for all of time.

@djmitche
Copy link
Contributor

Ah, that's what @petemoore said too. So maybe we are all on the same page: (bulleted for Jonas)

  • incident occurs
  • someone opens up a google doc (I agree these are better than etherpad for this sort of thing)
  • everyone dumps notes, info, etc. in there
  • incident is resolved
  • someone uses the google doc to write up a retrospective document, flags one or more other participants for review on a PR
  • Once it's generally agreed to be comprehensive and accurate, someone clicks "merge".

I think the key difference is that we want to collaborate rapidly while the incident is going on, but that the result of that collaboration is a barely-organized pile of stuff, and that needs to be refashioned after the fact into a readable retrospective. That document can link to all the important stuff: follow-up bugs, bits of background data (like logs), irc logs, etc.

@jonasfj
Copy link

jonasfj commented Jun 22, 2017

Makes sense to me...

@djmitche
Copy link
Contributor

OK, now I think we're all on the same page :)

@djmitche
Copy link
Contributor

Final comment period until July 3 (because I don't want to be bothered with this during the all-hands)

@petemoore
Copy link
Member

Love it.

@djmitche
Copy link
Contributor

djmitche commented Jul 3, 2017

OK, repo is updated - or at least, there's a PR to updated the README - and I've linked it from https://wiki.mozilla.org/TaskCluster/Operations

@djmitche djmitche closed this as completed Jul 3, 2017
@djmitche
Copy link
Contributor

djmitche commented Feb 7, 2018

This RFC is stored as rfcs/0053-Use-a-git-repository-for-retrospectives.md

djmitche added a commit that referenced this issue Feb 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants