Anonymous usage tracking #84

laszlocph · 2019-11-07T13:04:30Z

with consent of course

AkiraNorthstar · 2019-11-10T10:58:43Z

Which data should be transmitted?
If you need a tester, then I am happy to help.

laszlocph · 2019-11-10T13:29:24Z

Thanks for volunteering!

The full set of data that we are going to transfer is

one part vanity metrics to have things to celebrate. Like how many people run Woodpecker, with what kind of build volume, etc
and one part data to support product decisions: version control systems used, or if a certain feature is used etc. Woodpecker needs to form an identity over time and we need to focus efforts on things that are used, or things that are strategic. And we need data to evaluate the strategic decisions.

This issue will be updated with the initial set of metrics, and I pledge to keep the list transparent at all times.

davidak · 2021-08-02T21:42:28Z

I like the UX of Syncthing. At first start, you are asked for consent to collect usage data with a preview of that data.

There is also a public page with a great visualization of the data: https://data.syncthing.net/

But don't make the mistakes Muse Group did with Audacity like not asking for consent and using Google/Yandex for analytics.

audacity/audacity#835
audacity/audacity#889

I'm not sure if OpenTelemetry would be a usable tool for that.

anbraten · 2021-09-09T08:53:32Z

Home Assistant is doing a similar thing. This could be a good starting point:
https://www.home-assistant.io/integrations/analytics
https://github.com/home-assistant/analytics.home-assistant.io

or Octoprint

https://github.com/OctoPrint/OctoPrint/blob/027d8f8069b86a7f5e8c185a2d8f294b631c2f08/src/octoprint/plugins/tracking/__init__.py
https://tracking.octoprint.org/
https://data.octoprint.org/

anbraten · 2022-07-18T21:26:04Z

Data which would be interesting to collect:

every 24h (first on start)

version
users counter
active repo counter
used forge
activated features?
executed pipelines counter
total pipeline execution time
connected agents counter
used agent backends
server and agent OSes

6543 · 2022-09-01T14:32:17Z

we need some server that do collect it & verify it's a legit request ... by do a callback and see if site exist?

anbraten · 2022-09-01T14:41:43Z

we need some server that do collect it & verify it's a legit request ... by do a callback and see if site exist?

That would require us to "collect" the public address of the instance, but we don't want that as it should be anonymous I guess and it would require them to be public.

I recently had a a look at Grafana and InfluxDB for this.

I think we should simply create a small GO server which takes HTTP requests and inserts that data into a connected db like InfluxDB. I would create an anonymous id at the first start of a server and save it to the server database. This id (do we even need that 🤔) would be used to send a request every x hours to our server, which simply adds an entry to the database (maybe directly aggregating it in the long term).

If we want to "protect" against abuse we could add an ip based limit, like you can only create a tracking id 10 times a day and each tracking id is only allowed to report data every x hours. Somehow like letsencrypt does it.

anbraten · 2022-09-01T15:07:23Z

Maybe we can also add some popup shown to an admin on the first login asking if he wants to send usage tracking and generate the tracking id after that

qwerty287 · 2024-01-20T14:46:57Z

I'd like to ask some things about this again:

There's https://github.com/woodpecker-ci/analytics without real activity - do we still want to add usage tracking?

To be honest, I don't really see a value in it.
From @anbraten's comment about the data that should be sent:

version
users counter
active repo counter
executed pipelines counter
total pipeline execution time
connected agents counter

I don't really see how these can be used to improve development.

used forge
activated features?
used agent backends
server and agent OSes

For these, I can see a value, but this is data that should only be sent once. We can easily do a poll to find out how many users use which backend, which os etc.

anbraten · 2024-01-20T16:36:54Z

I think it would be still pretty helpful to get more insights about our users. For example we always have to consider if we need to add options on the repo level or the instance level. A user with an instance running on a PI is totally fine updating pipline configs or env vars, for larger companies or communities like codeberg that's often not possible to force all users to a specific way directly and needs a completely different update approach where features like config versions could help. Or things like quotas / user-provided agents would be really helpful for large instance, but probably no pi user cares about limits. We for sure have to provide both, but insides could help us to focus here. We could even do specific things like counting the amount of repos having pipeline option x set, allowing us to decide on actual data if we drop that option or have to keep it / provide an alternative.

I however expect the issue that quite a lot of users of the community are against any kind of tracking wherever it be totally public or not, which could make it pretty unreliable for us.
Using surveys isn't an option for me. Just think if you would want to fill it out yourself or take the who is using wp discussion as a reference.

anbraten mentioned this issue Mar 1, 2022

First ever Woodpecker community call #198

Closed

4 tasks

anbraten mentioned this issue Oct 21, 2022

Add endpoint to receive daily statistics woodpecker-ci/analytics#1

Open

anbraten mentioned this issue Nov 26, 2022

Roadmap #869

Closed

31 tasks

anbraten added this to next release Aug 30, 2023

anbraten moved this to Backlog in next release Aug 30, 2023

qwerty287 added this to the 3.x.x milestone Nov 4, 2023

qwerty287 added the feature add new functionality label Feb 11, 2024

zc-devs mentioned this issue Sep 3, 2024

Let linter check against vulnerable plugin list #4080

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anonymous usage tracking #84

Anonymous usage tracking #84

laszlocph commented Nov 7, 2019

AkiraNorthstar commented Nov 10, 2019

laszlocph commented Nov 10, 2019 •

edited

Loading

davidak commented Aug 2, 2021

anbraten commented Sep 9, 2021 •

edited

Loading

anbraten commented Jul 18, 2022 •

edited

Loading

6543 commented Sep 1, 2022

anbraten commented Sep 1, 2022 •

edited

Loading

anbraten commented Sep 1, 2022

qwerty287 commented Jan 20, 2024

anbraten commented Jan 20, 2024 •

edited

Loading

Anonymous usage tracking #84

Anonymous usage tracking #84

Comments

laszlocph commented Nov 7, 2019

AkiraNorthstar commented Nov 10, 2019

laszlocph commented Nov 10, 2019 • edited Loading

davidak commented Aug 2, 2021

anbraten commented Sep 9, 2021 • edited Loading

anbraten commented Jul 18, 2022 • edited Loading

every 24h (first on start)

6543 commented Sep 1, 2022

anbraten commented Sep 1, 2022 • edited Loading

anbraten commented Sep 1, 2022

qwerty287 commented Jan 20, 2024

anbraten commented Jan 20, 2024 • edited Loading

laszlocph commented Nov 10, 2019 •

edited

Loading

anbraten commented Sep 9, 2021 •

edited

Loading

anbraten commented Jul 18, 2022 •

edited

Loading

anbraten commented Sep 1, 2022 •

edited

Loading

anbraten commented Jan 20, 2024 •

edited

Loading