-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
phd placeholder: Accountability Protocol for Future Decentralized Applications #4719
Comments
You can also see some double spend experiment with a network knowledge for the Trustchain and Tribler: Issue #4634 |
I get the overall idea (which is nice!). I might have some comments to increase the presentation and clarity of your diagram:
|
First out of 20 blocks done: |
@qstokkink's work is basically the "Peer Discovery" module in @grimadas's architectural diagram above. With this module, we want to be sure that peers get to know as many as possible number of peers. We still need to make tests with larger graphs. Next sprint will be on the next component of network layer, which we name as "fraud resilient overlay." Basically, this component will enable the nodes to build their 1-hop neighborhoods in the overlay network. The idea is to choose neighbors as diverse as possible with respect to the latency measurements, with the aim of decreasing the probability of a sybil to appear in one's neighborhood (see details here). The graph below is a simple representation of how much we would decrease this probability. I used this latency dataset with 490 nodes and assumed that one of the peers creates 50 sybils around itself. On the x-axis are the nodes. On y-axis, we see the probability of the respective node to have a sybil as its first neighbor. Blue line indicate the random selection of nodes, orange lines show our strategy. Next sprint: Building overlay network
|
By properly pruning the matching partners we get better results from the stable b-matching algorithm. Most of the nasty side effects (in terms of messaging) of periodic cleanup seem to be patched up by having a larger number of maximum matches. These results are for a network of 300 peers, where every peer has 180 open connections (of which roughly 120 are considered to have a unique latency). |
Noodle Protocol and Related Work:We are building an optimistic P2P for enabling decentralised applications that values scalability and decentralisation the most. Being optimistic we need to handle occasional frauds in the system.
Nice insides from the databases:
Membership protocolMany systems assume full membership and/or full network knowledge. This is not realistic in a dynamic network. We opt for a partial view of the network. Such partial view must be not biased and work under churn. One of the most interesting approaches is based on gossip:
Ofc, these protocols will not work if you have unlimited Sybils that can poison your local view and isolate you from the network. A combination of Sybil mitigation strategy and Brahms is definitely a good idea. Partnership selectionMembership protocol is typically used to get connected and make it hard to isolate a node or partition the network. But they never argue how to select partner for long-term communication that would be optimal for you.
We can also select peers based on the capacity(declared bandwidth, or declared collateral balance for being a money router). We might then optimise the delivery of messages to a specific peer and build the system around that peer. Let's look at the concept of being interested in a peer as a topic in a PubSub System. Periodically peer declares a interest in peers as a vector of topics(or vector of trust). The overlay network organises itself into a pubsub system that gives guarantees on convergence and risk. For example, if you see that peer has to many subscribers with a low capacity you might prefer some other peer. For inspiration, you might look into: Some of the gossip protocols require a verifiable neighbourhood, that is either build based on hashes or is always limited. One of the main downsides of gossip is freeriding, which in our case is Byzantine behaviour (peer not forwarding information to hide certain transactions). More on that in accountability section Accountability of partnersIf we say that information forwarding is crucial we might want to track if everybody is following the protocol and incentives good behaviour. One of the earliest practical proposals is
Fraud resolution-Making P2P Accountable without Losing Privacy
We say that resolution policy is application specific and we aim for deterministic resolution in P2P.
Sybil attacks and latencyWe don't need to solve distinct identities and one man one identity, a group distinctness is enough:
The problem with these approaches are beacons that are central point of the system. You can Another line of work that can help to detect sybils is beacons and is based on virtual coordinates. Most famous one is:
Another interesting papers is based on rings, which are similar to our buckets: Meridian: a lightweight network location service without virtual coordinates, 2005 In short this is an area with established related work, we should be careful not to miss them. |
There should be no need to depend on declared bandwidth, as we partner based on latency. The less bandwidth you have available, the larger the RTT will be as you transact more in the system. This has already been widely researched in the congestion control sphere. We can look into something like BBR: Congestion-Based Congestion Control. |
I think we should distinguish at least three types of peers:
Even in case of Tribler the peers are not homogenous, there are different roles(exit node, relay node, etc.). I think that fundamentally changes how you build overlay. |
There is a bunch of related work on accountability and reliable and scalable gossip/multicast. We are now working on a Sybil Resilient Verifiable Neighborhood. Everybody follows the diversity protocol, which together with random sampling(like Brahms) gives some protection from Eclipse Attacks and network partitioning in a presence of Sybils in static environment. This is enough to use gossip information dissemination and discover peers that we want to interact with, but this is not enough to give risk estimation. To calculate the risk of getting a invalid token(double-spent) or the the risk of not getting payed(outdated balance estimation) and finally time to detection( In other words, we want two answer three questions:
To answer all three question and make a practical system is hard task, if even possible. For us it is good enough to answer all of them probabilistically, e.g. instead of hard staleness we can go with eventually complete knowledge(k-staleness), but in this case it is very important to bound the dynamics of the partner set, i.e. number of peers you can add/remove at certain time. Suppose we have a function:
To answer these questions we have two directions: Direct agreement among partners
It gives clear bounds and determinism on which we can calculate further transaction risks. It is only used when there is update on the partner set. For the stable partnership with will be used only once. Virtual agreement among partners
This is more elegant approach that reuses Trustchain ideas, but it is not clear how quickly will converge to a stable set and how to handle forks. Futher it requires synchronization of the chain and redundant interactions between peers. |
I agree. What I am arguing is that there is no need to advertise specifically bandwidth. By partnering based on latency, we already have the emergent effect of partnering with peers that have sufficient bandwidth. Regarding the partering agreements:
I would reword this, to say that you can prove to anyone that you have at least a certain peer set. Furthermore, it seems like the resource class of Please also elaborate on how having distinct identities provides sybil-proofness. |
We need a lock on the interaction partners, a certain commitment that peer will interact only with them for certain time/rounds. That will be used to calculate risks, no interaction is possible before providing this commitment. Every peer should eventually know about your commitment, so we can use gossip for that. This will also have guarantees on detection if done properly.
It is an attempt to generalize the Sybil attack in our context. In our case one of the threats is Eclipse attacking by sybils. Suppose after sampling you get a set of peers, now you want to test if this set is unique enough. You require to estimate that at least |
This is a crucial bit of information that was lacking in the original explanation. I can see that working, agreed.
I think we need a stricter definition of what a sybil identity is then. It is pointless to argue which measure we use for sybil identities, if we do not have this defined from the start. In the case of Bitcoin we can argue that its proof-of-work is very bad at avoiding sybils, as the rational actors are forming mining pools and have to artificially be kept under the magical 51% mark. The only thing protecting against one of these attacks in Bitcoin is the fact that the 51% attack would also devalue the coin itself. Back to the definition part: where do we draw the sybil line in Bitcoin? Is a Bitcoin sybil a single machine with a lot of CPU and multiple addresses? Is a Bitcoin sybil a factory hall filled with asics? Is a Bitcoin sybil the entire mining pool?
I will challenge the fact that it is even possible to do global verification of sybils without a single trusted party in control of all identities. |
Totally agree with you that it is tricky with mining pools. What you essentially have you have multiple users colluding to contribute to one address, should they be considered Sybils or just colluding? But let's get back to our case. Our goal is to pick overlay neighbours that are good enough and after pick interaction partners that are good enough. Goodness on the overlay is the based on forwarding information and not eclipse attacking you. Goodness on interaction partner is based on ??? Capacity? Balance? Trust? Reputation? Depending on what we choose the Sybil attack will change too |
Perhaps we can generalize this as the chance of your interaction partner finding any particular transaction you were involved in. Balance, trust and reputation are all derived from an (almost) complete set of transactions. |
For the next iteration we will focus on hardening the guarantees, giving a quantitative bounds on finality. Goals The main goal is build an accounting system which is simple yet powerful to support decentralised applications on top of it, such as financial markets, reputation systems etc. Non functional requirements:
Means:
Science behind it:
Optimizations: |
Noodle == much more then double-spending and witness lists. That was new to me. 2 years of work have been invested in the channels (or main outcome of 2 years of re-factoring). Switching to The Noodle Stack (with 5 layers it's more then a protocol) is quite disruptive and took me by surprise as you obviously noticed (more discussion on 10June). Set Reconciliation is a step towards the complexity of Dispersy with Bloom filter, something we which to avoid. For a lot of use-cases (e.g. channels) you can assume there are sufficient "seeders" with 100% of the information and you can simply collect it. Everybody is required to "seed" their complete trustchain record history. Simple linear list. Can you support that easily? Channels only need linear downloads of bulk Magnet links plus filename(s). Therefore, for channels a single append-only linear list with logical deletes is sufficient. Current version requires a channel to be completely downloaded before records can be inserted. Partial downloads of subdirectories of interest are not supported. If we enhance the channel format to be downloaded in real-time, inserted and seeded partial we solved sufficient problems and can move on. For the roadmap and Kotlin implementation we need first documentation of the current channels format and commit structure. Questions to discuss with a few coffees; Do you envision The Noodle Stack support the bulk exchange of 1+ million magnet links with congestion control? How does this work relate to #3690, trustworthy gossip? How do you solve the problem that most witnesses will be offline 98% of the time (unfiltered)? Witness reliability filtering is probably a requirement, but also an attack vector? |
I agree. I kept this part as simple as possible. If you have an order the reconciliation is much easier and Merkle Trees and Bloom Filters can be avoided. However, linear order specially a global one is not always possible or even desirable, for these cases we use reconciliation on a DAG. With noodle we enforce linear order on one writer-chain (personal chains - classical Trustchain). Community chains (multiple writers single datapoint) are optimistically linking to converge towards a linear log, but nothing bad will happed if it is a DAG.
That would be definitely interesting to try.
Gossip is the heart and soul of Noodle. We just narrow down the gossip to only those who care (related to PubSub). In current implementation I use push-pull gossip with reconciliation to guarantee fast convergence. Partly tested with Wikipedia dataset, can converge within 15-20 seconds under heavy load for 1000 peers.
Witnesses are not required for the protocol to work (Guaranteed Eventual Detection). They are required to build critical applications where detection (+ grim trigger) might not be enough. Witnesses can be chosen as you like, but the best approach so far - choose peers in the community and make sure they are diverse enough. Witness audit is enforced by a counter-party and their risk assessment, so in the end, they will decide to accept/decline the transaction. Witness availability is not a big issue since the witness set can be expanded up to everyone in the community. However, every community need to make sure that there is at least some number of available peers in the community. But this a research topic on its own, which I'll address later. |
Iteration progress and plans for the next sprint:RepositoryNoodle is moving to separate repository to easier management. For now it under it is under my personal GitHub, I suggest to move it later to Tribler repo, and include it as a reference in main Tribler repo. Noodle builds on top of IPv8, with IPv8 as a requirement. Development plansNoodle will be continuously tested and improved. The protocol will be documented and published as a pypi package. If successful Noodle will join the pleiad of similar protocols: DAT, IPFS, Scuttlebutt. Short term experiments and questions (1-2 Months):
Medium-Long term research plans:
|
Just for my understanding I'm translating the above into more abstract terms and MvP requirements for an accounting system which can underpin our live token economy.
Yes, using Bittorrent for bulk downloading of gossip information seems like a hack. It actually is a hack. Please read about this gossip work done by our master student who was hired by Mozilla |
Well I agree that is really hard to match Bittorrent can be used as multi-sourced FTP service, but BiTorrent is much more than that, than it raises a question why not use something different. But it as it was designed as for bulk downloads the only way to one when use Gossip vs BitTorrent is do heavy experimenting. My hypothesis gossip sync should be sufficient for most of the cases. As clients generally don't want or even don't need to download big bulk data, we might consider something more lightweight. I hope experiments will reveal that. |
Would it be a stretch too far to talk about flexible networks without rigid boundaries, instead of ‘weak ties’? Could be a unifying argument about the power of permissionless systems. |
A paper under work now, working title: "Universal Accounting of Goodness". Why accounting? Insights from accounting theoryThe main reason for accounting and unified rules is to address uncertainty and information asymmetry. We translate these properties to distributed systems, namely - consistency and validity of information. But we also recognize that achieving 100 % reliability is not always possible nor desirable. As information retrieval and storage is conjugated with spending resources. We model accounting platform that is flexible and universal to support accounting for multiple domains. It allows users to choose different policies based on they risk model and adjust their behaviour accordingly. |
@grimadas Can you take 15min today and post an progress update of the past summer months + all code.
The leading organisational principle of developed nations is the market mechanism with a certain level of government regulation. We introduce the first viable solution to digital commons problem, based on accounting of social capital. To govern the common collectively we use accounting and unified rules to address uncertainty and information asymmetry. We provide empirical data on our global database of social capital. 🥁 Intro Bami! (please push the tentative code) |
Requirements for sustainable cooperation1.Maintenance of cooperation:
|
Main application focus of BAMI is contribution accounting to sustain cooperation. The requirements were given above. With main one if the identification of bad peers (Byzantine cheaters and freeloaders). I see several different ways to approach this application. Here are my thoughts: What is contribution?Contribution - is a pairwise action from one peer to another that will be recorded as a transaction with two actions: contribution T1: Contribution event - double-signed entry. Can we do it differently?
We don't verify the event truthfulness. Can we always identify freeloaders?As it is connected with verification and having global knowledge - more no than yes. Only in specific applications under certain assumptions (honest, altruistic majority, global consensus). We don't prevent freeloaders with 100 accuracy. How can trust algorithms help?They can help in two ways: Ranking or Filtering. Depending on the application requirement one might be preferred over the other. This allows us to achieve magical properties, but only if we view it from the perspective of an honest node:
Best candidates for these algorithms for now are: variations of personalised hitting time. We use trust algorithms to limit the effects of attacks Is that global trust algorithm?No. This is highly personalised, as the only trust anchor that peer has are 1-hop interactions. Peer can either verify all 1-hop peers directly, or it trusts them. (This is our assumption!). This leads to an interesting result - some peers might consider honest peer a freeloader, and vice versa. However we show that with enough interactions work graph and trust algorithms lead to evolutionary stable state - freeloaders starving. Trust function is personalised. Global ranking or filtering emerges with time. Where trust algorithms will not help?Peers can still hide information and cheat by presenting only partial information to bias the trust algorithms. To protect from such attack peers need to exchange data on the target peer. To simplify and decrease time to detection we connect together peers by topic (interest in the target peer). This turns blind gossip - into a targeted one (similar to pub-sub). We use targeted gossip to minimise time to detection What do we do with freeloaders?As the trust is personal honest peers might see each other temporarily as suspicions. However I believe that we should block them, but instead give a limited service depending on the severity. For example, give less priority, or decrease speed. Thus, any freeloader would eventually starve. Punishment for freeloaders - service quality degradation. What do we do with cheaters?Cheating is more severe - as it is a change in the protocol with an objective proof of cheating (justified). Either more degradation of the service, or temporary ban might be considered. Cheaters get more severe punishment but only after justification. Is there a way back?There should be always an option to get back that is more preferred than whitewashing. For example, cheaters might fix all the forks and continue. Freeloaders will get the same quality once the ranking is increased. Punishment are incentives for change. Unaddressed issues and research questions.
The comments and more question are more than welcome! |
Excellent thoughts! Please expand this into 6-pages IEEE before 21September. Storyline: "Facilitating the emergence of truthful accounting". Our remarkably simple accounting mechanism does not record any details of interactions within a community, yet effectively suppresses freeriding, cheating and fraud. Our only mechanism is to irrefutably record the existence of an interaction between two community members. Paradoxically the mere recording of this interaction together with higher level mechanism lead to the emergence of global cooperation and starvation of freeriders. We successfully devised rules with the complexity of tit-for-tat with... With one addition we can make our mechanism resilient against fake identities within a complete collaborative framework. New identities start with zero community standing and only receive service if they have made a net positive community contribution. Further details are outside the scope of this work:-) |
The Truth Convergence Paradox. Mechanism to establish "single event truthfulness" do not converge to fraud-free communities as well as mechanisms which ignore truthfulness at this level and operate only using partial information, random gossip, and optimistic probabilistic detection. |
Related work: trying to enforce single-event truthfulness at the cost of efficiency (destroys 300Mbps+ download capability of Libtorrent) "Treat-before-trick: Free-riding prevention for bittorrent-like peer-to-peer networks" |
Bami Blocks have been born! Nice work 🥇 Management concern: when the phd work is finished the outcome is either: 1) self-maintaining Github community 2) dead code in Tribler. Please only make a modest change to Trustchain, keep improving and expanding it as much as your like. But keep the first version of Bami integrated in Tribler clean of any non-Tribler features. Proposal: by 15 November there is a stable Tribler build on our Forum with Bami 1.0 ready. Time-boxed approach. List of potential issues to address (brainstorm):
Candidates for later improvement cycles brainstorm, after Bami 1.0 release:
|
After hard thinking and planning, I decided to follow the most straightforward yet powerful approach there is:
Simplest accounting mechanism for Tribler:
In total peer will store only Reputation mechanismLatest experiments and research showed a surprising result - a reputation mechanism if mode properly to provide a substantial level of robustness, even in the presence of Sybils and fake data. Hence we refer all security and robustness to a reputation leaving accounting very very simple. This reflect the vision and is sufficient to build sharing economy with minimal complexity. |
wow! Thank you for your Tribler-centric/lab-centric thinking. So with barebones accounting, latency diversity, peer ranking, competitive upload spots we get spam flooding protection and Sybil-resilience? We need more? |
Yes, according to the latest research insights, it provides sufficiently solid security and allows us to build the first version of a micro-economy based on sharing 💪. |
@grimadas Do you need this now? It's currently not part of IPv8's wild codebase (as the related research is not accepted anywhere yet). I can hook it in though. |
I don't think that we need the latency diversity for the first (barebones) version of the bandwidth ecosystem. I advise to consider this as a task for next iterations. I will also update our ongoing write-up with these key insights. |
Alright, I'll leave that out then. It would probably also only serve to hinder you anyway: the DAS-5 running 1000 instances per node would be detected as a Sybil attack 😄 |
The barebones accounting is now deployed successfully, dubbed LedgerZero. Please remain focused on an upload incentive. First make it work. Try to suppress the reflex of generalisation, into a generic incentive mechanism. Nobody has ever created an indirect reciprocity mechanism that actually worked and scales. Tragedy of the commons for 1 resource is unsolved. |
As discussed. Was the focus on freeriding for 16 years smart? In a quality community, do we then induce unconditional giving? Platforms engagement and sharing: |
Sybil attacks are now investigated by six isolated fields. They are not aware of others. They lack the understanding of how fundamental their problem is. They are not fully aware of all prior and related work
|
Unified model of trust and intelligence (dream outcome)Two isolated field are probably much closer aligned than anybody realises. Unbelievable (really) similarities exist between theories of trust and theories of mind. With decades of scientific effort we might unify the Evolutionary connectionism architectures with our continued research into incentives and credit allocation (e.g. MeritRank) within collaboration humans (Our ongoing research for the past 23 years 😲). See Evolutionary Connectionism: Algorithmic Principles Underlying the Evolution of Biological Organisation in Evo-Devo, Evo-Eco and Evolutionary Transitions from 2016. Prof Levin great writing from evolutionary biology: The collective intelligence of evolution and development. Published in the collective intelligence open access Journal he co-founded. Great quotes: Reward functions appear in biological systems, our collaborative systems based on MeritRank which address freeriding, and also in a state-of-the-art robot recing pilot. See the Nature Journal quote with the reward terms for reinforcement learning with collision avoidance: update: another fascinating Nature paper studies credits. These are micro-level credits, to update weights (see overview paper). The Github code for backprop alternative. |
Scientific Foundation for Offline Token: |
Future Post-PhD Roadmap:This is a first draft planing for my future research/development. Realizing the dream of 1992 Web of Trust This involves development, deployment in Tribler and making a research paper on it. This involves:
Non-functional requirementsInfinite interaction until you achieve these:
|
Working title: Project Noodles
A draft for layers for the accountability tools for the next generation applications.
This is semi-layered architecture draft for better understanding of the concept:
Peer Counting and Sampling in Overlay Networks:Random Walk Methods
The text was updated successfully, but these errors were encountered: