-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using proximity to mitigate the Sybil attack #4481
Comments
Reading List (done):Trust, Reputation and Accounting Mechanisms
Evolutionary Game Theory To read:
I have written a pretty comprehensive introduction to the trust problem and have been thinking about the different existing trust and reputation mechanisms, such as
I wrote down all necessary requirements that our mechanism will have to fulfill and devised a mathematical framework for these. In the coming week I will finish the framework for trust and incorporate elements from Nowak's papers about the development of trust and reputation and evolutionary game theory, so that we can send the first draft to Japan. For the geographic proximity I have thought about this: There is a slight problem with this though. How many peers does A have to work with and where should these peers be located in order for no credible delay to exist with which S can convince everyone that S and B are not in the same place. I will perform some testing over the weekend and come up with some formulas in the next sprint. |
I would say this highly depends on the trust requirements of the application in which you deploy this system. It would be nice to have some model that determines: peer S is a Sybil, controlled by B with X% confidence. This is influenced by the number of peers you select for the "ping test", and the variance in ping times. |
Fascinating discussion outcome:
|
More on the social side of "social proximity": from the book: What's Mine Is Yours: The Rise of Collaborative Consumption, plus mentioned here. |
And more on the trustworthiness-proximity relation, from The concept of decentralized and secure electronic marketplace, 2008:
|
What if B controls a botnet? |
Generic scientific problem formulation attempt draft We demonstrate a strategy for cooperation in which only cooperators are long-term successful, given some assumptions. Our strategy is an "evolutionary stable strategy (ESS), meaning that if adopted by a population in a given environment, is impenetrable and cannot be invaded by any alternative strategy. We assume the existence of a secure identity function. The agent population is not assumed to be static. New agents continuously enter the system. The secure identity function ensure agents can't defect in one round, re-enter the system with a new identity, and defect in future rounds. The secure identity function creates a probability that previous defections can be linked to the misbehaving agent. In general, each agent in our system can rely on the secure identity function to test a certain property of another agent which is unequivocally true and is resilient to strategic manipulation. (protects against botnet, as correctly remarked by @MeadowHillSoftware) We assume the ability of agents to gossip successful acts of cooperation. We define this as the Each agents uses a The core of each agent strategy is to build personalized community of trusted collaborators. Each agent is either building this community (bootstrap phase) or maintaining it (bootstrapped). Specifically, each agent continuously maintains a small community (e.g. 50) of other agents it has build up a long-term reciprocal relationship with. This community is different for each agent. We make the cardinal assumption that these long-term cooperators are also truthfully reporting on the cooperative behavior of others. These collaborators are used to conduct The uniqueness of our strategy is the cryptographic protection of Our mechanism is very restrictive in the bootstrap phase. Reliable cooperators introduce us to their reliable cooperators. After the bootstrap phase, we continuously are introduced through gossip with strangers. We then expand our network into non-reciprocal agents, we transition into network reciprocity strategy. Emergent effect: successful clusters of cooperation grow, link with other clusters, and merge into a single connected cooperative cluster. ToDo: mathematically prove certain attacks to be unexploitable. |
Sounds like a "proof of good deed" or "proof of public service". |
Thnx for explaining your token cap idea:
Emergent property: certain regions have ran out of "trust"; no kindness to strangers, other regions trust strangers. |
There are two kinds of resources: the ones that can be stored (stockpiled) and the ones that can't. One cannot store bandwidth. Owning a PC that has some bandwidth available to it at all times is like owning a small personal hydro power plant: I don't care who uses my electricity as long as I (first) and my friends (second) have priority of using it at all times. I'm completely fine with giving my "free" power to strangers if no one else wants it, because I can't store it. At any moment, I cannot consume more bandwidth than my uplink allows me to, thus I can't "save up" some bandwith for the future. Basically, this is a post-scarcity economy. We don't need accounting. We need priority queues based on reciprocity. |
Excellent point. Many users have monthly caps, so bandwidth is finite. Unused bandwidth can be given away. The effort of leaving you laptop running overnight can be a burden. |
We now have this fraud-resilient overlay idea: Fraud Resiliant OverlayWe want to create an overlay network which makes it difficult for sybils to gain majority in a specific node's neighborhood. Motivation:
Input: Network Layer
Notes - Todos:
Related Works:
Where to use
|
Some minor comments on the model:
|
Comments about my discussions with Hisashi Ohtsuki at Sokendai university in Japan: Hisashi's idea was to introduce the notions of reputation scores in combination with behavioural strategy sets. Given a fixed but arbitrary reputation score, agents choose a particular strategy. Example Reputation Scores: Example Behaviour Strategies In Hisashi's and Nowak's case, reputation values are binary, whereby 0 is a bad reputation and 1 is a good reputation. The set of strategies is binary as well , with C standing for "cooperate" and D for "defect". A behaviour strategy is a method , which determines whether a node with a given reputation value wants to cooperate with another node. For example: A reputation dynamic is a function that assigns a node that has made a decision whether to cooperate or defect with another node a new reputation value, i.e. This yields reputation dynamics and behavioural strategies. Here Hisashi introduces the concept of a relative payoff of a behavioural strategy given a reputation dynamic. He then identifies the 8 most lucrative behaviour strategies, called "leading eight". It is our goal to find a continuous equivalent to this reputation dynamic, which additionally belongs to the leading 8. We explored a number of different graph theoretical centrality measures: EigenCentrality, KCores, Shortest Paths, Katz Centrality, Betweenness Centrality, Shapley Value, etc. We came to the conclusion that none of these metrics are truly sybil resistant and that an inversed approach might be smarter, i.e. find some kind of impossibility result. So far all of our attempts have been heuristic. Hisashi advised me to take a more theoretical approach and set up a more strictly defined set of axioms for such a dynamic function. Hisashi acknowledged that the problem was not as closely related to evolutionary biology as it first appears, as non-unique identities change the nature of the problem significantly. I'm currently devising such a set of axioms oriented along the idea of the leading eight: The highlighted tuple (d,p) would be the optimal in this case. So far, I have come up with 4: Cooperate, Punish, Apologise, Forgive:
If we choose a strict enough set of axioms we can maybe come up with a nice impossibility result. There are number of differences between our model and the model of Hisashi's work, namely:
|
Focus&brainstorm for thesis direction:
|
Please upload you current research notes here + add a brief 1 page with possible research directions; given discussion(s) we had. (e.g. something with strangers get lower priority) |
Research Notes, thesis draft and possible research directions have been added here |
remarks, as going through the current thesis draft. "2 Problem Description: Sybil Attacks and Misreports" 4.1 We improve upon the notation and model introduced by [PIMOTTE] and [SEUKEN]. Definition 5.1 "6.3.1 Allocation Policies" Define freeriding please Definition 5.5 (Misreport transactions). Define "stranger" and initial risk taking; people of the 'shadow of the graph'. No direct observation. Or another approach to a definition: if you know their interaction history they are no longer strangers. If the stranger shares their interaction history; this action transforms them into non-stranger. Come up with some nice proof. Suggestion bounded Stealing with parameter |
Thesis draft updated. |
Good meetings yesterday and today. A quick summary of my model (a more comprehensive discussion can be found in my thesis draft): The concept of work graph is repeated below: We begin by introducing what we call a similarity vector (might be renamed later). This is a vector for every node in the subjective work graph, consisting of attributes pertaining to the physical location of the node in question: The idea behind the similarity vector is that by default Sybil nodes all will have very similar values. The individual components should relate to ping times from the seed node (as well as from certain established nodes perhaps), location in the network tree (traceroutes) and IP-addresses, etc. Note that we want to keep this as generic as possible for now. Next, the proximity graph is derived from the similarity vectors as follows: This means that every set of nodes in the network that are very close to one-another, by standards of similarity vectors are grouped into one neighbourhood. Note that some false positives might arise, but not false negatives, i.e. some non-sybils will be grouped into the same neighbourhood as sybils, but no sybils should be overlooked. Later on our goal will be to optimise our similarity vector in such a way as to minimise the amount of false positives through use of a clever metric... Now, using the proximity graph we redefine the subjective work graph as follows: This means that all nodes that are considered Sybils by our similarity vector are collapsed into one single node, whereby all edges in between the potential Sybil nodes are dropped and we obtain a "proxy node" representing all Sybils as one. If this is implemented effectively, all fake edges in between malicious nodes are removed. In the next step we can rely on already established mechanisms of reputation, such as maxflow, netflow, hitting time, etc. So instead of looking like this: We would call this a "two-layer trust model" instead of our previous "one-layer trust model", i.e. trust models that are only based on the unmodified work graph. In the next week, we will have to come up with reasonable metrics for the similarity vectors as well as determine the best possible reputation mechanisms to applied on the modified work graph now. |
Thesis raw .PDF link ongoing discussion:
|
Summary of my progress and my meeting with Dion: The argument is that while no work is required to go from the above graph to the one below, node k gains some reputation through adding the edge (j,k). If the weight of the edge is large enough then it will hold . Hence, the attack must be a strongly beneficial as it holds and and therefore . However, this is actually incorrect as the work that corresponds to the edge (i,j) is not taken into account at all. Seeing as this (the attack edge) is an indispensable part of a sybil attack, I found myself disagreeing with the assertion of this theorem. I think that this kind of attack constitutes a weakly beneficial sybil attack at most, which is a pretty big difference. I introduced two new definitions: parallel-report responsiveness and serial-report responsiveness, based on which I have come up with two theorems that actually prove the existence of strongly beneficial sybil attacks under the assumption of these slightly more restrictive conditions. From this definition, in combination with the former definitions we can derive the existence of a strongly beneficial sybil attack. The second definition is given as follows: From this we can derive the theorem: The proofs to the are on pages 20 and 23 here. In the future we might be able to prove that any accounting mechanism which does not allow for parallel and serial report responsiveness could be strongly sybil resistance. In order to do this, we need to define the reward of a sybil attack more accurately. So far, I have come up with the definitions: and However, the upper definition is given in terms of work consumed, while the lower definition is given in terms of reputation gained. These two are not directly comparable and it is our goal to find a function that translates an increase of reputation into an expected value/amount of additional data that can be consumed. Over the course of the next week I will try to come up with a function for this. However, this is likely going to be a rather difficult thing to do as it is not deterministic, but rather probabilistic. Depending on the allocation policy a node will receive some data, if it is the node with the highest reputation among all nodes in the choice set. However, it does not have any influence over the nodes it is "up against". Hence an increase in reputation is only directly related to an increase in probability of being served, making this a rather difficult problem to solve. Finally, I discussed the model of physical proximity with Dion and he suggested that in order to avoid any excessive free-riding of Sybil agents (through being assigned to the same neighbourhood as a large number of honest nodes) I should somehow limit the number of nodes in a neighbourhood of the proximity graph. I will have a think about this problem as well over the course of the coming week. |
Goal of thesis: a theoretical proof of full Sybil resilience. We obtain this goal by imposing very harsh and restrictive conditions. Definition: Samaritans are nodes which give away resources for free to total strangers. After a node acted as a samaritan, they are no longer strangers. Definition: Positive net contribution. A particular node only collaborates with another node if the sum of all historical exchanges remains positive. Definition: Remoteness. Unless you have a positive net contribution, we consider you a stranger and will not interact. (now we have long-term tit-for-tat, very well studied stuff) Definition: Custodian responsibility. A custodian responsibility is an alternative form of transitive trust. Taking custodian responsibility mean introducing a stranger to a node to which you have a positive net contribution. The amount of custodian responsibility you take is subtracted from your positive net contribution and assigned to the stranger. The stranger then becomes a samaritan, through "vouching". (not incentive compatible, but we'll get there) Desired theorem: Definition: zero-cheating exposure is loosely defined as that you can never be cheated. Thus you are very paranoid, unforgiving, and only engage in positive net contributions. You never interact with strangers, only samaritans. Only interact to strangers with custodian responsibility. Thus there exists an initial investment, proof-of-work or attack edge. Only restrictive weak minimal transitive trust exists (add more weasel words). Open question: how do we restrict this or add incentives? Nodes primarily engage in direct reciprocity and ensure a positive net contribution. Desired theorem: |
@grimadas Please define the social profit incentive we just came up with. |
One idea to get further the idea of |
ISPs provide space for rent in their datacenters directly connected to the backbone. It is possible to strategically place a single node in each of these backbone facilities. Such a network will allow the attacker to emulate almost any latency mapping. |
|
Note that language is everything. Similarity vector, nice one. Proximity graph, you can make it a stronger concept by giving it a stronger name like clone graph used to estimate suspicious nodes likely controlled by a single entity. Something less strong does not communicate the strength that this concept represent, twin graph, coupling graph, equality graph or duplicates graph. Clone is a nice negative word that destroys individualism. Please add the "Beta reputation system" from 2002, credited as being the first such proposal in history. How is your model more evolved and powerful? Finally, please start considering getting your work in article form before the end of this year. We might be able to get into Auckland, New Zealand; see 2019 workshop version https://sites.google.com/view/trust2019/ |
Presentation on Friday went very well. You find the Powerpoint here. |
Mid-term presentation review. Result: happy. Swap "We seek a generic incentive mechanism for human cooperation", discuss this first; then file sharing? Slide 10: cite to related work in small font? Slide 19: challenge of open systems that fake identities are cheap; no central control point or Internet police. |
Somewhat related work: "The Economics of Trust and Reputation: A Primer, Lu´ıs M B Cabral" |
LATEST thesis is now 58 pages long Solid progress:
Good idea now to make each chapter Key Chapters:
Chapter 14 now contains great stuff: "Trust and Reputation are equivalent to the flow of water in between containers (equivalently capacitors). In an interaction, reputation will flow from the higher container to the lower one.". In December the thesis might be included in some form in some chapter(s):
|
LATEST thesis is now 67 pages long Solid progress:
|
Yes, please note that the current final thesis is found here. I transitioned to the official TU Delft thesis template a few months ago. Sorry if this caused any confusion. I will make sure to "tidy up" the repository in the days to come. |
Thesis finished! Final product can be found here. Thanks a lot everyone. |
Due to some merge conflicts my final thesis repository has now been moved here. |
Lets make this this the main weekly update issue. Target for coming time: a scientific publication. Also some experimental work with single Python script that can handle 120 million blocks? |
@synctext this thesis has been completed. Can we close the issue? |
We utilize the inherent fraud-resilience of proximity to stop unlimited fake identity creation
Close proximity is a property which is easy to validate. Also online it is possible to prove that somebody is less than 500km away. This enables us to distinguish close acquaintance from far away strangers.
Simple methods such as sending a large number and expecting it to be returned quickly offers a tamper-proof proximity measure. The laws of physics and limited speed of light prevent tampering and can prove that you are located nearby.
This master thesis will create a mathematical model which reflects the success of proximity-based defenses for a given strength of the Sybil attack.
We make a realistic assumption on attacker capabilities. It is far harder to create fake identities in close proximity to a victim than at any other random location on the Internet. We assume that attackers have finite resources and some financial constraints. The goal is to prove strong mathematical properties of defenses and attacks. In a generic cooperative agent model build trust after several successful interactions. We believe that a successful strategy to build cooperative cluster in an overwhelmingly defectors-dominated world is to do some initial risk taking when meeting strangers. By adding our proximity mechanism it is possible to build highly effective mechanisms to identify cooperators in close proximity plus test, discard non-cooperators, and mitigate (Sybil) attackers. We further assume cooperators can gossip about the collaboration levels of others using a tamper-proof data structure, for instance, our IETF Internet Standard draft trustchain.
Seuken proved that it is impossible to defend against the Sybil attack in the general case, in any system which has single-report responsiveness. This thesis work is meticulously designed to bypass that impossibility result and create a meaningful, effective and even highly efficient system without single-report responsiveness. We have looked into restricting communication to strangers using proximity for years, #2541. Prateek Mittal from Princeton exploited the temporal dynamics of the Sybil attack for a defense. This work also builds upon our our prior work by Pim Otte.
Desired outcome:
(still tentative, we just see how far we get with this master thesis direction)
The text was updated successfully, but these errors were encountered: